Milestone 2 pentagram
This is the submission page for Milestone 2 by Team pentagram.
- 1 Attend a Panel to Hear from Workers and Requesters
- 2 Reading Others' Insights
- 2.1 Worker perspective: Being a Turker
- 2.2 Worker perspective: Turkopticon
- 2.3 Requester perspective: Crowdsourcing User Studies with Mechanical Turk
- 2.4 Requester perspective: The Need for Standardization in Crowdsourcing
- 2.5 Both perspectives: A Plea to Amazon: Fix Mechanical Turk
- 3 Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc
- 4 Synthesize the Needs You Found
Attend a Panel to Hear from Workers and Requesters
Some observations/ideas noted down in the morning panel session :-
- some sites to find list of HITs- easy way to find tasks undertaken by workers
- the joy of giving back to community is a motivating point for turkers
- HITExploiter - a script to automate and rate tasks using Turkopticon
- Motivation for workers in MTurk
- wanting to help people
- social concepts
- the worker community is very friendly and helpful ; more or less like a Facebook group discussing about anything under the sun
- worker <-> requester interaction
- shoot emails directly (instructions not clear)
- invite requesters to forums
- verify credibility of requesters
- many a times pro workers help newbie requesters on how to use GUI,API and all
- the biggest hurdles for newbie workers
- finding a matching job - very hard
- poor UI in Amazon Mturk
- requires scripting knowledge to make decent money
- in case of non-US countries ; more difficult to find work
- frustration due to poorly paying jobs
- some very important suggestions --> ask questions
- the dropoff rate on mTurk --> severe, because of no feedback and rejections
- Some common thoughts
- prevailing wage 8$-10$ per hour - why ? (demographically and ethically minimum wage)
- main problem -> was task taken seriously ?
- how to detect cheating -> using open-ended questions
- time vs. money
- balancing act
- give incentives/bonus
- requesters dont have time to follow forums to know about their tasks
- requesters prefer personal email
- no assurity about completion of tasks
- very complicated to get a task done by a specific worker
- threshold for rejecting HITs adopted by many requesters
- use the open ended questions - if responses are direct from Wikipedia or gibberish implies bad work
- use timers
- feel that workers didn't read instructions completely
- problem in India and outside US
- proxy accounts
- Indians using USA-based MTurk account to turk and make money
- account selling - very common pre-2012
- Some mistakes by requesters-
- give very less time (no tutorial from Amazon side)
Some observations/ideas noted down in the evening panel session :-
- They feel the peak HIT time is slightly different from typical working hours. i.e. 7AM to 3PM. Some others however, do feel that weekends and unusual hours like midnight-3AM are more productive, as there are lesser people working at that time on better HITs.
- Workers are equivocal about the system of assigning ratings to workers. Since there is no concept of task-specific rating, a person who has a high rating in transcription might be misconstrued as skilled at translation, for example. oDesk overcomes this problem by a method of recommendation where requesters can send private messages to other requesters, recommending a worker.
- While MTurk can also be used for some volunteer work like filling free surveys, workers generally prefer working for money.
- Income from doing micro-tasks on MTurk provides a highly variable income. What takes 2 days to earn in a certain week might take 6 days in another.
- Workers would prefer questions having better tagging, and better description of tasks. They would not want to see nonsensical tagging like "Hey! This work is fun!" for something like 40 minute surveys.
- Requesters would like to see a platform which is conducive for social science research, where demographics of the population are known. This would ensure that people don't fake survey data.
- Personalised tasks are really hard to convert to turk-able tasks. So, some tasks are can't be put up on MTurk at all.
- Single-person tasks are also hard to request work for. For example, analysing audio files is a single person task.
- Requesters would love to have a system where they can request for small trivial tasks to be completed, and based on the outcomes of that, handpick the productive workers to assign the harder task to.
- In order to check the correctness of the user input, it is not necessary to give questions which have answers already. There are other ways like attention checks or honesty checks.
- They prefer breaking down tasks to the smallest possible extent and then requesting workers to complete them.
- Requesters themselves feel the worker rating system is bad. They feel workers are defensive about their rating, and that defers them from working properly. Also, at the end of the day, it results in a bad relationship between the requester and the worker.
- Also, as the requester, there's always a chance that a worker with a high rating might not be good, as other requesters also might have been reluctant to reject a HIT.
- If a requested HIT is not adequately responded to, requesters go to forums and find out from workers if the pricing of the task is appropriate, and change the pricing appropriately.
Reading Others' Insights
Worker perspective: Being a Turker
In the paper, the frequently discussed issues and solutions of requesters on crowd sourcing platforms has been augmented with, the not discussed issues of the Turkers. It clearly outlines one sidedness of the current crowd sourcing platforms as
- Information assymetry
- Imbalance of power between the requesters and turkers
The research seems to be elaborate and realistic,in terms of the opinions gathered.
Observations about workers
- Clearly, even though there are some workers who do Turking as a source of entertainment/experience, most would want monetary gains from spending their time and effort.The most matured workers would aim at not only high paying jobs but also interesting jobs.The pay expectations seem to have a varied perspective with some satisfied of earning something extra and others not by doing excessive work for small pay.
- There are various range of workers making various amount of cash on crowd sourcing platforms.It has to be noted that the cash made is usually not enough to rely upon as a constant source of income.The interest common to all turkers is to make more cash than their previous attempt.
- Turker nation and similar forums are typically used by turkers for reviewing Requesters.Though on the crowd sourcing platforms themselves there are hardly any review system for them.
- An interaction between a turker and the requester will be a great platform exchange of necessities and making tasks more viable from both their perspectives.This also comes with the prospect of amateurs turkers/requesters being guidelined by pro requesters/turkers
- It has also to be considered that forums such as these have become stagnant accusation portals, after some point.Though there are instances of turkers owning up to individual mistakes,it largely seems to be a blame game.
Worker perspective: Turkopticon
1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.
2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.
Requester perspective: Crowdsourcing User Studies with Mechanical Turk
This paper talks about how Micro-task markets like Amazon Mechanical Turk provides a potential paradigm for engaging a large number of users for low time and monetary costs.
Further, it superficially explains the advantages and disadvantages of such a paradigm with examples of experiments conducted on users, and talks about measures to be taken by requesters to ensure correctness of input provided by the workers.
Advantages of micro-task markets
- 1. They are really good for getting small human-intelligence tasks done.
- 2. For tasks which require a lot of human participation like surveys, design testing, rapid prototyping, ratings, performance measures etc, it is very useful.
- 3. The potential of having a lot of people from diverse background providing input is very appealing.
- 4. Cost of acquiring user input is very low.
Disadvantages of micro-task markets
- 1. Inability to ensure authenticity of work.
- 2. Effort needed by requester to validate the input obtained.
- 3. Workers try to game the system by providing nonsensical responses in a short span of time.
- 4. People may be deterred by the low pay of the tasks.
In the paper, they explain two conducted experiments:
Make people rate Wiki articles exactly like Wiki experts would rate them. (7-point Likert scale) The ratings were to be based on how well written, factually accurate, neutral, well structured, and overall high quality the article was. Also, user was supposed to fill a free-form text box for feedback.
- RESULTS Very weak similarity between experts' reviews and crowd's reviews. Lots of responses were irrelevant. Free-form text had semantically empty content in some cases.
Same as above, except that there was also some Quality Control questions to ensure that the people answered some questions which had a definite, known answer before attempting the actually rating. Also, they were supposed to write 4-6 keywords describing the article, which ensured that they read the article. This resulted in much more similarity to the experts' ratings.
Guidelines to requesters
The paper suggests that requesters should add verifiable questions to the task, to ensure worker honesty, and make the worker believe that the answers would actually be scrutinised.
Requester perspective: The Need for Standardization in Crowdsourcing
Observations about workers
- Tasks are chosen by workers through a online spot market.
- Workers are not sued or sacked for unsatisfactory task completion except that they don't get payed for HITs they completed.
- Tasks which are high demanded generally requires low-skilled workers.
- Workers need to strictly and constantly follow the rules in case of standardised tasks.
- Workers are free to choose any tasks( all of which differ in terms of level of difficulty and skill set required for each task).
- Method for suitable task retrieval by workers is inadequate and inefficient.
- Users of crowdsourcing platforms often get mixed results, which is quite fumble.
- In "curated garden approach",practitioners gain the scalability and cost savings of crowdsourcing.
Observations about requesters
- Can post any kind of tasks he/she wants and conducive to their terms and conditions and pricing.
- Reputations are weak and subverted.
- Occasionally we encounter scammers too.
- Some simply recruit workers(as accomplices) for criminal or offensive activities.
- Don't price the tasks nominally.
- Cannot rely on the quality of the task performed.
- Requesters evaluates the answers independently.
Both perspectives: A Plea to Amazon: Fix Mechanical Turk
The author of this blog is an experienced professor at the Department of Information, Operations, and Management Sciences at Leonard N. Stern School of Business of New York University.
At the time of writing, he had an experience of using AMT for almost 4 years. He gives a critical analysis of what's been missing in this platform through the blog post.
Some thoughts of author
- A need to evolve
- author is stressing on the fact that Amazon has completely alienated itself from the working and policies in mturk (the hands-off approach of Amazon)
Observations about workers
Trustworthiness guarantee for requesters
- requesters on Mturk are serving like slave masters
- some common problems with requesters
- a) reject good work
- b) not pay on time
- c) incomplete info on tasks
- new requesters tend to leave the market if they are not guided by experts on how to post tasks
- some objective characteristics that workers should look for in a requester before working for him
- a) speed of payment
- b) rejection rate for requester
- c) volume of work posted
- these call for a system which can present all this information in a format that is accessible to every worker
- a trustworthy market environment reduces the search costs for both requester and worker
A better user interface
- make task finding an easy process for workers
- workers have no means of navigating through the sea of tasks to find those that match their interests
- this forces the workers to select tasks based on some priorities ; this inturn leads to an uncertainity in the completion time of the posted tasks on Mturk
- some solutions proposed by author
- a) an interactive browsing system
- b) improvised search engine
- c) a recommender system to post HITs to workers
Observations about requesters
A better UI to post tasks
- less technical overhead ==> better online marketplace
- requirements that every requester must satisfy
- a) quality assurance for submitted HITs for a task
- b) proper allocation of qualifications
- c) break tasks into a feasible workflow
- d) classify workers
- Author points out an external API for running iterative tasks, Turkit, which has been very user-friendly for requesters especially
- Mturk is requiring the requesters to build the app from scratch to orient it according to their needs
A better and true reputation system for workers
- current reputation system uses no. of HITs completed and approval rate which are easy to manipulate
- why a good reputation system ? because if requester can't differentiate a good from a bad worker, he tends to assume that every worker is bad
- suggestions from author for a new reputation system
- a) More public qualification tests
- b) Track working history of workers
- c) Rating of workers
- d) Disconnect payment from rating
- e) Classify HITs and rating
- f) API for all the above features
A critical fact stressed by the author:
A labor marketplace is not the same thing as a computing service. Even if everything is an API, the design of the market still matters.
List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.
Synthesize the Needs You Found
List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.
A set of bullet points summarizing the needs of workers.
- Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.
A set of bullet points summarizing the needs of requesters.
- Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.