Milestone 2 Betzy
- 1 Attend a Panel to Hear from Workers and Requesters
- 2 Reading Others' Insights
- 3 Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc
- 4 Synthesized Needs
Attend a Panel to Hear from Workers and Requesters
In this section we introduce some of the observations we gathered during the meeting panel.
- Research showed that 40% of the HITS are scams
- The majority of the HITs are academic tasks and this results in “seasonal work” directly related to semester breaks
- Typical tasks on AMT include: surveys, market research tasks, image tagging, data labeling (categorization), personalized recommandations rating ect.
- Try to work during non peak hours (where ration HITs/worker is higher) to get HITs easier, while considering also the time of day when most HITs are posted
- Usually money is the only incentive
- They think testing their skill sets could be beneficial for both, the workers and the requesters.
- The approval rate is too general, it doesn’t say anything about specific skills.
- The Income is highly variable depending on the time of the year (for example is low in summer)
- Have difficulty finding HITs due to uninformative tags or description of HITs
- Payment options: hourly payment – the worker tries to get the work done better whereas flat rate – can compromise the work quality to get done faster
- Money, ethics, interesting tasks (not just copy-paste e.g.) are the selection criteria for HITs
- It is very difficult for the beginner workers to find work on AMT, and many of them leave after a short time
- Workers are willing to help each other, if not provided they build/find communication systems i.e forums, chat rooms..
- Workers and requesters would really appreciate documentation and tutorials on how to work with the system
- Try to avoid workers whose demographic data is not correct, especially for social studies tasks.
- They ask directly workers to clarify the HITs specifications (by initially creating a test project)
- AMT does not offer support for creating and managing personalized tasks for each worker
- Find it helpful to create some micro tasks to prove the quality of the work towards a gold standard
- Identify strong correlation between the quality of work and the granularity of the HITs
- Do very rearly reject work as it: yields negativity on the worker side, workers will take requester's time asking for the rejection to be canceled. This makes the approval rate not very useful as well.
- Master qualification does not perform good at filtering spammers.
- 99% approval rate, custom qualification (pool of workers that requester trusts) are the most common used qualifications.
- Sometimes voluntary work expected, for examples for task that are useful to the community.
- Use email commication to ask workers to work on specific HITs.
- It's very difficult or time consuming for requesters to detect workers who take the tasks seriously and the others who don't
- While paying too little money for HITs is unfair to the workers, paying too much is also a problem, it seems to be a selection function of workers who tend to cheat and not do the work properly
- Requesters have no easy way of assigning HITs to a particular worker, or group of workers e.g on AMT
- It does not seem easy for the requesters to decide on the right amount of time for completing a HIT
Reading Others' Insights
Worker perspective: Being a Turker
- Workers main motivation is earning money to compensate for poor pay at a “regular” job or to just to improve cash flow
- They invest a lot of “invisible work” in researching methods of finding better paying jobs and optimizing their skills and time working on jobs
- They react quickly to bad feedback from Requesters and will voice their opinion on online forums such as TurkNation
- They prefer the independence this type of job market offers them and reject outside “help” by government
- While they insist on being able to regulate the AMT market themselves, their preferred anonymity and independence hinder them from acting as a single force to achieve this goal
- Workers appreciate good feedback from Requesters and will share this with other turkers on platforms outside of AMT
- They rely heavily on platforms such as TurkNation to gather information about good and bad Requesters
- Turkers often find the compensation for the work they provide unfair and long for more involvement from AMT as a regulatory body
- Requesters have an unbalanced advantage over workers concerning the rating and blocking of workers
- They use additional platforms like TurkNation to communicate with workers in order to improve their HIT design and in turn improve the quality of labor they receive
- Many requesters view turkers as cheap, simple-minded laborers and set unfair wages as a result
Worker perspective: Turkopticon
Turkopticon is a platform where workers review the requesters in four qualities: communivativity, generosity, fairness, promtness; and then these reviews are shown in AMT via a browser plugin.
- Workers are not supported that much by AMT when it comes to work rejection, and dispute resolution techniques do not scale on massive crowds
- Many of the workers are interested in building long-term work relationships with some requesters, kind of full-time employment
- The worker community is very diverse, there are those who Turk for living and there are others who do it because they are bored, there are those who complete the tasks properly and there are those who just try to cheat
- Turkopticon is built because there are bad requesters out there, and the workers need to know who are the bad ones and who are the good ones
- Requesters attempt to write positive reviews about themselves, and flag negative reviews about themselves
- Many requesters do not respond to workers messages/emails
Requester perspective: Crowdsourcing User Studies with Mechanical Turk
- Many workers will take advantage of the system when a HIT is poorly designed.
- When HITs are designed poorly, the work product produced by turkers is often unreliable
- Well designed HITs produce work that is comparable to expert work in the regarded field making turkers a viable work force
- Workers are fast in their response
- Requesters have the need for cheap, fast labor
- They must design their HITs to include questions with quantitative and verifiable answers to easily identify bad workers
- Requesters must design HITs that require as much time to scam as to perform correctly
“Crowdsourcing User Studies With Mechanical Turk” (Aniket Kittur) implies that turkers represent a viable source of labor whose work product is comparable to work performed by “experts” in a given field. Turkers work fast and respond quickly to jobs posted. Designing HITs well helps identify workers taking advantage of the system by providing bogus answers in order to save time.
Requester perspective: The Need for Standardization in Crowdsourcing
- Workers need to learn the intricacies of the interface of each requester.
- Most of the high demand crowdsourcing tasks require relatively low-skill workers who need to follow instructions to get the work done.
- Requesters create from scratch the ‘best practices‘ for each type of work.
- Requestors lack the option to use other requesters design, or their knowledge. The same mistakes are repeated all over again.
- Requesters price the tasks with little knowledge of the market condition.
Both perspectives: A Plea to Amazon: Fix Mechanical Turk
- Have a lot of difficulty to use AMT for publishing tasks because of non user-friendly, limited functionality UI.
- Need to build the quality assurance system from scratch.
- Need to ensure proper allocation of qualifications.
- Need to break tasks into HIT-able workflows.
- Need to stratify workers according to quality.
- Often ask different workers to perform the same HITs to ensure quality, which makes the process expensive in terms of time and money.
- It is very difficult to predict the completion time of the posted tasks.
- New requestors are not easily involved in the marketplace as their HITs mostly are done by spammers or inexperienced workers and this results in bed quality and leave of the new requestors.
- Experienced workers do only few HITs for new requestors till they gain trust.
- Have difficulty to find HITs or requestors. The requestors need to put the workers name as a tag in the HITs in order for the worker to find the HIT.
- Ue priority queues to pick the tasks to work on (most recent HITs or HIT groups with most HITs).
- Workers would like to have a screening test before doing the actual work, to see whether they are eligible for the work or not
- Seeing the rejection rate of a requester helps the workers when deciding on which HITs to work on
- The rejection system makes it very difficult for the new workers to get started. They may get some rejections just because they didn’t understand the task properly. Then it is difficult for them to improve the rating, as they cannot get new HITs. https://www.reddit.com/r/mturk/comments/2yorvw/guys_i_think_i_may_have_just_really_screwed/
- The standard qualifications offered by AMT often influence the quality of work negatively (VusterJones). In addition it seems to be unclear to the community on how this qualifications are assigned (iamralph)
- The main ‘issue‘ raised by the workers is payment: paying in time and paying fairly. Ex: (kodemage) http://rh.reddit.com/r/mturk/comments/2clggw/i_am_a_requester_on_mturk_what_suggestions_do/
- Workers don’t seem to agree about the purpose of AMT: some see it as an additional source of income, some see it as a potential main job, and therefore expect higher payments.
- Many requesters have difficulties to decide on the right amount of money per HIT, many questions on "http://rh.reddit.com/r/mturk/search?q=flair%3ARequester%2BHelp&sort=top&restrict_sr=on&t=all" mention this
- Requesters who pay too little often get very bad results
- Workers need to learn about the requestors they are going to work for.
- Workers need to find HITs they want to work on quickly.
- Workers need to work on more standard interfaces to be more efficient and accurate.
- Workers need to be able to easily perform sophisticated searches, something like a graph search on the HITs, requesters and other information as well. In many forums and in the morning panel many workers said that it's really difficult to find interesting work. We want to minimize the amount of time spent browsing for HITs.
- Workers need to be shown as less repetitive work as possible, that should come from different HITs pools, optimized as much as possible. Requesters say that this kind of work really affects the quality of the results, it tends to decrease after some time of doing the same thing. If the workers find the work interesting and even get paid for they will probably deliver better results needs of requesters.
- Workers benefit from a network of workers as well as requesters to communicate with one another.
Evidence: This can be seen in “Being A Turker” (Example8: Evolution of a Requester) A couple of users gave negative feedback about a Requester who seems to have changed the design and payment of his HITs, which resulted in a better user experience for the workers. Interpretation: Workers opinions matter to the Requesters who don’t receive and official rating but recognize the importance of outside platforms like TurkNation in shaping other workers’ opinions and behavior.
- Workers need some sort of feedback system to rate Requesters.
Evidence: “Being a Turker” (Example 14) defectturk had a bad experience with a Requester and regarded the payment for his HITs unreasonably low, while another user majeski responded that although he too regarded the pay as too low, he occasionally does work for the Requester in order to “get his numbers up.” Interpretation: AMT should enable turkers to rate their Requesters. This in turn would cause Requesters to adjust their payment in order to get skilled labors for their requests. (Turkers do not wish the government to interfere and introduce a “minimum wage” but wish to shape the market themselves
- Requesters need to publish tasks in an easy, quick way and have guidence from the UI. Evidence: “Crowdsourcing User Studies With Mechanical Turk” shows two experiments showing that the design of a HIT severely influences the quality of work that is received. Furthermore “Being A Turker” shows in the aforementioned example 8 that Requesters value the opinion of the workers and will make adjustments accordingly. Interpretation: Novice requesters require guidance in the designing of HITs before getting started and a forum to communicate with other requesters to learn how to improve beyond that.
- Requestors need to learn about their workers reputation.
- Requesters need to manage the execution time of their HIT groups.
- Requesters need to learn from the experience of other workers to avoid repeating the same mistakes.
- Requesters need to learn about the market conditions to better estimate the price of the HITs.
- Requestors need to use standard interfaces to build HITs.
- Requestors need to be able to validate the users skills, or data (for example demographics and other workers data are very especially important for the social studies)
- Requesters need to be able to distribute the work to specific workers or groups of workers, in a configurable way, say 40% of the HITs go to this person and 60% go to that person. As of now requesters on AMT apply qualifications to their HITs or write the name of the worker on the title of the HIT to achieve this. Having this ability is very important because people who trust each other and have been working together and have achieved good results would like to continue this work relationship.
- People who are requesters and workers at the same time need to be able to "trade" HITs so that they don't have to pay with money, say one of them is a graphic designer and the other is a web developer, they need to be able to do work for each other.
- Requesters need to be assisted by the system when deciding on how much to pay for a particular task. This was discussed during one the panels, and in many forums it is a big problem because not paying the right amount usually leads to bad results. There would be categories of tasks and other properties, also by analyzing similar work from the past which would help the requester decide on a fair reasonable amount of money for that particular task
- The platform needs to remedy externalities (fight negative ones and motivate positive ones).
- The platform need to enforce standards to raise the value of crowdsourcing and promote efficiency.