Milestone 2 AltaMira
Alta Mira Milestone 2 Submission:
- 1 Attend a Panel to Hear from Workers and Requesters
- 2 Reading Others' Insights
- 3 Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc
- 4 Synthesize the Needs You Found
Attend a Panel to Hear from Workers and Requesters
We attended Panel 2 at 6pm PST. here are some of our observations:
- A lot of the work (apparently up to 50%) comes from academic sources.
- The group of workers we were talking to were involved in academic studies and chose tasks in the higher bracket (not the lower $0.01 HITs)
- Workers work varies, many HITs come between 7am and 4pm, some workers prefer to work during their off time, some work during regular hours.
- Workers do not treat MTurk or freelancing only as a source of income. In a 1:1 converstaion, I was told you need to have strong financial skills to make freelancing a full time job. If you have a mortgage, this is probably not the best route.
- Workers hours are highy variable, sometimes they make a lot of money, sometimes they don't. They have to balance it out over weeks.
- Skills are not well measured on MTurk and they donot like the qualification system.
- Rejection percentages (whether they are employed by requestors or not) are a cause of concern/grief for workers.
- Requesters generally don't reject work due to a fear of backlash on forums, the high level of impact each rejection has on a worker. They are more likely to accept even bad work to avoid what happens if they reject.
- In the words of 2 long term requesters, rejection system in MTurk especially is broken, it is not a system that requesters use and it does not quantify what they expect from workers.
- Approval rate is one of the most commonly used qualifications for a HIT.
- Like the idea of micro tasks and using it to create a gold standard.
- Don't prefer Master's qualifications becuase it costs more, is hard to get and doesn't always offer better results.
- Finding the right person for a task is not congruent with how MTurk's system. The system does not have a good way to deliver the right workers.
- Regardless of the system, oDesk, MTurk all have very high user ratings, oDesk which counts rejection rates less severely still has an avg user rating of 4.5/5
Reading Others' Insights
Worker perspective: Being a Turker, TurkOpticon, Plea to Amazon
1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.
- Majority of Turkers have low income < $10k
- Small numbers of most active turkers do most of the tasks (3,011–8,582 of the 15,059 and 42,912 even though reported number is 500k)
- Pay is a significant motivating factor for turkers
- (Implied) Pay rates are low on MTurk so there could be other reasons people are on MTurk
- Turkers use TurkerNation to find out about the best and worst Requesters, this is also the most popular part of the site.
- Turkers primarily do HITs because they like them
- There is a significant difference between the motivations of US and Indian turkers. In the US turkers don't make that much, they make about minimum wage at best. In India, however, this would be a good income.
- A lot of turkers live hand to mouth and it is used as a source of "additional income" instead of primary income.
- Ethics matter for turkers, they want to work on tasks they are valued at that match their skills.
- Turkers use forums heavily to find the best requesters to work for (and also point out bad requesters), these forums tend to be a problem for reqeusters.
- Amazon's payment methods are a cause for concern on MTurk, the payment cycles allow employers 30 days to evaluate and pay for work.
- Generally MTurk is heavily biased towards requesters.
- Worker reputation system is broken (as mentioned above), it is too easy to game the system and penalties are too severe for low ratings.
- UI needs a reboot, more ways to sort HITs, categorize HITs, have a way of predicting completion times.
Requester perspective: Crowdsourcing User Studies with Mechanical Turk, The Need for Standardization in Crowdsourcing, Plea to Amazon
2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.
- Amazon deliberately favors Requesters over works on MTurk.
- Requesters don't have a reputation system they can manage on Amazon. Reputation is derived from forums like Turker Nation using tools like TurkOpticon.
- Requesters write positive reviews for themselves on forum sites.
- Requesters incur low costs in using MTurk, compared to small user studies which can cost much higher.
- Not knowing the user base is an advantage and disadvantage on MTurk.
- There is widespread gaming of the system with uninformative responses and having verifiable parts of tasks makes the system less susceptible to gaming.
- Response rates are very fast on MTurkf for taks that don't include verification.
- UI: It is too difficult to post tasks and sometimes requires hiring developers to get the task to show up correctly and has to be built from scratch.
- Repuatation system for workers is broken, everything that exists right now completion %, qualifications, approval rate are easy to game.
List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.
Synthesize the Needs You Found
List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.
A set of bullet points summarizing the needs of workers.
- Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.
A set of bullet points summarizing the needs of requesters.
- Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.