Winter Milestone 2 AtinMittra
- 1 Attend a Panel to Hear from Workers and Requesters
- 2 Reading Others' Insights
- 2.1 Worker perspective: Being a Turker
- 2.2 Worker perspective: Turkopticon
- 2.3 Requester perspective: Crowdsourcing User Studies with Mechanical Turk
- 2.4 Requester perspective: The Need for Standardization in Crowdsourcing
- 2.5 Both perspectives: A Plea to Amazon: Fix Mechanical Turk
- 2.6 Soylent: A Word Processor with a Crowd Inside
- 3 Synthesize the Needs You Found
Attend a Panel to Hear from Workers and Requesters
Chris (Assistant Professor at Upenn. Conducts research on Natural Processing Language. Posts many Requests on MTurk for Academic purposes)
- Is able to conduct test hypothesis on MTurk within one day
- Acknowledges the importance of well designed HIT request
- Will release "small batch" of request instructions to gain feedback on their wording
- Can be difficult to work with non-native English speakers as their e-mails and feedback may be difficult to understand
- Talks about inhumane symbolism of identifying workers with serial numbers instead of names
- Will run threshold - if worker gets multiple choice answer right only 25% of the time, he knows they are randomly guessing. "It gets tricky when they're doing better than chance but worse than doing tasks conscientiously."
Peter (Requester primarily posting book labeling HITS)
- Puts a lot of effort into designing HIT instruction because he can't pay a lot of money
- Instead of rejecting work of rouge workers, he instituted qualifications. He mentioned rejecting work will effect long-term engagement of even productive workers.
Xiao (Assistant Professor at University of Arkansas School of Business. Primarily uses MTurk for survey research)
- Something to grapple with is attention span of workers
- Goes to TurkerNation to find good workers
Laura (Started using MTurk as income when son was born. Disability prevents her to pursue alternative career opportunities)
- Takes care of kids at home while working on MTurk
- Uses combination of script, TurkerNation and thread to find suitable tasks
- Most important thing is approval rating. Never risks going below 99%. Will try a new requester.
- Constantly aware of that threat of rejection - "for a new worker, [rejection] is death."
- Takes screenshot of confirmation of work completion email. If issue arises with Requester, uses screenshots as evidence to prove completion. She acknowledges human error may be cause of quarrel, so she remains polite.
Rochelle (Worker since 2008. Advocate for new workers)
- Doesn't schedule breaks during work
- "You're kind of always on the edge of your seat... Because there are workers around the world, this is a 24 hour cycle...7 days a week."
- Needs high approval ratings. Is more hesitant with new requesters. Will send email to make sure there is a real responsive human on the other side.
- Tries 1 or 2 tasks to see if batch is worth doing - cross references her performance with estimated time on HIT. Checks forums to further cross reference Requester and task batch.
@SpamGirl (Crowdworker and Worker advocate)
- Tries 2 minutes of HIT to check it out, if she is trying HIT it is because other people have said it was lucrative for them. Acknowledges wide variation in performance and efficacy due to differences in worker skill level.
- Checks community boards frequently when assessing to fill a potential HIT
Reading Others' Insights
Worker perspective: Being a Turker
- "Number of active Turkers is between 15,059 and 42,912; and that 80% of the tasks are carried out by the 20% most active (3,011–8,582) Turkers."
- "Turkers (56%) are U.S. based, but there is a growing number of Indian Turkers (36%) and other nationalities. Nearly one-third of respondents had a median annual income of <$10,000."
- Common issues for Turkers include "employers who don’t pay; identifying scams; the cost (to workers) of poorly designed tasks."
- Turkers use 'Requesters hall of fame/shame' forum to report and warn community on Requesters
- There is a forum for 'community interests' and 'prayers and good vibes'
- Generally, wages made from HITS are not sufficient enough for full-time employment
- Workers communicate good HIT design to Requesters
- There is a subcommunity of Workers who explain how to cheat tasks to other
- Generally, the community of Workers encourages each Worker to be cordial with Requesters in event of quarrel
2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.
- Requesters will use "Majority Rules" metric to assess quality of HITS
- Can block Workers (who Amazon may suspend from MTurk) but can't be blocked by Workers
Worker perspective: Turkopticon
Requester perspective: Crowdsourcing User Studies with Mechanical Turk
- A small number of Workers choose to cheat Requesters and the system, corrupt the reputation of the workforce.
- Cheating Workers don't even try to complete a HIT when they see a qualifying test
- Requesters use qualifying questions or tests to filter cheating Workers
- May disqualify Workers from future tasks based on performance on past tasks
Requester perspective: The Need for Standardization in Crowdsourcing
Both perspectives: A Plea to Amazon: Fix Mechanical Turk
Soylent: A Word Processor with a Crowd Inside
Synthesize the Needs You Found
List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.
A set of bullet points summarizing the needs of workers.
- Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.
A set of bullet points summarizing the needs of requesters.
- Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.