WinterMilestone 2 biubiubiu

From crowdresearch
Jump to: navigation, search

Attend a Panel to Hear from Workers and Requesters

by @xi.chen

Requesters' Side

  • Convey Thoughts
    • Observation: Professor especially mentions that for academic tasks, he need to pay special attention introducing clearly work that needs to be done. Furthermore, requesters need feedbacks to see whether they express clearly.
    • Interpretation: Requesters and workers lack effective communication ways to ensure they both understand each other.
    • Need: Requesters need to ensure that they are understood perfectly by workers.
  • Trust the Quality of Work
    • Observation: Some requesters introduce qualification tests before judging quality of work. Professor outsources requests and communication with workers to leave a good impression and maintain a good reputation in order to attract good workers.
    • Interpretation: Quality of work is not ensured and has to be manually checked. Replying to workers takes lots of time and energy, which is difficult to handle for small requesters.
    • Need: Quality of work need to be ensured and requesters want to be trusted by workers.
  • Time Estimate
    • Observation: Time estimates are typically off, which causes improper hourly wage. Requesters need to find data on their own and share their data manually in unofficial forums.
    • Interpretation: Since individuals vary a lot. Important data are not provided officially and comprehensively, causing data found by requesters to be inaccurate or unrepresentative. And thus estimated wage becomes improper.
    • Need: Requesters need to estimate things accordingly and reasonably.

Workers' Side

  • Schedule Problem
    • Observation: Working time is unpredictable. Workers have to stay alert all day long. Or they choose to use scripts and turn to forums for help.
    • Interpretation: Task-posting is decided only by requesters and thus seems unpredictable to workers.
    • Need: Workers should get notifications when there's new tasks available. Or tasks are released periodically and thus workers can plan their schedule accordingly.
  • Trust Problem
    • Observation: Workers value their approval rate. Laura probes a requester and gets first impression before completing many HITs for a new requester.
    • Interpretation: Workers lack an effective way of knowing whether a requester is worth working for.
    • Need: Workers need to trust a requester before working for him/her.
  • Search Cost
    • Observation: Workers' earnings can be quite different. From Laura's words, I guess the more time one can devote, the more earning one is likely to get.
    • Interpretation: Since it takes only several minutes for most HITs, the only reason explaining such variation is search cost. Workers have to spend much time looking for well-paid tasks and interested tasks.
    • Need: Workers spend less time searching but more time actually working.
  • Negotiation if Rejected
    • Observation: Workers have to backup their work if evidence of honest work is wanted. Besides that, it seems that a way to contact the requester is not imperative.
    • Interpretation: Workers cannot negotiate with requesters if their work get rejected through official paths. Workers don't have their work once after submitting.
    • Need: Workers want their honest work respected and recorded even if work doesn't meet requesters' standard.

Reading Others' Insights

Worker perspective: Being a Turker

by @alchemist

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • The majority (56%) are U.S. based, but there is a growing number of Indian Turkers (36%) and other nationalities.
  • A lot of them have low income (one-third had a median annual income of <$10,000).
  • Money is their primary motivator (at least for the posters on Turker Nation).
  • They have problems in: employers who don’t pay; identifying scams; the cost (to workers) of poorly designed tasks.
  • They cannot rate Requesters.
  • On Turker Nation, by far the largest area is devoted to the ‘Requesters hall of fame/shame ratings’ where Turkers can discuss their experiences with Requesters -- for Turkers on Turker Nation, the primary concern is to find good Requesters and avoid bad ones.
  • Turker Nation is primarily used by US workers.
  • Some Turkers accept somewhat (but not much) lower pay if a task was more enjoyable.
  • The idea that Turkers’ actions en masse send messages to Requesters and that Turkers are responsible for promoting fair pay is a dominant theme of Turker Nation discussions.
  • Turkers earn wages ranged from ~$50 all the way to ~$15k for a year. The highest earnings were made by experienced Turkers, and they state they only take well-paying, more professional AMT works.
  • Turkers are interested in comparison with other Turkers to gain information and knowledge
  • They set themselves targets, e.g. to make $10 per day, to double (or better) the last years’ amount.
  • The importance of their AMT income varies depending on earning ability and other life circumstances (for some, AMT is their primary source of income, for others it is supplementary).
  • The payment of AMT works are so low that even the best workers we know makes an annual income that is equivalent to working full-time at the minimum wage in US (although we do not know how many hours these Turkers are working).
  • Even those doing AMT work just for extra money (e.g. aparticular purchase) do so because they do not have enough money from other sources.
  • AMT has some benefits over traditional labour markets: Regular or set hours are not required, money does not have to be spent on transport costs, and judgments are restricted to the work you submit rather than your personal appearance and the way you present yourself.
  • Turkers are understandably offended when Requesters reject HIT submissions for reasons they do not understand: This not only deprives them of money they believe they have rightly earned, but it has a damaging effect on their approval rating.
  • Turkers do not have a reciprocal system action to blocking (just avoidance and publicising) and it is complicated for them to prove their innocence.
  • Some (genuine) Turkers complain about being unfairly labeled as bots, spammers, etc.
  • Turkers are understandably offended when Requesters reject HIT submissions for reasons they do not understand: This not only deprives them of money they believe they have rightly earned, but it has a damaging effect on their approval rating.
  • Turkers do not have a reciprocal system action to blocking (just avoidance and publicising) and it is complicated for them to prove their innocence.
  • Some (genuine) Turkers complain about being unfairly labeled as bots, spammers, etc.
  • noble intent – will actually just lead to closing AMT, and they would lost this source of income.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • They look at how to motivate better, cheaper and faster worker performance to get good data from workers, quickly and without paying much.
  • They can rate Turkers.
  • Requesters have better information on the Turkers than vice versa, as well as greater powers of redress.
  • Requesters engage directly with Turkers in ‘Requesters hall of fame/shame ratings’ on Turker Nation.
  • Some Requesters have a wrong perception that Turkers do HITS for fun, and thus they do not need to pay good wages.
  • A bad Requester may reject submissions or block a Turker (bar a Turker from working for them) for no good reason, and AMT does not have a good mechanism to punish these behaviors.
  • Direct, open, polite, and respectful communication is highly valued by Turkers.

Worker perspective: Turkopticon

by @frostao

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Amazon Mechanical Turk's participation agreement grants employers full intellectual property rights over submissions regardless of rejection. As a result, workers have no legal recourse against employers who reject work but still use it.
  • Workers barely hear back from the requesters if they try to email the requesters about their dissatisfaction about the rejection because the cost for the requesters to go through the workers emails is more than the amount the they pay the worker.
  • Workers have limited options for dissent within Amazon Mechanical Turk itself. Mostly they have to leave the platform if they get too many rejections.
  • In many cases, workers are paid below minimum wages.
  • A lot of the workers feel that their work was regularly rejected unfairly
  • Workers want faster payment

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Requesters can hardly respond to workers emails about rejections because the cost for the requesters to go through the workers emails is more than the amount the they pay the worker.
  • Requester only take a few actions only if they get a lot of unfair complaints from the workers.

Requester perspective: Crowdsourcing User Studies with Mechanical Turk

by @juechi

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • “Game” the system and provide non sense answers to decrease their time spent and thus increase their rate of pay.
  • Provide false personal information, including demographic information, expertise
  • Only a small group of workers were trying to take advantage of the system multiple times
  • In the first experiment, users completed the tasks extremely fast.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Acquire user input that is both low-cost and timely enough to impact development
  • Collect user input from a large and diverse set of participants
  • In the first experiment, requesters required users to fill out free-form text box to provide a check on whether users had in fact attended to the article or had just provided random ratings.
  • In the second experiment, requesters required users to complete four questions that had verifiable, quantitative answers before doing the task.
  • Detect suspect answers
  • Including or exclude users from future tasks based on their responses to past tasks
  • Use automated pre-test
  • Use Mechanical Turk as a recruitment device and to host the user study oneself using a simple API to send and receive participant information from Amazon

Requester perspective: The Need for Standardization in Crowdsourcing

by @qinwei

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Workers are not trained, screened and do not have incentives for good performance. (Don’t want to lose job)
  • Come and go easily.
  • Receive offers on tasks that different in difficulty and skill requirements for different rates of pay with different pricing structures.
  • Reputation is weak and easily subverted.
  • Difficulty of searching for tasks.
  • Troubled by scammers and spammers.
  • Need to learn the intricacies of the interface for each separate employer.
  • Need to adapt to the different quality requirements of each employer.
  • Giving feedback is costly and invites retaliation or scares off future trading partners.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Could not ensure that hired workers — after suitable training — could complete tasks easily, predictably and in a way that training was easy to replicate for new workers.
  • Most of their tasks are relatively low-skilled and require workers to closely and consistently adhere to instructions for a particular, standardized task.
  • Make offers that different in difficulty and skill requirements for different rates of pay with different pricing structures.
  • Reputation is weak and easily subverted.
  • Some buyers are simply recruiting accomplices for nefarious activities.
  • Difficult to price work, predict completion times and gain quality.
  • Troubled by spammers and fraud.
  • Have to implement from scratch the “best practices” for each type of work. The longterm employers can learn form mistakes wile newcomers have to learn the lessons the hard way.
  • Need to price its works unit without knowing the conditions of the market and this price cannot fluctuate without removing and reposting the tasks.
  • Giving feedback is costly and invites retaliation or scares off future trading partners.
  • Have no incentive to post evaluation of their workers, as this is a signal earned after a significant cost.

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

by @xi.chen

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Experienced and good workers treat new requesters cautiously. They simply don't devote much time and effort for a new requester until that requester's information like payment speed, rejections, possibility of coming back to the marketplace to post tasks in the future.
    • afraid of getting rejected for no reason or waiting too long for payment, good workers just do a little bit of unknown requesters' tasks.
    • if the requester just have one-pass tasks and they are relatively difficult, good workers will not spend time to learn how to do well.
  • Workers tend to seek help from some well-known forums.
  • If workers get rejected,
    • they want to keep their work to themselves;
    • they need a way to appeal and can win if their work is properly done.
  • Certain worker-facing applications can improve efficiency.
  • Workers want to find wanted tasks AE(asy)AP:
    • they want to find certain requesters' HITs by searching names of requesters;
    • they want to complete HITs that they're interested in and good at.
    • now on MTurk, workers use priority queues to choose HITs.
    • separate HITs by type. also mentioned in requesters part

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • New-coming requesters' tasks will be done mostly by careless workers, also known as "spammers", which discourage requesters' interest in the marketplace.
  • It is impossible to predict the completion time of the posted tasks due to user-unfriendly interfaces on MTurk.
  • Requesters want to post tasks easily:
    • small requesters want to cut overhead and other costs when using crowdsourcing marketplaces. But now they, and every other requester, need to spend extra effort and time building accessible systems on their own.
  • Requesters need more details besides simpy "Approval Rate" and "Number of Completed HITs"
    • they want to be ensured that they can get results of good quality.
    • they need to know if a worker is qualified for their tasks.
    • working history of a worker can reveal something important.
    • rate and pay a worker according to submitted work.
  • Conscientious requesters say,honest workers' reputation should not be affected if their work gets rejected becaues quality cannot meet requesters' standard.
  • Task and Rating Categorization.

Soylent: A Word Processor with a Crowd Inside

by @alchemist

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Ross et al. found that Mechanical Turk had two major populations: well-educated, moderate-income Americans, and young, well-educated but less wealthy workers from India.
  • In their experiments, many of the raw results that Turkers produce are unsatisfactory -- about 30% of the results from open-ended tasks are poor, which is unacceptable to the end user.
  • Turkers exhibit high variance in the amount of effort they invest in a task. A Lazy Turker does as little work as necessary to get paid. Eager Beavers go beyond the task requirements in order to be helpful, but create further work for the user in the process.
  • Both the Lazy Turker and the Eager Beaver are looking for a way to clearly signal to the requester that they have completed the work. Without clear guidelines, the Lazy Turker will choose the path that produces any signal and the Eager Beaver will produce too many signals.
  • Turkers working on complex tasks can accidentally introduce substantial new errors, e.g.replacing existing grammar errors with new errors of their own.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • To prevent errant crowd workers from contributing too much, too little, or introducing errors, these Requesters employed a Find-Fix-Verify pattern which, rather than ask a single crowd worker do all the work, recruits one set of workers to find candidate areas for improvement, then collects a set of candidate improvements, and finally filters out incorrect candidates.
  • They developed Soylent that crowdsources to tackles problems that are currently infeasible for AI algorithms (i.e. proofreading)
  • The Find-Fix-Verify pattern separates open-ended tasks into three stages where workers can make clear contributions:
    • The first stage, Find, asks Turkers to identify patches of the user’s work that need more attention. For example, when proofreading, the Find stage asks for at least one phrase or sentence that needs editing, and aggregates independent opinions to find the most consistently cited problems, which are then fed in parallel into the Fix stage.. (Soylent keeps patches where at least 20% of the workers agree.)
    • The Fix stage recruits workers to revise a patch. Each task now consists of a constrained edit to an area of interest, e.g. the worker can see the entire paragraph but only edit the text directly containing the patch. 3–5 workers propose revisions for each patch to produce sufficient viable alternatives.
    • The Verify stage performs quality control on revisions. The unique alternatives generated in the Fix stage are randomlized in order and 3–5 new workers are asked to vote on them (either to vote on the best option or to flag poor suggestions). All Fix workers are banned from participating in the Verify stage to ensure that Turkers cannot vote for their own work.
  • Results of proofreading: Soylent’s proofreading algorithm caught 33 of the 49 errors (67%). For comparison, Microsoft Word’s grammar checker found 15 errors (30%). Combined, Word and Soylent flagged 82% of all errors.

Synthesize the Needs You Found

Worker Needs

By @frostao , @xi.chen

  • Workers need to trust the requesters.
    • Evidence:Laura probes a requester and gets first impression before completing many HITs for a new requester.
    • Interpretation: Workers do not know if a requester is worth working for.
  • Workers need to have a voice
    • Evidence:In the research paper, Turkoption, it mentioned that it's hard for workers to get in touch with the requesters. It's one of reasons why they made Turkoption
    • Interpretation:Wokers want to tell the requesters what they want and be able to argue about potential unfair rejections.
  • Workers need to find good tasks that they like easily
    • Evidence: In the research paper, The Need for Standardization in Crowdsourcing, it mentioned that workers usually find it difficult to search for a good task they like.
    • Interpretation: Workers need a better way of finding the tasks they like and match them.
  • Workers need to get paid faster
    • Evidence: In the research paper, Turkoption, it mentioned that a lot of workers' basic needs such as, groceries, utility bills and etc, depend on the payment from their work. Those workers want to get their payment faster.
    • Interpretation: Workers want to have their work done cashed faster so that they can use what they earned.
  • Workers should be able to schedule their days.
    • Evidence: Laura has to take care of children and herself when finishing tasks on MTurk. But Things are unpredictable and sometimes she even does not have time to have lunch or to calm her children.
    • Interpretation: Task-posting is decided only by requesters and thus seems unpredictable to workers. Workers should be able to manage their days and take good care of themselves.

Requester Needs

By @juechi

  • Requesters need to post tasks easily (including how to price task, how to predict completion time, friendly UI)
    • Evidence
      • Unfriendly UI of AMT forced requesters to use external link to conduct user studies
      • requesters are always troubled by pricing task and predicting completion time.
    • Interpretation: requesters are expert in their own field, but they have no hint of how to price their task. Due to the unfriendly UI, it is also difficult for requesters to predict a completion time that applies for a diveristy of workers.
  • Requesters need to trust workers
    • Evidence
      • Requesters required users to anawer verifiable questions before doing tasks.
      • Requesters employed Find-Fix-Verify pattern.
      • Requetsers are troubled by “spammers”
      • Requesters need more details besides simply "Approval Rate" and "Number of Completed HITs"
    • Interpretation: Every requester wants their task results with good quality. They thus have to conduct several methods to protect their tasks from bad workers.
  • Requesters need to communicate with workers efficiently
    • Evidence
      • Requetsers can hardly respond to workers' emails.
      • Giving feedback is costly and invites retaliation or scares off future trading partners.
    • Interpretation: requesters and workers cannot communicate at AMT. Workers have to email each other, which costs a lot of time for both of them.
  • Requesters need to trust the platform
    • Evidence
      • New-coming requesters' tasks will be done mostly by careless workers, also known as "spammers", which discourage requesters' interest in the marketplace.
      • Requetsers use automated pre-test to ensure that the physical envrionment of the workers is suitable for completing the tasks.
      • Some Requesters have a wrong perception that Turkers do HITS for fun, and thus they do not need to pay good wages.
    • Interpretation: AMT doesn't have protection mechanism for new requesters. Also, AMT did nothing to help build a healthy marketplace.

Milestone Contributors

Slack usernames of all who helped create this wiki page submission: @frostao, @alchemist, @xi.chen ,@qinwei, @juechi