WinterMilestone 2 westcoastsfcr

From crowdresearch
Jump to: navigation, search

Attend a Panel to Hear from Workers and Requesters


Report on some of the observations you gathered during the panel.

  • Both workers and requesters use scripts to help with work through MTurk
 - quality control scripts (R)
 - searching for HITs (W)
 - calculation of worker rating (W) 
  • Workers use online forums such as TurkerNation (which is a worker-maintained worker community where workers can cooperate with each other)
 - Teach each other tricks and hacks they can use 
 - Daily Thread: where they post HITs they think are worth working on (shares work)
 - Workers will also use chats between workers where they will post HITs
  • When workers can work is unpredictable and 24/7; they have notifications (i.e. Snapcaht) set up to tell them when a profitable HIT is available
 - Workers have to put their life on hold (i.e. delay lunch, putting their kids to sleep, etc.) if they are in the middle of working or there is a HIT worth doing at a certain time 
 - There many not be many HITs worth doing available at one time 
  • Accepting new tasks v. repeated tasks
 - Important: how the task is going to affect worker rating? 
     - One worker, using a script, calculates worse case approval rating (e.g. all hits get rejected), and if they are in danger of going below a 99% approval rating, they will stop and wait to see if the requester starts approving 
     - "Don't put too many eggs in one basket" 
 - One worker tended to select from requesters that have a good history 
 - The worker will send an email to requester if they are new to make sure that they are responsive (if their work is rejected, they know the requester will respond)
     - Some requesters don't want to interact; they just want the work done
  • Task design process
 - MTurk allows requesters to quickly test a hypothesis and design an experiment 
 - Important: understand the task well enough to write a task that workers can understand 
 - Small batch posted as trial; results checked to see if they make any sense 
 - Workers e-mail requesters and suggest when things are unclear 
 - Time to interact with workers did not scale well 
     - Hired someone to monitor e-mail account and respond to people who felt their work was rejected unfairly or to answer questions
 - Built in reputation of requester - want good workers to engage in your task 
     - If the requester rejected someone, they redirected them to a link if they disagree - will be adjudicated 
     - Extra 20% of money dedicated to resolving these disagreements 
 - Requester aimed for productivity optimized HITs 
 - External source of survey creation used; MTurk templates are not used 
 - Ability to have a HIT inside an iframe allows you to build your own interface 
     - Useful since some HITs are not standardizable 
  • How do you make sure people are honestly and carefully answering those questions?
 - Go to communities of good workers 
 - Pay well 
  • The requester automatically accepts HITs and results are generally good
  • Rather than rejecting work, which could affect a worker's long-time performance, one requester had a qualification that gets taken away if their performance falls below a certain percentage
  • Workers are presented as serial numbers, no face-to-face interaction
  • How do you generate your projected earnings?
 - Based on experience, one workers estimates projected earnings in her head for batch HITs
 - Time estimates (i.e. for surveys) are often inaccurate
      - Requesters may overestimate or underestimate (want people to accept the HIT) 
      - Worker forums are used for time estimates; someone will do the HIT and report how long they took 
      - The amount of time any worker takes on a task can vary (what people report may not be the same for others) 
 - Participate and try the HIT out; invest 2 minutes 
 - Spreadsheets were kept, keeping track of the requesters and how long it generally took for their postings 
 - Huge amount of variability in earnings on a day-to-day basis; very difficult to budget 
  • Rejection
 - One requester set two thresholds: 
      - Rejection: performing at chance (worker clicking randomly) 
      - Approval: worker doing well, reasonably conscientious
      - Middle range: have to decide whether to approve or reject
 - If a new worker is rejected, their worker rating is greatly affected 
      - They are locked out of the vast majority of good paying work 
      - It takes a while (several hundred hits) before they can accept HITs that pay well
 - One worker takes screenshots of all the work that they do
      - If there's a rejection, they e-mail the requester with the screenshots of the work to clear up situation 
 - One worker sets goals for themselves in terms of how much they want to make  
  • Workers describe experience with requester in 5 words:
 - anonymous 
 - frustrating 
 - tricky 
 - unpredictable 
 - great
 - terrible 
  • Requesters describe experience with worker in 5 words:
 - incredible enablers of scientific research
 - power: as a requester, you have it; as a worker, you don't 
 - lonely, isolated
 - diverse
 - passionate
 - futuristic 
 - dynamic 
 - humans
 - frustrating (both)
 - tiring (both) 
 - empowered (both)

Reading Others' Insights

Worker perspective: Being a Turker

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Primarily work for money
  • Use online forums to obtain more information about requesters, to become more efficient
  • Turkers may lack money from other sources (i.e. their normally full time job)
  • Higher earning workers are more experienced on MTurk and take on more professional work
  • Turkers set themselves targets/goals for earnings
  • Workers are ethical and cheating is generally frowned upon; requesters are responsible for their own mistakes
  • Workers will help out a requester if help is asked for
  • Communication between the worker and requester leads to more efficiency by both parts
  • Novices focus on getting their approval rating and HIT count up
  • Workers spend invisible time searching for jobs to do, on online forums
  • Experienced Turkers concerned w/ approval rating and getting access to wider selection of jobs and higher paying HITs
  • Workers are afraid that regulation will drive cost up for requesters and cause the requesters to leave the platform
  • Workers believe in their ability to influence the market through their actions (accepting high pay, rejecting low pay)
  • Have positive invisibility: anonymity, can work for whoever they want, when they want, and negativity anonymity: easy to criticize, misunderstand

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • There is a misconception that the workers are having fun doing the work
  • Have information (i.e. approval rating) on workers
  • Have the greater power to fix unfair situations
  • Can easily approve or reject, denying the worker of any pay for the time spent
  • Requesters can adjust their pay and promptness for approval/rejection based on experience, communication
  • Can view spreading of information as encouraging people to work around qualifications

Worker perspective: Turkopticon

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

     -Workers take a risk when they take jobs because if the requester is unsatisfied with their work they can with-hold their wages. And if the worker contacts the requester, they are not obligated to respond.
     -Workers also take a risk when they accept a job because their work is compared through an algorithm that may or not be functioning correctly. And if the algorithm is incorrect, they won't get paid.
     -While it wasn't explicitly stated invisibility is something that is to the advantage of turkers (i.e. no biased payment based on gender, race, etc.)
     -While workers don't seem to want AMT to be government regulated, they are unhappy with the pay being below minimum wage.
          -Perhaps establish a minimum price that requesters can set
     -It's treated like a normal job, trying to build a relationship with employer(s)
     -There is an unsaid set of rules/beliefs that all true turkers adhere to 
     -Turkers can get banned due to low approval rating, but a requester's behavior is never analyzed

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

    -Requesters see the work done by workers as a simple computation. They don't think about the time and effort that went into providing this information. They take the works for granted.
    -If the work produced is not what they expected they can essentially steal wages from their workers without legal repercussions.
    -Requesters have little to no accountability to their workers.
    -Due to invisibility requesters tend to treat workers somewhat like slave labor
    -Requesters expect clean and thorough work for small pay, yet tend to assume workers are lazy
    -Mass amount of workers a requester deals with influences their attitude towards them since each seems easily replaceable 
    -Requesters are main source of income for amazon, so they get priority as far as rights go
    -There needs to be an easier way for requesters to give feedback to workers
         -1000:1 worker-to-requester ratio

Requester perspective: Crowdsourcing User Studies with Mechanical Turk

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Workers have low participation costs for accepting and completing simple, short tasks
  • A small subset of the workers take advantage of the system multiple times
  • Workers are more likely to give more meaningful answers on less “gamey” tasks when there are verifiable, more qualitative questions they have to answer before

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Having many participants for user testing is costly for potential requesters
  • Many participants are needed to catch errors and issues
  • Diverse population from all over the world provided on MTurk
  • Lack of demographic information, etc. provided on MTurk
  • Requesters spend time finding, removing, and rejecting responses
  • Requesters must design tasks so that a less-thoughful answer takes as much effort as an authenticate answer
  • Requesters have multiple ways to detect suspect responses: short task durations, comments repeated verbatim across multiple tasks
  • Requesters have no way to control the environment of the worker
  • Requesters can't assign participants to experiments

Requester perspective: The Need for Standardization in Crowdsourcing

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Workers make their own hours
  • Can choose to perform tasks of varying difficulty
  • Are required to learn different task specifications for different requesters
  • Are required to adapt to quality expectations of different requesters
  • Are forced to constantly adapt due to the lack of standardization of tasks
  • May not earn as much as they should for a task due to the task's earn rate that is decided without reflection on the current market

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Requesters can suggest tasks of varying difficulty
  • Requesters need to be mindful when deciding how much to offer a worker per task
  • Spammers are a threat to quality and task completion

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Good workers get paid the same amount as bad workers
  • Workers will not complete many HITs of a new requester until the workers know that the requester is legitimate, pays promptly, and does not reject work unfairly
  • Workers who are spammers and inexperienced are the ones working on HITs by new requesters
  • Workers can’t search for a requester unless the requester put their name in the keywords
  • Workers mainly use 2 sorting algorithms:
  - see the most recent HITs
  - see the HITgroups with the most HITs
  • Workers need a cleaner interface with more organization
  • Workers actually fear requesters. This isn't healthy in any work place
  • Workers need more a reputation system that doesn't rely on refusal of payment

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • In order to get good results, requesters build their own quality assurance system, ensure qualifications from workers, break tasks into a workflow, rank workers according to quality, etc.
  • There a few big requesters and many small requesters posting tiny tasks
  • To get quality, they have the same task repeated many times by many workers
  • Requesters can reject good work and not pay for work they get to keep
  • Requesters don’t need to pay on time
  • New requesters who post a large batch are likely to get low quality results from spammers and inexperienced workers
  • Redundancy shouldn't be necessary to get good results
  • There should be a simpler way getting to post tasks
  • Requesters need a qualification test to help lower redundancy and increase quality of work

Solyent: A Word Processor With A Crowd Inside

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Allows humans to utilize crowd tools
  • Uses workers for tasks that have a solidified answer by mixing technology and crowd work
  • Gives set parameters for workers so they can't misinterpret a task
  • Would help ensure workers don't get rejected pay
  • Workers aren't given a misguided task

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • Helps eliminates the issue of the lazy turker and eager beaver
  • Provides a more reliable system for editing
  • Allows requesters to get results that resemble the accuracy of a computer, with the common sense of a human
  • Allows edits to preserve the original meaning
  • Allows human error to be fixed by an algorithm, whose algorithmic errors are then caught by a human.

Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc

List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.

Synthesize the Needs You Found

List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.

Worker Needs

A set of bullet points summarizing the needs of workers.

  • Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.
  • Workers need to feel assured that the requester is good requester (i.e. responsive, fair, pay within a reasonable amount of time after submitting a HIT) Evidence: During the panel, one of the words used by the workers to describe their experiences with requesters was "anonymous". And, in the paper "Worker perspective: Being a Turker", there's mention of "information assymetry" between the workers and the requesters. Interpretation: Requesters can use the worker's approval rating to help make a decision on whether or not to take on the worker, but workers have no equivalent information.
  • Workers need to reduce the invisible work they spend looking for HITs, on forums, etc. Evidence: In the worker panel, there was a worker who used a script and browsed forums for HITS found by other people in order to find good HITs (i.e. good requester behind the HIT, pays well). Interpretation: Workers do a lot of unpaid work in order find good paying work.
  • Workers need requesters to understand that this is a job. Their choice to be a worker may not be a fully voluntary one and the work that they do is not necessarily fun. Evidence: In the panel and in "Worker perspective: Being a Turker", there was worker who reported back the response one requester had given them, which was to get a real job. Workers in this paper also report MTurk as a way to make ends meet while they wait for their other job to pick up. And during the panel, one worker was disabled and was not able to find work in the traditional sense. Interpretation: The requester didn't understand the situation behind the worker at all, and might have assumed that the worker was just too lazy to get a more traditional job. But in fact, workers may have no other option.
  • Workers need more stability and predictability in their work life. Evidence: In the panel, one of the workers described how unpredictable the work was. They sometimes had to choose between lunch or a well-paying HIT. Some got up at odd hours of night in order to work. Interpretation: Workers may have a difficult time making enough money with set hours since requesters choose when to post the HITs and there are requesters from all around the world, posting HITs 24/7.
  • Workers need some sort of invisibility to help protect from discrimination, while also having some system besides that government that regulates pay on MTurk. Evidence: Turkopticon Feminist HCI. Interpretation: This invisibility allows workers to not have to worry about biased payment and/or hiring
  • Workers need a more clean and organized interface evidence: Panos say the fact that Turkopticon, Turk Nation, etc. even exist. Interpretation: Amazon is lazy and doesn't care about the workers since they produce no revenue for them.
  • Workers need consistency across tasks from different requesters. Evidence: We learn from "The Need for Standardizing in Crowdsourcing", that workers are required to learn different task specifications of different requesters and are forced to constantly adapt to quality expectations of different requesters. Interpretation: Due to the variability in what a requester wants from a worker in terms of the the quality of results expected and the details of the actual task at hand, workers constantly have to adapt and lose more switching from requester to requester, task to task. This affects the growth of newer requesters.

Requester Needs

A set of bullet points summarizing the needs of requesters.

  • Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.
  • Requesters need a way to communicate with their workers effectively when scaling up. Evidence: A requester on the worker panel stated that they had to hire someone to respond back to workers' questions and handle the appeals made for rejections. In "Turkopticon", it is reported that there is a 1000:1 worker-requester ratio on MTurk. Interpretation: It becomes increasingly difficult to communicate with workers the more HITs you put out due to the shear number or workers.
  • Requesters need workers to understand the task at hand clearly. Evidence: In the worker panel, a requester put out a small batch of tasks first as a test batch to clarify instructions and to make sure workers understand what to do. Interpretation: Requesters may put out tasks that are badly designed and in return they get bad results, for which they may blame the worker.
  • New requesters need to have confidence in the platform itself. Evidence: In the paper, "A Plea to Amazon: Fix Mechanical Turk", the author extrapolates on the result of experienced workers not wanting to risk their approval rating on a new requester. New requesters are more likely to get spam or bad results, since those are the type of workers that are more willing to risk their approval rating on new requesters. Interpretation: New requesters don't know the platform well and have not invested much time in it. If they get bad results right off the bat, they might assume that any results they get from the platform is bad, and leave the platform.
  • Requesters need to reduce the amount of work they do in order to get good results. Evidence: In the paper, "A Plea to Amazon: Fix Mechanical Turk", the author reported requesters building their own quality assurance system, ensuring qualifications from workers, ranking workers according to quality, etc. Interpretation: Requesters have to invest a lot of time into maintaining the quality of results that they get back from workers.
  • Requesters need to understand lazy turkers are a result of their lack of effort for creating a task. Evidence: Crowd source user studies with mechanical turk. Interpretation: When they task was designed better as to force workers to actually do the task their amount of spam significantly dropped