Difference between revisions of "Milestone 2 UWI"

From crowdresearch
Jump to: navigation, search
(Requester perspective: Crowdsourcing User Studies with Mechanical Turk)
(Requester perspective: Crowdsourcing User Studies with Mechanical Turk)
Line 54: Line 54:
  
 
* In two experience, workers gave totally different qualities of the responses. In experiment 1, 48.6% responses were invalid but in experiment 2, only 2.5% responses were invalid.
 
* In two experience, workers gave totally different qualities of the responses. In experiment 1, 48.6% responses were invalid but in experiment 2, only 2.5% responses were invalid.
 
  
 
2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.
 
2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Revision as of 21:51, 9 March 2015

Template for your submission for Milestone 2. Do not edit this directly - instead, make a new page at Milestone 2 YourTeamName or whatever your team name is, and copy this template over. You can view the source of this page by clicking the Edit button at the top-right of this page, or by clicking here.

Attend a Panel to Hear from Workers and Requesters

  • In the panel, many of them are doing sociology research or working with sociology researchers.
  • One person talked about how difficult it is to use the interface of Mturk for the first time. But after he got used to it, he found the platform is more consistent than the other crowd sourcing platform. He wants to do more individual work so he came to MTurk.
  • They talked about the community of turkers a lot. There are many websites for the turkers, such as Turk Nation, Mturk Forum and so on. People share new hits on some websites and share the information and experience on some websites. Someone mentioned a new request was surprised that workers communicated with each other. The experienced turkers help new turkers to get used to the platform, get them advice and share the information about hits.
  • Someone thought that the initial motivation for the turkers is money. But the community really helps people to keep doing this job. When someone get his/her first rejection, he/she would be really frustrated. He/she can find help in the community.
  • The workers would email requesters about the work. They get suggestions on the format, the seasonal payment and required qualification of the hits. Someone said that the more standard the platform is, the more hits would be posted on it and they will get more work to do and get more money.
  • Two requesters seldom reject workers but they use different ways to test whether the workers take the tasks seriously. One new requester uses the time the worker spend on the instruction page. He thought if a person spend less time on the instruction page, he/she might not pay less attention to it and might not do a good job.
  • But the other requester said he used to use this method but he found out the fact that people spend less time on the instruction page does not necessarily mean they don’t pay attention to it. So he usually post a small question after the instruction. If the workers really understand the instruction, they can answer well.
  • Some of the requesters usually keep the task as small as possible. Sometimes they post small tests to test how much time people need to finish the task.
  • One requester said at the first time he did't realize the time of the hit is the maximum time not the minimum time.
  • A requester said that MTurk doesn't allow download because of the intellectual property. Sometimes he wants the workers to use new tools to do the tasks but he can’t provide the tool to workers.
  • Someone worked for the internal crowdsourcing platforms for Twitter and Google. He said a lot of companies have internal crowdsourcing platform.
  • Someone thought that MTurk is too hard to start with and it lacks the training and tutorial. One worker said that they want qualified and professional workers to stay, so if Amazon doesn't do training, they help new workers in the community.
  • They said that since MTurk doesn't allow workers outside US to work on that, there are someone sells American account to India.

Reading Others' Insights

Worker perspective: Being a Turker

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Worker perspective: Turkopticon

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Requester perspective: Crowdsourcing User Studies with Mechanical Turk

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • In experiment 1, worker response was extremely fast, "with 93 of the ratings received in the first 24 hours after the task was posted, and the remaining 117 received in the next 24 hours".
  • In experiment 1, workers finished the tasks in a very short time. "Many tasks were completed within minutes of entry into the system, attesting to the rapid speed of user testing capable with Mechanical Turk".
  • In experiment 1, "many of the invalid responses were due to a small minority" of workers.
  • In two experience, workers gave totally different qualities of the responses. In experiment 1, 48.6% responses were invalid but in experiment 2, only 2.5% responses were invalid.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

  • The requesters used the Mechanical Turk to gather usability testing.
  • The requesters compared their ratings to an expert group of Wikipedia administrators from a previous experiment
  • The requesters designed new experiment to "make creating believable invalid responses as effortful as completing the task in good faith". They added four questions that had quantitative and verifiable answers before rating the quality of the articles.

Requester perspective: The Need for Standardization in Crowdsourcing

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc

List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.

Synthesize the Needs You Found

List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.

Worker Needs

A set of bullet points summarizing the needs of workers.

  • Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.

Requester Needs

A set of bullet points summarizing the needs of requesters.

  • Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.