Milestone 2 teamtrojan

From crowdresearch
Revision as of 15:48, 11 March 2015 by Rashmiputtur (Talk | contribs) (Requester perspective: Crowdsourcing User Studies with Mechanical Turk)

Jump to: navigation, search

Attend a Panel to Hear from Workers and Requesters

It was a great experience to attend the hangouts session. The hangouts helped in understanding the perspectives of workers and requesters better.

Some observations made during the meeting :

From the reqester's perspective:

1. It is important to post tasks that have the potential to produce more responses.

2. Try to post new category of tasks.

3. The obtained responses can be useful in social science research.

4. Requesters may leverage demographic variety.

5. They face problems related to validating responses.

6. Experience of the requester plays a vital role in describing a new task.

7. It is challenging for requesters to transfer tasks to mico-tasks or tasks appropriate to crowd sourcing platforms.

8. Importance must be given to the language used and description of the tasks.

9. Some panelists felt that it was appropriate to allow workers post questions regarding the task description thereby allowing them to frame instructions.

10. It is important for requesters to “Get the right people to do the right tasks at the right time”. Also important to allow workers to submit successive responses.

11. Requesters must discern if participants are paying attention or are honest in submitting their responses.

12. Make tasks granular and pay workers via milestones achieved.

13. Many panelists felt that workers' responses must not be rejected.

14. May provide other incentives apart from monetary incentives. Many felt that posting tasks without pay would defeat the purpose of crowd sourcing.

15. It is difficult to rate a worker based on acceptance percentage since they do not know how the percentage adds up.

16. When requesters do not receive enough responses ,there are various options to promote the tasks including : Determine if task is appropriately priced and check if instructions are absurd.

From the worker's perspective:

1. An important factor in finding reasonable tasks is time of the day. Many panelists were of the opinion that early mornings or late nights were suitable times. Also, it was observes that weekends have less tasks than weekdays, but more workers . From the worker's perspective, “It is important to be there when tasks or HITs are posted”.

2. Communicating with the requester or asking questions can help find better and suitable tasks.

3. Common filters used to scan through tasks are: Time, Monetary reward, Interests and Task ethics.

4. Most workers may perform hourly tasks.

5. May think on the correlation between length of time spent on a task and quality of response.

6. Cold start is a problem in few crowd sourcing platforms, oDesk for example. There must be some sort of test to allow workers showcase their skills.

7. It is important to convert practical experience to digital skill.

8. Concerns are raised on whether the amount of money is highly variable or static.Panelists were of the unanimous opinion that the amount of money is highly variable.

9. Workers would appreciate if it is easier to find work that suits their interests.

10. Informative tags on tasks and categorization of tasks help workers find appropriate work faster.

Reading Others' Insights

Worker perspective: Being a Turker

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.


From the worker's perspective :

A. Amazon Mechanical Turk

1. Limited/ No Authority: In Amazon Mechanical Turk(AMT), workers have limited options for dissent in case their work gets rejected. They have no legal recourse against employers who reject their work and then go to reuse it.

2. Wage Theft: In AMT, workers are sometimes victims of wage theft when employers deliberately reject their work.

3. Interchangeable Treatment: Due to surplus labor, workers are treated interchangeably i.e no efforts are made to stop a worker from leaving the platform.

4. Difficulty in gaining approval rating: When a worker’s task is rejected, his/her approval rating falls down and then the system high priority tasks from the worker. Thus, hindering his/her chances of improving the approval rating.

5. Communication Issues: Amazon system as well as requesters frequently do not reply back to the workers’ concerns.

6. No Minimum Wage Policy: There is no fixed minimum wage per hit or per hour for the workers making their income highly variable.

7. Reason for working: The purpose of working on this platform varies for each worker from fun/ pass time to paying electricity bills.

B. Turkopticon

1. Workers’ mutual aid: Workers enter reviews, ratings and comments for the requesters which helps them in avoiding the spammers.

2. Search Functionality: Workers have access to search the tasks using keywords like requester’s name or id.

3. Anonymity: Workers are protected from retribution by obfuscating their email addresses while posting reviews and comments for requesters.

Crowdsourcing User Studies with Mechanical Turk

From the requester's perxpective :

1. Requesters may view Mechanical Turk as an alternative low cost and time mechanism to obtain user inputs on extensive or intensive tasks.

2. They may aim to access a wide pool of users and obtain user input on a significantly large scale. Also aim to obtain the benefits of the geographic diversity of participants.

Two experiments were conducted to determine the usability of Mechanical Turk in user studies. The empirical study asked users to rate a set of fourteen Wikipedia articles in an effort to match user ratings with Wikipedia administrator ratings. The articles were chosen from a random pool with a range of expert-ratings.

Experiment 1 :

Users were asked to evaluate an article on a scale of 7 based on factors like factual correctness, structure, neutrality and overall quality for a reward of $0.05. They could also provide optional feedback regarding improvements to the article through a text box.

The optional feedback aimed at determining the veracity of ratings provided by users.

Following observations were made from the results :

1. User response was very quick. Response time of a lot of responses was low of the order of few minutes.

2. Analogue between the user and admin ratings was very low.

3. Avid observation of responses indicated widespread gaming of system. These responses were unresponsive, non-constructive or copied text.

4. The remaining responses were too sparse to contribute to a constructive research finding.

Experiment 2 :

This experiment was built upon the previous experiment to try reduce the number of invalid responses and malicious use of the system .The experiment aided users provide better subjective responses by including a verifiable questionnaire.

Following observations were made from the results :

1. Number of responses per user reduced compared to the previous experiment.

2. A significant correlation between the user ratings and admin ratings was observed.

3. Fewer responses were observed to be invalid.

4. The median time to submit a response was higher.

The above experiments produce important insights about requesters:

1. Experiment 1 led to the following observation : In order to avail the benefits of Mechanical Turk, the formulation of tasks needs special attention, failing which, results may fail the task's purpose.

2. The increasing subjective nature makes it difficult for the requester to validate answers as observed in experiment 1.

3. Ignorance of participants experience, difficulty in approaching them and limited demographic information raise concerns over accuracy and correctness of responses.

4. Results of experiment 1 were not in favor of using crowdsourcing platforms for research purposes. This is an important factor for a requester.

5. The different responses of the two experiments reflect on the importance of design considerations for researchers to avail the advantages of these platforms.

The Need for Standardization in Crowdsourcing

From the worker's perspective :

1. Adaptation Problems: Since workers are not provided any kind of training in the current market scenario, they need to work hard to learn the intricacies of interface environment for each employer and adapt to requirements for each employer

2. Variable Income: The amount of money made by a worker by working equal number of hours on two different days is highly variable. Thus, this source of income is highly variable.

3. Insecurity: Because of no proper training and no skillset specification for a worker, a worker is always skeptical about the likely outcome i.e whether his work will be accepted by the requester.

4. Skills mismatch: Workers face problem in finding out the job they need and this often leads to mismatch of skills/ talent.

5. Flexibility: Workers have freedom of choosing the kind of work they wish to perform from all the given available tasks. They have the ease of experimenting with different types of tasks.

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc

List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.

Synthesize the Needs You Found

List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.

Worker Needs

A set of bullet points summarizing the needs of workers.

  • Example: Workers need to be respected by their employers. Evidence: Sanjay said in the worker panel that he wrote an angry email to a requester who mass-rejected his work. Interpretation: this wasn't actually about the money; it was about the disregard for Sanjay's work ethic.

Requester Needs

A set of bullet points summarizing the needs of requesters.

  • Example: requesters need to trust the results they get from workers. Evidence: In this thread on Reddit (linked), a requester is struggling to know which results to use and which ones to reject or re-post for more data. Interpretation: it's actually quite difficult for requesters to know whether 1) a worker tried hard but the question was unclear or very difficult or an edge case, or 2) a worker wasn't really putting in a best effort.