Milestone 2 munichkindl

From crowdresearch
Jump to: navigation, search

Attend a Panel to Hear from Workers and Requesters

Panel 1

  • To new requesters, the options available when setting up an HIT are sometimes confusing. For instance, one requester thought the maximum time for an HIT was to be an estimate shown to the workers instead of a system-side maximum completion time
  • Some requesters say the ideal time for a survey on mTurk to last is between 8 and 10 minutes in order to get valuable data.
  • Furthermore, requesters cannot be certain that their uploaded tasks are completed (instead of not being taken at all or workers stopping halfway through)
  • The system makes does not make it possible to pre-select workers who match certain demographics or have a specific type of qualification which are verified.
  • Workers value the flexibility to work from any place, at any given time
  • Workers do not like that there is inconsistency in whether they will get enough tasks to work on or not
  • There are people working on Turk full time as well as part-time in addition to a 'real life' job
  • A group of Turkers created the 'Dynamo agreement' (WeAreDynamo.org) aiming to make requesters provide some kind of minimum wage and fair working conditions
  • Turkers and other crowdworkers have created various forums to chat to and support each other as some sort of online equivalent to the 'break room' of an office, because they feel they cannot talk about their work with non-crowd-workers

Panel 2

  • The rejection rate is broken in Mechanical Turk. Requesters usually don't reject work even if is not useful, so when you look at worker's rejection rate you know it's not true.
  • In Mechanical Turk is always more useful to break down tasks in smaller tasks to get better results.
  • It's useful to first test the worker with a task which you have its ground truth to know if you can trust the rest of the results.
  • In Mechanical Turk, it would be useful to be able to choose a concrete worker for a concrete task.
  • It's hard to make personalized tasks in Mechanical Turk, such as Twitter suggestion algorithm tests.
  • The successful workers find jobs during unconventional hours.
  • For some research tasks would be useful to access to concrete demographic groups.
  • Mechanical Turk should have a better way to proof the worker's identitiy, such as where is the worker living, age, and so on. This is crutial for some social research.


Reading Others' Insights

Worker perspective: Being a Turker

1) The paper focuses on turkers who are mostly microtasking. These tasks include tasks that are difficult for computers to do. The paper also states that majority turkers are from US which raises the question is a mechanical turk really working for money. The paper makes a lot of emphasis on challenging previous research which deduce that most workers work for money and not for fun or learning. However as the paper proceed it becomes evident that the primary reason for turking at Amazon Mechanical Turk is money. A review of the posts of turkers on a forum suggest that they are happy with the amount of money they get even tough they are below the minimum wage line. It is also important to know that the high earning turkers are the ones having the highest status. Turkers usually set daily or monthly goals to motivate themselves. These targets are sometimes a monetary amount or a given number of tasks.

2) The paper outlines that a great requester is honest and a good communicator and think for the workers as much as he thinks for himself . He should be able to tell the worker precisely what he needs without the need to have multiple blocks of information being delivered later on. The requester should also have access to better tools to express their work. The paper also mentions that workers got frustrated when the requester was unresponsive or late in replying even if the requester had high reviews.

Worker perspective: Turkopticon

The authors analyse the croudsourcing platform Amazon Mechanical Turk as a site of technically mediated worker-employer relations. They present the development and the final version of Turkopticon, a system that allows workers to publicize and evaluate their relationships with employers. Finally, they discuss the potentials and challenges of sustaining activist technologies that intervene in big socio-technical systems.

The main reason for the development of Turkopticon, is that the authors see an inequality between the right of the requesters and the workers, which is also caused by Amazon’s interests.

In the paper it is stated that the workers, also because they come from different countries, have different views on how well-payed and valued is their work (what is a lot of money in India is not enough to survive in the US!). Some of the workers just engage for fun in AMT and others are in need of the salary. The salary of the workers is below average, furthermore they give up every right about their intellectual property. Additionally, every work can be rejected from the requesters without the need of any reasons and the workers have limited options of dissent within AMT itself. Payment can take as long as 30 days. If the workers consider the system unfair, they have very little options. Before the forums about AMT became important and Turkopticon has been developed, there was not much communication between the workers possible, which was even worse because they come from different cultures and speak different languages.

On the other hand, it is easy for requesters to not see the workers as humans and instead as some sort of APIs. They can reject every work without having to justify it which includes also that there is no need of payment. If the workers want to complain, they don’t even have to answer.

Turkopticon aims to close the communicational gap between the workers, as well as between workers and requesters.

Requester perspective: Crowdsourcing User Studies with Mechanical Turk

The paper discuses quality of work provided by crowd workers and in particular Amazon Mechanical Turk.It uses a series of experiments. The first experiment asks turkers to rate wikipedia articles. The rating is then compared with the official wikipedia rating board and it is determined that the rating was just marginally sufficient. A later inspection of the task reveals that a group of turkers actually payed around with the result sharing the rating in order to work less. The Requester was only able to determine by close inspection and this shows a flaw in such a crowd grading system.

It also shows that while experts can rate an article based on specific criterion, a platform like amazon mechanical turk could provide the opinion of the majority population this in turn helps obtaining a optimum balance by combining the expert and user ratings. An important part of the paper also talks about the problems from a requester point of view such as the inability to control test environment, testing ecology like which browser the user is using how does the design look different particularly in interactive grading of design work screen color calibration. The platform also lacks collaborative tasking i.e allowing multiple people to contribute to a task such as creating content. There is simply no way for the taskers to interact with each other while doing a task.

Requester perspective: The Need for Standardization in Crowdsourcing

1) What observations about workers can you draw from the readings? Include any that may be are strongly implied but not explicit.

Although this paper strongly focuses on possible advantages of and methods for standardizing crowdsourcing platforms, there is certain information about current behaviors of requesters and workers to be extracted.

Workers, have only very limited possibilities to filter tasks for desirable criteria such as skill requirements and difficulty or payment structure. In addition, they need to manually check each requester’s reputation beforehand in order to avoid possible fraud, as well as the level of quality expected by the requester. Due to non-standardized task user interfaces, they need to adjust to each one separately. The majority of workers works on low-skilled tasks and are said to perform better with very detailed instruction. At some platforms (such as crowdsource.com) they receive online training before being allowed to accept a certain type of task but this is not a common standard.


2) What observations about requesters can you draw from the readings? Include any that may be are strongly implied but not explicit.

On the requester side, much redundancy happens at the creation of tasks. Not having any or only a small number of ‘best practices’ to refer to, everyone defines their own user interface without yet knowing their effectiveness for good results. For comparison, in empirical research, there is a number of standardized interview questionnaires to choose from; such a thing for is not yet established for microlabor tasks. Furthermore, categorization of tasks is only possible on a very broad level. Considering task pricing, requesters do not have to possibility to adjust prices according to market demand without having to repost the task.

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

The author of this post from 2010 proposes improvements on AMT based on the following observations:

It is difficult for the requesters to access the quality of the workers beforehand. Work can be rejected without justification what has bad effects on the workers reputation. However, if the platform is not user-friendly, good workers are tented to leave AMT, reducing the overall quality of the platform.

The requesters have the problems of starting to become popular on AMT, because workers are careful accepting tasks from new requesters, because they could easily be not paid or rejected. Also, it is difficult for them to manage the complex API or the execution time. The author claims that many requesters hire full-time developers to produce the microtasks. However, by doing so, the work gets less cost-effective.

The author states that without changes fitting the needs, AMT will not be successful for much more time.

Do Needfinding by Browsing MTurk-related forums, blogs, Reddit, etc

List out the observations you made while doing your fieldwork. Links to examples (posts / threads) would be extremely helpful.

  • This comment talks about one of the points that we got in the Panel 2. The platform needs a way to verify the identity of the worker, not only to trust them, but also to save time as is something that is asked really often in the different tasks (workers need to spend a lot of time on it).
  • Trust is a real issue for requesters, as we could see in the panels and Reddit. See thread
  • Workers are very sensitive to payment issues, both concerning the transparency of the process and the speed of processing. (1, 2, 3)
  • Workers would appreciate standardized guidelines, which are accepted by all requesters, that they can rely on when working on the same category of tasks
  • They also wish for performance indicators of themselves and key indicators of requesters to be provided by the platform
  • Furthermore, they value clear and reliable information about completion time before starting and also during fulfilling a task
  • It is put much emphasis on the fact that workers want to spend as little time as possible on non-value creating activities before and during completion of a task
  • In order to be even more flexible concerning their working location and not wasting time on trying whether a task will be doable on a mobile phone browser, a well-designed app for HIT completion is appreciated. There are already some platforms offering mobile work, but they are more likely to be liked to location service tasks, thus being core of the business idea rather than a supplement
  • For requesters new to crowdsourcing, it is difficult to determine an appropriate price for their tasks, which is why they ask for advice in forums.
  • There are strong, different opinions within the community on what is considered to be fair pay or if there even has to be one.

Synthesize the Needs You Found

List out your most salient and interesting needs for workers, and for requesters. Please back up each one with evidence: at least one observation, and ideally an interpretation as well.

Worker Needs

  • Workers need to show their personal information because they are adding it in different tasks quite often. If there is a profile and the task can take that information automatically the worker will save time and they will be able to focus in the real tasks.
  • Workers need to own more rights about their intellectual property. Right now, requesters can use their work even in case of rejection.
  • Workers need to connect more with each other. For now, they can only get information about requesters or help each other on external platforms.
  • Workers need to be connected to the requesters. Until now they can only contact them via email and no answer is enforced.
  • Workers need to be incentivized by more than money. Although it is often stated as their primary source of motivation by themselves, they want appreciation and feeback on their work.
  • Workers need to be taken seriously. In this thread, some are complaining about intransparent qualification schemes as well as about the platform's customer support not thoroughly investigating their task- or account-related technical problems. This is objectively related to the problems, but the anger of the workers is around by the ignorance and rejection of the platform support team.
  • Workers need to be able to rely on timely payment. They are frustrated if they do not get paid in time and do not know about the reasons for it. Some are even willing to trade off money for fast processing.

Requester Needs

  • Requester needs to trust the identity and qualification of the worker. Right now Mechanical Turk doesn’t give the requester any way to confirm the information about the worker (such as where they live, age, gender and so on).
  • Requesters need to select concrete demographical groups for some tasks. If they want to do it now, they need to do some type of pre-survey. (It would be helpful to be able to select exactly what kind of worker they need specifying exact information and segment them like you can do in systems like Google Adwords.)
  • Requesters need to be cost-effective. Now sometimes they even need to hire full-time programmers to create the tasks for AMT.
  • Requesters need the ability to modify tasks. Now they have to create a new task, even if they only want to change one parameter.
  • Requesters need to be good communicators they should be able to communicate what their task is and if necessary justify why their payment is sufficient for the task.
  • Requesters should be able to control environments to establish baseline. Example using a specific browser to maintain a uniform view of the visible content. Or specific color calibration when reviewing design work.
  • Requesters should also be allowed to set up collaborative work environments so that multiple turkers can sync and perform a task where each sub task is somehow linked to other sub tasks.
  • Requesters should also be able to provide a section where workers can interact under certain guidelines to share information and work more efficiently.