WinterMilestone 1 stormsurfer

From crowdresearch
Jump to: navigation, search

Experience the life of a worker on Mechanical Turk

Reflect on your experience as a worker on Mechanical Turk. What did you like? What did you dislike? If you're from outside the USA and unable to access MTurk, try try the worker sandbox, or other sites like CrowdFlower or Microworkers or Clickworker.

As a high schooler, I wasn't able to access Mechanical Turk. (I am required to complete registration of an Amazon Payments account, which I cannot do as an individual younger than 18 years.) I instead tried the worker sandbox, so I'll reflect on my experience with that.

Task

For the worker sandbox, I completed 20 HITs at a rate of $0.05 each for "functional norming." Directions: Consider the image below. Please indicate which of the activities listed on the right could be done in the environment depicted in the image from the location of where the photo was taken. If you are unsure about an action, or feel that the activity would be unusual in this environment, do not check the box. Click on any action to show examples. The full task can be found here.

MTurk Worker Details Winter Milestone 1 stormsurfer.png

Positive qualities

  • Easy to find tasks: It was extremely simple to find a task. I could easily search for tasks and modify the results such that they are HITs available to me that pay above a certain price level.
  • Easy money: At least for the task I completed, the money earned/time ratio was quite high. I could easily earn above minimum wage within an hour, and for individuals in third-world countries where the cost of living is much lower, this is a lot of money! (The cost of accessing the Internet, however, may be higher.) One caveat, however, is that all of my HITs are still pending. It is possible that I might have to redo all of them again!
  • Specific tasks: The task was quite specific and clear and I understood what I was supposed to do.

Negative qualities

  • No feedback: Did I do the task right? Partially right? Completely wrong? There is no way for me to know! Even after two hours, my submissions (all 20) are still pending. I completed 20 HITs for the same task, and it is possible that I might receive $0.00 for all of my work if it is incorrect. If I received feedback after my first 1-2 HITs, I can easily improve and complete the task better the next few times. I understand that the requester currently may not be online to approve my submission or may be swamped with reviewing other submissions, but an easy fix would be giving 1-2 "sample" HITs. Similar to sample questions/answers on a standardized test, the requester can give sample responses to 1-2 HITs to give the worker an idea of what the requester is looking for.
  • Vague responses: I was unsure about whether most actions/activities could be performed in the environment. Again, related to the last negative quality, if I received immediate feedback for a few sample HITs, I would get a better idea of how well the activities should fit in the environment.
  • Completion time unclear: What is the estimated time to complete a task? The time allotted is unclear; for example, for the above task, I was allotted 60 minutes, which is more than enough time to complete the task. This is the case for most tasks, so it is difficult for me to analyze which task is most optimal for me to complete. As a worker, I want to maximize the amount of money that I can make in a period of time. A helpful feature would be to see, on average, how long other workers took to complete the same task and give a money earned/time estimate (e.g. dollars/hour) for each task. This would create competition among requesters (for the highest dollars/hour ratio) and ensure fair wages for workers.

Conclusion

The takeaway: MTurk definitely has some really great features, including the ability to easily search for/find tasks and the specificity of each task. I can easily earn money, and in third-world countries where the cost of living is much lower, this could be a lot! However, there are two essential features that I feel must be added: a feedback loop between the requester and worker and an estimated "salary" for completing the task. The former would be beneficial to both parties, decreasing the amount of time needed to have a task completed for the requester and increasing the "salary" over a period of time for the worker. The latter would create competition among requesters and ensure fair wages for workers.

Experience the life of a requester on Mechanical Turk

Reflect on your experience as a requester on Mechanical Turk. What did you like? What did you dislike? Also attach the CSV file generated when you download the HIT results. If you're from outside the USA and unable to access MTurk, you can try the MTurk requester sandbox, or CrowdFlower or Microworkers or Clickworker

Unfortunately, because I was using the worker sandbox, no worker every completed my task. However, I'll reflect on my experience with creating the task itself.

Task

Write a short summary of the Wikipedia page

MTurk Requester Details Winter Milestone 1 stormsurfer.png

Positive qualities

  • Easy to create task: It was extremely simple to create a task. I could easily start a new project, and Mechanical Turk gave me several templates to choose from. These templates had pre-filled values and recommendations, and I could easily edit the values to fit my needs. I could specific the criteria that a worker needed to pass to be able to do a specific task and could specify the title, description, and tags so that a worker could easily find my task.
  • Easy to manage: The manage view is very simplistic, and I can see how many assignments for a batch are completed. The dashboard gives me several vital pieces of data, including estimated completion time, effective hour rate, and batch progress and divided the batches into three sections: batches in progress, batches ready for review, and batches already reviewed. Once a batch is submitted, I can also easily edit a task so that if, after receiving some results, I realize that I was not specific enough and needed to change the directions. Daemo's prototype task feature is a more effective and alternative approach to solving this problem.

Negative qualities

  • Potential for abuse: I might be wrong because I never actually received any results for my task, but Mechanical Turk seems really easy to abuse from a requester standpoint. I can reject a worker's work so that I don't have to pay him/her while actually using his/her work. In other words, I can download his/her results for the task and then reject it on the basis that it was not good enough so that I would not need to pay him/her.

Conclusion

The takeaway: MTurk, in my opinion, gives too much power to the requester. I can set unfair wages (and workers can get lost in determining which tasks are best for them in terms of wages), and I can easily reject work when it is actually satisfactory. I feel that it would be hard for the worker to trust the requester on Mechanical Turk and similar platforms.

Survey of workers

I'd be interested to see if somebody with access to the non-sandbox version of Mechanical Turk creates a task on MTurk that surveys workers to get feedback from them about the platform. I'm not sure if Amazon allows this, but if it does, it would be an efficient way to gather workers' thoughts to improve Daemo.

Readings

MobileWorks

  • What do you like about the system/what are its strengths? MobileWorks presents a lightweight application that is extremely useful for third-world countries, especially India, where users may not have access to a desktop or high bandwidth. This is something that should be kept in mind when creating Daemo; pages that are too "flashy" and data-heavy will be slow to load in countries like India where connection may be slow. Furthermore, MobileWorks was quite specific in what users could accomplish (human optical character recognition), and because the work was being done on a small device, they presented a quite innovative solution where the text would be split into multiple segments. I also really liked how each piece of text would be verified by at least one other user, which is important!
  • What do you think can be improved about the system? MobileWorks could potentially expand to other uses other than human OCR and support other languages (e.g. Hindi, Chinese, etc.).

Daemo

  • What do you like about the system/what are its strengths? Daemo focused on "requester" quality, which is something not usually enforced by other platforms (including Mechanical Turk). Boomerang would ensure trust between workers and requesters and incentivizes requesters to provide high quality instructions and fair pay and workers to provide high quality performance, efficiently. Prototype tasks are also especially useful because they allow the requester to refine his/her instructions, ensuring that each task can be done more efficiently. Requesters will get their work done faster, and workers will get their pay more quickly.
  • What do you think can be improved about the system? Even though the instructions may be more specific through the prototype task feature, the quality of work that the requester is looking for can still many times be vague. For example, it is a common occurrence that different teachers have different grading styles; even for the same free response question, each teacher will have a slightly different approach to determining what is a better answer and what is a worse one. The same can be said for requesters and worker output. In order to make sure the worker knows what the requester is looking for, the requester should provide 1-2 sample responses for sample tasks (similar to sample questions/answers on a standardized test). Creating a baseline would help calibrate workers and requesters. Furthermore, in order to ensure fair wages for workers, there needs to be competition among the requesters. Tasks should have an estimated wage based on the task history (i.e. money earned/time for previous workers for that task) so that users can quickly scan and determine which task will earn them the most money within the smallest period of time. This will drive the price of tasks up to a fair amount.

Flash Teams

  • What do you like about the system/what are its strengths? Using a flash team is an effective and efficient way to contract a project/product. It is easy to find specialized workers willing to do a subset of a large task, and the small bits of the project can easily be moved around. Foundry is effective because it allows a requester to simply request an input and output, and the workers will take care of the specialized parts. Foundry reminds me of Agile, which allows employees in a work environment to better organize themselves to achieve more in less time.
  • What do you think can be improved about the system? It seems that a requester still needs to do a large amount of planning to organize the workers, and I don't think the time it took for the requester to plan/manage the workers is included in the time reported by the paper. On the other hand, for projects not using Foundry, teams have to manage themselves, and this time is included in the project completion time. I feel that, as a result of this, the project completion times for Foundry are slightly deflated when compared to completion times for projects not using Foundry. Furthermore, it seems that the requester, to an extent, must have experience with managing members of a team. It also seems that the requester has limited control over the direction of the project and that the workers are given too much leeway in where the project is headed.

Milestone contributors

Slack usernames of all who helped create this wiki page submission: @shreygupta98