Milestone 2 padawans
- 1 Observations Gathered During Panel 1 Meeting
- 2 Reading Others' Insights
- 3 Needfinding by Browsing MTurk-related forums and Reddit
Observations Gathered During Panel 1 Meeting
We attended the Panel 1 meeting held on March 9th 2015 at 10 pm IST. The following are a few observations gathered by us :
Observation 1: The requesters have full responsibility for deciding the cost of all the HITs. This means that they can keep it as low as they wish.
Interpretation: There are a large number of types of tasks being run on Mturk. It is practically impossible to set a standard for each. Hence, the requester is given full authority.
Need: AMT can build separate categories to segregate all the tasks and specify a range for the cost.
Observation 2: For the new workers, it is very difficult to grow. In order to get a decent a reputation, they first need to work with requesters who either pay very low wages or are unfair in terms of rejecting work.
Interpretation: The good/high-paying requesters would rather go with workers with good reputation in order to ensure quality of work.
Need: The new workers need to find good work and their work should be decent for which they get good reputation.
Observation 3: Workers can contact requesters through email and forums. But the employers may not even reply them.
Interpretation: Requesters do not have that kind of time/funds to reply to each of the queries on email or forum.
Need: AMT should introduce a few categories for the reasons of rejection to make it easy for the requesters and also, give some idea to the worker. Basically, the requester will just have to choose one of the options that he/she thinks is the reason for rejection.
Observation 4: The requester might be conducting a survey on western population but the turker may be from that part of the world at all.
Interpretation: Many workers in countries like India use proxy or “buy” accounts in Mturk or ask a US citizen to set it up for them.
Need: Amazon should introduce a mechanism to keep check on the nationality of the worker so that data generated for the employers is genuine.
Observation 5: In case of surveys, it is difficult to find out if the turker has devoted the right amount of time and that he/she has answered the questions genuinely.
Interpretation: Even timers give a very small amount of information.
Reading Others' Insights
The following are our observations on each of the five papers listed in Milestone 2. We have given the observations of raw behaviors and issues of workers and requesters.
1. Worker perspective: Being a Turker
It has been observed that the main reason why workers work is money.
Workers often face problems like unfair rejection of their work, slow payment, low wage, lack of communication between requester and worker, employees not paying, poorly designed requests, etc from the functions of AMT, it is better suited for requesters than workers due to reasons like information asymmetry and imbalance of power workers cannot rate requesters while the latter can rate workers. Requesters usually have more information about the worker.
Workers also feel that task should be designed better.
AMT doesn’t provide any mechanism for direct communication between worker and requester. Often they interact through blogging sites.
It has been observed that wages paid is quite low even though people are content with it.
Workers are mostly interested in the following topics on blogging sites like TurkerNation : Compare themselves to other to know their potential and earning capability. Discuss about how to increase their income to a certain target.
For some AMT is primary income while for others it is supplementary income.
For Indians, the wage is decent enough, however for American workers, it is quite low.
Another instance that shows that AMT favors requesters over workers is that workers can be blocked by requesters on AMT but vice verse is not possible. Too many blocks lead to account suspension and it requires a lot of effort to prove innocence.
It can be noticed on AMT forums that workers are often considered to be unworthy. They often complain about being accused by labels likes bots, spammers etc by the requesters.
Requesters often tend to design their HITS badly leading to wrong answers or spamming workers can be broadly categorized into novice and expert novices tend to work for lower wages because they need to improve their approval rate.
Approval rate is very important as they open up more request with better pay.
2. Turkopticon: Interrupting Worker Invisibility in Amazon Mechanical Turk
AMT renders Turk workers invisible in the sense that they have very little say as it can be concluded from the points that follow.
The requesters have full intellectual property rights over the work submitted by workers. This can often lead to wage theft. The requesters may use the worker's submission and also reject it at the same time. The worker cannot do anything about it.
A worker who thinks that the requester has been unfair in rejecting his work can contact him/her. But it is not guaranteed that they will surely receive a reply. This is because, as the paper mentions, sending replies to all will add up to the cost.
Dissatisfied workers leave the system. They do not have any other option. AMT can afford this loss because of the large number of workers already there.
Workers, even in the US, are paid below minimum wage in many cases.
The surveys conducted to find out more about the loopholes in the system can be termed as the “Workers' Bill of Rights”. It was observed that workers gave short answers to open ended questions but detailed answer for proactive questions.
In this survey many wanted a quicker payment mechanism. It often happens that they do not receive their earning within a fair amount of time.
When an employer rejects a worker's work, its rating goes down. So, AMT will hide the higher rated tasks from worker. Thus, such a worker will got deprived of some works that he might be able to perform well.
There are turkers from different parts of the world having different requirements and skills. For example, Indian turkers might find $10 for a job as “generous” but an American worker might think that it enough. Also, Indian turkers tend to be highly educated and face lower costs of living in comparison to Americans.
3. Requester perspective: Crowdsourcing User Studies with Mechanical Turk
In the first experiment the task got a decent reply very quickly. However the quality of the response was not good. A large number of workers spent time as less as a minute to review an article which shows lack of dedication and interest.
The experiment 1 of using AMT as a user measurement tool was not successful as a large percentage of responses gathered were false. This lead to their rejection.
The requesters had to spend a lot of time going through the responses to judge them and reject the bogus responses.
The design of the request affects the quality of response gathered from workers to a great extent. This was evident from experiment 2.
Requesters should design few questions which can be concretely verified and make it easy to segregate wrong responses. Such questions also provide sufficient hints to the turkers that their answers shall be scrutinized and thus warn them.
Requesters also should keep in mind that the task completion should as simple and less time consuming as possible as turkers value their time for money. There are certain patterns observed among the responses received which indicate bogus answers such as extremely short task durations and comments that are repeated verbatim across multiple tasks.
The experiments suggested that platforms like MTurks are well suited for user studies that combine both subjective and objective information MTurk does not provide any easy way to decide the ecological validity for an experiment.
Requesters need to understand what kind of work is well suited for the micro-task markets and should then know the effective way to design them in order to ensure as less as possible bad responses from turkers.
4. The Need for Standardization in Crowdsourcing
In general requesters would like to ensure that hired workers after suitable training can complete the tasks assigned to them easily. Also the training given to the workers should be easy to replicate to new workers. This is required for maximum productivity and quality of service.
In crowdsourcing, it has been observed that requesters for most of the high-demand tasks (which are generally low skilled) require workers to follow the instructions for a particular standardized task properly. This is to make sure that the work is done in a proper manner since there is no way to make sure that the worker is qualified and has genuine knowledge.
A crowdsourcing market has been compared to an open bazaar where workers have full liberty to come and go on their will and are free to choose tasks based on their difficulty and skill requirement for different pay rates. Similarly the requesters place work offers which are diverse in nature. There is a lot of choice available to workers as various requesters place their job offer. Whichever offer suits a worker, they take it up.
Scammers are common in crowdsourcing markets. Scammers are requesters who recruit accomplices for malicious activities. Although the crowdsourcing market seems like a disorganized market, it has various plus points. Such a market provides both workers and requesters a lot of flexibility.
The current state of the crowdsourcing market is that there is no standardization. Because of this, the price that the requesters offer for tasks are uneven. Also, there is a difficulty in predicting completion times and gaining quality and also the way that workers can search for tasks is inadequate. Currently, the requesters generate their own work request, price the request independently and evaluate the answers separately from everyone else.
Requesters have to implement ‘the ideal practices expected from workers’ for each type of work. The requesters who have been in this business for long learn have experience from their mistakes and fix design problems but the new requesters learn the lesson of bad design the hard way.
Requesters need to price their work unit without knowing the conditions of the market and this price cannot fluctuate without removing and reposting the tasks.
5. A Plea to Amazon: Fix Mechanical Turk
Requesters face a lot of problems like: Scaling up, mamging complex API, managing execution time and ensuring quality.
The following need to be incorporated within the system:
A Better Interface To Post Tasks (Useful for Requesters)
All requesters need to do a lot work which actually means that they might not be benefitting that much. Some of the things are:
(a) build a quality assurance system from scratch, (b) ensure proper allocation of qualifications, (c) learn to break tasks properly into a workflow,
(d) stratify workers according to quality
The system has a large number of small requesters and only a few big ones. This means that a large proportion of the requesters find it difficult to grow.
A Worker Reputation System (Useful for for Requesters)
Better reputation profile for workers. Why? A market without a reputation mechanism turns quickly into a market for lemons: When requesters cannot differentiate easily good from bad workers, they tend to assume that every worker is bad. This results in good workers getting paid the same amount as the bad ones. With so low wages, good workers leave the market. At the end, the only Turkers that remain in the market are the bad ones (or the crazy good ones willing to work for the same payment as the bad workers.)
A better mechanism is required that helps the requesters in judging the reputation of the workers. If an employer cannot differentiate between the good and the bad turekers, then he/she will end up paying the same amount both the kinds. Hence, even the good ones are poorly paid. Eventually, these good workers leave the market.
A Requester Trustworthiness Guarantee (Useful for Workers)
Requesters are bing treated as “slave masters”. They can reject work and yet use it. Even if they accept it, they need not make the payment in time.
Newcomers post big batches of HITs. Legitimate workers will do a little bit of work and then wait and see. Nobody wants to risk a mass rejection, which can be lethal for the reputation of the worker. Given the above, who are the workers who will be willing to work on HITs of the new, unproven requester? You guessed right: Spammers and inexperienced workers. Result? The requester gets low quality results, gets disappointed and wonders what went wrong. This should not happen. Workers should not be afraid.
Requesters new to the market may post big batched of HITs. Legitimate workers will do a fraction of the work and the notice the requester. This is because he does not want a massive rejection of work which could be fatal for his reputation. So, the inexperienced workers do the work and thus, requesters get a poor quality of work. This should not happen. Workers should not have this kind of fear of rejection.
A Better Task Search Interface (Useful for Workers)
It is not possible for a requester to browse through the available tasks that might interest him or search for a requester he/she might want to work with. So, the completion times of the tasks follow a power law making it effectively impossible to predict the completion time of the posted tasks.
1. MTurk Forum
Observation: This is a centralized thread for new turkers. Most of the questions asked by them have easy solutions or relate to the use of the platform. For example,
1) One of them asked about the full form of TO whose asnwer was Turkopticon.
2) A worker couldn't understand how to update the pending amount. The answer was pretty simple. He just had to click on it.
2. Reddit Forum
Observation 1: Workers are not happy about the rejection/approval system.
evd 1: black_tee said in her post “First thing I will say, is that I support the removal of approval/rejection system”
Observation 2: Turkers want proper mechanism to communicate with the requesters.
evd 2: black_tee said in her post “One thing that would be useful is an in-platform messaging system where it is easy to communicate with requesters”.
Observation 3: Turkers want mobile apps to allow them to work on HITS.
evd 3: studystack said “I think it would be great if there were an app for doing HITs from your cell phone".
Observation 4: Turkers demand a system where in they can rate the requesters based on their HITS, payments, defaulters. This maybe a grading policy or a support team.
evd 4: yokogake said the following “A way in platform to rate the requestor. This would include things like rejection rate, user rating etc”. In reply to this, clickhappier said “A support team that actually investigates and acts on flagged/reported requesters in a reliable and timely manner”.
Observation 5: Requesters want features that allow them to increase the pay of a HIT.
evd 5: studystack said “As a requester, I'd also like a way to increase the pay for a particular HIT”. In his reply, clickhappier said “Sounds good, but only if such a feature would only allow pay increases. If pay decreases could suddenly stealthily happen.