Milestone 1 taskforce

From crowdresearch
Jump to: navigation, search

Experience the life of a Worker on Mechanical Turk

We experienced being a worker in three different platforms: AMT (in its Sandbox version), CrowdFlower and Clickworker. AMT and Clickworker are both microtask labor markteplaces, in which crowd workers register to accomplish simple tasks and in return receive a small amount of money. CrowdFlower is a microtask crowdsourcing platform, which acts as intermediary between requesters and marketplaces. CrowdFlower tasks are published in different marketplaces, including for example Clickworker. We list for each of these platforms the positive and negative aspects that we identified.

Amazon Mechanical Turk - Sandbox

Positive aspects

  • It is easy to navigate through the tasks.
  • They provide clear instructions.
  • Workers have clear information about their progress within the task.

Negative aspects

  • As a sandbox worker, we did not get paid.


Positive aspects

  • Monetary rewards are attained on task completion.
  • Tasks without pre-screening/expertise requirements are available.
  • By moving up the levels, workers get access to more tasks. Additionally, workers get information on their progress.

Negative aspects

  • Few tasks are available to work on as a new worker.
  • 3rd party channels have to be used to receive work and rewards.
  • Some tasks are highly time consuming and pay very less.
  • Tasks don’t always take the approximate amount of time that is stated (some take more time).
  • Tasks must be completed within a limit of time set by requesters (some people might feel annoyed by a tight time pressure).


Positive aspects

  • There is the possibility to win a referral bonus as a worker, if the worker manages to recruit some other workers. He/she can earn 5.00 EUR once the invited person earns 10.00 EUR.
  • Workers can specify their interests on different topics and specify whether they are their hobbies or know-how. Workers may specify their profession, provide a work example, and give information about their skills and the languages they speak. The goal of filling in such profile is to get a personalized list of available tasks.

Negative aspects

  • The presentation (UI) of available jobs is poor. The worker can only see 3 jobs at once. This gives the impression that it is not a crowded marketplace.
  • For a worker who is usually working at AMT or CrowdFlower, it is not completely clear whether the reward shown for a task is for the complete batch of tasks or for an individual microtask.
  • Workers are not allowed to modify some information about native languages that is provided by default.
  • The list of skills to select from are only related to typewriting and translation - this is very limited.
  • After selecting the skills and other parts of the profile, we could not work on any task (i.e. no task was displayed as available).

Experience the life of a Requester on Mechanical Turk

We experienced being a requester in CrowdFlower. We list the positive and negative aspects that we identified and provide the CSV file [1] for a published job in CrowdFlower. Due to privacy issues, we decided to remove the information about the workers who contributed to the microtasks.


Positive aspects

  • CrowdFlower offers the possibility to create jobs using templates for common types of tasks such as sentiment analysis, surveys, search relevance assessment and content categorization.
  • Test questions (aka gold units) can be created to 1) instruct workers on how they should accomplish the task and 2) measure the accuracy of workers. Test questions are created as example cases of the microtasks - the answer is provided by the requester. Moreover, test questions have an open-end text field in which crowd workers give feedback to requesters. Based on this feedback, requesters may update the answer of the test questions and workers are paid retroactively.
  • Requesters may adjust many parameters of their microtasks (e.g. number of units per page, number of assignments and workers per unit etc.).
  • CrowdFlower publishes the microtasks using a large set of channels (i.e. marketplaces).
  • It is possible to select language crowds (German-/ French-speaking crowds). However CrowdFlower does not show openly how this is assessed.
  • Requesters have the option to decide for high quality workforce (level 3) or high speed workforce (level 1).
  • CrowdFlower is providing more and more tips on how to design microtasks (via their Web site, in dialogs). Their API and requester UI documentation has also increased considerably in the last 1,5 years. Several success stories have been published as blog posts.
  • There is a CrowdFlower community on Twitter. CrowdFlower tries to motivate people and make them feel well (e.g. they ran the so-called postcard challenge).
  • The platform offers several aggregation modes.
  • The results are provided in different reports which may be downloaded as CSV files (all responses, aggregated responses, contributors..).
  • It is possible to set webhooks and notifications (even though in the previous versions of the free platform they had more notification features than the current ones).
  • Even if they changed their billing system, they still provide access to a free version of the platform for academic researchers. And they also support the GitHub student developer pack.
  • It is possible to organize jobs in projects, and provide team-based access.
  • They provide an open data library (“Data for everyone”) to let other people reuse the crowdsourced data.

Negative aspects

  • CrowdFlower does not support AMT anymore.
  • Microtasks are designed using CML - their own markup language (which is easy to learn, but is not standard).
  • With their new billing system, the requester is given an estimate in the microtask cost. However, this price may increase while the microtasks are running (e.g. when receiving many untrusted contributions due to an error in the design of the microtasks).
  • There are some features of the jobs which may only be updated via the API (e.g. after_gold option).
  • It is not possible to contact the workers that worked for the microtasks.
  • It is not possible to connect two (or more) jobs within the platform.
  • The requester has low control on how the microtasks are distributed over the selected channels.
  • The integration with e.g. Javascript is sometimes not working perfectly.
  • The communication that requesters can have with contributors is very limited, unless the requester implements something in the UI of your microtask.
  • It is no longer possible to create your own qualification tests (i.e. containing things that are not test questions).
  • It is not possible to pay different rewards for different units.
  • It is not possible to implement task assignment within the platforms - it needs to be done at a microtask level.
  • Test questions have the double goal of testing accuracy and training. It is not possible to distinguish between these two purposes separately.
  • Worker-worker collaboration is very difficult to enable.
  • It is not possible to create test questions in a survey.

Explore alternative crowd-labor markets

We analyzed two non-microtask crowdsourcing platforms: oDesk and Elance. The main differences between these platforms and the previous platforms is 1) the size and difficulty of the tasks published and 2) the interaction that takes place between requesters and workers. For example, Elance serves freelancers in the domains of IT & Programming, Design & Multimedia, Writing & Translation, Sales & Marketing, Admin Support, Finance & Management, Engineering & Manufacturing. Both individuals and companies may perform jobs and there is an option for requester to choose either option. We list the positive and negative aspects that we identified in these platforms.


Positive aspects

  • As a requester it is possible to browse experts.
  • There is not only keyword search, but also skills-based search.
  • Jobs are recommended to workers based on their profile.
  • The platform offers a jobs feed to workers.
  • Jobs are classified in categories. There is a large list of categories from which workers should select 10.
  • There is an option to see your own profile as others see it” (like in Facebook).
  • oDesk Readiness tests are qualification tests to ensure good knowledgeable workers.
  • The platform offers the possibility to exchange messages.
  • It is possible to link to own accounts at LinkedIn, GitHub, StackOverflow, Facebook, Google +, Twitter to tell more about yourself.
  • It is possible to create teams of workers.
  • There are several kinds of notifications (e.g. job offer has been updated)
  • They offer a downloadable time tracker.

Negative aspects

  • The work must be done outside the platform. oDesk only publishes jobs offers. This might lead to a less controlled environment or an overhead for the requester in terms of quality assurance.
  • It requires more effort to review work done.

Differences with microtask crowdsourcing

  • The task is not posted. It is an ad for offering the job what is posted.
  • The reward is much higher that in microtask crowdsourcing, but the work to do is also much more complex, bigger and requiring more time.
  • There is a much richer matchmaking.
  • Workers apply for work and then they get accepted / rejected.

More in detail comparison between oDesk and CrowdFlower

A stark difference between oDesk and CrowdFlower from a requester’s point of view, is the notion of ‘expertise’. While CrowdFlower does not currently have provisions for accessing expertise or background information of crowd workers directly, oDesk supports requesters with identifying the most suitable ‘freelancers’ as per their skillset. Since a requester on CrowdFlower can filter workers only on a geographical basis, the third-party channels they peruse or the platform-specific ‘level’ of the workers, there is a lack of transparency that could aid the requester in attracting the ideal workforce. On the other hand, a requester on oDesk can use leverage several ‘categories’ to sift through, in order to find the ideal ‘freelancers’. From a workers point of view, there are few restrictions w.r.t. knowledge or skills in order to consume tasks on CrowdFlower. In contrast, freelancers on oDesk are chosen by requesters based on the skills they portray and get paid accordingly.

Another glaring difference in how these two platforms operate is the contribution of individual workers towards a solution. While CrowdFlower relies on the crowdsourcing paradigm, where small contributions from a number of workers are accumulated in order to solve a problem at a higher level (tapping into the ‘Wisdom of the Crowd’), oDesk workers project more individuality. Often the solutions that requesters seek on oDesk are attainable by hiring either a single ‘freelancer’ or a small ‘team’, that contribute heavily to successfully complete the task at hand.

oDesk insures requesters with a ‘money-back’ stipulation, that allows dissatisfied requesters to reject the work of particular freelancers, without any monetary loss. While CrowdFlower provides a requester with several degrees of control (such as gold standard/test questions, crowd/channel filters, and so forth), the monetary rewards that workers receive cannot be returned to a dissatisfied requester.

On the whole, CrowdFlower lacks the transparency that oDesk possesses w.r.t. it’s workforce. Less complex tasks are more cost-effective on CrowdFlower.


Positive aspects

  • The platform provides records about freelancers such as country, skills, feedback based on averaged 5 star ranking, number of reviews ,hourly rate.
  • From our experience, the workers are very qualified and committed to fulfil the tasks.

Negative aspects

  • The payment is generally high. It might start with 7-8 $ for a very basic work done by a workers from specific countries and soar up to 50-70$ per hour for high skilled jobs.
  • The system is not designed to scale. The interaction with workers consist of multiple milestones and requires and involves being constantly involved.



MobileWorks is a crowdsourcing platform designed to be used on mobile phones. The platform targets specifically workers in developing countries, who might not have access to computers and might have a limited knowledge of English. For this reason, its tasks are OCR tasks, in which given an excerpt of scanned text documents, workers need to type the characters they identify in the image.

Positive aspects

  • The platform focuses on workers in developing countries.
  • Questions are given to 2 workers until answers match.
  • Historical accuracy models future payment of the tasks given to the worker.
  • Workers can see in real time the amount of money they are earning.
  • Workers who tested it found evaluated as usable and recommended it to family & friends.
  • MobileWorks will extend the approach (paper of 2011) to audio transcription, language translation, and language subtitling.

Negative aspects

  • (Only) OCR tasks are available.
  • The small size of the mobile screen limits the things to be shown to the user.


The paper presents a platform for paid crowdsourcing platform appealing to low-income workers in the developing countries. The platform supports image-based tasks that are to be distributed to low-end mobile phones. Worker outputs are sent via SMS, making the process affordable and easy of approach. The system specializes on digitization of local language text, that is by transliteration of the images of a word using english keyboard. As a reward users are awarded with mobile balance.

Positive aspects

  • Proposed platform seems to address the opportunity lying in utilizing the free time slots of workers. By streaming the crowd work to the mobile devices the flexibility of worker is increasing, and therefore, the willingness to proceed with the job.
  • The platform wasn’t perceived as a part-time job but as another offer of mobile company. Therefore, workers considered themselves of taking advantage of a special offer rather than doing job. Even though this assumption was not properly validated, it might be interesting to explore whether payment provided as a balance to the service will increase workers’ willingness to proceed with job.
  • mClerk appeals to low income workers in the developing countries that are usually unable to participate in crowdsource markets either because of the limited connectivity to internet or a limited set of skills and command of English.

Negative aspects

  • While the proposed system supposed to increase social welfare, it is not training the workers a new skills. It would be beneficial to consider learning by doing systems.
  • No mechanism to provide cheating was employed. The system based on the agreement however low quality outputs do not reflect the monetary rewards received by crowdworkers.

Flash Teams

Positive aspects

  • Crowdsourcing complex work (such as creating animated films, developing courses and design prototypes, etc.) by relying on team structures.
  • Leverages the expertise across different domains, easily accessible through oDesk.
  • Teams are generated in a dynamic fashion and can either grow or shrink as per the task requirements.
  • By using Flash Teams tasks can be accomplished at faster rates than traditional approaches.

Negative aspects

  • Although the approach seems scalable and quite efficient in achieving objectives, a limiting constraint is the accessibility of ‘adequate experts’. While oDesk provides with a wide range of experts from different domains, the Flash Teams depend on this heavily.
  • Errors in the work generated in initial ‘blocks’ can create potential obstacles in the working of dependent ‘blocks’.
  • Delays can occur when teams do not function according to plans or if required experts are not available within the timeframe.

Other related publications

We did not work on further publications

Our initial wishlist for a microtask crowdsourcing platform

  • Expert and non-experts workers should peacefully co-exist in the labor market. There is a demand for both sorts of work (crowd wisdom vs experts) and an ideal crowdsourcing platform would support this.
  • There should be a recommendation to match workers and tasks (in either a worker to task manner, or a task to worker manner) in order to fulfill expectations of both requesters and workers. However, this does not need to be done in a restrictive way. For instance, it might be worthwhile to weight and prime knowledgeable workers, but at the same time allow other workers (who are interested) to do the work as well.
  • Crowdsourcing agents should be able to have CVs in microtask (and macrotask) crowdsourcing to represent themselves and their work across platforms. This would encourage recognition for (micro / macro) crowd work.
  • It would be useful to use machine-readable metadata in crowdsourcing platforms for various purposes (e.g. data integration and annotation of crowdsourced data). This is a topic that is being discussed by the CrowdSem [2] community.
  • The terminology used in different marketplaces should be homogenized. AMT is talking about HITs, while CrowdFlower refers to jobs and pages. This might be misleading for some workers and requesters.
  • Microtask crowdsourcing platforms should adopt many of the features included in macrotask crowdsourcing platforms (e.g. jobs feed).