In the first milestone, we experienced and studied several existing crowdsourcing platforms from the real environment and from papers. Through this experience, we got a brief idea of crowdsourcing platforms and its strengths and weaknesses. On this page, we are going to reflect on our experiences using Amazon Mechanical Turk as both a worker and a requester. Later, we compare Mechanical Turk with TaskRabbit, which is another crowd-labor website, crowdsourcing manual tasks. In the last part, we talk about our opinions on three platforms as presented in papers on crowdsourcing systems.
Experience the life of a Worker on Mechanical Turk
The AMT is definitely a good platform for gathering multiple workers from different locations working together to finish a tedious task. Almost everyone can register an account working on AMT to earn money by doing quick intellective tasks.
However, the platform is not aesthetically pleasing. The login/registration page is text heavy and looks messy. It is not obvious that the "Get Result From Mturk Workers" button shown on the introduction page enables users to switch to Requester account. When logged into the website, we found that the page was confusing. At the first glance, there are task names, such as "Find the count of comments on a website," and also have link called "View a HIT in this group". We did not know the difference between these two buttons and what would load if we'd click on either of these buttons; therefore, we were hesitating to click. Besides, we did not know what "group" refers to. On the search bar, it says "Find (... ) containing (...) that pay at least $(...)." We had the sense to type in key words of what we wanted to search for on the search bar; however, when reading this sentence carefully, we were not quite sure what we should type into the search bar that follows after the word "containing." Moreover, tasks were randomly ranked without having any kind of sorting. Furthermore, some tasks asked workers to "request qualification", and some of them ask workers to take a qualification test. Once you click the "request qualification," the page just tells you to contact requesters without giving a reason of why, how to meet the qualification requirement, and what those requirements are. Finally, we did not find the page to explain the meaning of the qualification on AMT. Although the tasks can be filtered by clicking the check box saying "for which you are qualified", we were still unclear about the specific qualification we have, what the qualification refers to, and how to meet qualifications. Once we accepted a task, we had to re-do all previous steps because of forgetting to click "Accept HIT". Besides, we did not know wether there is a review for the answers for the hit. So, because of lacking information about the investigation process, we put random numbers for completing our task, and just earned quick money without being responsible. As workers, we also could ask our friend to help us completing the task and the system wouldn't know whether we finished the task all by ourselves or asked someone to help. Therefore, a worker's qualification, which is graded based on the work quality, is doubtful. Furthermore, we found that once we finished a task, we didn't get any notification telling us that the money is earned. If the status showed pending, we didn't know how long it would be in this state.
Experience the life of a Requester on Mechanical Turk
We used the Mechanical Turk requester sandbox to post a request for workers. This request involved categorizing fashion elements into their appropriate categories i.e. tops, bottoms, footwear, one piece, accessories. We chose the categories request type because it seemed like that was the only one available when we first got started. Once we chose categories as our request type, it was relatively simple to create our actual task. Since the requester sandbox didn't require any payment, the request was published quite fast. However, we found that once published, it was extremely hard to get back to the request and view it’s status.
After we confirmed that the request was published, we started searching for it under HITs but couldn't find it anywhere. 10 minutes later, we tried searching for categories under HITs and then found the request. We then requested a team member to try and work on the request but the system just wouldn't allow her to accept the request. It seemed as though there was some sort of qualification restriction applied to the request. We were quite bewildered by this because we hadn’t set any qualifications at all.
We've been trying to get the categorization request to work without qualifications but cannot seem to figure it out. So, unfortunately, we do not have results to show. We learned however that Amazon takes a 10% commission on top of the reward amount that we set for Workers.
We then tried to create a data collection task. On the first page, we found that it was easy to input the title and the instruction of our task. On the second page, we were allowed to edit the layout of our task pages. This type of request follows a strict format with the instruction on top, a table in the middle and input box at the bottom. So we have to re-type the instructions. In addition, we have to change the source code of the page to change the format, which is really difficult for people with no coding knowledge.
- Getting started to create a request wasn’t difficult.
- Actually creating the categories request was relatively simple.
- The process was quite straightforward
- There was no clear indication on the type of request being created. The text on the button keeps changing and it links to a different place every time you click it. Some links are orange in color and some are blue.
- It wasn't very obvious that you could create a request that wasn't categories based.
- Once the request was created, it was extremely hard to find.
- The layout of data collection task is hard to change. The only effective way is to change the source code. However, this is not convenient for people who don’t have coding knowledge.
- Managing requests was hard. Certain concepts such as batches and qualification types are not explained clearly.
Explore alternative crowd-labor markets
Compare and contrast the crowd-labor market you just explored (TaskRabbit/oDesk/GalaxyZoo) to Mechanical Turk.
MobileWorkers is a mobile web-based crowdsourcing platform for people who live in developing countries to participate in microtask, such as human optical character recognition task.
Likes & Strengths of The System:
- Avoids the disadvantage of the circumstance, and fully uses the benefit of the cheap mobile Internet cost: People in India lack access to the desktop, and are limited to English literacy; however, the mobile phone penetration is relatively high. Many phones are simple but have with the capability to surf the Internet. Moreover, the cost of mobile Internet is very cheap – even people who have a salary under $2 USD can afford it.
- Provides livable wage for workers: Workers get payment based on their previous performance. As workers’ accuracy becomes higher, they earn a higher wage. Participants were able to complete 120 tasks per hour when using MobileWorks. Therefore, paying approximately 0.18 to 0.20 Indian Rupees per task per hour to workers matched their regular wages. Authors also foresee that the working speed of the workers would increase as they become more competent and familiar with tasks. Furthermore, each task can take less time for workers to finish when the data transfer rates have improved. In this case, workers get paid higher rates.
- Easy process: Each task is divided into multiple small pieces for multiple workers to work on. Workers can work at any location and at anytime. Quality is controlled by using multiple entry, which is every task is assigned to two different workers until their answers match. This is an easy process for employers to manage and for workers to participate in.
- Satisfied accuracy: The average amount of tasks per hour is 120, and the overall accuracy was 99% by using the multiple entry solution.
- Positive feedback: Ten out of ten users rated the system as higher than 4 stars out of 5. All of them were willing to recommend it to other people.
Dislikes & Weaknesses of The System:
- According to the article, the system was only tested for human optical character recognition tasks. Although authors indicated that they wanted to conduct further study on other types of works, the feasibility and result of this system is not clear yet. For example, while using this system to complete an audio transcription task, the study result may not be satisfactory because the workers may spend more time trying to finish the task and earn less payment.
- Moreover, workers may lose the participant's interest half way through the task when they find it is hard for them to earn a sustainable wage. In this situation, requesters may hardly find enough workers for their tasks.
- Furthermore, because anyone can participate in the task at any location and time, the task accuracy in a real working environment may be lower compared to a pilot study. The work quality can changed depending on various locations and times.
- Finally, the article does not mention about the process of verifying the qualification of workers. It is possible that workers may be over confident about their competence, thinking they are capable of completing a task. Employers may have to waste time and effort dealing with these inferior results.
mClerk is a mobile application that was built specifically to target users in semi-urban areas of India and introduce them to crowdsourcing.
Likes & Strengths:
- The most interesting thing about the system was how it targeted the perfect users - i.e. people in semi-urban areas with strong social circles.
- We found the use of leaderboards to ramify the crowdsourcing process very fascinating. It was great to see how this encouraged users to work harder toward completing tasks.
- Another thing we found key in mClerk's success was its use of reminders to refresh a worker’s memory about their pending tasks.
- We found that dividing the project into two phases was a very smart move. We particularly liked how the team used bonuses in phase 2 to reveal changes in the user’s behavior.
- SMSes were not free for all of mClerk's potential users. This could have prevented new users from joining since they were so sensitive to price.
- Although we realize that providing a mobile refill might have been the easiest and most convenient way to compensate users, it could have been better to pay users by another method such as hard cash or some sort *of medical coverage. This might have been more meaningful for the user and might have even prevented them from misunderstanding the system.
Likes & strengths of the system
- The Flash Teams system enables teamwork. This means that more complicated and professional tasks can be assigned using this platform and relatively high quality results can be expected.
- The team is modular and combines several blocks together. One or more people take charge of a block. In each block, a manager is assigned. The team is managed by the system, and the users then set the blocks and time limitation of each module required to complete the task. The structure is similar to that of the organization, which is easy to understand and allows everyone in a team to focus on the work related to his/her expertise.
- The team is very elastic. If the former group finishes their work earlier, the system calculates the start time of later groups and sends them a notification.
- The team uses a pipeline workflow. As long as the latter group gets enough input from the former group, work can begin earlier, saving a lot of time.
- Workers in two adjacent groups can communicate with each other. This allows workers to ask questions about the former group’s work. Workers are also connected to users. Users can give feedback and workers can ask questions about the users’ need.
- Since the working plan is set by the system in advance, as long as all the workers in the block follow the plan, the task can be done effectively and efficiently.
Dislikes & weaknesses of the system
- Since the team is modular, it is hard for people to communicate and discuss the work as a team. The team members cannot brainstorm together, inspire each other and find the problems in the project from different perspectives. The team cannot talk to each other about misunderstandings, disagreements and conflicts.
- The platform allows workers in latter blocks to contact the workers in the former block. But workers in the former block can’t change the results of this block even if they find problems after they have talked to ones in the latter block. In addition, only people in adjacent blocks can communicate with each other. Moreover, because it is an online platform, someone may finish his/her work and just leave the group.
- It is hard for workers to find and solve problems in pipeline workflows. For example, the user research team might not know the limitations that back-end developers are facing. They may come up with something that the developers cannot implement at all. In addition, problems from the early stages of the project cannot be iterated on easily if nobody finds it.
- To offer a task in this platform, the users need to know the inputs & outputs of the project. Most users just have an idea or a goal for what they want to do. They need experts to translate their ideas and goals to specific inputs and outputs.
- Though the platform allows users to give feedback to workers during the process, it is hard for users to to do this since they can’t see explicit results. In addition, users may not have time to monitor the team and give feedback at all times. Additionally, the feedback could be biased.
- There is no evaluation mechanism for work in this platform.
Narula P, Gutheim P, Rolnitzky D, et al. MobileWorks: A Mobile Crowdsourcing Platform for Workers at the Bottom of the Pyramid. Human Computation, 2011, 11: 11.
Gupta A, Thies W, Cutrell E, et al. mClerk: enabling mobile crowdsourcing in developing regions. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2012: 1843-1852.
Retelny D, Robaszkiewicz S, To A, et al. Expert crowdsourcing with flash teams. Proceedings of the 27th annual ACM symposium on User interface software and technology. ACM, 2014: 75-85.