Milestone 1 TuringMachine
- 1 Human Intelligence and Computational Thinking
- 2 Experience the life of a Worker on Mechanical Turk
- 3 Experience the life of a Requester on Mechanical Turk
- 4 Explore alternative crowd-labor markets
- 5 Readings
Human Intelligence and Computational Thinking
The rise of Mobile Computing and the World Wide Web have helped harness the Collective Wisdom of Crowd at a large scale. Today, Crowd Computing is tightly woven into our social and professional lives. In this assignment, we analyze the systems that combine human and machine intelligence to solve large scale problems that neither can solve alone. Our analysis is based on the genomes of collective intelligence,(Malone etal. 2010), a framework proposed by Professor Thomas Malone and his team at the Massachusetts Institute of Technology. The framework provides essential building blocks require to design the crowd computing systems.
- Goal: What are the main objective of the system?
- Staffing: Who are the workers and requestors? Where do they come from?
- Incentives: What motivates people to participate and contribute?
- Process: How is it being done? How can machine intelligence and mathematics help establish the task workflow and harness wisdom of crowd?
We expand the framework to describe the current state of crowdsourcing platforms. In the figure below, we highlight TaskRabbit, MTurk, GalaxyZoo, Mobile Works, MClick, and oDeks along with other platforms. The figure is an abstract representation of Crowd Computing, Human Computations, Collective Intelligence, and Crowd Solving phenomena that describe the science of crowdsourcing.
Experience the life of a Worker on Mechanical Turk
Mechanical Turk is one of the platforms that have helped crowdsourcing flourish over a decade. The initial goal of MTurk system was to improve recommendation engine and machine learning computations carried out at Amazon. However, Amazon quickly realized the business opportunity and opened up the marketplace for public. In what follows we describe our experience as a worker, turk, a self operating automaton. We worked on research survey and image labeling tasks. Our approved HITs were worth $0.50. We didn't receive any bonuses and one of our tasks remained in the pending status.
- Extra income: MTurk provided an opportunity to join a global workforce and earn some side income in a free time.
- Freedom and flexibility to choose and prioritize the HITs : We liked the fact that workers can choose HITs based on various criteria mentioned in the drop downs. For instance, we were able prioritize HITs based on reward and amount of time require to complete the task.
- Low cognitive overload tasks: We found HITs do not require STEM qualification and people with diverse skill sets and economic background have an equal opportunity to participate. In our experience, most of the HITS such as image labeling, content creation, OCRs, audio transcription, and data collection were repetitive in nature and didn't require any special training.
- MTurk interface and HITs workflow: We analyzed MTurk interface using Nielsen's heuristics. We found that the interface is consistent with most of the usability standards except the Visibility of system status during the review process.
- No collaboration with peers: We found that the lack of collaboration among workers denies the social learning and networking opportunities. We had experienced complete isolation while working on HITs.
- Incentives: The amount of money offered was very low. We didn't feel a sense of glory while working on the tasks. According to one of the studies, 19% workers in the US are earning less than $20,000 a year (Ross etal. 2000 )
- Trust factor & Monopoly of requestors: We didn't receive any feedback on one of the tasks. It was unclear whether we are going to get the money for time we had spent. There is certainly a big trust issue between workers and requestors. We found Turkopticon plugin created by UCSD researchers very helpful. Turkopticon helps workers report and avoid shady employers.
- Intense Competition: We found the marketplace is extremely competitive. The new workers need to spend a significant amount of time to figure out hidden tricks before they start making any money.
- False sense of a global marketplace: According to the study, (Ross etal. 2000 ), 57% of workers come from the US, 32% from India, and 11% from rest of the world. However, it is hard to understand why Amazon have stopped recruiting workers in India. This indicates the high uncertainty and risk involved in a worker's career.
Experience the life of a Requester on Mechanical Turk
We conducted a sentiment analysis experiment on MTurk. We divided the prediction market task into three batches. To understand the effect of price, we choose price points of $0.01, $0.10, and $0.11. The average time per assignment ranged from 31 to 88 seconds and effective hourly rate varied from $0.409 – $11.613.
- First batch – Predict S&P500 index prices.
- 25 assignments each with reward $0.01.
- Results: 3 workers completed the task i.e. 15% submission rate.
- Second batch – Predict S&P500 index prices.
- 5 assignments each with reward $0.10
- Results: 5 workers submitted the task i.e. 100% submission rate.
- Third batch – Predict Apple Stock Prices.
- 8 assignments each with reward $0.11
- Results: 8 workers submitted the task i.e. 100% submission rate
- Scalable Economy: MTurk is a scalable global marketplace that connected us to workers who had a wide variety of skills and educational backgrounds. According to Ross etal. 2010, around 42% of the workers have undergraduate and 15% advanced degrees.
- Problems MTurk can solve: MTurk is highly efficient and fast for image labeling, content creation, optical character recognition, audio transcription, and data collection tasks that require low cognitive overload.
- Task Creation Process: As a requestor, we found that it is very important to provide structure and unambiguous instructions to the workers. In the research study Exploring iterative and parallel human computation processes, G Little etal. 2010 provide guidelines on how to structure HITs using iterative and parallel workflows. We found the MTurk API is a great tool to automate the task creation and qualification design process.
- Workers feedback on the tasks: After a completion of the task we tried to reach out to the workers via MTurk emails. However, we didn't hear back any feedback from them. As a requestor, we feel it is very hard to establish and nurture the employee-employer relationship. There is a no way to interact with the workers and try to understand their mindset while they perform certain tasks.
- Lack of creativity: MTurk platform is limited to conduct the HITs that require low cognitive overload. However, we believe human imagination and creativity play vital role at the workplace. It is not clear to us how MTurk will solve some of the problems that require expertise in STEM field. For instance, the problems that EteRNA, a crowd computing game is solving are far more challenging than the HITs on MTurk.
- Waiting time: Among the three batches we have submitted, one batch had a 15% submission rate. We are still waiting for someone to pick up the task.
- Potential risk and danger of child labour: We had no control over finding out who performed the assigned HITs. There is a potential risk of worker delegating or outsourcing the crowdsourced tasks to child labour.
Explore alternative crowd-labor markets
We build on previous sections and compare GalaxyZoo, TaskRabbit, oDesk with MTurk. We break down our analysis in four factors: goal, staffing, incentive, and workflow.
- GalaxyZoo, a citizen science project uses crowdsourcing for galaxy morphological classification. GalaxyZoo platform is available in multiple languages.
- TaskRabbit allows requestors to outsource small jobs to workers in local neighborhood. Typical jobs include Cleaning, Handyman and Moving.
- oDesk provides globally distributed workforce of skilled contractors. It helps freelancers connect with the potential businesses .
- MTurk is a popular online marketplace that helps carry out structured micro-tasks such as image labeling, content creation, OCRs, audio transcription, and data collection problems.
MTurk, oDesk, and TaskRabbit are more focused on employment creation. However, the goal of GalaxyZoo is to involve citizen in the scientific research. The design of the platform is tailored towards building a sustainable community of volunteer. This is an example of an Identity Based Attachment in which people feel connected to the group as whole. TaskRabbit, oDesk can have potential to develop the Bond Based Attachment between workers and requestors. However, we find MTurk is a complete isolation. Identity and bond Based attachments are discussed in Building Successful Online Communities: Evidence-Based Social Design, Tausczik, Dabbish, and Kraut 2012.
Staffing, Incentive & WorkFlow
- GalaxyZoo is an outstanding example of citizens and scientists working together for advancement of science. Volunteers of the GalaxyZoo consists of amateur astronomers, students, and senior citizens. Volunteers use visual appearances and patterns to classify the galaxies into Elliptical, Spiral, and Irregular classes. Over 150,000 volunteers have helped classifying hundreds and thousands of galaxies. Research shows that volunteers want to participate and contribute to scientific endeavor and they are less concerned about making money Galaxy Zoo:Motivations of Citizen Scientists, Raddick etal. 2013
- TaskRabbit workers are pre-certified and background checked. The company website claims that every task is insured up to $1,000,000. Workers use mobile applications to respond to the different tasks assigned to them. According to an article at The Verge around 70% workers hold bachelor’s degree, 20% hold master’s degree, and 5% hold a PhD. There is a direct interaction between workers and requestors. Workers are motivated to work harder to get longer term assignments and stable income. The article also highlights the TaskRabbit's initiative to provide health benefit programs for workers. TaskRabbit employs Gamification and leaderboard system to encourage top performers.
- oDesk website allows businesses to post the job descriptions and interview candidates for specific projects. Businesses can review the candidate profiles and ratings before making the hiring decision. This structure help business hire top talent at reasonable rate. There is significant amount of direct interaction between contractors and businesses.
- MTurk workers have wide variety of skills and educational backgrounds. According to Ross etal. 2010, around 42% of the workers have undergraduate and 15% advanced degrees. MTurk allows requestors to design qualification for the tasks and recruit the skilled workers. There is no direct interaction between workers and requestors.
In our experience, GalaxyZoo, TaskRabbit, oDesk, and MTurk serve different purpose and attract different talents. We found that components of crowdsourcing system are similar to lego blocks. System architects have to find out what problem they want to solve and then assemble the system using the blocks that can fit together. In the figure above we have highlighted various problems crowd computing systems are solving.
In this paper, authors (Narula etal. 2011) highlight design and architecture of the MobileWorks, crowdsourcing platform that can provide respectable wages to workers while performing data digitization. 10 workers from Mumbai and Delhi were assigned the task of digitizing set of handwritten documents and scans from the stock page of 19th century newspapers.
- Socioeconomic Factor: MobileWorks system makes paid crowdsourcing platform accessible to the workers at the lower end of economic pyramid. This novel approach can help create jobs and enrich the future socioeconomic development. Crowdsourced tasks can reach out to hundreds of thousands of workers through easily available Mobile Internet technology.
- Incentive and Real Time Feedback: Real time earning feedback can motivate workers to enhance the quality of work they will submit to requestors. Payment as function of historic tasks and quality can help create digital profiles and reputation of workers. This is similar to design feature used in TaskRabbit or EteRNA. It would be interesting to learn what was the response of workers when they got the feedback - Were they motivated to continue working further? or Did they drop out from pursing the future tasks?. Further research can be done to understand the effect of real time feedback. Game with purpose and EteRNA have shown that real time feedback can immensely help boost workers confidence and motivate them to continue. For further information please see General Techniques for Designing Games with a Purpose. Luis von Ahn and Laura Dabbish.2008
- Parallel and Iterative Workflow: The study highlights that workers completed 120 tasks in one hour with accuracy of 99%. The task were divided into multiple segments and then redistributed to the workers in parallel and iterative fashion. In the initial studies of crowdsourcing, Bernstein etal. 2010 and G Little etal. 2010 have shown that parallel and iterative techniques are highly effective while structuring the OCR tasks. These techniques can improve the quality of the submission. The paper restates the importance of task restructuring using parallel and interactive workflow.
- Usage patterns and Intelligent Task Scheduling: In future, MobileWorks platform can gather data about the factors that affect the performance of the workers. For instance, the system can collect the time-period when workers were most productive and schedule the tasks during that time of the day. Artificial Intelligence planner can allocate the tasks by searching through the dataspace.
- Friend Sourcing & Recruitment: It is unclear from the paper how the recruitment process will be scaled. The survey section of the paper highlights: all of the users said that they were more than likely to recommend the system to their friends and family. This can be a potential opportunity for designing a referral system for recruitment. For instance, current worker can use text messages to reach out his social circle; the worker will get paid depending on how many friends he brings on the board.
- Further User Research: The user research was conducted with 10 participants over two months of period. The paper doesn't provide answers to following questions:
- What methodologies were used to evaluate the mobile interface?
- What problems workers faced while performing the tasks?
- What was the educational or professional background of the users?
- The paper highlights that the tasks were assigned by requestors and workers had no freedom to choose the tasks. Will this model sustain for a long period of time? We would like conduct further survey and research to understand what workers think about someone assigning them tasks. This raises a potential opportunity to design the Mobile Web interface that will allow workers to choose the tasks they would like to work on. This will reduce the monopoly of requestors and lead to fair payment and socially optimal outcome.
In this paper, authors (Gupta etal. 2012) highlight design and evaluation of mClerk, a text-messages based crowdsourcing platform. During five weeks of study 239 users (55% students, 45% shopkeepers, and others) actively digitized over 25,000 words.
- Socioeconomic Factor: The intention of MClerk is to improve livelihood of low income families using paid mobile crowdsourcing. This is another great example of crowdsourcing driven economic development.
- Local Language Digitization: Several languages in India and across the globe represent rich history and culture. Most of the scholars have written their work in local languages. The novel idea of Language Digitization can help preserve and share the historical knowledge with next generations.
- Portability, crowdsourcing on the go: MClerk, a text-message based system allows users to perform the crowdsourced tasks at any location; this provides workers greater flexibility to manage their activities.
- Information Diffusion & Incentives: Dynamics of information propagation through power users i.e. hubs had helped researchers to increase number of workers from 10 to 239. This could be a design feature for a sustainable recruitment process.
- Power Users i.e. Hubs in the Network: Researchers have collected data about power users, (Figure 2 in the paper). This data will help us understand the distribution and evolution of the social network (normal, log-normal or power law). Computational network science and theoretical models such as Erdős–Rényi can help explore fundamental questions such as: How will future communities of workers form? Who will be the next power users?
- Information Overload: The cell-phone text inbox can flood with the crowdsourcing tasks messages. The paper doesn't highlight how users will handle this information overload. This raises the potential risk of loosing important text messages from friends and families. Workers should have autonomy to select how many text messages they would like to receive.
- Payment Process & Incentives Low-income workers want to make extra money so that they can help out their families. The daily challenge in their lives is access to food and medicines, not the cellular credits. However, the researchers provided cellular credits to the workers instead of money. It is hard to see how this payment method will sustain over the period of time. Cellular credits alone cannot be an incentive to participate in the crowdsourcing.
In this paper, authors (Retelny, etal. 2012) proposed flash teams, a framework for dynamically assembling and managing the team of experts. The participant in the study were chosen from oDesk.
- Human Creativity: Most of the crowdsourcing platforms are focused on solving structured micro-tasks. However, domain expertise and creativity have a huge role to play in success of crowdsourcing (see EteRNA and FoldIt). Flash team framework will help organize the virtual teams of domain experts that can solve complex interdependent problems.
- Computational Management: Flash teams are elastic and can grow and shrink on demand. Requestors has a control over the task and they can work with DRIs to complete the work on time and at high quality.
- Pipelining the Workflow: The idea of Handoff helps prevent blocking the pipeline; handoffs will also help reduce the waiting time for experts in queue.
- Automated Planner & workflow: Automated planner can search through various intermediate transitions to find shortest paths and optimize the task workflow.
- Elastic nature of the team: Elastic nature of the team is a great idea. However, this should be tested on projects from different domains. For instance, in a large scale software engineering project, adding more people to the team increases the communication overhead and delays. Untrained requestors may feel tempted to add more people to the team, but adding people to the late project makes it later; see The Mythical Man-Month, Frederick Brooks.
- Automated Team Selection We would like to evaluate the performance of the automated planner against recommender engine. Collaborative filtering or Single Value Decomposition algorithms can play a vital role in selection of workers/requestors for a specific projects.
- Graph of Workers and Requestors In traditional crowdsourcing, requestor and workers are strangers to each other. However, there are many project on which friends can be requestors/workers. Using various parameters, an automated team selection algorithm can search through the network graph and find out which members in the network are well suited for the specific task. This will minimize potential trust issues and help conduct the tasks in faster way.
- Fair payments with Nash Equilibrium & Vickrey–Clarke–Groves mechanism The platform still need to work out the payment structure. Implementation of Secondary Price Auction will force workers and requestors to bid at their true valuation and reach to fair payment agreement Algorithm for optimal winner determination in combinatorial auctions, Sandholm 2000