Difference between revisions of "Milestone 1 TuringMachine"

From crowdresearch
Jump to: navigation, search
(Team)
Line 13: Line 13:
 
We expand the framework to describe the current state of crowdsourcing platforms. In the figure below, we highlight TaskRabbit, MTurk, GalaxyZoo, Mobile Works, MClick, and oDeks along with other platforms. The figure is an abstract representation of Crowd Computing, Human Computations, Collective Intelligence, and Crowd Solving phenomena that describe the science of crowdsourcing.  
 
We expand the framework to describe the current state of crowdsourcing platforms. In the figure below, we highlight TaskRabbit, MTurk, GalaxyZoo, Mobile Works, MClick, and oDeks along with other platforms. The figure is an abstract representation of Crowd Computing, Human Computations, Collective Intelligence, and Crowd Solving phenomena that describe the science of crowdsourcing.  
 
   
 
   
[[File:GCI.png|900px|center|Genomes of Collective Intelligence, abstract view of the crowdcomputing world. Image is derived from the framework proposed by Malone etal 2010]]
+
[[File:GCI.png|900px|center|| thumb| Genomes of Collective Intelligence, abstract view of the crowdcomputing world. Image is derived from the framework proposed by Malone etal 2010]]
  
 
== Experience the life of a Worker on Mechanical Turk ==
 
== Experience the life of a Worker on Mechanical Turk ==

Revision as of 21:46, 16 January 2016

Team TurinMachine

  • Neil Gaikwad
  • Vishnu Ramachandran
  • Kristiono Setyadi

Human Intelligence and Computational Thinking

The rise of Mobile Computing and the World Wide Web have helped us harness the Collective Wisdom of Crowd at a large scale. Today, Crowd Computing is tightly woven into our social and professional lives. In this assignment, we analyze the systems that combine human and machine intelligence to solve large scale problems that neither can solve alone. Our analysis is based on the genomes of collective intelligence,(Malone etal. 2010), a framework proposed by Professor Thomas Malone and his team at the Massachusetts Institute of Technology. The framework provides essential building blocks require to design the crowd computing systems.

  • Goal: What are the main objective of the system?
  • Staffing: Who are the workers and requestors? Where do they come from?
  • Incentives: What motivates people to participate and contribute?
  • Process: How is it being done? How can machine intelligence and mathematics help establish the task workflow and harness wisdom of crowd?

We expand the framework to describe the current state of crowdsourcing platforms. In the figure below, we highlight TaskRabbit, MTurk, GalaxyZoo, Mobile Works, MClick, and oDeks along with other platforms. The figure is an abstract representation of Crowd Computing, Human Computations, Collective Intelligence, and Crowd Solving phenomena that describe the science of crowdsourcing.

Genomes of Collective Intelligence, abstract view of the crowdcomputing world. Image is derived from the framework proposed by Malone etal 2010

Experience the life of a Worker on Mechanical Turk

Mechanical Turk is one of the platforms that have helped crowdsourcing flourish over a decade. The initial goal of MTurk system was to improve recommendation engine and machine learning computations carried out at Amazon. However, Amazon quickly realized the business opportunity and opened up the marketplace for public. In what follows we describe our experience as a worker, turk, a self operating automaton. We worked on research survey and image labeling tasks. Our approved HITs were worth $0.50. We didn't receive any bonuses and one of our tasks remained in the pending status.

Likes:

  • Extra income: MTurk provided an opportunity to join a global workforce and earn some side income in a free time.
  • Freedom and flexibility to choose and prioritize the HITs : We liked the fact that workers can choose HITs based on various criteria mentioned in the drop downs. For instance, we were able prioritize HITs based on reward and amount of time require to complete the task.
  • Low cognitive overload tasks: We found HITs do not require STEM qualification and people with diverse skill sets and economic background have an equal opportunity to participate. In our experience, most of the HITS such as image labeling, content creation, OCRs, audio transcription, and data collection were repetitive in nature and didn't require any special training.
  • MTurk interface and HITs workflow: We analyzed MTurk interface using Nielsen's heuristics. We found that the interface is consistent with most of the usability standards except the Visibility of system status during the review process.

Dislikes:

  • No collaboration with peers: We found that the lack of collaboration among workers denies the social learning and networking opportunities. We had experienced complete isolation while working on HITs.
  • Incentives: The amount of money offered was very low. We didn't feel a sense of glory while working on the tasks. According to one of the studies, 19% workers in the US are earning less than $20,000 a year (Ross etal. 2000 )
  • Trust factor & Monopoly of requestors: We didn't receive any feedback on one of the tasks. It was unclear whether we are going to get the money for time we had spent. There is certainly a big trust issue between workers and requestors. We found Turkopticon plugin created by UCSD researchers very helpful. Turkopticon helps workers report and avoid shady employers.
  • Intense Competition: We found the marketplace is extremely competitive. The new workers need to spend a significant amount of time to figure out hidden tricks before they start making any money.
  • False sense of a global marketplace: According to the study, (Ross etal. 2000 ), 57% of workers come from the US, 32% from India, and 11% from rest of the world. However, it is hard to understand why Amazon have stopped recruiting workers in India. This indicates the high uncertainty and risk involved in a worker's career.

Experience the life of a Requester on Mechanical Turk

We conducted a sentiment analysis experiment on MTurk. We divided the prediction market task into three batches. To understand the effect of price, we choose price points of $0.01, $0.10, and $0.11. The average time per assignment ranged from 31 to 88 seconds and effective hourly rate varied from $0.409 – $11.613.

  • First batch – Predict S&P500 index prices.
    • 25 assignments each with reward $0.01.
    • Results: 3 workers completed the task i.e. 15% submission rate.
  • Second batch – Predict S&P500 index prices.
    • 5 assignments each with reward $0.10
    • Results: 5 workers submitted the task i.e. 100% submission rate.
  • Third batch – Predict Apple Stock Prices.
    • 8 assignments each with reward $0.11
    • Results: 8 workers submitted the task i.e. 100% submission rate

File:HITResultsTuringMachine.csv

Likes:

  • Scalable Economy: MTurk is a scalable global marketplace that connected us to workers who had a wide variety of skills and educational backgrounds. According to Ross etal. 2010, around 42% of the workers have undergraduate and 15% advanced degrees.
  • Problems MTurk can solve: MTurk is highly efficient and fast for image labeling, content creation, optical character recognition, audio transcription, and data collection tasks that require low cognitive overload.
  • Task Creation Process: As a requestor, we found that it is very important to provide structure and unambiguous instructions to the workers. In the research study Exploring iterative and parallel human computation processes, G Little etal. 2010 provide guidelines on how to structure HITs using iterative and parallel workflows. We found the MTurk API is a great tool to automate the task creation and qualification design process.

Dislikes:

  • Workers feedback on the tasks: After a completion of the task we tried to reach out to the workers via MTurk emails. However, we didn't hear back any feedback from them. As a requestor, we feel it is very hard to establish and nurture the employee-employer relationship. There is a no way to interact with the workers and try to understand their mindset while they perform certain tasks.
  • Lack of creativity: MTurk platform is limited to conduct the HITs that require low cognitive overload. However, we believe human imagination and creativity play vital role at the workplace. It is not clear to us how MTurk will solve some of the problems that require expertise in STEM field. For instance, the problems that EteRNA, a crowd computing game is solving are far more challenging than the HITs on MTurk.
  • Waiting time: Among the three batches we have submitted, one batch had a 15% submission rate. We are still waiting for someone to pick up the task.
  • Potential risk and danger of child labour: We had no control over finding out who performed the assigned HITs. There is a potential risk of worker delegating or outsourcing the crowdsourced tasks to child labour.

Explore alternative crowd-labor markets

We build on previous sections and compare GalaxyZoo, TaskRabbit, oDesk with MTurk. We break down our analysis in four factors: goal, staffing, incentive, and workflow.

Goal

  • GalaxyZoo, a citizen science project uses crowdsourcing for galaxy morphological classification. GalaxyZoo platform is available in multiple languages.
  • TaskRabbit allows requestors to outsource small jobs to workers in local neighborhood. Typical jobs include Cleaning, Handyman and Moving.
  • oDesk provides globally distributed workforce of skilled contractors. It helps freelancers connect with the potential businesses .
  • MTurk is a popular online marketplace that helps carry out structured micro-tasks such as image labeling, content creation, OCRs, audio transcription, and data collection problems.

MTurk, oDesk, and TaskRabbit are more focused on employment creation. However, the goal of GalaxyZoo is to involve citizen in the scientific research. The design of the platform is tailored towards building a sustainable community of volunteer. This is an example of an Identity Based Attachment in which people feel connected to the group as whole. TaskRabbit, oDesk can have potential to develop the Bond Based Attachment between workers and requestors. However, we find MTurk is a complete isolation. Identity and bond Based attachments are discussed in Building Successful Online Communities: Evidence-Based Social Design, Tausczik, Dabbish, and Kraut 2012.

Staffing, Incentive & WorkFlow

  • GalaxyZoo is an outstanding example of citizens and scientists working together for advancement of science. Volunteers of the GalaxyZoo consists of amateur astronomers, students, and senior citizens. Volunteers use visual appearances and patterns to classify the galaxies into Elliptical, Spiral, and Irregular classes. Over 150,000 volunteers have helped classifying hundreds and thousands of galaxies. Research shows that volunteers want to participate and contribute to scientific endeavor and they are less concerned about making money Galaxy Zoo:Motivations of Citizen Scientists, Raddick etal. 2013
  • TaskRabbit workers are pre-certified and background checked. The company website claims that every task is insured up to $1,000,000. Workers use mobile applications to respond to the different tasks assigned to them. According to an article at The Verge around 70% workers hold bachelor’s degree, 20% hold master’s degree, and 5% hold a PhD. There is a direct interaction between workers and requestors. Workers are motivated to work harder to get longer term assignments and stable income. The article also highlights the TaskRabbit's initiative to provide health benefit programs for workers. TaskRabbit employs Gamification and leaderboard system to encourage top performers.
  • oDesk website allows businesses to post the job descriptions and interview candidates for specific projects. Businesses can review the candidate profiles and ratings before making the hiring decision. This structure help business hire top talent at reasonable rate. There is significant amount of direct interaction between contractors and businesses.
  • MTurk workers have wide variety of skills and educational backgrounds. According to Ross etal. 2010, around 42% of the workers have undergraduate and 15% advanced degrees. MTurk allows requestors to design qualification for the tasks and recruit the skilled workers. There is no direct interaction between workers and requestors.

In our experience, GalaxyZoo, TaskRabbit, oDesk, and MTurk serve different purpose and attract different talents. We found that components of crowdsourcing system are similar to lego blocks. System architects have to find out what problem they want to solve and then assemble the system using the blocks that can fit together. In the figure above we have highlighted various problems current crowd computing systems are solving.

Readings

MobileWorks

In this paper, authors (Narula etal. 2011) highlight design and architecture of the MobileWorks, crowdsourcing platform that can provide respectable wages to workers while performing data digitization. 10 workers from Mumbai and Delhi, India were assigned the task of digitizing set of handwritten documents and scans from the stock page of 19th century newspapers.

Strengths

  • Socioeconomic Factor: MobileWorks system makes paid crowdsourcing platform accessible to the workers at the lower end of economic pyramid. This novel approach can help create jobs and enrich the future socioeconomic development. Crowdsourced tasks can reach out to hundreds of thousands of workers through easily available Mobile Internet technology.
  • Incentive and Real Time Feedback: Real time earning feedback can motivate workers to enhance the quality of work they will submit to requestors. Payment as function of historic tasks and quality can help create digital profiles and reputation of workers. This is similar to design feature used in TaskRabbit or EteRNA. It would be interesting to learn what was the response of workers when they got the feedback - Were they motivated to continue working further? or Did they drop out from pursing the future tasks?. Further research can be done to understand the effect of real time feedback. Game with purpose and EteRNA have shown that real time feedback can immensely help boost workers' confidence and motivate them to continue. For further information please see General Techniques for Designing Games with a Purpose. Luis von Ahn and Laura Dabbish, 2008
  • Parallel and Iterative Workflow: The study highlights that workers completed 120 tasks in one hour with accuracy of 99%. The task were divided into multiple segments and then redistributed to the workers in parallel and iterative fashion. In the initial studies of crowdsourcing, Bernstein etal. 2010 and G Little etal. 2010 have shown that parallel and iterative techniques are highly effective while structuring the OCR tasks. These techniques can improve the quality of the submission. The paper restates the importance of task restructuring using parallel and interactive workflow.

Improvements

  • Usage patterns and Intelligent Task Scheduling: In future, MobileWorks platform can gather data about the factors that affect the performance of the workers. For instance, the system can collect the time-period when workers were most productive and schedule the tasks during that time of the day. Artificial Intelligence planner can allocate the tasks by searching through the dataspace.
  • Friend Sourcing & Recruitment: It is unclear from the paper how the recruitment process will scale in future. The survey section of the paper highlights: all of the users said that they were more than likely to recommend the system to their friends and family. This can be a potential opportunity for designing a referral system for recruitment. For instance, current worker can use text messages to reach out his social circle; the worker will get paid depending on how many friends he brings on the board.
  • Further User Research: The user research was conducted with 10 participants over two months of period. The paper doesn't provide answers to following questions:
    • What methodologies were used to evaluate the mobile interface?
    • What problems workers faced while performing the tasks?
    • What was the educational or professional background of the users?
    • The paper highlights that the tasks were assigned by requestors, and workers had no freedom to choose the tasks. Will this model sustain for a long period of time? We would like conduct further survey and research to understand what workers think about random requestor assigning them tasks. This raises a potential opportunity to design the Mobile Web interface that will allow workers to choose the tasks they would like to work on. This will reduce the monopoly of requestors; secondary price auction with reserve price can help design mechanism for fair payment and socially optimal outcome.

mClerk

In this paper, authors (Gupta etal. 2012) highlight design and evaluation of mClerk, a text-messages based crowdsourcing platform. mClerk uses a little unknown protocol on Nokia and Ericson phones to distribute graphical tasks to workers. During five weeks of study, 239 users (55% students, 45% shopkeepers, and others) had actively digitized over 25,000 words.

Strengths

  • Socioeconomic Factor: The intention of MClerk is to improve livelihood of low income families using paid mobile crowdsourcing. This is another great example of crowdsourcing driven economic development. Researchers built technology that can work on low end mobile phones, which are cheaply available in the market.
  • Local Language Digitization: Several languages in India and across the globe represent rich history and culture. Most of the scholars have written their work in local languages. The novel idea of Language Digitization can help preserve and then share the historical knowledge with next generations.
  • Portability, crowdsourcing on the go: MClerk, a text-message based system allows users to perform the crowdsourced tasks at any location; this provides workers greater flexibility to manage their activities.
  • Information Diffusion & Incentives: Dynamics of information propagation through power users had helped researchers increase the number of workers from 10 to 239. Social Network based recruitment can be a sustainable model for scaling the workforce. Findings in MClerk research open new opportunities to carry out further controlled experiments. Another research study by Tausczik etal. 2012 highlights advantages of Bond Based attachment in various network settings.

Improvements

  • Power Users i.e. Hubs in the Network: Researchers have collected data about power users, (Figure 2 in the paper). This data can be used to understand the distribution and evolution of the social network (normal, log-normal or power law). Computational network science and theoretical models such as Erdős–Rényi can help explore fundamental questions such as: How will future communities of workers form? Who will be the next power users?
  • Information Overload & Usability: The cell-phone text inbox can flood with the crowdsourcing tasks messages. The paper doesn't highlight how users will handle this information overload. This raises the potential risk of loosing important text messages from friends and families. Workers should have autonomy to select how many text messages they would like to receive. One possibility include merging MClerk with MobileWorks solution. This will provide users with two options(web based and text based) to effectively manage their tasks.
  • Payment Process & Incentives: Low-income workers want to make extra money so that they can help out their families. Daily challenge in their lives is limited access to food and medicine, not the cellular credits. However, the researchers provided cellular credits to the workers instead of money. It is hard to see how this payment method will sustain over the period of time. Cellular credits alone cannot be an incentive to participate in the crowdsourcing. We believe further survey and research needs to be done in this area.

Flash Teams

In this paper, authors (Retelny, etal. 2012) proposed Flash Teams, a framework for dynamically assembling and managing the team of experts. Researcher demonstrate how Flash Teams can accomplish the broad class of goals including design prototyping, educational course development, and film animation, in a short amount of time. The participant in the study were chosen from oDesk, a global online work platform.

Strengths

  • Exploring Human Creativity: The Flash team framework will help organize the virtual teams of domain experts that can solve complex interdependent problems. Most of the crowdsourcing platforms are focused on solving structured micro-tasks. However, domain expertise and creativity have a huge role to play in success of crowdsourcing. Platforms like EteRNA and FoldIt have already demonstrated how crowd solving can help advancement of complex science. This is a big paradigm shift in design space of crowd computing systems.
  • Computational Management: Flash teams are elastic and can grow and shrink on demand. Requestors has a control over the task and they can work with DRIs to complete the work on time and at high quality. This is future of work where computational systems can help requestor find and manage top talent around the globe.
  • Pipelining the Workflow: The idea of Handoff helps prevent blocking the pipeline; handoffs will also help reduce the waiting time for experts in queue. In addition queueing theory, geometric random variable, or poisson distribution can help determine the waiting time for workers down in the pipeline. In Crowd-Powered Systems M Bernstein, 2012 discusses similar model for real time crowdsourcing.
  • Automated Planner & workflow: Automated planner can search through various intermediate transitions to find shortest paths and optimize the task workflow. This is faster way to identify and recruit the expert crowd and structure the task flow.

Improvements

  • Elastic nature of the team: Elastic nature of the team is a great idea. However, this should be tested on projects from different domains. For instance, in a large scale software engineering project, adding more people to the team increases the communication overhead and delays. Untrained requestors may feel tempted to add more people to the team, but adding people to the late project makes it later; see The Mythical Man-Month, Frederick Brooks. Therefore, further controlled experiments should be conducted to to evaluate the workflow and potential risks .
  • Automated Team & Workflow Recommendation: We would like to evaluate the performance of the automated planner against recommender engine. With the help of system logs and usage patterns, we can construct the content profiles and factor matrix. Further this data can be used to devise collaborative filtering or single value decomposition algorithms. This automation can play a vital role in selection of workers/requestors for a specific projects.
  • Graph of Workers and Requestors: In traditional crowdsourcing, requestor and workers are strangers to each other. However, there are many projects on which friends can be requestors/workers. Forming a group of likeminded people will help increase the trust and quality of work. Using parameters from organizational behavior theories, an automated team selection algorithm can search through the network graph and find out which members in the network are well suited for the specific task. For instance, in a research study Evidence for a Collective Intelligence Factor in the Performance of Human Groups, Woolley etal,2010. authors argue that having more women on team increases collective intelligence of the team. Intelligent algorithms can use these ideas to build next generation of productive workforce.
  • Fair payments with Nash Equilibrium & Vickrey–Clarke–Groves mechanism: The paper doesn't provide much details about the payment structure. We believe ideas from mechanism design can help build sustainable marketplace. For instance, implementation of Secondary Price Auction with reserve price will force workers and requestors to bid at their true valuation and reach to fair payment agreement. For further details please see Algorithm for optimal winner determination in combinatorial auctions, Sandholm 2000