Difference between revisions of "Milestone 8 TuringMaching Foundation3"

From crowdresearch
Jump to: navigation, search
(Created page with "==Input and output transducers== Tasks get vetted or improved by people on the platform immediately after getting submitted, and before workers are exposed to them. Results ar...")
 
(Solutions)
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Input and output transducers==
+
=== Foundation 3: External quality ratings ===
Tasks get vetted or improved by people on the platform immediately after getting submitted, and before workers are exposed to them. Results are likewise vetted and tweaked. For example, peer-review.
+
Metaphor of credit ratings: rather than just people rating each other, have an (external?) authority or algorithm responsible for credit ratings (A, B, C, etc.)

 +
Benefit: this reduces the incentive to get positively-biased five star ratings on everything — those ratings become meaningless
  
==Challenges==
+
====Challenges====
'''Cost''': who pays for this? In other words, can this be done without hugely increasing the cost of crowdsourcing?
+
*Is this a group/authority? For example, Wikipedia reviews are subjective and based on voting. Or is it an algorithm?

 +
*If it’s a group, who pays for their time to review you?
 +
*From Anand: “How do you do skills-based ratings, etc., without hindering tasks with a requirement to categorize them?
  
'''Speed''': is it possible to do this quickly enough to give near-immediate feedback to requesters? Like, 2–4 minutes? As spamgirl says, "The #1 thing that requesters love about  AMT from her recent survey of requesters, is that the moment that I post tasks, they start getting done."
+
==Solutions==
 +
We proposed the semi-automated rating mechanism in [http://crowdresearch.stanford.edu/w/index.php?title=Milestone_7_TuringMachine ''Trust Rank, Sustainable Reputation Framework for the Crowdsourcing Marketplace'']
  
From Edwin: What happens when I have a task that I know is hard but I want workers to just try their best and submit? I’m OK with it being subjective, but the panel would just reject my task, which would have been frustrating.
+
We believe the cost of setting up external rating mechanism is high. However, internal rating mechanism can be made more robust. This system can evolve and learn from the historical rating behaviors. For instance, to overcome the problem of bias voting we propose following mechanism:
  
From Edwin: Could this help deal with people feeling bad when rejecting work? Maybe we need a new metaphor, like revision.
+
'''Hypothesis: The worker will not prefer to continue business with the bad requestor. Requestor will not prefer to continue business with the bad workers.''' <br>
 +
No worker or requestor will prefer to continue working after a bad experience with each other.
 +
* We tune our recommendation system based on the ratings given by workers/requestors.
 +
* If worker gives good rating to the requestor, '''algorithm pairs them for the next 3-5 tasks'''. In this situation for the good agents, truth telling is the dominant strategy. Please see the payoff matrix. This enforces workers and requestor to provide honest judgement about the ratings.
 +
* During the sign up, the system will present the detail demo of Ranking System to Both requestors and workers
  
==Solutions==
+
[[File:NeilEcono.png|750px|center|thumb|Game Theoretic Setting with Nash Equilibrium]]
'''Hypothesis''': Well designed tasks have better quality submissions <br>
+
<u>'''Task Improvement Solution 1: Fixing the TASK with aggregated feedback'''</u>
+
In this scenario we leverage the ''one to many relationship in the task graph i.e. one Requestor many Workers''
+
* Hypothesis: '''Emergency Break: The tasks that have serious design flaws are easy to fix:'''<br>
+
** If a task has serious design flaws then majority of workers will face issues understanding it. At present there is no mechanism to capture this feedback.
+
** We propose an interface UI control that will allow workers immediately report the issues related to the task flaws. This is similar to the Emergency Break used in the Trains. 
+
** Once the task is posted it will have timestamp associated with it. If 45% workers, who are working on the task raise the Emergency Break to report the unclear instructions; the task will be then placed on hold and the requestor will be notified.
+
** Workers can provide feedback on the task instructions.
+
** We can learn how workers use Alerts and add penalty for misusing it.
+
** '''This is a low cost and faster mechanism.'''
+
  
[[File:NeilBreak.png|950px|center|thumb|Emergency Break]]
 
 
<hr>
 
<hr>
'''Hypothesis 1''': Over the period of time the workers will get experience in the marketplace and would participate in review process for monetary volunteering benefits. Ranking system will help select the expert from the crowd <br>
+
'''Hypothesis 2''': Amount of time and money spent in reviewing the bad quality results will be more than amount of time spent improving the task and getting immediate feedback. This extra money can be used to pay the experts.     <br>
+
'''Algorithm'''
<u>'''Task Improvement Solution 2: Trust Circle Design Process: Requestor - Expert Worker - Supervisors - Workers'''</u>
+
* Rating Parameters for Requestors: Generosity, Fairness, Promptness, and Communicativity are widely used ranking parameters ''Irani et.al 2013''. In addition, we introduce Probability of Payment Defaults, PPD as a parameter to indicate historical track record of the requestors.
* The Figure below highlights the detailed review process
+
* Finally, we associate weights to each parameter and determine the final score as a function of the parameters. In long run, as we collect more parameters, we can reduce the dimension of the data using PCA. Then we can select the components that explain high variance in the ranking and derive loadings (weights) from that.  
* Select Expert Workers using [http://crowdresearch.stanford.edu/w/img_auth.php/5/53/NgaikwadLF.png '''automated algorithms''']
+
* Another method for rank prediction is [http://crowdresearch.stanford.edu/w/img_auth.php/5/53/NgaikwadLF.png '''Latent factor recommendation'''] system which gives ability to automatically learn the features require for the rating prediction. 
* Select supervisors from the set of Expert Workers. Expert workers are paid higher and selected from pool of high accomplished individuals. We have also designed the [http://crowdresearch.stanford.edu/w/index.php?title=Milestone_7_TuringMachine '''ranking mechanism'''] that can be integrated with the system to motivate workers to perform well and get into the class of Expert Workers. <br>
+
* We can also incorporate extra parameters used in recruitment sites such as [http://www.glassdoor.com/Reviews/Google-Reviews-E9079.htm ''Glassdoor''].  
'''Motivation for becoming an Expert''':<br>
+
* Workers: Similarly rating for workers can be derived based on Honesty, professionalism, past performance, commitment, and skill sets, widely used parameters while hiring the workers in the Fortune 500 companies.  
*<u>Intrinsic motivation</u>: Bad quality submission or cheating behavior affects entire crowdsourcing community. Most of the workers want to stop the bad guys and volunteer their time. However, the current system does not have any mechanism that will involve workers in filtering out bad submissions. Pool of experts is a motivated group of individuals who want to maximize the social welfare.
+
<hr>
*<u>Extrinsic motivation</u>: The expert workers are paid higher for the experience & managerial skills they bring in. In addition, we propose the  [http://crowdresearch.stanford.edu/w/index.php?title=Milestone_7_TuringMachine '''reward system'''] that will encourage the workers to do well and move into the class of socially recognized experts. 
+
  
[[File:Fig3.0gaikwad.png|750px|center|thumb]]
 
  
===Realtime Feedback & Peer Pressure to Perform Better===
 
* The Figure below highlights the worker A's dashboard. ''<u>Please read the diagram from #0 to #5 i.e. from the bottom to top</u>''
 
*  The worker A receives real time feedback, motivational messages.
 
*  The worker A can see the live task statistics and performance of his colleagues can motivate him to participate and do well in the task.
 
'''Over the period of time the system can build a network graph of workers and requestors who work well together. This can be further extended to build the teams that can work together on highly complex tasks.'''
 
  
[[File:Gaikwad2.png|900px|center|Class]]
+
'''Scores'''
 +
[[File:NeilPca.png|750px|center|thumb|Score Prediction Analytics]]
 +
 
 +
<hr>
 +
'''Ranking'''
 +
[[File:NeilRank.png|750px|center|thumb| Rank Classification]]
 +
 
 +
<hr>
 +
The System: For further details please see  [http://crowdresearch.stanford.edu/w/index.php?title=Milestone_7_TuringMachine ''Trust Rank, Sustainable Reputation Framework for the Crowdsourcing Marketplace'']
 +
[[File:Gaikwad's risk diversification framework.png|750px|center|thumb|Trust Rank, Sustainable Reputation Framework for the Volatile Crowdsourcing Marketplace]]

Latest revision as of 19:25, 22 April 2015

Foundation 3: External quality ratings

Metaphor of credit ratings: rather than just people rating each other, have an (external?) authority or algorithm responsible for credit ratings (A, B, C, etc.)
 Benefit: this reduces the incentive to get positively-biased five star ratings on everything — those ratings become meaningless

Challenges

  • Is this a group/authority? For example, Wikipedia reviews are subjective and based on voting. Or is it an algorithm?

  • If it’s a group, who pays for their time to review you?

  • From Anand: “How do you do skills-based ratings, etc., without hindering tasks with a requirement to categorize them?”

Solutions

We proposed the semi-automated rating mechanism in Trust Rank, Sustainable Reputation Framework for the Crowdsourcing Marketplace

We believe the cost of setting up external rating mechanism is high. However, internal rating mechanism can be made more robust. This system can evolve and learn from the historical rating behaviors. For instance, to overcome the problem of bias voting we propose following mechanism:

Hypothesis: The worker will not prefer to continue business with the bad requestor. Requestor will not prefer to continue business with the bad workers.
No worker or requestor will prefer to continue working after a bad experience with each other.

  • We tune our recommendation system based on the ratings given by workers/requestors.
  • If worker gives good rating to the requestor, algorithm pairs them for the next 3-5 tasks. In this situation for the good agents, truth telling is the dominant strategy. Please see the payoff matrix. This enforces workers and requestor to provide honest judgement about the ratings.
  • During the sign up, the system will present the detail demo of Ranking System to Both requestors and workers
Game Theoretic Setting with Nash Equilibrium

Algorithm

  • Rating Parameters for Requestors: Generosity, Fairness, Promptness, and Communicativity are widely used ranking parameters Irani et.al 2013. In addition, we introduce Probability of Payment Defaults, PPD as a parameter to indicate historical track record of the requestors.
  • Finally, we associate weights to each parameter and determine the final score as a function of the parameters. In long run, as we collect more parameters, we can reduce the dimension of the data using PCA. Then we can select the components that explain high variance in the ranking and derive loadings (weights) from that.
  • Another method for rank prediction is Latent factor recommendation system which gives ability to automatically learn the features require for the rating prediction.
  • We can also incorporate extra parameters used in recruitment sites such as Glassdoor.
  • Workers: Similarly rating for workers can be derived based on Honesty, professionalism, past performance, commitment, and skill sets, widely used parameters while hiring the workers in the Fortune 500 companies.


Scores

Score Prediction Analytics

Ranking

Rank Classification

The System: For further details please see Trust Rank, Sustainable Reputation Framework for the Crowdsourcing Marketplace

Trust Rank, Sustainable Reputation Framework for the Volatile Crowdsourcing Marketplace