Difference between revisions of "Milestone 4 Task Ranking by Crowdgeist"

From crowdresearch
Jump to: navigation, search
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  
== Outline of a systems intro ==
+
== Outline of current system ==
  
Crowdsourcing markets currently offer ranking for matching those who want their tasks completed to those who do the work.  Workers want to get work suited to their own criteria/needs. Requesters complain that they cannot get good results from some workers and cannot be guaranteed a high quality of results.  
+
Crowdsourcing markets currently offer ranking for matching those who want their tasks completed to those who do the work.  Workers want to get work suited to their own needs, skills and possibly personalized criteria. Current reputation systems do not adequately differentiate the quality of the tasks completed by workers: requesters complain that they cannot get good results from some workers and cannot be guaranteed a high quality of results, and would like some KPI to evaluate the reputation of a worker (@seku).  
  
Trust and power are broken because traditional approaches to ranking does not differentiate in in ways that fully meet the needs of workers or requesters. The quest to achieve 5 star recognition, if many cases, leads to reputation inflation Horton 2015 from to threatened retaliation and has also lead to the emergence of add-ons such as Turkopticon to enable workers to filter out requesters who are notorious for not paying. Current reputation systems do not adequately differentiate the quality of the tasks completed by workers nor do they allow workers to identify which requesters.
+
Trust and power are broken because traditional approaches to ranking does not differentiate in in ways that fully meet the needs of workers or requesters. Moreover, ranking has one dimension, while our idea is this dimension flattens workers attitudes and possibilities. The quest to achieve 5 star mono-dimensional recognition, if many cases, leads to reputation inflation (Horton 2015) or threatened retaliation and has also lead to the emergence of add-ons such as Turkopticon to enable workers to filter out requesters who are notorious for not paying. Ranking workers and requesters may results in cases of (endless?) retaliation, over-competition, gaming and, at times, inaccurate matching of tasks from requesters to workers and in cases are deemed unfair (Horton 2010). In synthesis, current rating systems do not take into account all of the possible criteria and datasets that could make the matching of task to worker more efficient and effective (Horton 2012 - Matchmaking).
  
== Related work ==
+
== What we propose ==
  
Current ranking systems for crowdwork fail to meet the specific needs of workers and requesters. As a result, add ons such as Turkopticon are created to assist in supporting workers to identify appropriate tasks from trusted employers. (Shall we also add something to do with the situation where ranking workers and requesters results in cases of competition, gaming and, at times, inaccurate matching of tasks from requesters to workers and in cases are deemed unfair Horton 2010.
+
We propose multi-dimensional matching system investigation about systems based on diversity and not on a mono-dimensional, final, numeric system for ranking. The idea is about a matching system that will match both jobs with workers and workers with requesters as well.
  
Currently rating systems do not take into account all of the possible criteria and datasets that could make the matching of task to worker more efficient and effective. Horton 2012 - Matchmaking
+
We will test diffetrent context with vectorialization algorithms and social matrix factorialization.
  
  
For collaborative filtering there is a cold start problem where  the algorithms are based on historical usage data. 
 
We want to be able to recommend new workers to tasks even when there is no usage data to analyze.
 
  
== What’s the high-level insight? ==
+
To recognize possible alternative systems, we propose to analyze:
  
Predictive preferences with deep learning. Use of word2vec. 
 
Workers are matched with tasks posted by requesters based on data sets identified by ML - parameters can be set by both the worker and matched to those set by the requester and also identified by their activity within the crowdsourcing market, for example, within their completion of previous tasks and their evaluation of previous task. .
 
  
== What’s the system? ==
+
Spotify
  
 +
task: listen to supergood music
  
Predictive preferences with deep learning. Task recommended to workers based on signals based a comprehensive set of parameters including ability and availability (skills, timescale, etc). To determine the worker and requesters preferences based on historical usage data. An alternative to the ranking system based on roles, diversity, compatibility.
+
personality: my preferences give me more pleasure than the top 10
  
(@admp idea) We believe there are requesters and workers with different strengths and through multi dimensional matching we can make them happier. A key would be to have a algorithm which is hidden and figures out information about workers and users from various dimension as (value of efficiency of getting work done accurate or in time, etc. )
 
  
 +
Wikipedia
  
@amdp suggests:
+
task: write articles about all known by mankind
a simple comparison methodology
+
a system with ranking and one without
+
for sure the one without does not exist in the online labour market
+
  
 +
personality: my skills and preferences allow me to write an article whenever I want and on the subject I want, rather than following a list of priorities and needs in an encyclopaedia
  
== Experiment ==
 
  
 +
OkCupid / Tinder
  
Matching systems from dating sites like OKCupid are very detailed in matching people with each other. (https://www.okcupid.com/help/match-percentages)  
+
Matching systems from dating sites like OKCupid are very detailed in matching people with each other. (https://www.okcupid.com/help/match-percentages). Visual Matching in Tinder can also be interesting. You only see each other… if you match each other.
  
  
 +
Kindle suggestion system (“Customers who bought this also bought” and “Customers who viewed this also viewed” sections.
  
examples: Spotify
+
Task: offers suggestions for books one might be interested in reading but keeps the kid-related stuff separate. 
task: listen to supergood music
+
personality: my preferences give me more pleasure than the top 10
+
  
Wikipedia
 
task: write articles about all known by mankind
 
personality: my skills and preferences allow me to write an article whenever I want and on the subject I want, rather than following a list of priorities and needs in an encyclopaedia
 
  
Kindle.  It’s not their 5-star rating system I would like to look at but their recommendation system in the “Customers who bought this also bought” and “Customers who viewed this also viewed” sections.  I like how it keeps the kid and adult recommendations separated.
+
Current Stanford Crowd Research Collective - Daemo
  
Task: offers suggestions for books I might be interested in reading but keeps the kid-related stuff separate.  (For YouTube, I get recommended all sorts of kid-related-rubbish after they've had a go on my phone for a bit.  It drives me nuts.)
+
We could also insert Daemo in our list if any ranking system is tested.
Kindle, on the other hand,  only offers me kid-related-reading-material recommendations when I’ve bought or have viewed a book for children. 
+
Personality: It manages to keep the recommendations differentiated which enables me to find books I do want to read.
+
  
  
 +
Time banking / Timerepublic
  
 +
Hours exchange.
  
 +
== Experiment ==
  
 +
*Specific details to follow*
  
Online Dating
+
== The Results ==
Having a look at online dating that give you recommendations.
+
  
Tinder: … Visual Matching. You only see each other… if you match each other.
+
Multidimensional Percentage Ranks should result in possible accurate Matching with no retaliation drawbacks and more identification by both workers and requesters.
  
  
OK cupid
+
Contributors:
 +
 
 +
@amdp
 +
 
 +
@acossette
 +
 
 +
@baxterstockman
 +
 
 +
@ferlin87
 +
 
 +
@arichmondfuller
 +
 
 +
 
 +
== References and Notes ==
 +
 
  
 
Psychological Questionnaires ask many (also very private  but entertaining  questions) in order to match you quite accurately through  a percentage number from 1 to 100. r
 
Psychological Questionnaires ask many (also very private  but entertaining  questions) in order to match you quite accurately through  a percentage number from 1 to 100. r
 
The company seem to put put a lot of research in  how  to make recommendations.
 
The company seem to put put a lot of research in  how  to make recommendations.
 +
For collaborative filtering there is a cold start problem where the algorithms are based on historical usage data.  We'd like to be able to recommend new workers to tasks even when there is no usage data to analyze.
 +
Predictive preferences with deep learning. Task recommended to workers based on signals based a comprehensive set of parameters including ability and availability (skills, timescale, etc). To determine the worker and requesters preferences based on historical usage data. An alternative to the ranking system based on roles, diversity, compatibility.
  
== The Results ==
+
(@admp idea) We believe there are requesters and workers with different strengths and through multi dimensional matching we can make them happier. A key would be to have a algorithm which is hidden and figures out information about workers and users from various dimension as (value of efficiency of getting work done accurate or in time, etc. )
  
 +
@amdp suggests:
 +
a simple comparison methodology
 +
a system with ranking and one without
 +
for sure the one without does not exist in the online labour market
  
Multidimensional Percentage Ranks should help to find  a match- and try to make accurate Matching Possible
+
Information on what’s required of us for this milestone:
  
 +
http://crowdresearch.stanford.edu/w/index.php?title=Winter_Milestone_4
 +
https://www.youtube.com/watch?v=aLmr2HvoBKw
 +
http://crowdresearch.stanford.edu/w/img_auth.php/3/36/02-01-research.pdf
  
== Future work ==
 
  
WHAT COULD WE NEED FOR OUR PLATFORM?
+
Examples to look at:
 +
Example1: Gupta A, Thies W, Cutrell E, et al. mClerk: enabling mobile crowdsourcing in developing regions. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2012: 1843-1852.
 +
http://crowdresearch.stanford.edu/w/index.php?title=File:MClerk_(private).pdf
 +
 
 +
Example2: Narula P, Gutheim P, Rolnitzky D, et al. MobileWorks: A Mobile Crowdsourcing Platform for Workers at the Bottom of the Pyramid. Human Computation, 2011, 11: 11.
 +
http://crowdresearch.stanford.edu/w/img_auth.php/f/fa/MobileWorks_%28private%29.pdf
 +
 
 +
Example3: Vaish R, Wyngarden K, Chen J, et al. Twitch crowdsourcing: crowd contributions in short bursts of time. Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2014: 3645-3654.
 +
http://crowdresearch.stanford.edu/w/img_auth.php/e/e4/Twitch_Crowdsourcing_%28private%29.pdf
  
 
A team that does research and matching and is always working on improving the matching algorithms.  
 
A team that does research and matching and is always working on improving the matching algorithms.  
 
Questionnaires about personal working preferences that are fun to answer
 
Questionnaires about personal working preferences that are fun to answer
  
___
+
 
_____________________________________________________________________
+
http://www.colyvan.com/papers/Fry.pdf
http://www.colyvan.com/papers/Fry.pdf____________
+
  
  
Line 122: Line 143:
 
http://www.buzzfeed.com/coralewis/why-a-marxist-social-policy-is-gaining-ground-in-silicon-val#.yh5Q7YW8nO
 
http://www.buzzfeed.com/coralewis/why-a-marxist-social-policy-is-gaining-ground-in-silicon-val#.yh5Q7YW8nO
  
Time banking
+
 
  
 
Peer-to-peer exchange is when two parties trade goods or services directly to each other. Exchange services such as Uber, Lyft, AirBnB, etc. offer information systems to support the matching process of supplier and demander. Such exchange services exemplify rapid growth and demonstrate the power of the so-called shared economy. In this manner, a lot of different services have risen, from simple ride sharing, car sharing (getaround.com), parking lot sharing (parkatmyhouse.com), workspace sharing (liquidspace.com), temporary overnight sharing (couchsharing.com) to even textbook sharing (chegg.com). The most common form of exchange is a good or service for money. (Caroll 2014)   
 
Peer-to-peer exchange is when two parties trade goods or services directly to each other. Exchange services such as Uber, Lyft, AirBnB, etc. offer information systems to support the matching process of supplier and demander. Such exchange services exemplify rapid growth and demonstrate the power of the so-called shared economy. In this manner, a lot of different services have risen, from simple ride sharing, car sharing (getaround.com), parking lot sharing (parkatmyhouse.com), workspace sharing (liquidspace.com), temporary overnight sharing (couchsharing.com) to even textbook sharing (chegg.com). The most common form of exchange is a good or service for money. (Caroll 2014)   
Line 128: Line 149:
 
Furthermore, money as a trading currency is discriminating towards people being incapable of having or earning money, such as the handicapped, sick, children or unskilled ones. An alternative for this is time sharing.   
 
Furthermore, money as a trading currency is discriminating towards people being incapable of having or earning money, such as the handicapped, sick, children or unskilled ones. An alternative for this is time sharing.   
 
hOurWorld is such a timebanking provider. Users can post and requests tasks, offering a service or accepting an offer. The currency is time, therefore, every service has the same value. For example one hour of a blue collar job equals the same as one hours of white collar job. Both parties have to agree for the actual exchange of the time value (approving credit). A rating system as described above does not exist. The exchange is purely based on offers and requests from people. In real life, however, the timebank lives from the offers of people (Belotti et al. 2014). Further research shows that engagement and helping out in a local community has a psychological benefits to an individual. According to , among other benefits volunteering shows increased personal well-being, reduction of depression and increased self-esteem (Thoits 2001, Post 2005).
 
hOurWorld is such a timebanking provider. Users can post and requests tasks, offering a service or accepting an offer. The currency is time, therefore, every service has the same value. For example one hour of a blue collar job equals the same as one hours of white collar job. Both parties have to agree for the actual exchange of the time value (approving credit). A rating system as described above does not exist. The exchange is purely based on offers and requests from people. In real life, however, the timebank lives from the offers of people (Belotti et al. 2014). Further research shows that engagement and helping out in a local community has a psychological benefits to an individual. According to , among other benefits volunteering shows increased personal well-being, reduction of depression and increased self-esteem (Thoits 2001, Post 2005).
 +
 +
  
 
Caroll, J. (2014). Co-Production Issues in Time Banking and Peer-to-Peer Exchange.  
 
Caroll, J. (2014). Co-Production Issues in Time Banking and Peer-to-Peer Exchange.  
Line 139: Line 162:
 
Rogers, B. (2015). The Social Cost of Uber.   
 
Rogers, B. (2015). The Social Cost of Uber.   
  
Post, S. Altruism, happiness, and health: It’s good to be good. International Journal of Behavioral Medicine 12, 2 (2005), 66-77.  
+
Post, S. Altruism, happiness, and health: It’s good to be good. International Journal of Behavioral Medicine 12, 2 (2005), 66-77.
  
  
 
+
Contributors:
 
+
Contributors: please feel free to add anyone else who wants to contribute.
+
  
 
@amdp
 
@amdp
Line 155: Line 176:
  
 
@arichmondfuller
 
@arichmondfuller
 
Information on what’s required of us for this milestone:
 
 
http://crowdresearch.stanford.edu/w/index.php?title=Winter_Milestone_4
 
https://www.youtube.com/watch?v=aLmr2HvoBKw
 
http://crowdresearch.stanford.edu/w/img_auth.php/3/36/02-01-research.pdf
 
 
 
Examples to look at:
 
Example1: Gupta A, Thies W, Cutrell E, et al. mClerk: enabling mobile crowdsourcing in developing regions. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2012: 1843-1852.
 
http://crowdresearch.stanford.edu/w/index.php?title=File:MClerk_(private).pdf
 
 
Example2: Narula P, Gutheim P, Rolnitzky D, et al. MobileWorks: A Mobile Crowdsourcing Platform for Workers at the Bottom of the Pyramid. Human Computation, 2011, 11: 11.
 
http://crowdresearch.stanford.edu/w/img_auth.php/f/fa/MobileWorks_%28private%29.pdf
 
 
Example3: Vaish R, Wyngarden K, Chen J, et al. Twitch crowdsourcing: crowd contributions in short bursts of time. Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2014: 3645-3654.
 
http://crowdresearch.stanford.edu/w/img_auth.php/e/e4/Twitch_Crowdsourcing_%28private%29.pdf
 

Latest revision as of 08:20, 8 February 2016

Outline of current system

Crowdsourcing markets currently offer ranking for matching those who want their tasks completed to those who do the work. Workers want to get work suited to their own needs, skills and possibly personalized criteria. Current reputation systems do not adequately differentiate the quality of the tasks completed by workers: requesters complain that they cannot get good results from some workers and cannot be guaranteed a high quality of results, and would like some KPI to evaluate the reputation of a worker (@seku).

Trust and power are broken because traditional approaches to ranking does not differentiate in in ways that fully meet the needs of workers or requesters. Moreover, ranking has one dimension, while our idea is this dimension flattens workers attitudes and possibilities. The quest to achieve 5 star mono-dimensional recognition, if many cases, leads to reputation inflation (Horton 2015) or threatened retaliation and has also lead to the emergence of add-ons such as Turkopticon to enable workers to filter out requesters who are notorious for not paying. Ranking workers and requesters may results in cases of (endless?) retaliation, over-competition, gaming and, at times, inaccurate matching of tasks from requesters to workers and in cases are deemed unfair (Horton 2010). In synthesis, current rating systems do not take into account all of the possible criteria and datasets that could make the matching of task to worker more efficient and effective (Horton 2012 - Matchmaking).

What we propose

We propose multi-dimensional matching system investigation about systems based on diversity and not on a mono-dimensional, final, numeric system for ranking. The idea is about a matching system that will match both jobs with workers and workers with requesters as well.

We will test diffetrent context with vectorialization algorithms and social matrix factorialization.


To recognize possible alternative systems, we propose to analyze:


Spotify

task: listen to supergood music

personality: my preferences give me more pleasure than the top 10


Wikipedia

task: write articles about all known by mankind

personality: my skills and preferences allow me to write an article whenever I want and on the subject I want, rather than following a list of priorities and needs in an encyclopaedia


OkCupid / Tinder

Matching systems from dating sites like OKCupid are very detailed in matching people with each other. (https://www.okcupid.com/help/match-percentages). Visual Matching in Tinder can also be interesting. You only see each other… if you match each other.


Kindle suggestion system (“Customers who bought this also bought” and “Customers who viewed this also viewed” sections.

Task: offers suggestions for books one might be interested in reading but keeps the kid-related stuff separate.


Current Stanford Crowd Research Collective - Daemo

We could also insert Daemo in our list if any ranking system is tested.


Time banking / Timerepublic

Hours exchange.

Experiment

  • Specific details to follow*

The Results

Multidimensional Percentage Ranks should result in possible accurate Matching with no retaliation drawbacks and more identification by both workers and requesters.


Contributors:

@amdp

@acossette

@baxterstockman

@ferlin87

@arichmondfuller


References and Notes

Psychological Questionnaires ask many (also very private but entertaining questions) in order to match you quite accurately through a percentage number from 1 to 100. r The company seem to put put a lot of research in how to make recommendations. For collaborative filtering there is a cold start problem where the algorithms are based on historical usage data. We'd like to be able to recommend new workers to tasks even when there is no usage data to analyze. Predictive preferences with deep learning. Task recommended to workers based on signals based a comprehensive set of parameters including ability and availability (skills, timescale, etc). To determine the worker and requesters preferences based on historical usage data. An alternative to the ranking system based on roles, diversity, compatibility.

(@admp idea) We believe there are requesters and workers with different strengths and through multi dimensional matching we can make them happier. A key would be to have a algorithm which is hidden and figures out information about workers and users from various dimension as (value of efficiency of getting work done accurate or in time, etc. )

@amdp suggests: a simple comparison methodology a system with ranking and one without for sure the one without does not exist in the online labour market

Information on what’s required of us for this milestone:

http://crowdresearch.stanford.edu/w/index.php?title=Winter_Milestone_4 https://www.youtube.com/watch?v=aLmr2HvoBKw http://crowdresearch.stanford.edu/w/img_auth.php/3/36/02-01-research.pdf


Examples to look at: Example1: Gupta A, Thies W, Cutrell E, et al. mClerk: enabling mobile crowdsourcing in developing regions. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2012: 1843-1852. http://crowdresearch.stanford.edu/w/index.php?title=File:MClerk_(private).pdf

Example2: Narula P, Gutheim P, Rolnitzky D, et al. MobileWorks: A Mobile Crowdsourcing Platform for Workers at the Bottom of the Pyramid. Human Computation, 2011, 11: 11. http://crowdresearch.stanford.edu/w/img_auth.php/f/fa/MobileWorks_%28private%29.pdf

Example3: Vaish R, Wyngarden K, Chen J, et al. Twitch crowdsourcing: crowd contributions in short bursts of time. Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2014: 3645-3654. http://crowdresearch.stanford.edu/w/img_auth.php/e/e4/Twitch_Crowdsourcing_%28private%29.pdf

A team that does research and matching and is always working on improving the matching algorithms. Questionnaires about personal working preferences that are fun to answer


http://www.colyvan.com/papers/Fry.pdf


We need this sort of differentiation when we put specific skills into the task scenario. What I need and want as a worker when doing Japanese translation is very different from what I need and want consulting on UX, for example.


https://medium.com/@ConsenSys/basic-income-on-the-blockchain-fair-money-45662889077c#.uj3wk9orj

https://ourbasicincome.wordpress.com/2015/06/18/circles-universal-basic-income/ http://aboutcircles.com/ Forum discussing the above circles. https://www.ethereum.org/


https://en.wikipedia.org/wiki/Freigeld


http://www.grin.com/en/e-book/266178/optimized-ranking-based-techniques-for-improving-aggregate-recommendation

Online DATING

http://www.shore.ac.nz/massey/fms/Colleges/College%20of%20Sciences/IIMS/RLIMS/Volume04/Automated_Schema_Matching_Techniques-An_Exploratory_Study.pdf

http://www.colyvan.com/papers/Fry.pdf

okcupid… multiranking ( questionairies/very detailed matches … /)

very detailed information, and very precise matches, everyone gives also very closed information ,

vs. tinder…

anonymous superficial (location based) matches… once you have them you can dive (tinder is not the best example as it is location based and matching very superficial but Human and intuitive on pictures).


http://www.buzzfeed.com/coralewis/why-a-marxist-social-policy-is-gaining-ground-in-silicon-val#.yh5Q7YW8nO


Peer-to-peer exchange is when two parties trade goods or services directly to each other. Exchange services such as Uber, Lyft, AirBnB, etc. offer information systems to support the matching process of supplier and demander. Such exchange services exemplify rapid growth and demonstrate the power of the so-called shared economy. In this manner, a lot of different services have risen, from simple ride sharing, car sharing (getaround.com), parking lot sharing (parkatmyhouse.com), workspace sharing (liquidspace.com), temporary overnight sharing (couchsharing.com) to even textbook sharing (chegg.com). The most common form of exchange is a good or service for money. (Caroll 2014) Since these exchange services are rather handled privately instead of as a business, most information systems provide a rating mechanism to build up credibility. A typical rating system is a 5 point Likert scale resembled in stars (Uber, AirB’n’B). To stay above a certain rating, the work might involve a high amount of “emotional work” (Hall & Krueger 2015, Rogers 2015). Depending on the matching system of the service provider and the amounts of ratings, a single negative or not 100% review can cause severe consequences for the worker him/herself. Furthermore, money as a trading currency is discriminating towards people being incapable of having or earning money, such as the handicapped, sick, children or unskilled ones. An alternative for this is time sharing. hOurWorld is such a timebanking provider. Users can post and requests tasks, offering a service or accepting an offer. The currency is time, therefore, every service has the same value. For example one hour of a blue collar job equals the same as one hours of white collar job. Both parties have to agree for the actual exchange of the time value (approving credit). A rating system as described above does not exist. The exchange is purely based on offers and requests from people. In real life, however, the timebank lives from the offers of people (Belotti et al. 2014). Further research shows that engagement and helping out in a local community has a psychological benefits to an individual. According to , among other benefits volunteering shows increased personal well-being, reduction of depression and increased self-esteem (Thoits 2001, Post 2005).


Caroll, J. (2014). Co-Production Issues in Time Banking and Peer-to-Peer Exchange.

Bellotti et al. (2014). Towards Community-Centered Support for Peer-to-Peer Service Exchange: Rethinking the Timebanking Metaphor

Thoits, P. A. & Hewitt, L. N. Volunteer work and wellbeing. Journal of Health and Social Behavior 42, (2001), 115–131

Hall, J. & Krueger, A. (2015). An Analysis of the Labor Market for Uber’s Driver-Partners in the United States.

Rogers, B. (2015). The Social Cost of Uber.

Post, S. Altruism, happiness, and health: It’s good to be good. International Journal of Behavioral Medicine 12, 2 (2005), 66-77.


Contributors:

@amdp

@acossette

@baxterstockman

@ferlin87

@arichmondfuller