WinterMilestone 3 Dubs: Dark Horse

From crowdresearch
Jump to: navigation, search

Below we'll list our darkhorse ideas for a reputation system that could increase both the quality and trust links between requesters and workers:

− Based on the reputation system's approaches delined on the white paper "Daemo: A Crowdsourced Crowdsourcing Platform", it is a notable fact that all interactions between workers and requesters go back and forth between requesters -> workers and workers -> requesters, but never allow workers to rate workers or requesters to rate requesters. Our approach will consider that the former system (where one kind of user rates another) already has seen some traction and is indeed a good enough model to market places nowadays, so we'll shine some light on what an approach considering feedbacks from various types of users could look like.

− First of all, despite being able to rate requester's, workers work solely by themselves, never really interacting with other workers. Well, this seems to be a reasonable approach when it comes to a two-sided marketplace. However, we believe that we could get some inspiration from how results are listed in search engines to help us enhance our system. The basic algorithm behind google's recommendation systems states that one relevant page links to other relevant pages, while the contrary is also true(non-relevant pages tend not to link to relevant pages for specific keywords). Having that in mind, we think that workers should be able to anonymously rate other workers, so that the quality of the overall work performed by a certain worker is calculated based on the feedback coming from both requesters and from other workers.

As of now, we'll take it for true that the best prediction of future behaviour is the past behaviour shown by a certain worker. That means that if he or she has performed consistently high quality work in their past assignments, we can consider that that increases the likelihood of them performing well on future tasks as well. Also, we assume that the mindset of performing well creates, inside the worker's mind, a certain standard as to how tasks should be done and what differentiates a high-quality task from a not so good one, even if the worker is not conscious of that. Therefore, we believe that we can use high-quality workers as both filters for other worker's works and new-comers work. First, we think that it could be a good idea to differentiate reputation points coming from requesters and reputation points coming from other worker's feedback, since a new requester might be tricked into believing that a work is high-quality whereas in reality, that work should be considered average when comparing against the history of similar works performed by workers on that specific crowdsourcing platform. So, whenever a new requester comes to the platform, he or she would be able to see how a specific worker ranks according to the feedback of other requesters (that tells if the worker has finished the task on time, if the result was as expected and so on) and feedback from other workers (that'd tell us how the work performed by that specific worker ranks against the standard of quality from that crowdsourcing platform). Second, we believe that the best approach for such a system to work is if none of the high quality workers know who they're rating and ideally they wouldn't even know they're rating someone at all. Suppose a crowdsourcing platform has 100 workers, 10 of them known to be consistently high quality workers, specialised in performing OCR-related work. If a new worker interested in OCR tasks arrive, we have no way to determine beforehand if he/she is a high quality worker or not.

In order to determine that, we might assign him an x number of OCR fake tasks, which we them present to the workers already known to be good ones to judge wether those tasks were performed in high quality manner or not. That initial feedback would allow us to do two things:

1 - Determine if that worker is ready to perform tasks coming from real requesters and if they'd perform according to our platform's standards and
2 - Keep a list of related high-quality workers. For example, if 9 out of 10 high-quality workers rate that new-coming worker as a high-quality worker, 

it is very likely that he/she indeed meets the quality standard of our platform, and therefore his future ratings of other workers should weight more than workers who fail to be included in the high-quality group. As time passes, we'll have a system that clearly distinguishes between high-quality, average-quality and below-average quality workers, so that we're able to give them tasks as such.

The advantage of such a system would be that we'd be able to provide new-coming requesters with not only feedbacks coming from other requesters, but by the whole community in our platform which would lead to higher chances of the measure of a worker's quality to be accurate. Also, it would after some time and learning, isolate different types of workers, to whom the platform would be able to provide higher earnings or any other form of compensation for their high level work,strengthening the community without lowering the morale of the team as a whole, since an index reporting the reputation from other workers would only be accessible for requesters as well as not providing the information of who has reviewed each other works.

A important point to keep in mind, however is the possibility of a worker to game the system if he or she becomes able to determine which tasks are from real requesters and which are works from fellow workers. The person could then report everybody else's activities as being average or below-average quality work so that he or she could benefit from the perks only high quality workers get access to. One way to undermine that is to use learning over time to check an specific worker rating against other ratings, so we could determine if the data coming from him is an outlier or not and if we should discard it or not. Another downside to this approach is that the platform could possibly have to spend financial resources to get this system to work each time a new worker comes. This cost could be mitigated over time by deducing the cost from the worker's gain or simply by charging it from the customers, since it'd be a strategic advantage from other platforms.

In short, we use the skills of high rated workers to rate other workers, creating a link between like-minded workers and discerning between high-quality workers and those who aren't, so that in future tasks we know exactly who are the most recommended workers for each task.

To make it clearer, we have drawn the following diagrams:

  1. The traditional way of thinking in a reputation system Diagram
  2. Our Dark Horse idea applied to a reputation system Diagram

Note that in this case the dashed line represents the task from a worker (Worker 2 - unknown reputation) to another worker (Worker 1 - good reputation) while the unstyled line represents the evaluation.

Milestone Contributors

Our team:

  • Gabriel Bayomi - @gbayomi
  • Flavio Scorpione - @scorpione
  • Henrique Orefice - @horefice
  • Lucas Bamidele - @lucasbamidele
  • Teogenes Moura - @teomoura

The bold members are the ones that came up with the idea at the first place, but the contribution is general. You can check the other ideas here:

http://crowdresearch.stanford.edu/w/index.php?title=WinterMilestone_3_Dubs:_Representation http://crowdresearch.stanford.edu/w/index.php?title=WinterMilestone_3_Dubs_ReputationIdea:_Daemo_Points http://crowdresearch.stanford.edu/w/index.php?title=WinterMilestone_3_Dubs:_Long_Term_Ideas http://crowdresearch.stanford.edu/w/index.php?title=WinterMilestone_3_Dubs:_Actionable_Ideas