WinterMilestone 4 Algorithmic hummingbirds
IMPROVING REPUTATION SYSTEMS FOR CROWD-SOURCING PLATFORMS
SUMMARY OF WINTER MEETING 4
We take forward the need finding process of week 3 by narrowing them down into specific directions and organizing them into full fledged research proposals in the following three themes - task authorship, task ranking, open governance.
The current generation of crowd-sourcing platforms are surprisingly flawed, which are often overlooked or sidelined and some of these include poor interface design,difficulty in task hunting, unfair payment systems, poor worker-requester communication and representation and so on.
So, the idea is to channelize efforts in this direction by solving or at least, minimizing some of the issues by building a next generation crowd-sourcing platform, Daemo integrated with a reputation system, Boomerang. The improvised platform would focus on auto tagging or auto classification of tasks by using machine learning and artificial intelligence algorithms and clustering through guild systems and other minor issues. The approach is expected to reduce redundancy, ensure better communication not only between requesters and workers but also among co-requesters and co-workers which would yield better results in limited time bounds (as compared to existing platforms) and definitely, more efficient and representative crowd platforms.
Crowd-source; Daemo; Boomerang;
Crowd-sourcing, a typically defined as the process of obtaining services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers. So, what is it exactly that happens on these crowd sourcing platforms? For example, Amazon Mechanical Turk (popularly MTurk) is a crowd-sourcing Internet marketplace that enables individuals and requesters to coordinate the effective use of human intelligence to perform tasks with the help of Human Intelligence Tests (HITs) that computers are currently unable to do. In the current scenario, neither are the requesters able to ensure high quality results, nor are the workers able to work conveniently. The current generation of crowd sourcing platforms like Task Rabbit, Amazon Mechanical turk and so on, do not ensure high quality results, produce inefficient tasks and suffer from poor worker-requester relationships. In order to overcome these issues, we propose a new standard (the next de-facto platform), Daemo which includes Boomerang, a reputation system that introduces alignment between ratings and likelihood of collaboration. The results we would be hoping to achieve would be - to prove that increased communication between requesters and workers and also among co-requesters and co-workers yields higher quality results because the efforts of either side would hopefully be channelized; new pricing models may attract more workers.
BRIDGING THE COMMUNICATION GAP
As suggested above (and by me, in the milestone 3 submission - []), requesters and workers could go through an intermediary, who holds payment until work is completed and task is verified. The involvement of the third party would ensure fairer payment systems (and hopefully better involvement and higher quality results). But could this possibly crowd sourced as well (where professional requesters and professional workers are chosen and they are together integrated into the system and together decide on the payment policy of a particular task ensuring the best interests of everyone)?? We could experiment on such a system first, and correlate (and extrapolate) with the results obtained! Does communication really correlate to a much better results?? Possible way of ensuring that the expectations of the results are met would possibly be - to experiment. For example, consider two groups of online crowd workers of similar levels of experience, possibly interest, age, gender considerations. Now, one group has all the communication privileges where the workers are free to communicate to the requesters and vice versa and also among themselves. The other group, however, cannot communicate directly to peers or requesters. Following this working procedure under specified conditions , for sufficient amount of time, tabulating results separately and independently and finally comparing the two should prove that communication yields to lesser rejections, more happy workers (and requesters), and hence, high quality results.
DEALING WITH UNFAIR REJECTIONS
In order to deal with unfair rejections, we need to build a system, resistant to spammers in order to convince the requesters that maybe, the worker did actually try his/her best but maybe the task was not just clear enough or some other technical issue was persistent. So, when a new worker joins the platform, he has enough resources and documentation to ensure that he/she is well acquainted with the interface, the policies of the platform that's being adopted and so on. So, the new worker is first thrown with simpler tasks or gold standard tasks (where the answers are already known) and his behavior is being tracked in real time with respect to number of keystrokes per minute, and the movement of the mouse etc. His/Her answers are matched with the known solutions and ranking is assigned to that worker. Consistently low scores on the part of the worker may result in him/her being flagged. (Here, we are actually hoping that the interaction is not heavily dependent on the nature of the task or the task in itself) Whenever a requester marks a task as easy (or any particular task is low-paid), it first reaches the top of the task feed of the new user. As the worker gains experience (that is to say, he/she successfully completes x number of HIT's, he/she climbs up the ranking feed and now is assigned intermediary or difficulty level tasks). This ensures that new workers don't have to unfairly compete with professionals. In order to ensure that the rejection is justified, once the HIT is submitted, it is done so along with the statistics and so, the requester is thoroughly convinced that it was not the lack of effort on the part of the worker. In case he decides to reject the work, it is mandatory for him to justify as to why the work was rejected. He can or cannot allow the worker to re-attempt the task as per his discretion. The worker, in case feels he/she is wrongly being flagged, may appeal to the intermediary whose decision shall be final and binding.
This part of the paper mainly deals with task quality design aspects, like pricing models and failure analysis. Pricing models is fixed with the help of an intermediary body (consisting of representations of both parties) in consultation with the requester and his team (if any). And failure to complete the given task or failing to meet expectations, can be accounted for by communication between the involved people within the system. Sincerely hoping that these methods suggested above would help build a better crowd platform, for a better world where crowd workers are represented and respected.