WinterMilestone 10 AlgorithmicHummingBirds

From crowdresearch
Jump to: navigation, search


The goals and discussion this week in all of the three domains focus on the refinement of previous week as we push for our fast approaching deadlines. At the end of the session we were left with three open ended questions - a. How exactly does leveling work? b. What's the effect of promotion? c. What's the minimal viable product of social space for a guild?


A second pilot study was launched and authored tasks were released into Amazon Mechanical Turk. It resulted in about ~10 to 15% variance with 65-75% relative accuracy (after scaling). The folks also hand labeled more data to release.

This week the focus is on edge cases, boundary conditions and questions with respect to tasks with low variance or high disagreement. We also need an effective mechanism to identify such areas of interest.


The goal last week was to work on data collection mocks, planning study, outlines. It was all about designing the information gathering aspect of the front end interface and also think about its back end implementations. These folks have been doing a great job working on the timer aspects as well.

The basic idea as of now is to have a matrix representation where workers can try out all combinations such as view time and reputation, only reputation, only time and so on. We could work the above with the help of a graph or a tree data structure interlinked with such a matrix representation as it could lead to efficient and easy retrieval.

We could also support the hypothesis that workers would be more inclined to do tasks for requesters with low rejection rate by indulging in on data.


This week we would be focusing in on refining our answers to the previous week's open ended questions.


The design test flight wing has been actively engaged in front end mocks and overall UI.


The current generation of crowd-sourcing platforms are surprisingly flawed, which are often overlooked or sidelined.

So, the idea is to channelize efforts in the direction of guilds and experiment to see to what extent this helps in minimizing some of the issues by building a next generation crowd-sourcing platform, Daemo integrated with a reputation system, Boomerang. The improvised platform is expected to yield better results in limited time bounds (as compared to existing platforms) and definitely, more efficient and representative crowd platforms.

Authors Keywords

Crowd-source; Daemo; Boomerang; Guilds;


The worker is expected fulfill criteria setup by the guild network and his/her work is reviewed by senior members. We look at prolonged or continuous leveling with exams and critical tasks which would be reviewed directly with no involvement of any 3rd party agencies.

Instead of looking at examination based evaluation for leveling, we could look at their overall record, their reputation in the recent past, their quality rating, position within the guild mechanisms and so on and so forth, which in my opinion might lead to accurate results.

When a worker joins the guild and starts working, over time he/she has a set of tasks (which he/she has successfully completed) to their credentials. They now pay up a small % of what their have earned and this is routed to guild funds. This is used to pay for the peer review evaluations that follow. This is an evaluation “task” in itself (probably happens on a random subset of completed task preferably and not on each task which would be reviewed by the task requester anyway). All the tasks which he/she has reviewed will enter the worker review record and this directly affects the reputation of the worker within the guild i.e., either the worker may jump a few rankings and earn reputation or he/she might lose out a little by falling through the reputation ranking system.

Here, we might take into account the average of peer, senior and requester ratings and these together would affect the worker system. These are not final and binding and the worker holds the rights to challenge such evaluations in order to prevent unfair mechanism of manipulating reputation rankings.

However, when we do consider a random subset we need to be cautious as to not pick up gold standard or prototype tasks for evaluation as that would be based on ground truth and would make little sense for the worker to pay for review of such a task and for peers to evaluate tasks for which the answers are fairly obvious.


There could be automatic threshold barricades that a worker needs to cross in order to get promoted.

x points, y tasks PROMOTION #i+1

x points, y tasks PROMOTION #i

x' points, y' tasks PROMOTION #i-1

x points, y tasks PROMOTION #i-2

and so on and so forth.

We could also have tasks to be reviewed and peer reviewed and then you get promoted based on the ranking and the points and badges recieved. Promotion here would be more like a social decision (they would be paid for it).

Or we could have automatic ranking systems where promotions are also pretty automatic (higher ranked) and would manifest as reputation within the system.

OR we could have third body decisions and a person performing well consistently would be considered for promotion.

This could be considered but we might need to make a collective decision on what we would choose maybe supported with substantial experimentation and data.

Price points will be set manually by the guild initially, but later will be adjusted based on actual mean hourly wages for each guild level which means that pricing is dynamic and based on real market behaviors as opposed to a computationally simplified model.


We could have Forum or Discussion Board with public and private areas where tasks and their ideas would be discussed like a peer doubt clarification session within a guild. However we could have requesters or requester assistants monitoring the session to make sure answers it self or crucial clues to the tasks are not shared. Other peers could flag off a comment or a reply if they feel it is abusive or is giving away answers and so on and if it proves to be so, this could directly affect his/her reputation rankings. We could “clone” a git hub kind of model where a worker or requester can open an “issue” with each task hash tag. It is open for members of that particular guild to join. The visibility of the task and the discussion is limited to the guild. It also has options for private chat between peers and requesters where visibility is limited.

We could have Q&A community with voting (stack overflow model for instance) and this would prevent everyone from spamming the requesters inbox with similar questions. He/She could post a question (regarding the task or the evaluation policy, the pricing etc.) and anyone would be able to see and answer it. The requester could verify and correct answers later. This could also be extended so that workers could receive important updates about the task,changed deadlines if any, so on and so forth.

We could have a Knowledge Base. Assume that a task is based on multiple domains or that it requires deep experiential knowledge to solve it. or lets say a requester wants the workers to refer to a paper or website and then attempt it; It would of course make sense to put it up in the discussion forum but it would become to cumbersome to search for a piece of information since there would a lot of activity; workers would end up missing a crucial part of the information and attempting this task without requisite knowledge would end up in rejection or give rise to unforeseen variance ;Naturally, such a task would be rewarding in nature, not only in terms of knowledge, but also in terms of the monetary prospects. In such a case, a worker would either not have the confidence to raise up to the challenge or he might have the confidence but he/she is unsure of the direction in which he or she needs to proceed; If this really is the case, the task would be open to a very small group of highly skilled workers (probably at the top) within the guild. The requester would have very little data on which his results could be based. So, we could have a separate platform where the requester could chose to guide (if he has the time) a few interested workers on how to proceed, what to read, and so on and so forth.


We could make a discourse system where there is a flair level and a slack kind of an organization with management options and so on.