Winter Milestone 7 Algorithmic Hummingbirds

From crowdresearch
Jump to: navigation, search



We have been detailing down specific aspects of the core idea and configuring it for some particular functionality in the following three domains - task authorship, task ranking, open governance. The main ideology for this week is to synchronize the ideas into systems design proposal.

Tracking the process in each of the three domains:


The goal here really is to align and converge visions about the work published in the last few weeks.

We still focus on the variance, launch pilot studies with different volunteer groups, and start writing out the method section writeup. We apply some well known statistical methods on the obtained writeup.

a. We also looked at some sort of a back off strategy with application of sentiment analysis to data (for which we can take advantage of our own crowd and check empirical variance before testing outside this crowd body). For this purpose we use square bench marked tasks (tasks used to standardize crowd sourcing algorithms in a way so as to say). So, can poorly designed tasks give us an insight into human interpretation and psychology which could be recursively used to refine task design, understand the workers better and also, can this data provide crucial information to other research domains? But some of the major challenges would be to store such huge amount of data and apply inferential theories on them (which might require preprocessing in order to bring it into a structured form) and also the overhead involved in making it work not just for people of a particular geography or literacy but for a global generation.

We could further explore the idea of having design templates (preferably different ones for different types of tasks). We could experiment with two groups, one of which uses this unique feature and other which doesn't and see if result quality varies significantly. Further, we could allow the requester to reuse his previous designs and templates for future tasks.

The had been looking for volunteers in the past week working mainly on labeling twitter tweets (sentiment labeling), labeling football scenario reactions, and labeling restaurant features.

Goal for the upcoming week is to run authored tasks and get some data analytics on the obtained data (along with display of rejection rate). We also improve Boomerang's classification system where the details database is overwritten. If worker has prior experience of some form, guilds need to be configured to account for it (may not really matter in micro tasks).


The focus is on building a system which relies on the personal details and information which the worker might not share accurately. So, we incentivize this process, where revealing information could yield a potentially better experience. We focus and converge thoughts into a methods channel section writeup.

We customize task time at an hourly basis. We could have some kind of a B-Tree structure running in the back-end which could exhaustively calculate all possible configurations. This would also help get rejection information.

We can have the workers tag tasks while they author it. Instead, could we use deep convolutional machine learning techniques along with artificial neural networks and use some previous or a pre generated data as a training set and build an auto tagging system integrated with Boomerang for Daemo?

The goal for the upcoming week is design out the front end and also account for the incentives of the system such that not all tasks are accepted or rejected but any rejection also shows up in your task feed.


Refine the guild to a minimalistic structure. We also discuss output transducers where members of the guild would work before its returned to the requester.

Goal for the week is tease apart mentorship (new members), guild prototyping with some threshold mechanism.

4. DESIGN TEST FLIGHT Working on aspects where requesters can choose to post into a personal guild or into a general pool of tasks globally accessible within Daemo. Guild owners will assign tasks to a worker; assigned task comes at the top of the list but workers continue to have access to all other tasks.

5. RESEARCH ENGINEERING The volunteers have got in many changes live on Daemo and are now working towards implementation process currently of other patches.


The current generation of crowd-sourcing platforms are surprisingly flawed, which are often overlooked or sidelined.

So, the idea is to channelize efforts in the direction of guilds and experiment to see to what extent this helps in minimizing some of the issues by building a next generation crowd-sourcing platform, Daemo integrated with a reputation system, Boomerang. The improvised platform is expected to yield better results in limited time bounds (as compared to existing platforms) and definitely, more efficient and representative crowd platforms.

Authors Keywords

Crowd-source; Daemo; Boomerang; Guilds; sub guild systems


Crowd-sourcing, a typically defined as the process of obtaining services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers. So, what is it exactly that happens on these crowd sourcing platforms? For example, Amazon Mechanical Turk (popularly MTurk) is a crowd-sourcing Internet marketplace that enables individuals and requesters to coordinate the effective use of human intelligence to perform tasks with the help of Human Intelligence Tests (HITs) that computers are currently unable to do. In the current scenario, neither are the requesters able to ensure high quality results, nor are the workers able to work conveniently. The current generation of crowd sourcing platforms like Task Rabbit, Amazon Mechanical turk and so on, do not ensure high quality results, produce inefficient tasks and suffer from poor worker-requester relationships. In order to overcome these issues, we propose a new standard (the next de-facto platform), Daemo which includes Boomerang, a reputation system that introduces alignment between ratings and likelihood of collaboration.


Consider a guild system to have a connected group of workers working in similar domain. Now when a task is being published first, we assume that there is an autotagging engine in place and the task is tagged. Firstly, how the auto tagging feature would work is something like this – its built using some machine learning, artificial intelligence, neural networks and some other domains. We ask requesters to manually tag tasks initially, the machine “learns” it and then manual tagging is scraped, the machine automatically knows what domain a particular task belongs to. Now, its not possible or even correct to theoretically assume that this system is cent percent accurate. In order to work around this, we introduce a few design interventions. The default tag of any given task is general. If the auto tagging system fails to recognize the domain of a particular task or the author specifies no explicit constraints as to the qualification of the worker who attempts the task then the task is open to any audience who wishes to attempt it (given the fact that Boomerang is going to be improvised in order to filter out spam workers by studying their user interaction). Now if the task is tagged to be under one or more domains, then we open up the task to a guild or the channel of that domain first. It moves out of the periphery only if the task is failed to be completed in the specified frame of time. An experiment in this direction may quantify or falsify the hypothesis about the effect on the output quality. The clear disadvantages are that one may think that there is unfair opportunity in distribution or restriction of task is prejudiced but let us assume for now, that all domains and general category tasks are more or less equal in number as they should eventually turn out to be. Also, what if a task requires collaboration between two or more domains or is tagged under multiple domains but don't really require that sort of a collaboration? These ideas are explored later in this document.

Talking about guilds and their computational capability, we can have requesters interact with one representative of a guild community (but does equal power distributions work better? What about guilds of a larger population?). Tasks are distributed among guild-ers. Collaborations and transparency is encouraged within the guild (of course, interactions need to be monitored to prevent cyber security or sharing answers issues).

The possible interventions in this regard could be – a. Workers joining a guild present a dossier like links to CV, github, blogs etc. b. Pull these from the database where attributes gravitate towards vectors that guild support which'll be pulled/refined and updated. c. Create a leveling system based on proximity to various vectors (no proximity to any one of the workers would imply that systems are recommended professional training to acquire needed attributes). d. With real time systems, managers of the guild system can have accounting of workforce when task arrives.

Daemo workers can essentially visualize leveling up as they gain expertise and experience.

Using the whole system we could reproduce online or social learning on daemo?


We can configure the guild to clone the stack overflow kind of interface where it helps users manage complicated tasks using pre established system into the guild. Essentially, the major guild would be composed of sub-guilds (which are just smaller units of guilds) which work in collaboration and also, be quite manageable.


We could explore a professional association model who build and classify users like a skill tree where people are organized into a pyramid kind of structure where people with highest skill levels sit at the top of the pyramid and people with lowest skills (or new workers maybe) would come some where near the bottom. We could expect to find more people at intermediate levels than at the top or bottom. Of course, if thats true, then it wouldn't be a pyramid structure anymore, but the analogy is used here for illustration purposes only.


We said we would encourage collaboration within the guild. But wouldn't too many people being involved in a single task from one or more domains cause confusion and hence delay? Can we optimize some ratio in this regard? Can we implicitly or explicitly weigh the tags before optimizing this ratio?


You give feedback to members within your guild and to members across domains with respect to a particular task you worked on. Then, we apply Boomerang within the guild where the feedback rating directly impacts your chances of working together with him/her.


This part of the paper mainly deals with open governance design aspects special guilds. Sincerely hoping that these methods suggested above would help build a better crowd platform, for a better world where crowd workers are represented and respected.