Milestone 1 Sky

From crowdresearch
Revision as of 03:23, 5 March 2015 by Farzadsalimijazi (Talk | contribs) (Experience the life of a Requester on Mechanical Turk)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search



First of All we should say MTurk is great. It creates too many opportunities for all industry, scientific research and also lots of people who want to earn some money beside their income (because it’s hard to earn enough money for all of your expenses). So we like the concept and it’s a great start point. When you read rest of this page, keep it in your mind that we like all of MTurk but these are things that we think make it better tools for its mission.

MTurk is a great BASIC crowdsourcing platform (It has basic features of a comprehensive platform)

We think if we need to choose one basic platform and try to improve it that is MTurk because other platform has small part of its vision. However each feature can be improved by some little change.

Experience the life of a Worker on Mechanical Turk

We are going to first address problems and obstacles for using MTurk and then we are going to propose possible solutions for these problems. Although writing solutions is not part of this milestone and deadline is close we will discuss some solutions and in future updates we will add more in more details. We use this structure because some of solutions address couple of problems. Actually you can find some of solutions in other systems descriptions.

We appreciate if you want to add any things to it. Let’s start :), Enjoy!


Let’s see what the problems from the start point are:

Authentication Requirement

First of all according to group’s feedback on registration process, I think it’s too frustrating and also SSN is not the only way for authentication. Lots of people including sonal.7271 and myself are susceptible to provide this as she said CONFIDENTIAL information. You may say SSN is required for tax information or we need it for payments!?

Another problem with MTurk signup mechanism really restrict workers and requester to participate (We can say you are going to be approved if you are US citizen or permanent resident). We need something for the Universe not US and actually Crowdsourcing has more contributor in other country than US.

Long account Approval/ Rejection time

If you keep track of slack in last days, you see everybody complain about long period of approval. However I thinks it’s not necessary at all with better self-account approval / rejection structure. (I will explain in next Update)

Finding HITs or works

Now we are registered and we want to use it as worker. First question is how to find works to do? What are the ways for finding HITs in MTurk? Turk provide us a search text box to search a keyword and also two radio box for specifying whether you want it to be based on your qualification or not. There are couple of problems that all of us aware of that if we tried to find HITs...

Personalizing the search results

Yes, this is the first question: how can I find HITs based on my Skills?! Spending lots of time to see if there is a way for search based on specific skills!! I couldn't find that. For example I have skills of x, y, z how can I search among hundreds of thousands Hits. Some of you may say requesters can use proper Keyword and workers can use proper keyword for search :-| ! So what MTurk has done here that charge us for %10 commission!?! (We will propose some solutions)

Also If you look at requesters keyword selection, you see it’s not going to help too much because almost they are same for a big category. For example most of HITs are surveys and all have survey in their keyword and one or two keywords which is too details I mean I bet even none of those workers with that skills and knowledge of that keyword are not going to search that words.

Bad Result even in Keyword search (try it please!)

Okay let’s say we do not have other ways, So let’s search by keywords :|. But the problem is that it’s not working for multiple keyword searching!! For example I take a HIT with three keyword (Survey, CW, Approve) you get that HIT as result just when all of your words in your query exist in Hit's keyword. It means your query has an extra words the result will be “Your search did not match any HITs.”!! Are you serious! So how can I search (We will discuss in next section :)) ) It means even search based on keywords just works when you have search for one keywords and queries that all of its word exist in keywords of HIT :(. CW is abbreviation of (Casting words) if you search "Casting words” you get not found although we have Casting words in description of the HIT and "Castingwords” in keywords. (But we can solve this easy , We Will talk in next section ;))

No strategy for specifying fair value for Hit

Ok, let’s say we were lucky and have searched The Proper keywords and have found an awesome HIT, (I did it ;'( ). But wait a minute, JUST $0.02 (at least it is not 1 cent).

Fair value depend on two factor Time for completing the task and payment rate.

In this moment you have two questions that are actually MTurk’s problems!

- First $0.02 for how much time? So let’s see, there is 'Time Allotted' field which is the time you can finish the HIT or return it, but it is not going to answer my question, Right?

Some requesters kindly write in Hit's description for how many question or how much time :). I Appreciate MOST of them cause you never know how big is the "FEW minutes”!!

So it means you never know unless you accept the HIT and rely on your luck ( I’m not Lucky, What about you?)

- Second payment rate are really low! I am sure there is no Hit which takes less than 1 minute so 2 cent per minute means $1.2 per hour!! Obviously it’s not fair !! And also they charge you for Tax for this 1.2!!!(So we are going to have too much tax return, right?!!)

But there are couple of really easy ways for this problem (they are so easy, we will discuss in next Update :))

Possible Malware (could be unsafe)

Probably you have seen some of Hit’s ask you install plugin or click on external links. Is it safe?! Does Amazon check it for safety? If not what are the MTurk's mechanism for reducing this risk as much as possible. Actually lots of HIT's are more like malware than work and interestingly they are those with more fair value!! (More than $1). Because of pervious reasons we spend too much time to find a HIT and finally we end up with suspect HIT, (Most of time I give up to do that, What about you?)

Confusing procedure for doing and getting reward for completed HITs

Let’s say we decided to do a HIT and started the task. We are at the final stage and we have passed the Allotted time, have closed the windows accidentally or have missed the page for entering the 'completion code ' (it’s confusing). The only way is contacting the requester for your problem. And now you will see that the only way to contact requester is using contact link bellow the HIT description. And If you want to find that HIT with search You will see its time consuming or sometime it’s gone!!

And sometime you will finish the HIT but will not receive the confirmation code from HIT’s!!! (This one is against the humanity).

But there are easy solution for this problem too!

Rejection rate is too much (Work for free)

There is no mechanism for controlling Rejection rate! Reason is Obvious!! Because requesters are not going to pay any cost for rejection! When reason is Obvious therefore Solution is Obvious (we will update later)

And also workers even cannot understand themselves whether Hit is for free workers or not!! Because there is no clue for them to figure out, right?

MTurk is not learning at all according to our history

Let’s say we overcome all of these problems and completed couple of HIT and earn some money in exchange for Valuable time (I think Amazon does not think like us). What about next HIT?!!? Yeah, you should do all of these step again for new HIT just like the beginning!!! (I think Amazon should pay commission not us)

This problem is the main reason for worker's un-safety. I mean MTurk does not encourage user to do more and help them in future to have more options unless in Qualification concept (This is one that I like it).

We need a mechanism to offer stream of HIT's to workers based on their skills, performance and previous HITs, right?

mTurk does not recommend HITs even as keyword Suggestion !!

We can use even simplest recommenders to make finding HIT more efficient. Using just keyword suggestion based on current keyword can help a lot in problem above , right ?

Statistics for worker's activity are not enough

We can provide more statistics in addition to how much we earn, like our work history, time we spend on that HIT, previous requesters, reason for rejection...

mTurk cannot be used when people want to use it!

Majority of people wants to do HIT's in their extra time mostly when they cannot do their own works. For example I prefer to use my time in bus station to earn money even its 2 cent but I just have phone. And if I want to open MTurk in my browser, you know it's difficult to work with that.

Improving Concept of Qualifications

Qualification can be not just value and number of completed HITs.

Workers cannot Rate requesters

It could help lots of problem related to free riders.

Experience the life of a Requester on Mechanical Turk

Unfortunately only one of our group member got approved and not more than two days, So we just could prepare a survey as request and Its ready to publish but we are waiting for funding problem with amazon payment . We are going to provide CSV[[]] as soon as get prepared and we got. We design a precise and comprehensive survey that is based on our results. We want to see how right are we in our opinions.

Now let’s see mTurk in requester's perspective.

Sign up procedure made it not scalable to all country

We can say just Americans have chance to use it. The question is why we need this restrictions? Even in business aspect (or profit for government through tax :| ). More worker and requesters means more profit (Commission). Concept of crowdsourcing here is that you just need to be human even not expert human to do these HIT, right? (Iran, my country is foreboded for trade, what about other countries like India)

Inflexible HIT's

Modifying and setting HIT after creation is sometime not possible and in some case really confusing. Like modifying price or keyword means create new HIT. We need to modify HIT's according to workers respond (Just Increase is best option or Dynamic price based on demand time)

Mono - motivation (Just Value and some time bonus )

If you are real requester (not look for free worker). You need more type of incentive to make it attractive. Values that could be nothing (Rejection rate) is not enough. Also having just one incentive has another problem. By increasing rejection rate of that incentive people are not going to trust it any more ( its not as motivating as beginning )

There are couple of easy solution for this, right :) ? I think MTurk designer did lots of thing in favor of Requesters. Not only it's against workers right, it also is against the real requesters. We are going to address couple of them:

HIT's does not have any rate from Workers

It could be a reward for Real requesters and get their job faster, See it works :)

Requesters cannot use Workers profile information

If you have done survey in mTurk you see there are lots of questions which are repetitive, What’s the point of creating and answering these question again and again, you know time is Valuable for both side. Also

Also what if Worker is a woman (I mean tell you wrong age :| , right ? ). Workers can provide wrong information for some reason and effect the task results.

Required information for using various crowdsource platform

for more click on Sign up required information

Explore alternative crowd-labor markets

Compare and contrast the crowd-labor market you just explored (TaskRabbit/oDesk/GalaxyZoo) to Mechanical Turk.


The tasks are labor, non-professional non-technical tasks. The task interface is easy to work with for the requester. It prompts the requester about the type of the task which is in some categories; Cleaning, Handyman, Moving Help, …Then it asks the time slot for whether the tasks involves having a vehicle from the worker and then it asks a time slot and a day and it matches the task with appropriate person.

Nice time-based job allocation

This would be nice capability for our crowdsourcing platform if we could support time-dependent task and we could say these HIT's need to be done in specific time and we had worker's free-time slot and assign them according to free slots.


GalaxyZoo is a system for classifying galaxy objects and images. It is volunteer classification of images of galaxy objects. The user can start classification immediately after entering the web interface. At each step of classification, it gives the crowd a very short description and excerpt knowledge for classification, and it moves to next step. Each image should be and then asks the user to discuss and justify their answer and then moves to the next classification step till it converges one of the classes.

Volunteering based tasks

The fact that many people specially in developed countries really want to contribute in science or social activities. One category of HIT's that should be considered in new platform is supporting these kind of tasks.




MobileWorks : The application process for job takes 2-3 weeks, and they ask for just basic English understanding skills and no more proof. About 56% of these Mechanical Turk workers (“Turkers”) are from the United States and 36% from India. Desktop computers are penetrated in Indian community only 0.9% of population whereas almost 50% of them have cellphones. Minimum commission charged is $0.005 per HIT. If you choose to send HITs exclusively to Mechanical Turk Masters there is an additional fee of 20% of the reward you set for Workers.

The problem with MobileWork is that they have to chunk the data, into very small pieces so that it is receivable with SMS and because of small mobile screen the chunk should not be more than two words of document. The rates belongs to the size of the business. They involve OCR, object tagging. Accuracy of the user is taken into account for their future payments. The payment is a function of the task and the accuracy of worker and the cost of the task is calculated while the task is running. In amazon the cost of the task is predefined by the requester. They have a very lightweight User Interface, with authentication interface and right after authentication the tasks comes up. The tasks are iterative and triple iteration converges to 99.89% of accuracy. Average of 120 tasks can be done per hour and 0.55 dollar is paid per hour so the average task payment is 0.0046 dollars per hour. The newer update to MobileWork is their integration with LeadGenius and providing a real working environment with the workers desired configuration.

Pros & Cons & New Idea

Nice Quality Validation Mechanism :)

Quality is maintained using multiple entry. Each task is distributed to two workers until two of the answers match. If a worker provides an incorrect answer, her quality rating decreases. Conversely, a worker that provides a correct answer will see an increase in quality score.

This mechanism has two advantage : First Automatic validation by workers themselves and more accuracy. (Compare it for overall rejection strategy by MTurk . We can use this strategy for huge part of HITs. Also it can reduce suspect rejections by requesters.

Second it’s faster and also it has Fault tolerance feature.

Nice payment Strategy :) (It’s fare , At least you can make decision before spending time)

We use the historical accuracy of the worker to model the future payment of the tasks she is assigned. Hence, for each worker, an a priori payment is a function of the task and her quality of the work. Since the cost of each task is calculated before the task is complete, the worker can view how much money she has made in real-time.

Compare this with MTurk which the cost of the task is predefined by the requester!! And even cannot be changed by requester after creation. By this kind of strategy we can improve fairness problem.

It also motivate workers to work better. Also they have more job safety.

Worker Can work when they want to work (It’s more compatible with crowdsourcing vision)

Most of people are thinking about using crowdsourcing platform when they cannot do their usual work like bus. MTurk should have some HIT that can be done by phone by a flag for specifying it. And requester should provide content for supported version.

Increasing number of Category

This is part of future works in the paper. The word **category** is a keyword here, something that we do not see in MTurk platform. We can also categorize HIT's in MTurk and we have better platform for workers to find their proper task. Or we can do category based search.



MClerk: MClerk is a mobile crowdsourcing tool like MobileWork. Enabling image tasks delivery to low-end mobiles and adapt the request to local languages. The accuracy of digitization was 90.1% and the cost is from 0.004 dollar to 0.01 per word. Limited mobile platforms. The mClerk system, is based on the ability to send small pictures with SMS, provided by Nokia and Ericssons and they adapted the system based on this. The image segmentation is done per word. They enable the closes transliteration of the word. The system charges the requester 1.1 per word, whereas similar systems charge between 0.5 to 1.25 INR. System provides feedback to users.

Pros & Cons & New Idea

Some of Pros & Cos by real users of mClerk

First Pros:

1. User Satisfaction:

“This service is great sir! You don’t need to pay anything for service activation and you get currency for sending SMS. I anyways sent 50 messages to friends daily before this. I do messages all the time, between classes with friends, in bus, even with one hand while having dinner.” This is a user, who has digitized 1,717 words.

Another user:

“I have to wait 20 mins for bus. I stand and do at bus-stop. I have stopped going to the recharge shop now, I get enough.”

2. Language Facility:

Assuming that many local languages doesn't have any font on cellphones, the system provides a transliteration of the

closes word meaning in local language assuming that many Indians have basic knowledge of English.

3. Time Flexibility and freedom:

People can submit their work with SMS even when commuting to work or in the bus.

4. Easy registration:

Only by a miss call to the researcher, they get a call back and by providing their information, mobile operator, and phone company and referrer (the person who has invited them to join) they can register to the system.

5. Troubleshooting:

Miss call base troubleshooting

6. Competitive environment:

At the end of the day, they sent the name of top 5 leaders and urge workers to compete and do better

7. Enabling Micro tasking for users with no desktop computer or mobile internet

Cons: 1. Lack of Synchronous Feedback:

Feedback is essential to users in order to succeed in future work.

2. Inflexible Troubleshooting:

The miss call based troubleshooting might be working well in a system of 1000 workers but managing such a troubleshooting platform is not scalable as the number of workers grow.

3. Time Pass:

Spending reproducible time on messaging.

“We sit at back bench in class and message during lecture.” “Earlier we [friends] used to message good-morning, goodnight, Jokes etc. Now no one does that. Everyone is busy.”

4. Lack of trust:

People usually cannot trust a system as good and simple as this.

“Is it legal? What’s your profit? I don’t want any trouble.” “It is like some code sending. What do you do using this?

5. Not extendable to all mobile phones:

The system just works with Nokia phones and one user said:

“System works in Nokia only, so I told my friends this system is by Nokia company to increase sales.”

6. Not good for all low-incomers:

The system needs lots of free-time, therefore it is not good for a taxi driver or guards, but it's very good for shopkeepers or

students and jobs that allow them to have social interactions.

7. Highly dependent on Mobile Operators:

The system is affected highly by operator Blackout days and their message drop rate.

Nice Target that are developing countries (cause we can find more worker for work with less money)

Compare it with mTurk that even signup procedure stop some valuable workers to participate.

Its just for specific Phone like Nokia

The reason is most of people in developing countries have this device, but Cause we want to scale our platform to even developed country like USA, We need to be more scalable.

Flash Teams


Expert users assigned to sequence of modular tasks which are computationally manageable each. The users are expert in design and engineering. The users should be expert and the Flash Teams can gather up elastically for different organizations. Could we accelerate the process of prototyping and user-testing with crowdsourcing? The tasks done with Flash Teams are creating animation, software design process and prototyping or developing an online course in one day. They believe that even temporary groups can coordinate complicated works effectively if formed into a team structure with certain task and responsibility assignment. Sequence of linked tasks are created. Flesh team is a framework to assemble and manage crowd expert teams dynamically. Flash teams are sequences of linked modular tasks. Output of one task is chained as an input of another. Teams can be combined, the focus in one teamwork and modular tasks. The goal is to accomplish a design in one day. Groups can shrink and expand based on the demands of the tasks. They use Foundry to enable users to create flash teams and they recruit from oDesk. Foundry provides the team with shared awareness of the schedule, progress. It abstracts low-level management effort to enable oversight and guidance. oDesk has a very good UI, it tags the skills along with the job description and the overall payment. Very neat organization of jobs. The problem with these crowdsourcing tools is that they provide the worker with all confidential information and data even before applying to a job, and there is no level of confidentiality and security assigned to users and that doesn’t preserve privacy of the requesters to any point.

Flash teams leverage scale of paid crowd for expert work and also it manages the work computation. 

Pros : previous crowdsourcing platforms for experts focus on single expertise like math. Foundry use tools such as Gantt charts to structure team and give visual timeline. They have overcome geographic dispersion. Flash teams uses Foundry which enable coordination by encoding the responsibility of individuals and interdependencies. Team structure help even strangers work effectively. The modular structure of system is based on management modularity theory, therefore it is loosely coupled and components can be connected and interact with a standardized interface. Flash team has lightweight, reproducible and scalable team structure. Tasks are machine understandable therefore they can be managed and manipulated so that the team can grow and adapt and combine into larger organizations. Teams consist of a block which is “one expert performing a task”. Blocks have input, output and can be connected to other blocks. Input and output are presented as “tags” and the connection between them is tagged with the combination of the input-output. Block input tagged A, block output tagged B and the connection tag is A-B. The requester manage team by chaining blocks. Each block on the timeline has input, output, title, tags, description, and one of the skills on oDesk. It keeps a library of all previously done components for reusability as drag-and-drop blocks. It needs the continuous interaction of the requester throughout the work process. The teams are elastic, they can grow and shrink based on the demand. It updates the time to finish the whole task dynamically during the run-time. The intermediate tasks are pipelined rather than waiting for the entire task to complete. Most of the tasks are successfully done in ONE DAY!

Quick view on strengths

Task chaining: enable dependent tasks to be chained
Task pipelining: allowing non-independent tasks to start free of any time constraint
Elastic Teams: shrinking and growing of teams based on the demands
Dynamic management and time-lining: enables team to hit the milestones and goals and give individuals a feel of a team structure
   Nice chain task mechanism

Quick view on weaknesses

Expert tasks: it doesn’t provide some micro tasking for all kinds of jobs 
Non-unified components: if all Foundry, oDesk are formed into one unified platform transparent to the worker and recruiter that will be better.
Experimental: it is just done in a research based environment

Nice multilevel chain task mechanism

Introducing multilevel chained tasks can be very useful. Like our example above that we break a huge task, like creating prototype or animation and we send each part as a micro-task to users.

Our Platform should handle these kind of multilevel task, if you compare it with mTurk , those work are mostly independence to each other. But task in this structure are more dependent and need collaboration.

self-manageable by crowd

Foudry, seems to work well as an expert elastic team management, however a standardized management platform, which is itself manageable by crowd can be very beneficial and it also gives the feeling of a team structure to the worker

Nice dynamic structure for Tasks

In this system we can change attribute and parameters for task. We can even change the team size and members. Thats a cool feature that should be considered in our next generation design. (Compare it with mTurk static setting of HITs )