Milestone 2 Sky

From crowdresearch
Jump to: navigation, search

First we really encourage you to look at the index table in our previous milestone Milestone_1_sky cause fortunately or unfortunately we were doing need finding and most of those entries are observations we have had in using Mturk or they are professional Turker's experience. Our previous milestone and this one complete each other.



Worker perspective

Worker Training

Workers need to get start from specific point and follow a roadmap with specific milestones to be successful Turker
  • Observations: Workers in too many forums , in meeting panels in hangouts people explicitly agreed on difficulties in the beginning. Most of them say for first weeks of working we just earn few amount of money. Or we end up with too many rejection. And the only thing that helped them to professional level just try and try and asked in Turk communities.
  • Interpretation: Cause majority of workers remember these difficulties at the beginning , it seems they are confused how to start. And Mturk also does not have any Tutorial or some kind of Navigations training for beginners.
  • Need : Amateur Workers need to start from specific point and have a roadmap showing the way of being a successful Turker
  • Evidence :
    • In Getting Start , zingy says: "Starting out with mTurk is a great new experience and it can seem daunting at first" she also proposed nice milestones that can help beginners.
    • in Crowdshrimp milestone 1 you will see this need obviously by comparing adaams and Kristine , In one side Kristine Hoang says " there was an instance where I thought I had found a $0.50 task (which was to tag home design photos), but when I found out it would take two hours to finish the task, I was no longer motivated to do it." In the other hand Adaaam Lui says " My earnings were drastically increasing with each passing week, with the support of forum members and new friends I'd met through the turking community. By the fall, I was setting weekly goals of $2000 a week."
    • In Panel_1 David says “finding the work on Amazon is too hard to start on, requires a lot of time to find HITs and you are not aware of working with requesters who want to underpay, it takes a while, Amazon interface you can make 10 dollars the first week but be frustrated as hell.”
    • In Panel_1Manish: “Amazon has a difficult learning curve, finding the work is very difficult, in some cases you find underpaid hits, they hope to gain more and they just quit because of this.”
    • * Panel_1Christie says: “we need training documentation, pertaining good workers, lose people that are lazy, prevent rejections for good people. We have to do it and care about it”
    • Christie : “New platform, drop-off rate is severe because of the lack of support, we need training documentation, pertaining good workers, lose people that are lazy, rejections for good people we have to do it and care about it.”


Workers need to do HIT's based on some guidelines from requesters and standards from Amazon
  • Observations: It seams Workers ask about some guidance in many forum to increase their approval rate.
  • Interpretation:
  • Need : Workers need a guideline to do HIT at the beginning and standard to avoid scammers
  • Evidence :
    • In this Forum [LINK] , DCI says: "comprehensive documentation, instructional videos and well trained support to help people in getting started."
    • Here it could be one consequence of lack of guideline , "Better support for returned HITs...find out why a HIT is being returned, e.g. if it's broken, have an option to add comments noting this so the requester is aware. Similarly, if the HIT violates the stated ToS or asks for more work than it's worth, that should be noted as well"

HIT finding

Workers need to navigate easily in Mturk Interface
  • Observations: Workers use an extra device or third party tools to make it easy to navigate there.
  • Interpretation: Not using all space of webpage efficient and also basic way of display of result make it difficult for workers to navigate between pages and HIts or continuously refresh the page.
  • Need : Workers need to have better ways to know about Their favorites Requester or HITs to do them agin. It helps them to save lots of time.
  • Evidence :
    • In first meeting watch Meeting panel 1 , manish says: "lots of my friends using extra monitor to keep track of new HITs .."
    • Here LINK , DC says : "Frequently scan through every single page of mturk HITs, which display only 10 results per page"

Workers need to check their favorite HITs / Requesters
  • Observations: Workers are willing to do more of HITs that they already have had good experience in them.
  • Interpretation: Lack of any mechanism or tools for adding specific HIT or Requester forces workers to use third party tools that are not efficient most of times.
  • Need : Workers need to have better ways to know about Their favorites Requester or HITs to do them agin. It helps them to save lots of time.
  • Evidence :
    • Here LINK , DCI says: "lots of others using third party tools like Chromes page monitor extension to constantly check specific pages for changes and alert the worker if any are found", He also said that using third party tools is so time consuming.
    • Here LINK , DC says : ""A star or checkbox could be added to HITs on the search page allowing people to mark and save their preferred work. This could then be used as a filter allowing workers to show only their preferred tasks that are currently available" or he says again "Functionality could be added to allow workers to subscribe to preferred requesters and receive notifications upon posting"

Workers need to be informed by new HITs efficiently (News feed )
  • Observations: Workers complain about the Mturk's only way for finding NEW HITs.
  • Interpretation: Static and not flexible way of refreshing page by mTurk Forces workers to keep refreshing every second and navigate in a pages with 10 result each. This way of showing new HIT is not efficient and most of matched HIT is hidden for qualified workers.
  • Need : Workers need to save time in keep track of new HIT.
  • Evidence :
    • Here LINK , DCI says: "Continuously refreshing and monitoring the search page while filtering results for the newest HITs and looking for any HITs of interest posted"
Workers need to use more flexible Keyword search engine
  • Observations: Workers have difficulties with Mturk keyword search engine. Even we notice this in previous milestone. If you try you will see that the search engine does not support searching multiple term.
  • Interpretation: Really basic search engine by Mturk restricts workers to only one keyword search. And in the other hand because most of times requesters do not select proper keywords or Most of workers use specific words like “survey” , One term keyword search is not helping too much.
  • Need : Workers need to find more appropriate HITs buy describing their HIT with more keywords rather than just one word. Like getting results for "Politic's Survey" not just "Survey".
  • Evidence :
    • Here LINK, DCI says: "Mturk's search engine does not allow for multiple search terms (term1 OR term2) like many search engines" or another place he says: "The search engine could be upgraded to allow for searching multiple terms"
    • Here LINK, Also we reported this problem in our Milestone 1 Sky wiki like this : "For example I take a HIT with three keyword (Survey, CW, Approve) you get that HIT as result just when all of your words in your query exist in Hit's keyword. It means if your query has an extra words the result will be “Your search did not match any HITs.”!! So how can I search? It means even search based on keywords just works when you have search for one keywords and for queries that all of its words exist in keywords of HIT :(. CW is abbreviation of (Casting words) if you search "Casting words” you will get "Not found " although we have Casting words in description of the HIT and "Castingwords” in keywords."
Workers need to find most efficient keywords that they can use in their query
  • Observations: Workers always spend too much time and think about which keywords they should use in their query. And try lots of keyword. They usually end up with “Result not found ”. Using more general keywords like survey end up with tens of results that navigating among them is really time consuming.
  • Interpretation: Mturk does not help workers to use efficient keywords. MTurk can do real-time suggestions based on existing keywords in existing HITs. It save lots of time for workers and even requesters. And its decrease the number of time workers end up with “Result not found” because its based on existing HITs.
  • Need : Workers need to find most efficient keywords that they can use in their query to not end up with nether ’not found’ or too many matched results.
  • Evidence :
    • Here LINK, DCI says: "Mturk's search engine does not allow for multiple search terms (term1 OR term2) like many search engines" or another place he says: "The search engine could be upgraded to allow for searching multiple terms"

Workers need to find related HITs based on their skills
  • Observations: Workers Usually spend too much time navigate in HITS to find HITs related to their skills. And sometimes this time is more than doing HITs.
  • Interpretation: Its difficult for Turkers to find Specific Hit in pool of all HITs that are not categorize by topic or type.
  • Need : Workers need to find most related HITs that match their skills or previous experiences.
  • Evidence :
    • Here LINK , newturk says "Well I just started doing this, like today, and I'd like to see a way to find hits that match my skills. I know I'm good at x,y and z but I see no way to find those hits. I spent an hour today looking through things and found only three where I felt like I had anything substantial, or even just valuable to contribute"
    • Here LINK , arkaneinic says: " Being able to separate hit's by different categories would be beneficial" and also says "Should be easy enough to at least use the same category implementation of on mturk."


Workers need to feel career justice not injustice employment
  • Observations: Workers many times talked about the unfairness compare to requesters. They think requesters has more power in mTurk than workers. The first incentive for all of us is that we feel every thing is fare. They think requesters are more flexible that workers
  • Interpretation: There could be couple of reason for this observation:
    • Cause there is qualification process for workers but requesters should not pass any qualification process
    • Worker's Approval rate and rejection rate is used by requesters to select workers but there is no history of requesters to avoid bad requesters
    • There are mechanisms like soft and hard block procedure for requesters that can block even honest workers but we do not have similar strategies for workers
    • There is no strategy to control fair payment according to time is spent on HIT
  • Need : Workers need to have more flexibility in their activities, They need to feel career justice.
  • Evidence :
    • in LINK Vredesbyrd says "I'd imagine the first step in creating a worker oriented crowd sourcing platform would be to find a way to ensure legitimate work can't be unfairly rejected by a requester. "
    • in LINK Lept says "The fact that Amazon itself has NOTHING to tell workers if the person they are working for is any good, but has piles and piles of statistics about workers for requesters to look at, bothers me to not end. Ideally it should be a two way street."
    • But in the other hand one requester had a nice point of view LINK kerek Says "Workers have a lot more flexibility and commitment, we can work around any issues and learn tricks and spend time adapting. Requesters who have a frustrating first experience will just disappear. I would focus almost entirely on that side of the platform design.


Workers need to Build their Reputation more efficiently
  • Observations: In crowdsourcing environments it is difficult to build a reputation
  • Interpretation:Most of the reputations lacks a strong support and can be subverted easily
  • Need: A standard platform to build the worker reputation
  • Evidence :
    • By having a set of sample standardized tasks and having them as basic templates, MTurk has been able to make credible reputation scores, teset scores, work and payment histories.Here "For example, one of the main innovations made by oDesk was that they logged a worker’s time spent on a task, enabling truthful hourly billing"

Workers needs "Bill of Rights" . need to be heard in Crowdsource system

  • Observations: Workers are invisible in AMT design
  • Interpretation: Individuated workers have lack of consolidation and solidarity to exert pressure on employers or Amazon in case of injustice. Even in US the are paid below the minimum wage in so many cases, and technologists are not concerned about human costs.
  • Need : Workers need a "Bill of Rights" based on ethical standards
  • Evidence : " I would also like workers to have more of a say around here, so that they can not easily be taken advantage of, and treated fairly, as they should be. Amazon seems to pay more credence to the requester, simply ignoring the fact that without workers , nothing would be done"Here.

Workers need trust, they should believe Somebody or themselves is taking care of Accountable Requesters

  • Observations: Requesters can choose to pay the worker or not, prone to wage theft
  • Interpretation: Workers dissatisfied with a requester's work can contact the requester but Amazon doesn't require requester to answer and they don't in many cases and workers have no option than to leave the system.
  • Need : Workers need to contact the requester for the cause of rejection
  • Evidence: " You cannot spend time exchanging email. The time you spent looking at the email costs more than what you paid them. This has to function on an autopilot as an algorithmic system and integrated with your business processes" Here

Safety and Security

Worker need to be safe when they are working on HIT (Possible Malware in HIT)
  • Observations: There are lots of HIT that ask you install some plugin and could be not safe
  • Interpretation:
    • Amazon not check link safety
    • Amazon not contorolling its external links
    • Amazon does not have any rule , restriction or punishment for those requesters
  • Need : Workers should not be worry about their device safety or privacy
  • Evidence : Our Evidence in Milestone 1

Communication with Amazon for mediating rejections, blocks and Communication with requester

  • Observation: Amazon doesn't enforce Requester to answer the user
  • Interpretation: Workers dissatisfied with a requester's work can contact the requester but Amazon doesn't require requester to answer and they don't in many cases and workers have no option than to leave the system.
  • Need: A communication media to ask the reason of rejections and questioning the requester
  • Evidence: Panel_1Christie says : “before you couldn’t email the requester, but overtime especially in these two years, putting up HITs in

community people fix it themselves but now they make professional emails to project to requesters, and ask them why, and by the community we fight back, they shame requesters, and they also post their rejections to check if that is fair or not, they email the requester, and Turkopticon rating is a big and amazing thing as well.”

Requester perspective

Requesters need to be supported by guidelines to configure their HITs
  • Observations: It seams Requesters ask about some guidance in many forum to increase their approval rate.
  • Interpretation: Lack of really clear documentation or tutorial for requesters to train them how to create , configure and manage request force requesters to spend to much too much time to create a HIT and sometimes its frustrating.
  • Need : Requesters need to be supported by comprehensive documents , tutorial or template to decrease initial time investment for creating Hits.
  • Evidence :
    • In this Forum LINK DCI says "Obviously not everyone has the proper skill set for creating their own tasks or the budget to hire people to do it for them. The more that is already done for people and the lower the initial time investment is to post work, the more requesters that will use the site".

Requesters need to do standardized tasks to test the worker, train them and ensure quality
  • Observations: Good performance highly relates to standardized instructions for the worker
  • Interpretation:Based on Henry ford, training the work force well, make them accomplish the tasks easier and replicates training for the new workers easier
  • Need: Standardized tasks to test the worker, train them and ensure quality
  • Evidence : By analogy, standardized measure, always facilitate the conversions and trading
    • For example, electricity producers are required to produce electricity adhering to some minimum standards before being able to connect to the grid and sell to other parties Here
True Reputation System for Workers
  • Observations: Marketing without reputation is “market for lemons”, not able to evaluate the quality beforehand
  • Interpretation:Good workers leave the market because of low wages
  • Need: Reputation system that rates the workers
  • Evidence : “Repeatedly labeling a carefully chosen set of points is generally preferable, and we present a set of robust techniques that combine different notions of uncertainty to select data points for which quality should be improved. The bottom line: the results show clearly that when labeling is not perfect, selective acquisition of multiple labels is a strategy that data miners should have in their repertoire ”Here.

Analogous to labeling repeatedly, building the reputation over repeated tasks to ensure quality is highly valuable in ranking the workers.

    • Refer to Sky milestone 1

Result Filtering

Requesters need more advanced and efficient worker selection for HIT assessment and more restricted in rejection after assessment
  • Observations: Most of requesters work with previous workers but they have difficulties in finding and adding their Id , Or specifying them in HIT description that they say its frustrating.
  • Interpretation:
  • Need : Requester need to be more flexible before assignment and to be more restricted after assignment and decrease rejection rate
  • Evidence :
  • In Panel_1Gianluca says: “If I want a specific to be done with a worker and title the task with the worker ID and ask him/her to complete but it’s not possible and complicated. “
  • there are lots of evidence already in

Requesters want Task redesign for reliable results
  • Observations: Requesters, specially when surveying the workers with non-defenitive answer get uninformative and sometimes confront malicious user behavior
  • Interpretation: Requesters conducted two experiments to rank Wikipedia pages for being well structured, well written and factually comprehensiveness and compared it against expert ranks, the correlation was very low showing that MTurk is not reliable and users are uninformative.
  • Need: Requesters need to redesign the tasks to more verifiable tasks for more reliable results
  • Evidence : Redesigning Experiment 1 (Exp1) to Experiment 2(Exp2) have made the correlation between the true answer and workers answer higher.


Exp1: conducting a survey of rating Wikipedia pages, with non-definitive questions, Exp2: conducting the same survey with the same conditions but by asking verifiable questions, like summarizing the text and adding 5 tags for each Wiki, to make sure that the worker has read the page. [1]

[Courtesy: Aniket Kittur et al]

HIT management

Requesters need to trace their requests by real time status update
  • Observations: Requesters complain about ambiguous way of representing their HITs status. Some says We should look for some ways ourselves to make sure HITS are done completely by specific worker and carefully. So we have difficulties in managing our HIT to get better results.
  • Interpretation: Few parameters for showing HIT's status is not enough for decision making by requesters.
  • Need : They need to trace heir HIT
  • Evidence:
    • in [meeting panel 1 ] he says: "I do not have anyway to see whether my HITs are done or not "
    • in LINK Vredesbyrd says "Obvious problem is that this makes the platform way less attractive to the requester, because if I had something that needed doing I certainly wouldn't want to use something that could override my decision on whether or not it's done."

Better Interface for Posting Task
  • Observations**: Last change on MTurk has been introducing a UI for submitting batch tasks
  • Interpretation**:The complexity of the interface needs a full time developer to deal with
  • Need** : Requesters need a simple interface to post and manage tasks
  • Evidence** : Wide use TurkIt Here . "If every requester, in order to get good results, needs to: (a) build a quality assurance system from scratch, (b) ensure proper allocation of qualifications, (c) learn to break tasks properly into a workflow, (d) stratify workers according to quality, (e) [whatever else...], then the barrier is just too high. Only very serious requesters will devote the necessary time and effort. "

Creative Survey Design Invoking Critical Thinking
  • Observations: Human working as a machine become a machine
  • Interpretation:When running similar surveys needing gut responses, especially psychological ones, the researcher get more or less a robot like answer and not a gut response.
  • Need : Creating the survey more creatively and effectively
  • Evidence : Sara Marshall a Turker says “You just get in a motion,” “And you’ll see particular questions and it’s like, if I see the same block of questions twice on the same day, I even know the pattern for my answers.”Here

Both perspective

Workers and Requesters need to Sign up more easily
  • Observation: People from other sides of the globe can’t sign up for the system and even people from US had difficulties in signing up.
  • Interpretation**: This is mainly because of payment system and safety of both workers and requesters. Current system need SSN number for sign up stage that restrict system bounds to U.S., And people from other countries have difficulties to contribute in current system.
  • Need: The system should transact money independently or at least could connect to a global transaction system.
  • Evidence: It happened to me and my teammates when we wanted to sign up in AMT we faced this problem and I have I’ve seen this troubled some people like Andrea.

Workers and Requesters need to look for their issue in have well-organized, efficient and specific purpose system for Turkers
  • Observations: There are tens of forums and Turk community all around the web. And each one has many contributor. they are trying to solve their issue by asking their question. Thats a valuable resource . But unfortunately workers spend too much time to solve their problems.
  • Interpretation: On reason could be because Mturk does not have well-organized general structure for Turker community discussions to make discussions efficient. Like avoiding repetitive questions or guide to already answered same questions. and forums are distributed all over the web. So Unfortunately sometimes workers have difficulties to solve their problems in short time or receive proper answer onetime. Some time we see same question in different forums but there is no guarantee people with that question visit both question. Take stack overflow as an example.
  • Need : Workers and requesters need to discuss in efficient and well organized system
  • Evidence :
    • here you can see diversity and distribution of discussions all over the web like reddit, mturkgrind ... .

They need to make decision by using more accessible criteria
  • Observations: In many case in forums and meeting we see both workers and requesters have difficulties with scammer in other side. Workers say some of requesters are just reject their workers without reason and we need to know them and Requesters say they have difficulties to distinguish between good workers and bad workers.
  • Interpretation:
    • One reason could be lack of any history for non of requesters and workers .
    • Another one could be lack of information in users profile information.
    • Another one could be lack of any rating or review for both requesters and workers to show their average performance.
  • Need : Workers and Requesters need to have more criteria to select their partners and make decision. For workers which HIT is worth spending time. For requesters Which worker spend enough time to do the HIT and his answer is valuable
  • Evidence :
    • here Rand4m says: "It would be useful to have a demographic page that could simply be incorporated into any task, so we don't have to be constantly filling in the same old demographic page each time: that wastes the time of both the requester -- to have to ask for it -- and the worker -- to have to do it"
    • We had other case but In this interview Interview its completely obvious the turker says

"<oneohoneohfive> protection from spam/scammer workers, fairness and protection of users, etc

<oneohoneohfive> requesters get screwed, workers get screwed, and there's no recourse

<oneohoneohfive> it's the wild west

<karthik_> oh, what exactly do you mean by fairness and protection of users ?

<karthik_> you mean for both requesters and workers ?

<oneohoneohfive> requesters could reject legitimate work, or hard block a worker, etc

<oneohoneohfive> requesters can see their hits eaten up by scammers

<karthik_> and the requesters dont justify the rejections ?

<oneohoneohfive> they can if they feel like it "

Mturk is a business So Workers And Requesters need customer service support and upgrade by their feed-back
  • Observations: Workers and Requesters both think that amazon do not here them and do not try to support them. It seems they think amazon does not care about their feed-back.
  • Interpretation: One reason could be Amazon Management strategy for dealing with Mturk. Probably it has not that priority than Amazon online shopping. Actually its interesting that there are tons of feed back out there by professional users from 2010 and until now we have had not any upgrade by Mturk , thats open to discussion by you my friends ?!
  • Need : Mturk users need costumer service and upgrade
  • Evidence :
    • here in [meeting 2] Niloofar says "most recent frustration would be that Amazon doesn't talk to us about anything - rejections, blocks, suggestions, upgrades, etc."
    • ZSM turker says with hummer bellow some great feedback by another user "Should we do one of those online petitions too? it would be so great to have that functionality"
    • here interview the turker says " there's an offsite feedback system workers use to rate them, otherwise amazon doesn't do shit"
    • In Panel_1 Christie says: “In the new platform, drop-off rate is severe because of the lack of support”
Mturk users need to focus on doing their jobs / not to be distracted by finding third party tools for efficient
  • Observations: Workers and requesters all use some extra device or third party tools for finding and doing their main jobs. Workers all using third party add-ons for to find HITs and requesters add-ons and scripts for filtering their results . Workers spend time to contact requesters to solve their payments issue and requesters spend time to clarify workers why did they reject them.
  • Need : Mturk users need costumer service and upgrade
  • Evidence :
    • here in LINK ladylilac says "I would say take mturk as the base and then add changes for improvement (for instance, all the add-ons that we use now would be built into the new platform already)"
    • here LINK LeftCoastLady talk about a guideline for using external tools "Clearer and adhered-to guidelines related to support of external tools, especially those that require you download additional software.


Workers and Requesters need to Establish some amount of trust
  • Observations: There are some trust guarantee doubts about surveys on MTurk.
  • Interpretation:The researchers don’t know how reliable the data they got from surveying Turkers can be.
  • Need : Establishing some level of trust between requester and worker
  • Evidence: Sara Marshall says “I can’t comfortably separate my honesty from money, because at the end of the day, all you really have is your integrity,” she says. “And if you are willing to sell that for a dollar, it says a lot about the kind of person you are.”Here

Requesters and Workers should influence each other
  • Observation**: It is hard to distinguish good requesters from bad requesters
  • Interpretation: When there isn’t a mechanism to rate requesters by workers feedback this problem raises. There will be requesters that refuse to pay enough money according to the quality of performed task and also some of them will read the results hasty so they may reject some good results.
  • Need: Requesters and Workers should influence each other
  • Evidences: Mezza: I did one hit yesterday, and its still pending. In my opinion the pay is

too low for the time required, the pay is also too slow to look past its low payment all of which is assuming you get paid at all because it is majority rules graded. Big thumbs down for me.

Users need to use more capable UI for doing their task and and getting better statistics report
  • Observation: People have troubles using UI for their task performing.
  • Interpretation: It is mainly because of huge range of tasks that may requesters propose. There should be many categorized tools to serve people in performing tasks considering that they are using different front-end devices like PC, laptop or touch enabled phones. Lack of convenient tools for different platforms takes more energy from workers and cause to dissatisfaction according to their low earnings.
  • Need: There should be many categorized tools to serve people in performing tasks
  • Evidences :
    • Andrea said that: There aren’t many tools for statistics, visualizing data, etc. there aren’t many tools to make using the system easier the GUI has few tools, no visualizations, few templates, makes posting HITs hard