WinterMilestone 2 lightsaber

From crowdresearch
Jump to: navigation, search

Contents

Learning from Needfinding

Participant Observation

Reasons

  • Find design opportunities
  • Build empathy
  • Process v. Practice
    • Apprentice
      • Partnerships
      • Observations
  • Complexity of culture

Deep Hanging Out (Fieldwork strategy)

  • What do people do
  • Understand the values and goals
  • Activities v. Ecology
  • Similarities v. Differences
  • Context i.e. the time of the day

Examples

  • truckers can’t use a small screen of a device → make it larger
  • apprenticeship
    • repairman
  • Walmart
    • What customers said v. What they actually did
    • Leading questions

Interviewing

Choosing Participants

  • target users
  • users of a similar system
  • non-users to broaden the potential market
  • Approximate if necessary → what type of user to choose depending on the situation
    • CS students (approximate targe users) v. SWE (target users)

The importance of being curious

  • Power v. Knowledge
  • Start with the people in the middle
  • People on the top are usually very careful about what they say, so normally you won’t get as much information out of them

What are good questions

  • Not leading
  • Open ended
  • Okay to have silence
  • What to avoid
    • like/love/want type of questions
    • frequency (sometimes they are not honest)
    • absolute scale
    • Y/N

Additional Needfinding Strategies

Diary Studies

  • Complete it at a specific time
  • Tailor the recording to the context
  • Scale better than direct observations
  • Easier tools lead to better results
  • More practice, training and reminding

Experience Sampling

  • Emotion
  • Useful to aggregate the results

Lead Users v. Extreme Users

  • Keep their needs in mind
  • How their answers can help you find opportunities and improve the product
alt text


Personas

  • Demo info
  • Goals and why they use the product
  • Build empathy
  • Understand their emotions, point of view, values, etc.
  • If your new product will fit this user’s needs

Reading Others' Insights

Worker perspective: Being a Turker

Observations for Worker

Workers have various personal motivations, beliefs, intentions, behaviours and goals. From the scenarios of the paper, here are few things we saw:

  • Social desirability bias, faster pay is a significant motivating factor.
  • Turkers have little to no information about the Requesters. The trust factor to do any of the task is sort of low.
  • Greater powers of redress is with the requestor. The workers are frustrated with how reputations are not handled quite well, less communication with the other party.
  • Turkers often do an initial investigation of how much they are going to earn. Some are drawn to more attractive wages. The calculations of ‘worth it’ takes up a lot of time and pressure.
  • Some key findings are that they treat their activities as work where pay is the most important factor and that they understand and orient to AMT as a labour marketplace.
  • Their biggest concerns are having enough information to make good decisions on selecting jobs, having good relationships with requesters, and how to act collectively.
  • We find that the key function of Turker Nation is to help reduce the information deficit and promote better collective action. Based on this we suggest that technology directions that should support these needs.

Observations for Requestor

Ideally a requestor is satisfied with cheaper, high quality and faster worker performance. But in reality he get to choose only one:

alt text

Worker perspective: Turkopticon

Ideally, Amazon should have changed its systems design to include worker safeguards. This has not happened. Instead, Turkopticon has become a piece of software that workers rely on as a critical tool to not fall into bad requestors trap. Turkopticon is like a networking networking platform used by workers to review employers on AMT. A system to make worker-employer relations visible and to provoke ethical and political debate. Crowdsourcing terms such as human-as-a-service, remote person call, Labor-as-a-service raised questions about the ethics of human computation.

  • Workers have limited options for dissent. Dissatisfied workers within MTurk has little option other than to leave the system altogether as it favours requestors the most.
  • Workers general concern was their work was regularly rejected unfairly or arbitrarily, slow payment. Also they explicitly mentioned a minimum wage or minimum payment per HIT.
  • Expressed dissatisfaction with employer’s and Amazon's lack of response to their concerns.

With Turkopticon :

  • Workers explicitly helped requestors to design for scale, created groups of people who see their interests as it aligned. Workers also learnt new technical skills to cope up with the multiple hits.
  • Workers provided and viewed reviews for a particular requester to choose HIT.
  • Activities such as exploratory prototyping engage diverse stakeholders in identifying causes of common concern.
  • Turkopticon’s existence sustains and legitimizes AMT by helping safeguard its workers. AMT relies on an ecosystem of third party developers to provide functional enhancements to AMT.The ambivalence of success in activist technologies.

Turkopticon collects quantitative ratings from reviewers on four qualities that we hypothesized based on the Workers’ Bill of Rights survey:

  • How responsive has the requester been to communications concerns?
  • How well has the requester paid for the amount of time?
  • How fair has the requester been in approving or rejecting work?
  • How promptly has requester approved work?

Requester perspective: Crowdsourcing User Studies with Mechanical Turk

1) Common Issues

  • Plan user evaluations
    • Costs on acquiring participants, observers and equipments
    • Time-consuming
  • Gather feedback from a small set of participants
    • Difficult to detect issues and errors
    • Lacks of statistical reliability
    • Free-text response are generally blank or uninformative or copy-and-paste responses.

2) Challenges for Requesters (for user measurements) on MTurk

  • The purpose of micro-tasks does not completely align with the goal of users measurements
  • Difficult to detect malicious users
  • Lacks of information about the users such as demographics, expertise and limited contact

3) Maintain the design of the tasks for user measurements

4) Ensure the quality of the results

5) The Mechanical Turk ratings are weak support to gauge expert ratings.

Requester perspective: The Need for Standardization in Crowdsourcing

The Need for Standardization from the Requester’s perspective

  • Requesters come to understand that hired workers can complete the tasks properly after training with standardization in crowdsourcing. It is easier for requesters to replicate training if they have a more standardized structure.
  • Many crowdsourcing tasks right now are relatively low-skilled and requesters require workers to stick to instructions for the particular and standardized task.

Current Status

  • At the moment, there’s not too much disruption at the microtask level from the Requester’s side - how workers are allocated tasks after they are hired, how reputation is managed on the platform and how tasks are represented to workers. For example, MTurk was criticized for its lack of innovation and it hasn’t changed much since its launch.
  • With the increasing criticism for requesters, there’s a new thriving approach to fix this issue and this type of platform is called Curated Garden. Some of these them are uTest, MicroTask, CloudCrowd and LiveOps. They recruit and train workers for their standardized tasks and set prices for both sides of the market. This type of platform usually has more democracy and it’s easy to experiment on. They provide a new way to approach the work-pool and build a more scalable platform. Nevertheless, it still face challenges because their standardized approach limits its access to both buyers and sellers.

Both perspectives: A Plea to Amazon: Fix Mechanical Turk

Observations from Requesters

  • Difficulties of scaling Up
    • Difficult for requesters to reduce overhead, friction, transaction costs and search costs
  • Lacks of a better interface to post tasks
    • Decrease the quality of results because the interface confuses workers
    • Unable to properly break tasks into a workflow
  • Lacks of a true and helpful reputation system for workers
    • Cannot distinguish good workers from bad ones
    • Does not have enough good quality or standardized public qualification tests
    • Lacks of a feature to track workers’ work history
    • Lacks of a rating system for workers
    • Connect payment from rating. This feature should be used only to prevent spammers, but not to discourage real workers.

Observations from Workers

  • Lacks of trustworthiness for requesters
    • Sometimes workers don’t deliver high-quality results for new requesters because they are uncertain whether the new requesters are legitimate, which leads to an issue that requesters are not happy about the low-quality results
    • Workers are unable to find out how fast a requester usually pays
    • Workers are unable to find out about the requester’s rejection and appeal rate
    • Workers don’t know if the requester has enough work and will come back to the market in the future so they cannot decide whether they should invest time in the requester’s tasks

Soylent: A Word Processor with a Crowd Inside

Features

  • Shorten the text to 85% of its original length without changing the meaning
  • Crowdproof helps detect mistakes and suggests fixes
  • The Human Macro improves the formats of citations and including appropriate figures.

The major focused areas

  • Crowdsourcing systems
    • The Find-Fix-Verify pattern is considered a new design pattern for human computation algorithm, and it aims to manage low-quality and overeager Turkers, identify these Turkers’ common problems and turn the findings in a more visualized way.
  • AI for word processing
    • In this focus, the Human Macro helps users to automate similar editing tasks and reduce the time and cost for word processing.

Techniques for Programming Crowds

  • Add straightforward and quantitative questions in the micro-tasks because Turkers present high variance in the amount of effort they put into a task and they often produce errors.
  • The Find-Fix-Verify patterns helps to separate the open-ended tasks and allows requesters to have strong control over quality by having different types of workers to review, fix and improve the task.

The ultimate goal

  • Improve workers’ quality

Synthesize the Needs You Found

Summarizing the needs of workers

Workers need a better interface to use the platform effectively

  • Workers often get confused about what the tasks are about or the requesters’ information. It will be more helpful to the workers if they can browse and submit tasks on a better interface.Sometimes requesters do not properly break tasks into a workflow, which also confuses the workers. Therefore, it’s important to make the breakdown of a task more clear and easier to understand, so workers can perform better.
  • Therefore, workers need a better interface to use the platform in order to fully understand the tasks and deliver high-quality work more effectively.

Workers need a reasonable payment and feedback system for them to receive payments properly

  • The current platform connects payment to rating. Sometimes even if a worker completes the task, s/he might still not be able to receive payments because the requester determines the quality of the result does not meet the expectation. This original purpose of this is to prevent spammers in the system; however, it actually discourages real workers from taking on more tasks because they are worried they will always not be able to receive payments after completing a task in the future.
  • Therefore, workers need a better payment system so that they will receive constructive feedback on how to improve in the future, while still remain confidence and stay active on the platform.

Workers need a more comprehensive reputation system to obtain more information on requesters

  • Workers are unable to find information about requesters such as how fast the requesters usually pays, his/her rejection and appeal rate and if the requester will bring more work to the market in the future. Without this type of information, it is difficult for workers to fully understand the requesters and decide whether they want to complete the tasks for the requester. Most of the time, they will have to waste a lot of time on researching the requesters and how much they can make.
  • Therefore, it’s important for the platform to provide a comprehensive reputation system so that it helps workers to make better decisions on selecting tasks, maintain positive relationships with requesters and act collectively.

Summarizing the needs of requesters

Requesters need to find a better way to scale up

  • According to A plea to Amazon, many requesters find it difficult to reduce overhead, friction, transaction costs and search costs when they look for workers to complete the tasks. They have to spend a large amount of time on creating the tasks, searching for workers, review the work and fix potential issues in deliverable.
  • Therefore, it’s critical for requesters to increase its efficiency. One potential solution is to use Soylent, a word processor that works for crowdsourcing. Soylent has powerful features such as Find-Fix-Verify can help requesters to ensure the quality of the tasks’ description, review the deliverable and fix the issues efficiently.

Requesters need a more comprehensive reputation system

  • According to A plea to Amazon, new requesters always have a hard time in getting high-quality deliverables because of the absence of a great reputation system. Workers often don’t deliver high-quality results for new requesters because they are uncertain whether the new requesters are legitimate.
  • Furthermore, requesters often find it challenging to distinguish good workers from bad ones because the platform fails to provide enough good quality or standardized public qualification tests, allow requesters to track workers’ work history and have access to a rating system for workers.
  • Therefore, it’s critical for the platform to build a comprehensive reputation system for requesters not only to provide more detailed information about themselves and attract more workers, but also to look for workers more effortlessly and ensure the quality of the deliverables.

Research Engineering (Test flight)

We could setup the Daemo and start working on profile issue.

Milestone Contributors

  • @angelfu
  • @vijaym123