Difference between revisions of "Winter Milestone 5"

From crowdresearch
Jump to: navigation, search
(b. Task ranking (write Systems section))
m (Spelling correction)
 
(2 intermediate revisions by one other user not shown)
Line 28: Line 28:
 
'''A specific idea raised in last week's submissions that gathered a lot of interest: could we run an experiment to demonstrate how much variance there is in requester quality for the same authoring task, vs. how much variation there is in worker quality?
 
'''A specific idea raised in last week's submissions that gathered a lot of interest: could we run an experiment to demonstrate how much variance there is in requester quality for the same authoring task, vs. how much variation there is in worker quality?
 
'''
 
'''
[[File:Screen Shot 2016-02-01 at 8.34.50 PM.png| Task authorship]]
+
==== Mock abstract for task authorship, vision we're aiming for ====
 +
'''*Requester Variation Determines Result Quality in Crowdsourcing*
 +
'''
 +
 
 +
The dominant narrative in crowdsourcing is that low-quality workers lead to low-quality work, and high-quality workers produce high quality work. The result is that most techniques focus on identifying high quality workers. We hypothesize that requesters vary significantly in their ability to create high-quality tasks, much as user interface designers vary significantly in their ability and training in UI design. We performed an experiment wherein 30 requesters each authored ten varied crowdsourcing tasks, and workers on Mechanical Turk were randomized to complete one requester per task type. We found that while 20% of the variance in result quality is attributable to worker variation, fully 35% of the variance is attributable to requester variation. We introduce the concept of prototype tasks, which launch all new tasks to a small set of workers for feedback and revision, and find that it reduces requester variation in result quality to one-third of its baseline amount..
  
 
=== b. Task ranking (write Systems section) ===  
 
=== b. Task ranking (write Systems section) ===  
Line 39: Line 43:
 
Requesters don't get ideal workers, and workers don't get ideal requesters - can we rank the relevant tasks, on the basis of reputation, skills, and other necessary aspects?
 
Requesters don't get ideal workers, and workers don't get ideal requesters - can we rank the relevant tasks, on the basis of reputation, skills, and other necessary aspects?
  
'''A specific idea raised in last week's submissions that gathered a lot of interest: could we design an task feed ranking interface and algorithm for Daemo? A combination of user-centered work and machine learning/data mining?
+
'''A specific idea raised in last week's submissions that gathered a lot of interest: could we design a task feed ranking interface and algorithm for Daemo? A combination of user-centered work and machine learning/data mining?
 
'''
 
'''
  
Line 45: Line 49:
 
'''*Boomerang: Incentivizing Information Disclosure in Paid Crowdsouring Platforms*
 
'''*Boomerang: Incentivizing Information Disclosure in Paid Crowdsouring Platforms*
 
'''
 
'''
 +
 
There is a massive amount of information necessary for a healthy crowdsourcing marketplace — for example accurate reputation ratings, skill tags on tasks, and hourly wage estimates for tasks — that is privately held by individuals, but rarely shared. We introduce Boomerang, an interactive task feed for a crowdsourcing marketplace, that incentivizes accurate sharing of this information by making the information directly impact their future tasks or workers. Requesters' ratings of workers, and their skill classifications of tasks, are used to give early access to workers who that requester rates highly and who are experts in that skill, so giving a high rating to a mediocre worker dooms the requester to more mediocre work from that worker. Workers' ratings of requesters are used to rank their high-rated requesters at the top of the task feed, and their estimates of active work time are used to estimate their hourly wage on other tasks on the platform.
 
There is a massive amount of information necessary for a healthy crowdsourcing marketplace — for example accurate reputation ratings, skill tags on tasks, and hourly wage estimates for tasks — that is privately held by individuals, but rarely shared. We introduce Boomerang, an interactive task feed for a crowdsourcing marketplace, that incentivizes accurate sharing of this information by making the information directly impact their future tasks or workers. Requesters' ratings of workers, and their skill classifications of tasks, are used to give early access to workers who that requester rates highly and who are experts in that skill, so giving a high rating to a mediocre worker dooms the requester to more mediocre work from that worker. Workers' ratings of requesters are used to rank their high-rated requesters at the top of the task feed, and their estimates of active work time are used to estimate their hourly wage on other tasks on the platform.
  

Latest revision as of 08:15, 14 February 2016

Due date (PST): 8:00 pm 14th Feb 2016 for submission, 12 pm 15th Feb 2016 for peer-evaluation

This week, we will accept proposals to pursue different aspects of the project, and start a design test run.

  • Youtube link of the meeting today: watch
  • Winter Meeting 5 slideshow: slides pdf
  • Youtube link of the task feed meeting 1 today: watch
  • Youtube link of the task feed meeting 2 today: watch
  • Youtube link of the task authoring meeting 1 today: watch
  • Youtube link of the task authoring meeting 2 today: watch

Research: Participate/watch additional meeting videos and submit a specific proposal (All)

Last week we asked you to pitch research ideas for the three themes - open gov, task ranking, task authoring. This week we ask you to choose at least one the themes below (preferably the one you chose last week), and dive deep into it, with specifics. The goal is to propose a METHODS section for task authoring, and SYSTEMS section for open gov and task ranking: one that we could execute in the coming weeks and submit to UIST or CSCW - see the template Winter Milestone 5 Templates. The submissions that are most viable and popular will be set as our direction for our research project, and the submitters may be asked to help us lead it!.

a. Task authorship (write Methods section)

Watch the video recordings from the meeting, below:

  • Youtube link of the task authoring meeting 1 today: watch
  • Youtube link of the task authoring meeting 2 today: watch

The typical narrative is that workers produce highly varying quality work. Nobody has studied the effect that requester quality has. Can we quantify the variance in requester quality, and introduce interventions to help improve it?

A specific idea raised in last week's submissions that gathered a lot of interest: could we run an experiment to demonstrate how much variance there is in requester quality for the same authoring task, vs. how much variation there is in worker quality?

Mock abstract for task authorship, vision we're aiming for

*Requester Variation Determines Result Quality in Crowdsourcing*

The dominant narrative in crowdsourcing is that low-quality workers lead to low-quality work, and high-quality workers produce high quality work. The result is that most techniques focus on identifying high quality workers. We hypothesize that requesters vary significantly in their ability to create high-quality tasks, much as user interface designers vary significantly in their ability and training in UI design. We performed an experiment wherein 30 requesters each authored ten varied crowdsourcing tasks, and workers on Mechanical Turk were randomized to complete one requester per task type. We found that while 20% of the variance in result quality is attributable to worker variation, fully 35% of the variance is attributable to requester variation. We introduce the concept of prototype tasks, which launch all new tasks to a small set of workers for feedback and revision, and find that it reduces requester variation in result quality to one-third of its baseline amount..

b. Task ranking (write Systems section)

Watch the video recording from the meetings below:

  • Youtube link of the task feed meeting 1 today: watch
  • Youtube link of the task feed meeting 2 today: watch

Requesters don't get ideal workers, and workers don't get ideal requesters - can we rank the relevant tasks, on the basis of reputation, skills, and other necessary aspects?

A specific idea raised in last week's submissions that gathered a lot of interest: could we design a task feed ranking interface and algorithm for Daemo? A combination of user-centered work and machine learning/data mining?

Mock abstract for task ranking/feed, vision we're aiming for

*Boomerang: Incentivizing Information Disclosure in Paid Crowdsouring Platforms*

There is a massive amount of information necessary for a healthy crowdsourcing marketplace — for example accurate reputation ratings, skill tags on tasks, and hourly wage estimates for tasks — that is privately held by individuals, but rarely shared. We introduce Boomerang, an interactive task feed for a crowdsourcing marketplace, that incentivizes accurate sharing of this information by making the information directly impact their future tasks or workers. Requesters' ratings of workers, and their skill classifications of tasks, are used to give early access to workers who that requester rates highly and who are experts in that skill, so giving a high rating to a mediocre worker dooms the requester to more mediocre work from that worker. Workers' ratings of requesters are used to rank their high-rated requesters at the top of the task feed, and their estimates of active work time are used to estimate their hourly wage on other tasks on the platform.

c. Open gov (write Systems section)

In current systems, worker and requester voices remain unheard; and the platform is run by a central organization with all control. Can we infuse the idea of open governance in Daemo?

A specific idea raised in last week's submissions that gathered a lot of interest: could workers form guilds for specific expertise areas, then run their own reputation and ranking operations? Like how doctors and lawyers administer their own tests for determining whether to license you? Open gov

Deliverable

We've synthesized some of the most popular ideas for each area. Grab at least one area (task authoring, task ranking, open gov), and develop it further into a concrete research proposal using the template - Winter Milestone 5 Templates! Here's the recommendation:

  • For task feed: write a systems section.
  • For open gov: write a systems section.
  • For task authorship: write a methods section.

Systems section is about - what the system would would look like, where the novel contribution is a software system or platform to solve problems. Methods section is about - how a specific phenomenon can be studied to understand certain behavior that solves the problem. What kind of studies need to be run.


Check out this really helpful paper. See the template of each here to write you submission - Winter Milestone 5 Templates!

Research Engineering (All)

This weeks main issues: #77 (this is a pretty big one), #660

announce in #research-engineering that you are working on a particular issue and please let the others know about the progress of the issues you are working on (so that we don't do duplicate work). You are encouraged to work together.

For any questions ping @aginzberg, @dmorina, and @shirish.goyal on Slack #research-engineering

Design (Test Flight)

DRI: @karolina and @michaelbernstein

Great work last week, its time to iterate on it. Watch the weekly meeting for lengthier discussion and refine the messaging system for a more complete mockup and storyboard. Something that we can go implement, rather than a very high level idea. You can look at this week's slide to see some of the top ideas that were pitched last week.

Use Balsamiq or any tool of your choice, and share your unique mockup/wireframe of the messaging system on Daemo.

Submission

Create a Wiki Page for your Team's Submission

Create a wiki page with either methods or systems section, diving deep into one the three themes (look at the template here: Winter Milestone 5 Templates). If you're participating in design test run, create another wiki and paste screenshots or mockup/wireframe files. If you have never created a wiki page before, please see this or watch this.

[Team Representative] Submission or Post the links to your ideas until 8:00 pm 14th Feb 2016

We have a [Reddit like service] on which you can post the links to the wiki-pages for the submissions, explore them, and upvote them.

Please use the same login avenue (Facebook, Twitter, or email address) as you’ve done in the past with Meteor​. This will help us identify and track your contributions better.

For newcomers joining Crowd Research, when it asks you to pick your username, pick the same username as your Slack. Please DO NOT forget to mention the milestone contributors' slackid below each wiki page.

On Meteor, there are 4 submission categories, 3 for research theme, and 1 for design test run.

1- [One of three mandatory] http://crowdresearch.meteor.com/category/task-rank where you can post a link to the wiki page for your task ranking proposal

2- [One of three mandatory] http://crowdresearch.meteor.com/category/task-author where you can post a link to the wiki page for your task author proposal

3- [One of three mandatory] http://crowdresearch.meteor.com/category/open-gov where you can post a link to the wiki page for your open gov proposal

4- [Test flight] http://crowdresearch.meteor.com/category/design-messenger where you can post mockup/wireframe of the Daemo messenger.

Give your posts titles which summarize your idea. Viewers should be able to get the main point by skimming the title ("Automatic Suggestion for Tasks based on Average Completion Time" is a good title. "YourTeam TrustIdea 1" is a bad title).

-Please submit your finished ideas by 8:00 pm 14th Feb 2016, and DO NOT vote/comment until then

[Design Test Flight]

For your Messenger Design Wiki submissions, please create Google Slides showing the flow of your designs and link it in your Wiki page. For example, see this slide show

When you create the "Share" link, make sure to click “Anyone with the link can view"

[Everyone] Peer-evaluation (upvote ones you like, comment on them) from 8:05 pm 14th Feb until 12 pm 15th Feb 2016

Post submission phase, you are welcome to browse through, upvote, and comment on others' ideas. We encourage you especially to look at and comment on ideas that haven't yet gotten feedback, to make sure everybody's ideas gets feedback. You can use http://crowdresearch.meteor.com/needcomments to find ideas that haven't yet gotten feedback, and http://crowdresearch.meteor.com/needclicks to find ideas that haven't been yet been viewed many times.

COMMENT BEST-PRACTICES: Everybody in the team reviews at least 3 ideas, supported by a comment. The comment has to justify your reason for upvote. The comment should be constructive, and should mention positive aspect of the idea worth sharing. Negative comments are discouraged, rather make your comment in the form of a suggestion - such as, if you disliked an idea, try to suggest improvements (do not criticize an idea, no idea is bad, every idea has a scope of improvement).