WinterMilestone 3 stormsurfer ReputationIdea: Sample HITs for the Worker

From crowdresearch
Revision as of 09:36, 31 January 2016 by Shreygupta (Talk | contribs)

Jump to: navigation, search

Describe (using diagrams, sketches, storyboards, text, or some combination) the idea in further detail.

Problem (Goals)

Did I do the task right? Partially right? Completely wrong? There is no way for me to know! During WinterMilestone 1 stormsurfer, even after two hours, my submissions (all 20) were still pending. I completed 20 HITs for the same task, and it is possible that I might receive $0.00 for all of my work if it is incorrect. If I received feedback after my first 1-2 HITs, I can easily improve and complete the task better the next few times. However, it is difficult for the requester to give immediate feedback to each and every worker; he/she currently may not be online to approve the HITs or may be swamped with reviewing other HITs from other workers.

The bottom line: workers don't know what a requester is looking for; open-ended tasks will always have a level of interpretation open to them, and each requester has his/her her own scale for determining what is an "accepted HIT" vs. a "rejected HIT." Likewise, it is important that requesters are consistent when "grading" HITs: the same quality of work should always either be accepted or rejected. The goal, therefore, is to give workers an idea of what quality of work requesters are looking for, to give workers an idea of what quality of work other workers are submitting (as stated during the meeting, Bayesian truth serum, or predicting others’ responses, will improve quality of work; workers should be able to more accurately predict others' responses), and to hold requester accountable for their "grading scale" (i.e. ensuring that they are consistent among HITs).

Solution (Design)

Now, pretend that tasks are "assignments" or "tests," requesters are "teachers/professors" or "graders," and workers are "students." Teachers are grading thousands of different assignments completed by various students, but let's throw in the fact that the students have never met the teachers before, most students have not met the other students, and most teachers have not met the other teachers. As with the free response sections on AP/IB exams and the essay section on the SAT/ACT, there is a need for standardization across all students and teachers. What qualifies as a 12/12 on the essay (for the SAT/ACT), and what qualifies as a perfect score, or a sub-par score, on an AP or IB exam?

SAT Sample Response Winter Milestone 3 stormsurfer.png

AP Sample Response Winter Milestone 3 stormsurfer.png

MTurk Sample HITs Winter Milestone 3 stormsurfer.png

Milestone contributors

Slack usernames of all who helped create this wiki page submission: @shreygupta98