Milestone 1 supercowpowers

From crowdresearch
Jump to: navigation, search

Template for your submission for Milestone 1. Do not edit this directly - instead, make a new page at Milestone 1 YourTeamName or whatever your team name is, and copy this template over. You can view the source of this page by clicking the Edit button at the top-right of this page, or by clicking here.

Experience the life of a Worker on Mechanical Turk


I made ~109 cents in an hour or two. Most of that came from two jobs for a company that wanted me to look things up on the WikiOrgChart -- 40 cents per hit, with about two minutes of work per. Most of my time, though, was spent on HITs where I was supposed to count the number of people who entered and exited the right side of the frame of security camera footage, for 30 seconds of video, played twice (once to count entrances, once for exits). Incredibly boring, since the interface required I sit through the whole thing, even when I knew from the first pass no one was exiting for the rest of 15 seconds remaining. Doubly frustrating to me that it seemed like no one even did a cursory job of doing a first pass with software: some of the videos were just completely static, empty shots. I guess there's two sides to that -- I guess if they had used software, I would have been out of the opportunity to make those particular two cents.

Another employer had me do manual OCR for European grocery store receipts. That didn't pay too well, but it was quick and not as boring. The problem there was, they immediately ran out of HITs for me after my first one. They paid me a bonus though!

I got fairly distracted by the $1 goal we set for ourselves -- the gamification/instant gratification aspects made me lose sight of the fact that I sat at my laptop. I wonder what it would take for the novelty to wear off, and how I'd feel on, say, week five of trying this. I know I was inspired to keep coming back to the European receipt OCR'ers...

Experience the life of a Requester on Mechanical Turk


My task includes some semi-public company data that I would rather not explain/upload in a public/searchable forum.

In terms of the meta experience of doing a task on Turk, this reminded me of the way in which it's awesome and the ways in which it sucks. The awesomeness is that this was all super-quick and cost me $5, and was easy to set up given that I've done similar stuff before.

The suckiness includes:

  • This would've been much harder to do if I was doing it for the first time. For example, I would've had to figure out how to do variables for the spreadsheet data, which should really be something they explain front & center. And there are lots of other hurdles.
  • It is really common to want to bundle copies of the same task, and afaik there's no good way to do that. Really, the interface should let me have an input spreadsheet with 1 comment per line, and should automatically bundle N comments for me for any choice of N. In the past, when I cared, I've actually randomized the questions that are bundled together, but it was a ridiculous javascript mess.
  • The output CSV is really ugly.
  • The output CSV takes real time to generate/download even though it's pretty tiny.
  • It's very hard to know what keywords to use. Amazon could help you out based on what people actually search for.
  • There is some UI/UX silliness. E.g., why do I have to view my HIT in preview (rather than HTML) mode before I can go on to the next step? And really the UX is kinda terrible in general.
  • No easy way to move tasks between Sandbox and the actual site.

Anyhow, I'm sure there are more things to complain about... Amazon clearly has not invested into making Turk nice to use. And yet, it works. :~)


My requested project was just a goofy side deal I have going on: I have been keeping a spreadsheet of the discs I get in the mail from Netflix, and their serial numbers, so I can peek at how big their warehouse is: Netflix Tank Counting. You can see my input and output CSVs here: Google Spreadsheet Link I'd never requested a task before, but overall I think it went fine, just frustrating to use some times. Definitely a bargain.

I wound up having to submit my 15 x 3 = 45 HIT batch twice. When I posted it around 12:30 Wednesday, at 5 cents per hit, I got nothing for eight hours. I reposted it again around 10 AM, this time (a) offering ten cents per HIT and (b) removing the restriction (set by default) that only Master Turkers could work on the tasks. This changed everything, and the HITs results flooded in, real fast. It took maybe three hours before all the Netflix titles had been documented three times over. The accuracy was pretty good -- only a few obvious errors, mostly due to not minding that I wanted Bluray release dates for some titles. In fact, you can see that one Turker kinda ALL-CAPS called me out for saying Chris Pine was in Guardians of the Galaxy. (It's Chris Pratt. I was sleepy.)

It took me about as much time to format that input CSV and manage the Turkers as it would have to look up the info myself -- and I'll be manually copying the Turker results as well -- but if I were working with even an order of magnitude more titles, the platform would have been worth the effort. (A lot of my time was spent coming up with a description of each movie, so there'd be less ambiguity for the Turkers to have to deal with.)

I agree with Alya completely about how ugly the template designer is, and how fragile CSVs seem for this kind of thing. I wonder how much effort it takes to get good at using the programmatic API, and what kind of libraries there are for it. There's gotta be a way to do this without me having to click back and forth between their weird Javascript frames. I now have $7 in my Amazon Payments account, and $6 of those are from 2006 or so, when MTurk first launched, and I tried it for kicks. It has not meaningly improved its style since then.

Explore alternative crowd-labor markets


I've hired on oDesk, when I needed help getting sbt-assembly configured, and I found it to be waaaaaay more personable than Mechanical Turk. Everything was negotiating with a real person: one candidate for the task tried to convince me to just switch to Maven. The worker I actually paid (a fixed amount of $500, of my employer's money) and I chatted a lot on Skype. Very different from the literally nameless and faceless interactions I had with the Turkers who filled in my Netflix spreadsheet. Probably important to note that $500 is a lot for the platform, so that might have made the requester/employer process smoother than the typical case.

However, when I tried finding work on oDesk around a year ago, just to pick up an odd gig or two, I got no traction at all. Sent out a bunch of applications, cut my rates, and just never heard anything back. I was expecting the same might happen on Mechanical Turk -- that the work would all be full up by existing platform veterans -- but I found Mechanical Turk really does let new folks just show up and find work, albeit much much less lucrative work.




I like that it worked -- it received high accuracy, and the workers themselves felt it was worth recommending. I like that it's focused on a particular niche that was getting ignored by existing platforms: people whose primary/exclusive network interface is a mobile device.

There's nothing I strongly dislike about the system, but I might think harder about:

 In our survey we found that participants were 
 earning 20 - 25 Indian Rupees an hour (USD 0.55) 
 while doing their regular job. While using MobileWorks, 
 participants were able to complete 120 tasks per 
 hour. Therefore, on an hourly basis the workers
 should be paid approximately 0.18 to 0.20 Indian
 Rupees per task to match their regular wages. 
 We predict that the efficiency of the workers would
 increase as they become  more competent at
 doing tasks on their mobile phone.

So the good news is, comparable OCR tasks fetch that level wage on MTurk. But I wouldn't be so sure the wage could be appreciably increased just from getting more efficient with time. The whole appeal of MobileWorks is to greatly expand the number of people who can compete to work on these tasks. Without some kind of simultaneous effort to greatly expand the number of employers who need OCR'ing performed -- and who don't mind using anonymous folks from the Internet to do that OCR'ing (e.g., they wouldn't mind if their worst enemy read these documents) -- MobileWorks might just create downward pressure on wages from increased labor supply, to a degree that dwarfs how fast in general some workers can perform the task.


  • What do you like about the system / what are its strengths?
  • What do you think can be improved about the system?


Firstly, I notice that this team also subscribes to the "rapidly iterate prototypes" spirit that also guides this here project of ours.

I like that so much of the study focuses on growing the platform organically -- I love Figure 2. I also found it interesting to read about the payment solution: paying off some or all of the worker's cell phone bill. Both these are real extensions of our understanding how phone-only crowdsourcing could work, above and beyond the MobileWorks paper. This team went further in actually implementing realistic recruitment and payment procedures. I find their speculation that market-competitive accuracy rates could be achieved to be plausible, esp. if the per-word rate stays low.

Compared to MobileWorks and its smartphone software, I'm not sure how well this SMS-and-tiny-pictures-only could afford other kinds of labor. I notice in their discussion section they appeal to the same gamification spirit as a way to inspire even lower wages; that's exactly what I got caught up in for an hour or two on MTurk.

Flash Teams

  • What do you like about the system / what are its strengths?
  • What do you think can be improved about the system?


The main results -- the massive reduction in time and expense spent thanks to greater coordination through Foundry -- are great. I'd be interested to know how much time it took to recruit the teams of experts off oDesk. Would you incur that cost with each new project?