Milestone 1 TripleClicks

From crowdresearch
Jump to: navigation, search

Welcome to the page for Triple Click's Milestone 1 submission. Please enjoy your stay.

Experience the life of a Worker on Mechanical Turk

Turk #1

My experience as a worker on Mechanical Turk was challenging. I had previously explored MTurk in the past to see what crowd working was like (purely for curiosity’s sake), but this was the first in a while that I had been back on the platform. Getting signed back into MTurk wasn’t too difficult, I just had to use my Amazon.com login. However, what struck me was the fact that in 5+ years, the platform hadn’t seen any significant change.

Task Search: The process of searching for tasks was confusing as many task titles had tags or identification numbers. Most included labels according to some context that the requester had (e.g. numerical values, jargon), very few were listed with easy to scan descriptions or titles. The extremely small text size made it difficult to also click on links without exercising a considerable amount of precision or mental effort. The meta data (HIT expiration date, Time allotted, Reward, HITs available) was hard to scan and for a beginner worker, hard to navigate and prioritize (trying to find the HITs that were lucrative, interesting, could be completed quickly, etc). I didn’t know what the “Request Qualification” link meant or what it did until I accidentally clicked it trying to click on “Why?” and sent a request notification to a Requester (oops).

Task Selection: Viewing a HIT was nice because it helped me to preview the types of HITs to expect, and I found myself using this as a means to figure out if I wanted to work for the Requester. Eventually, I settled on tasks that asked me to transcribe information in a photo (mostly receipts). I didn’t like tasks that asked me to look for contact information on a person (that felt invasive and too much like a sales/lead generation task - which is a considerable time effort) and I didn’t like tasks that asked me to look for a URL for a business (because I felt that a portion of the businesses might not even be online).

Task Completion: The interface for entering information was fairly straight forward, but in some instances, it was hard to know if I should do the task as is or interpret what the requester wanted. For instance, I was transcribing a receipt. Grocery stores truncate the name of items to the point that they don’t make sense. For example: ORG GRN LETT, which I (as a human) know to be Organic Green Lettuce. But not knowing what standards the Requester has for input, I knew that I would be doing exactly as he/she asked by typing “ORG GRN LETT”, but knowing that there'd be more value provided if I typed the full item name. This, to me, is an example of perhaps incomplete task details and more broadly, the challenges in standards or expectations that a worker might have to deal with. The odd thing, to me, is that a worker could be blocked or removed from future HITs if they make a minor mistake. It seems like a one-way street to me to not have a way to evaluate the clarity of instructions.

Payment: After about 2 hours of submitting tasks and waiting for approval, I managed my first $1.00 doing a series of HITs that were valued at about $0.05 each. I had fun for the first 30-45 minutes, happily transcribing receipts from stores, but soon became tired by constantly checking and rechecking my work for accuracy. Having to wait for HITs to be approved and to transfer payment to my account left me disheartened.

Overall: There is much to be improved upon in Mechanical Turk. From poor interface design, search and navigation, and information design issues to compensation and worker satisfaction concerns, there is much to be disliked and very little to be liked about the platform.


Turk #2

Once I was approved for work, I looked through HITs to see what I could do to quickly make $1. Most HITs were of extremely low value, or of seemingly ill intent, such a $1 HIT that promised it was the “easiest” around but also required you to install a media player on your computer. Since this seemed like an ideal way to get malware, I skipped it and chose a $0.10/submission HIT.

After working for roughly 15 minutes and possibly earning $1.10 (based on whether or not my submissions are approved), I couldn’t help but be struck by how dead this made me feel inside. As a freelancer, I’m not unfamiliar with sitting alone and staring at my screen for long stretches of time. But as a crowdworker, labeling collections of images and acting as a supplement to an algorithm, I felt like I was being micromanaged by a computer. A number of my HITs weren’t able to be immediately submitted because my “accuracy was too low.” I was able to bring up my accuracy and then submit a couple of HITs. But the rest in which I failed to achieve proper accuracy, I just skipped because the time put into trying to correct mistakes wasn’t worth the $0.10 per HIT.

Of course, as I completed more HITs I began to get a better feel for what was expected (again, assuming my HITs are approved) and felt a certain sense of accomplishment in being able to breeze through my submissions. But, again, having to adhere to an accuracy metric that wasn’t known by me made me feel the work is all too tenuous and easily disapproved by the requester.

Ultimately, I feel for whoever is trying to make ends meet by working as a crowdworker. There is little incentive to do quality work. It would seem that quick work is ideal, while quality work means never getting enough money. But the perhaps the Master Qualification is something to strive for and thus reason to do a better job. Such information, however, isn’t made obvious to a new Turk like myself.

Likes: The initial feel of working through tasks that you know only a human can do.

Dislikes Knowing that one day a computer will be able to do these tasks; the interface, which looks like it hasn’t be changed in ten years; the lack of clarity in what is required for certain HITs; the pay.

Experience the life of a Requester on Mechanical Turk

Requester #1

The experience of being a Requester on Mechanical Turk had its ups and downs. Sign-up was't too difficult to accomplish since, again, it used my Amazon.com credentials. However, I did find that navigation between the workers and requesters portals was a bit confusing and inevitably got me trapped in some sort of middle ground screen linking the two. In retrospect, this is probably because very few people who are requesters are also workers and vice versa. However, it doesn’t excuse what I deem to be a poor way finding system and navigation scheme.

The Bad: Getting set-up with Amazon Payments to pre-pay for HITs wasn’t difficult, but again, it was likely because I already had an Amazon account. Admittedly, I never knew how much money to pre-load for my account, and I felt at times that I’d be losing nickels and dimes somewhere in the process. Because I didn’t come to the Requester site with a task need in mind, it too me a while to think one up. After trying three different categories or templates/types of tasks (survey, classification, transcribe from an image), I threw my hands up in frustration. Not only was it difficult to use the interface to set up simple form fields for surveys or to load up a CSV with image links, I was left with the feeling of urgency in having to break down a task to its most smallest, granular form. How does one deconstruct and reconstruct a task without losing details or context in process? How do I be very clear and thorough in my instructions? Eventually, I went with a very simple task: tell me about your most positive or favorite childhood memory in 50-100 words. It still took a considerable amount of time to publish (making copy edits, loading money value to my account, setting parameters for qualifications), but I was able to breath a sigh of relief once I saw the results start arriving. However, upon seeing the amount of time it took vs. how much the workers were being paid at an estimated hourly rate, I felt pretty badly about requesting a task all together.

The Good: Outside of the ease in transferring value (pre-payment for the HITs), the positive experience I took away from being a Requester was seeing the quick completion of my task. The approval process was a little clunky because of interface issues on the client side, but it was nice to see the results and know that they came from awesome people.

Overall: There is much to be improved upon in Mechanical Turk from the Requester side. Again, from poor flows, outdated interface design, disorganized search and navigation, there are a lot of things that need to be improved upon to help requesters create clearer and fairer tasks.

File:Batch 1843548 batch results.csv


Requester #2

Compared to UI that Turks have to deal with, the Requester UI is slightly better. But only slightly. It’s as if Amazon cares more about a requester’s experience, versus that of a Turk. But only slightly. The manner in which one is suppose to create a project isn’t obvious for someone unfamiliar with the system. The fact that the “Get Started” button on the first page changes to a “Create a Sentiment Project” button is a bit confusing for a person, like me, who will often click something without reading all the text because he’s in a hurry. Luckily, creating a sentiment project worked well for me in regards to a project I’m currently working on, so I decided to give it a test run to see how people feel about certain self-help/lifestyle terms.

Creating the project had it’s ups and downs for me. Mostly downs. It might seem strange that a person can go through much of his life without becoming intimately familiar with spreadsheets, but that’s me. And so the need to upload a .cvs file gave me pause. I assumed that I could just enter text into the browser and use that for my project. But I understand benefit of uploading a .cvs. Of course, it wasn’t hard to upload once I created one. When it came to the text that I could enter, I wasn’t certain that the Turks would answer the question the way I hoped they would, based on the default wording for sentiment questions. Unfortunately, there was no way to change the wording, so I just hoped for the best and tried to make what I could enter as understandable a possible.

Funding the project was actually the biggest headache. It opened a new tab in which I could do the funding. But once the money was ready, I couldn’t access my project from the new tab. It didn’t matter where I clicked; it seemed as though my project had been deleted! Thankfully, when I closed the tab and reloaded the original tab, my funds and project were in order and I was able to publish my project. What was nice, though, was Amazon telling me exactly how much I needed to fund the account, rather than having me just deposit a lump sum which could end up just sitting in the account until I finished using it up.

From there on, the experience was smooth. My HITs were completed and the money was paid to the Turks without me having to approve anything (which was a bit of a surprise, given the fact that I still haven’t received my money from the HITs I completed as a Turk.), though it was a bit unnerving how impersonal the who process is. Of course, there were no names for the Turks. Just a string of characters. However, there is the option to give bonuses to the Turks, which I would probably do for more involved work.

Likes: Knowing how much to fund my account in order to pay for the HITs; the quickness with which the HITs were completed; the ability to give bonuses to Turks.

Dislikes: The interface; the HIT creation process; the process of funding my account; lack of connection with who is submitting the HITs

File:Batch 1843164 batch results.csv

Explore alternative crowd-labor markets

TaskRabbit

I’ve previously used TaskRabbit for research and personal assistance tasks, and have found earlier versions of the site (2012-2013) to be a much more positive experience compared to the current design (2014-2015) with similar tasks.

Task Rabbit (2012-2013, Older): TaskRabbit from the perspective of a task requester was fairly straight forward to use. It allowed requesters to quickly structure and publish their tasks and either pick their own Rabbit (a task worker) via search or pick one from Rabbits bidding for the task. I often found that with very specific research or personal assistance tasks, I opted to select a specific Rabbit. For more general household tasks, I preferred to accept bids. Hourly rates and estimates were fair and accurate. Great care to protect identity and contact information before a requester and Rabbit are paired was one of the great positives of the site experience. I always found Rabbits to be accurate reflections of their ratings and opted for people who had a high number of tasks completed and high scores. Quick or proactive communications was usually a prime indicator of their ability to get the task done, and I appreciated the over communication. Payment and tipping was easy through a linked credit card, and I was always happy to leave a positive review for a task well done.

Task Rabbit (2014-2015, Current): TaskRabbit has since moved to a model where a requester enters their task needs and the system matches them to the “best” Rabbit possible. I don’t enjoy this because in a lot of ways, it feels like a loss of control, choice, and autonomy. My fear is that the system isn’t as transparent as it can be, and it makes me uncomfortable. There were other times where I had time-sensitive tasks, and Rabbits were either not responding (because they were marked as “online” but in fact were not) or would accept the task only to realize that they weren’t qualified or able to fulfill the requirements. This would often lead to long, drawn out quests to fit a Rabbit because I’d either end up with a Rabbit that was not able to do the task (didn’t have the right skills, but tried anyway) or not able to actively communicate or I’d cancel the task all together. Ratings on Rabbits became harder to find and often time felt falsified, regarding 4+ star Rabbits as great but leaving me with a Rabbit that ended up not doing anything, not communicating for days after being assigned, and eventually requesting payment for a task that was never completed. Overall, a very negative change from the origins of TaskRabbit.

Current TaskRabbit vs. Mechanical Turk: In comparison to Mechanical Turk, there is more visibility into who a worker is with TaskRabbit and what their track record is like. It humanizes the worker. TR also takes into account also specific skills or interests that they have, and allows them to complete tasks that are personally aligned. Rates or wages are also perceivably higher as TR workers are usually targeting hourly rates above $20 per hour, depending on their geographic location and the type of task involved. On that same note, TR tasks tend to require more effort, time, and in some cases, physical ability (e.g. lawn work, home repairs, moving house). In terms of ease of use, TR is easier to use as a reflection of its more consumer-oriented focus and use of more recent interface and flow stylings.

Overall: Mechanical Turk does much better in breaking large information tasks down into smaller, affordable tasks (on the part of the requester), automating data/information needs. TaskRabbit, in contrast, works best for in-person tasks that require more depth or communications. Both, however, would benefit from changes in how tasks are published and articulated, how tasks are allocated, how much or what type of information is provided about the worker’s interests / skills / ability, accuracy of ratings, and fairness of wages.


oDesk/Freelancer.com

As a freelance designer and illustrator, I am always looking for ways to get new clients. So years ago, when a site much like oDesk, Freelancer.com, was brought to my attention I immediately signed up. I could barely imagine having a more underwhelming experience. While I like the idea of freelance work being available to anyone who needs it, the fact that these people are also only willing to pay way below market rates for a job, left me with a bad impression. The vast majority of postings seemed to be for people hoping to get a website for under $50 (this was before Squarespace and similar sites) or a detailed portrait of their friend dressed up like a vampire for $8. It didn’t help that there were people willing to fill those requests for work, which would have earned me below minimum wage for the amount of time I would have had to put into these jobs. So, I stopped responding to “job offers” and deleted my account. It was a waste of time trying to compete with people with worse portfolios (by my own judgement, of course), who were seemingly able to survive on dollars a day.

From my the looks of it, it seems as though oDesk is pretty much the same kind of site as Freelancer. But while I have my personal issues with such sites, comparing them to Mturk is another story.

oDesk vs. Mechanical Turk: In comparison to to Mturk, oDesk makes the worker a much greater part of the process. Of course, that’s because there must actually be some correspondence between the requester and the worker, known on oDesk by the slightly less faceless term of freelancer. But having the freelancer represented as a face and name, rather than as a character string, is perhaps the single biggest superficial difference. Then there is the other major difference, which is that these tasks can’t be broken up into micro-portions. The nature of the work, while often low-paid, is more involved than anything available on Mturk. This comes from the work also being more specialized and needing of a certain amount of expertise.

Overall: Mturk is more democratic in terms of who can take part: If you’re a human, you can be a Mechanical Turk. oDesk requires more skill and contact. Both, however, suffer from the ability of job posters to post low paying jobs and of workers driving down wages by taking on those jobs.


Galaxy Zoo

Galaxy Zoo vs. Mechanical Turk: While both of these “markets” function by breaking down tasks into micro-tasks that can be completed by just about anybody, one isn’t actually a market. Galaxy Zoo, unlike Mturk, doesn’t deal in money, but rather a desire for being part of a grandiose project, much like Wikipedia. And also unlike Mturk, it motivates its contributors intrinsically, rather than with the promise of external rewards. Of course, since there is no money changing hands, Galaxy Zoo dispenses with the need for registering an account and contributors can start classifying galaxies in just one click from the homepage. And since the actions taken by contributors and the questions asked by Galaxy Zoo are the same from one HIT to the next, Galaxy Zoo is able to create a more pleasant user experience. Part of that is a better designed UI, which Mturk could benefit from.

Overall: The simplicity of Galaxy Zoo is a very noticeable when compared to Mturk. The UI is simpler and more understandable. For both sites, the tasks can be similarly mundane, but the with Galaxy Zoo one never gets the feeling of being cheated by getting a low wage. The payoff comes from taking part in helping to classify galaxies. If you’re into that kind of thing.

Readings

MobileWorks

I liked that the system had a relatively easy sign-up/log-in flow (one screen) and that the tasks were pre-processed in a way to accommodate the limitations or constraints of the technology in the region. The sharing of money earned in real-time is nice too because I think it serves as a motivator or incentive in seeing funds accrued. However, I felt that in the mention of more modern texts vs. 19th century papers have a noticeable difference in accuracy reported, there was a inherent unfairness that raised some questions. Does the system punish workers for harder or possibly impossible tasks (declines in their quality or accuracy ratings)? Does the system distinguish or keep separate accuracy rates for the types of tasks (modern vs. 19th century)? Is the system aware of these differences in types of tasks and checks to make sure that the worker is compensated fairly based on difficulty? Answers to these questions may be great starting points for improvements to the systems.

mClerk

I appreciated mClerk's attention to efficiency. The fact that it spread virally as well indicates that this delivery method is of less friction. The use of image-base SMS text as opposed to a browser-rendered image. It makes the platform much more accessible and improves upon the workflow by reducing any mobile browser performance or latency issues - and it takes advantage of the technology, leveraging it for quick and more low-cost micro tasking. I was also intrigued by the process of transliteration to English and the systems conversion back to local font (to reduce tediousness and errors) and perhaps how this can be scaled to assist with language translation services. I think there can also be an SMS short code that workers can text to check their current progress, and perhaps ways to better incentivize top referrers of the system to enable them to do what they do best: share the opportunity with others but not to sit back and stop working (perhaps a sunsetting of the period during which they can collect reward based on referrals).

Flash Teams

Flash Teams was an intriguing look into the use of a system to coordinate and manage an expert team. While it’s certainly more efficient from the perspective of time and cost (compared to self-managed workers or the current/conventional means of coordinating work) and I can always appreciate modularity, I admittedly felt disturbed by what the system might mean for the future of highly skilled workers (experts) and creative work. I liked that it helped make work more efficient (you start once your section is active) and that it helped manage some of the challenges that teams currently face (being able to fill in needs quickly from a database of vetted workers), I felt that it took away from what I believe is fundamentally important: building trust in people rather than a system or technology, fostering team bonds, creating meaningful and fulfilling work, and getting away from the idea that process is paramount in building or creating things. It made the experience of working together seem sterile and machine-like.

My concerns also stem from completing more complex tasks (outside of simple prototypes) to the pre-processing or breakdown of large, complex tasks. That process of pre-processing or breaking down complex tasks assumes that either the user (or a project manager) or the system (and AI) would understand the general process as well as nuances of team members, skills and expertise, and domain knowledge to be able to configure and assign tasks. An inexperienced user or requester might not have enough expertise or insight into real-world process to complete projects, and might not be sensitive to the intricacies of challenges (and opportunities!) that teams face in the creative/development process. An experience user or requester might have a different challenge in that quality of output isn’t up to their expected standards, and with teams organized without pre-existing team bonds might not take it upon themselves to produce their highest or best quality work (as they believe that their pay is less through doing cost and time efficient work). An AI or pre-processing system that could help to break down complex tasks, understand with detail how to chunk tasks, and know who to assign might already then have the capabilities to carry out and implement work on its own.

I’m still formulating my thoughts on this and can’t articulate how to improve, but I’d imagine improvements to be along the lines of worker autonomy to some extent, the ability to be fairly compensated for expertise, and the consideration of the system that not all projects follow the same requirements, workflow or process.