WinterMilestone 2 Algorithmic Hummingbirds
We started off the second meeting of the Winter Crowd Research initiative with a recap of the previous meeting and the goals we are all working towards. We then stepped into the task for this week which is significantly about Need-finding. We saw the need for development of efficient algorithms on minimum work to be done for not only successful task completion but also globally impactful research. We also engaged in activities that proved how important it is to be able to differentiate observation from interpretation and also how recursive Why’s can help capture minute details about the broader picture.
For quite some time, during the initial days of week 2, we engaged in brainstorming about how to go about the process of need-finding while engaging in panel discussion with workers and requesters who are major contributors on the Amazon Mechanical Turk (AMT) platform. We collaborated online and came up with questions to ask the panelists which were further up-voted to make sure that these were definitely asked.
Some of the questions that we asked workers on the AMT were as follows:
1. What does the daily rhythm of a worker on a platform like AMT feel? They, basically, start with some threads where they check for requests and start working on the tasks. It is an iterative process where they go back and forth between the two. Whenever a favorite or high paying HIT (human intelligence tests) appears, they grab it immediately to avoid losing out on them as the monetary benefits could go a long way.
2. How are the tasks viewed differently by a beginner compared to someone who has been on the AMT platform for quite some time? The tasks are viewed differently in terms that someone with experience might be more calculative so as to say – “If a task is paying so much and the time required is so and so, would it be really worth my time or is there any other productive task in which I can involve my time? How would it affect my worker rating?”
3. How do workers track their wages? Do they use any tools or plug-ins for the purpose? It is typically done with experience and some even claimed to maintain tabs using extensively detailed spread sheets. It works out differently for different individuals and can’t be generalized due to large variations.
Some of the questions that we asked requesters on the AMT were as follows:
1. What is the process that is generally followed while designing HIT’s? The first step is to think about the task and give clear enough instructions to the workers so, workers can understand and deliver whatever is expected of them. They first try to post ad-hoc HIT’s and check with the obtained results and if needed (expected results were not delivered due to some reason), modify the HIT’s accordingly.
2. What kind of templates or features would the requester like to see on AMT in the future? The templates don’t really cater to the designing of HIT’s and it would be helpful to allow requesters to build their own templates using tools like iFrames. Increased support for workflow would also be appreciated.
1. There are a variety of workers, some who are depending on platforms like AMT as their breadwinner while some, for whom, platforms like AMT are additional sources of income and it is important to cater to both types of workers.
2. There is a new dimension on disabled (or rather, differently abled) workers which has come to light after the round robin discussions.
3. Workers and requesters, among-st themselves, have similar issues and struggles but are dealing with their problems in isolation which may or may not be a wise decision which is again interpretation, not observation.
1. The requesters are designing their HITs from scratch (as we have found, templates are of little or no use) and at the same time, finding it difficult to get appropriate results. Too little people working on the HIT’s would imply that the results might be more likely to be fluctuating or inaccurate where as too many people working on the same HIT may give rise to redundancy. It might be difficult, especially in case of academically oriented or research HIT’s, to pay well (due to funding concerns) which would be necessary to getting workers motivated to participate and also getting timely enough feedback to be impactful enough in the global market. Evidence: It was mentioned that, on one side, the results are not good enough (which may be due to variety of factors) and no lack of effort, from the other side.
2. There is large variation in the pay scale which is making livelihood extremely unpredictable which is affecting those workers, especially those for whom AMT acts as breadwinner. The workers are putting in around 15 hours a week, also balancing family and other commitments only to see work, in which they put in their heart and soul, rejected with no clear explanation. We also might have to look into the long term impact these activities might have on their health (they are typically suffering from physical issues like disturbed sleep cycle and psychological concerns like job insecurities). Evidence: The reflected stress levels in the workers and increased cost of living.
1. Fair treatment of workers and not to mention, fair payment of wages 2. A process to alert workers when a high paying job appears or their favorite requester or favorite HIT gets posted. 3. Allowing requester to design their own templates 4. Modules or recommendation systems to help new workers familiarize with the platform and recommend tasks so that they are motivated to work ,stick around with the platform a little longer and don’t end up choosing the wrong tasks, earning less and leaving the platform. Or plug-ins to hold up longer with attention spans of workers 5. Effective process to check work quality 6. Support for workflows. 7. Mechanism to check/track wages 8. Reduce earning fluctuations 9. Rejection followed with valid evidence and suggestions
WORKER (TURKER) PERSPECTIVE
Collection of forum data (by ethnomethodological analysis) , to provide novel depth and detail on Turker Nation workers and requesters, who portrayed economic status while working on tasks, while also illustrating relationships in their perspective considering moral, emotional, practical, and ethical issues; Ethnomethodology is the process of analyzing naturally occurring data which eschews theorizing to explicate organisation of activities as a recognizable social accomplishment. It shows how the design of AMT largely favors Requesters evidenced by information asymmetry (leading to deception and privacy violation concerns) and power imbalance. It also suggests “Requester hall of fame/shame ratings” to increase system volatility. It also focuses on improving community orientation of the platform by introducing SeerKRap to check the cases where ethical concepts have been compromised. They believe that increasing the vibes in the turker-requester relationship may lead to better designed HIT’s and support cooperation providing more control with respect to market functionality .
TurkOpticon, an activist system, allowing workers to publicize and evaluate relationships of turkers and requesters and also enabled meaningful collaboration among-st themselves; The idea offers a distinctive vintage point with case study design scaling into a highly distributed system and also incorporating feminist analysis but what about their penetration in existing socio-technical systems? Such activism takes shape in the wild, not only testing seeds of possible technological futures, but also attempting to steer and shift existing practices and infrastructures of the technological present and propel it into a new dimension of the future.
Structuring and managing overheads can be reduced by task standardization which would enable mass production of physical goods making the crowd sourcing market scalable. When implemented, they may standardize prices to meet real world needs, enhance or upgrade the platform, add additional plug-ins to accurately predict task completion times and so on. The advantages of such standardization include re-usability of tasks, trading commodities (where tasks can be completed without having to rework with the platform or think about requester reputation), true market pricing (priority order changes as per price), employment of automated market makers to increase liquidity, strengthen network effects, easier breakdown of complicated tasks etc. It would probably also cater to, remedying externalities (positive, negative examples) and set standards which are practically enforceable.
Investigating and researching a different paradigm for gathering user input via the micro task market, for example, AMT at low cost but at rapid speed which is vital because this can substantially improve the platform. However, special care needs to be taken during task formulation in order to leverage the capabilities of user studies as in, it’s important to have questions or tasks which are explicitly verifiable, honest efforts would yield better productivity for the turker than spamming through the task and enforce many levels of filter or trickle mechanisms to auto-detect fraud responses but even then, ecological validity cannot be ensured.
The foundations of the platform are lacking certain structural functionalities like scaling up, managing complex API’s (application programming interfaces), controlling execution time and ensuring tasking formation or completion quality. The suggested approaches include – a better interface (to reduce overheads, transaction costs), worker reputation system (to prevent rejection of genuinely completed tasks by having qualifications tests, keeping track of history, disconnect payment and ratings), trustworthiness guarantee (slaves have the freedom to choose their masters), and improved search mechanisms (following the power law) – in order to tackle the above mentioned problems.
The idea is to embed turkers in an interactive environment to support complex cognition and manipulation tasks on demand. So, Soylent, a word processing interface that leverages crowds for proofreading, document shortening, editing/commenting tasks, was developed and it consists of three main components - Shortn (text shortening service that typically reduces the length to 85%), Crowdproof (human powered spell checker and also suggests fixes) and Human Macro (interface for offloading arbitrary word processing tasks).