Difference between revisions of "Qualitative Analysis RDQA"

From crowdresearch
Jump to: navigation, search
(Data Review)
Line 39: Line 39:
===Data Review===
===Data Review===
'''Code: ContentWorker'''
!Review [index]
|27 [85:302]
|The HIT I was doing expired before I could finish, so I contacted the requester to assist me. They created another HIT for me in order to compensate me for my completed survey. I am happy with my experience with them.
|66 [175:200]
|Would love 1000 of these.
|-67 [328:401]
|But how often does one get a nickle for writing a list of "filthy words"?

Revision as of 04:59, 23 March 2016


This page presents exploratory findings from worker responses collected from TurkOpticon. These responses demonstrate a potential payment expectation for survey tasks and how worker and requesters relate outside and within Turker.


The method chosen for this analysis was emergent as that the data source was constantly changing based from the entries of workers into TurkOpticon, and the researcher wanted to allow the research to create a grounded theoretical model. As such the results from this analysis will be heavily framed from the information structure of the web site, since the workers enter knowledge as framed by the TurkOpticon designers.

TurkOpticon Website

TurkOpticon Reviews

TurkOpticon is a requester monitoring tool created and maintained by Lilly Irani at the University of California - San Diego and others and originates from the Turker Bill of Rights originally created in 2008. A web site hosts requester-task reviews created by workers that express ratings and an user experiences.

TurkOpticon Rating Scheme

Turkers can rate requesters on 5 point scales along 4 dimensions: "comm", "pay", "fair", and "fast"[1]. TurkOpticon describes the measures as:

  • communicativity ("comm"): How responsive has this requester been to communications or concerns you have raised?
  • generosity ("pay"): How well has this requester paid for the amount of time their HITs take?
  • fairness ("fair"): How fair has this requester been in approving or rejecting your work?
  • promptness ("fast"): How promptly has this requester approved your work and paid?

Data Sampling Procedure

Data was collected based from a convenience-random sampling. The researcher chose to collect 2 full pages of responses that were present at the time when the researcher was on the site. TurkOpticon presents difficulty to the task of data collection as that the response collected on the web site stream in real time and the pages update with changing data. To handle this challenge, the researcher needs to keep the web site open without updating to maintain a consistent data set. The data presented is randomly presented to the researcher as TurkOpticon workers from all over the world enter their experiences at the times of their choosing.


RQDA is the R Qualitative Data Analysis package. The package enables researchers to enter data into a database and codify the data by 4 levels: codes, code categories, cases, and annotations. The most basic level are the codes given to data sources. To give structure to the codes, code categories can be composed from several lower level codes. Only codes and code categories were used for the purpose of this study.

Data Entry

Data were collected by copying and pasting review test into RQDA and then coded with the review's respective ratings.

Review Example.JPG

For example, the TurkOpticon review sample would have the text beginning from "Category Validation..." to "...opinion, unacceptable." Thereafter, the sample would be named, saved as an individual file, and codified with "Fair 1","Fast 3","Pay 1","Comm 4", "Rejected", and "TaskValidation". Any other potential codings were included to create as comprehensive of a list as possible for theory development. Other possible codings include "Error" and "ActivityWorker". Each code has a specific definition such as "ActivityWorker" as any activity expressed or implied as being undertaken by a worker. The result can become a list that such as Table Coding Key and be used to look for cross-patterns that indicate potential areas of coding interactions.

Data Review

Code: ContentWorker

Review [index] Clipping
27 [85:302] The HIT I was doing expired before I could finish, so I contacted the requester to assist me. They created another HIT for me in order to compensate me for my completed survey. I am happy with my experience with them.
66 [175:200] Would love 1000 of these.
But how often does one get a nickle for writing a list of "filthy words"?


Hairball Map: What might happen outside of Turk?


Scenario Development Example

---Begin Task Submission--- ---Evidence---
1.Worker submits work GENERIC
2.1 Requester mass rejection parameter kicks in GENERIC
3.1 Requester team screens rejected tasks [Account 46]
2.2 Requester sends verification email to worker(UNKNOWN) [Account 56]
2.3 Requester sends automated email to worker [Account 62]
2.3.1 includes a task ticket confirmation for payment [Account 17]
4.1 Requester team submits results report to Worker [Account 46]
5.1 Requester team posts to worker review page [Account 46]
---Begin Generic Email Response---
1. Worker writes email to requester GENERIC
2. Requester responds to email quickly GENERIC
---something happens---
2.2 Requester does not receive email GENERIC
2.3 Requester marks worker's email as "spam" [Account 17]
NOTE: 17 is vengeful worker. "make sure I was paid my 20 cents".

Might have acted in a way to have pushed requester to mark emails as "spam".

How might Requesters manipulate tasks as a response?

These strategies are areas of control for the requester to achieve an unknown goal with similar tasks posted sequentially. Workers monitor requesters for these changes.

1. Increase/Decrease Pay 17
2. Introduce Test Screeners before task 30
2.1 Announced/Unannounced
2.2 Paid/Unpaid
3. Task Qualification Constraints In/Decrease GENERIC
4. New Task Attempt Recreation 27
5. Control/Block Emails 17
5.1 Mark all email communications as spam
5.2 Mark partial emails as spam
5.3 Mark none
6. Avoid posting more tasks GENERIC
7. Partition Task Quantities 27

Turker thoughts about pay on Survey Tasks

Pasted image at 2016 03 19 05 12 AM.png

Data: TurkOpticon 5 Votes v. All Others for Survey Tasks
Welch Two Sample T-Test Students' T-Test
p = 0.005428 p = 0.003642

Data CSV from March 19

data: dat$Tasks.w..5s and dat$All.Others t = 3.3403, df = 12.791, p-value = 0.005428 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.976024 9.246208 sample estimates: mean of x mean of y 10.863968 5.252852

Tasks w/ 5s All Others Mean 10.86396774 5.252851782 SD 4.807581499 2.259549664 N 10 10 P 0.003642357


Future Directions

The data used in this analysis are present below. Here is some guidance to examine the data for yourself. First, it is useful to generate the current coding key table in a separate text file prior to enable to understand the questions that might be asked regarding a RQDA files. Use the following code to generate this:




tableCode<- x[match(unique(x$codename), x$codename),]


The text file will be created and saved into the working directory under the name ProjectCodeKey.txt. The code key table for the first analysis can is provided here. For the purposes of this project, two file sets were created during sampling.


TurkOpticon RDQA 3.21.1714 database

TurkOpticon RDQA 3.19.1700 database

Data CSV from March 19