150828 wording responses
Since you're a requester, I wanted to ask you quickly about some wording for a new crowdsourcing platform called Daemo. I'm helping to work on the reputation system there, and we're trying to make the wording very clear for it. What I'd like to ask you is: what is your first impression of the following statement:
"When you rate this task: 1) you will help to improve the overall quality of results collected through Daemo, and 2) the workers you rank highly will get first access to your future work "
What does this say to you about why you should rate the task? Would you rate more tasks based on this information? Does any part of it stand out to you more than another? Does any part not matter to you? Is there something missing?
Thanks for your input! Hopefully we can create a new platform that serves you much better than mTurk has so far.
My initial thought is to swap 1 and 2, since #2 is the incentive that most requesters will pay attention to and once people read #1 they may just zone out since it's typical norm-setting stuff.
Quick answer: I'm confused about how rating the task and rating the worker are connected. Am I rating the "task" as in the work submitted to me by a worker, am I rating the task work for each worker? That raises alarm bells.
I am unsure about how you imagine the rating system, but imagine having 10K hits for image tagging. Do you expect a requester is going to rate 10K individual tasks? This is not going to happen ever!Â Not even for tens of hits people are not going to rate.
Requesters are used to either approve or reject work, not to provide any other feedback. The way you rank workers should be strictly dependent on this numbers.
After all, that's the reason why we use crowdsourcing from the beginning: to have the work done a.s.a.p., not to have to spend as much time after to rate/feedback etc..
At first I thought the "..." section was asking me as a worker to rate the task, as requesters don't usually rate tasks.Â
Then I thought you were asking me as a requester because of part 2).
So I think you need to make it clear who is being asked here to rate what and why.
What does this say to you about why you should rate the task? It says to me that good workers will get better rankings and therefore first dibs on work. As I work for a requester and also do some turking here and there, I can look at this from both sides and I think this would be beneficial to both the requester and the good worker. If it was verifiable that the worker was indeed good (I am not convinced master turks are any better than non-masters). The good worker would get more work and be able to pick from the best jobs they are suited to or like the most. The requester would be more sure to get a higher quality of work.
If a new "mTurk" will be created, as a requester I would like to see some kind of testing for perspective workers be done. Definitely an English level test so it can be determined who has a good working knowledge of English.Â I have emailed with foreign workers many times and seen Amazon not allow some very good English speakers to make their HIT's while they allow others that can barely communicate with me. I seriously doubt that anyone that does not have basic English skills could make HIT's on mTurk, unless they were making HIT's that were for their own language.
I would like to see more options for HIT design in a new provider, such as: the ability to return a HIT that was rejected to be corrected or completed by the original worker so both sides are happy; more user friendly designs (instructions are easier to build in and not be overlooked); ability to communicate both ways easier (I would like to get to know the workers that make our HIT's better, not to be friendly, but to be able to answer their questions, help with problems, etc.). I would like a forum that is fair to both the worker and the requester. I would like a place to be able to list problems we are having, when the system is down, answers to the most common problems, etc. I would like to have a more open and helpful co-existence with the workers that make our HIT's.
Obviously pay is important to the requester and the worker, I am not sure how to make it more equitable, but I think it has to be affordable for the requester and profitable for the worker for this to work as we all want it to.
Sorry to dump on you a bit of my wish list, but if you are going to improve the wheel, make it shine too.
One other thing, I think the rejection system in mTurk is very unfair. I don't like to reject a HIT, but we do need it to be made correctly. I do understand that a typo can be made here and there, but programming does not understand typos. We cannot sort out good data from bad in a HIT, so we have to reject the whole thing. I wish we could reject the HIT, so it will go back and be made correctly, but the worker, if they are a good one, would not get a rejection strike, so I wish I had the option to reject the work (if they cannot correct it) but not give them a negative as well. So I wish I had more options for approving and rejecting HIT's. I do cut good workers I know as many breaks as I can, but that in turn costs us money, so it is a fine line I try to walk on.
To make our HIT's you don't have to have a 99% rating, we are not so exacting. I would like to see other requesters not be so brutal on the approval rating requirement, then the worker would not have to be so scared/angry with every rejection. We all make mistakes and I would like some way to deal with this more fairly to both sides.
I don't know what a task is in this context, is that completed work?
Both 1) and 2) seem too long. Both could be more concrete.
I am quite confused by these statements. I am not sure about the audience whom these statements appeal to.
If they target requesters, (1) expected contribution to a greater cause (the system) could be a motivation to exert more effort, but I feel this is probably not enough for the majority of requesters. It is better to appeal to the expected benefits from the greater efforts that could be exerted; (2) this generally sounds good, but pondering on it for just a little bit more would yield quite a few questions as to how the system handles "better" consistently, and how is it different from just myself picking those who had better quality work on my projects in the past, and then simply qualify/invite them to do my future work, and disqualify/do not invite others of less quality?
These are my thoughts, hope they are useful. I do hope that you can keep me updated about this "Daemo" (or it's real name if this is a pseudoname) platform, its milestones, and launch if it actually happened.
So this is for requesters (or whatever you end of up calling researchers) to rate the workers?
Are we rating each individual worker or the overall success of the project?
I'm not sure if this is a task that I've posted why I would rate the task other than 'this is a perfect task since I created it'---am I missing something Kristy?
What strikes me as the wording is inconsistent: in the preface, you say you are rating a _task_, but in the second one, it says "the workers you rank highly". I assume that the idea here is that the requesters are ranking the quality of the work done by a particular worker on a task.
Also, how exactly do you plan to enforce the second motivation for rating?
In any event, it's important for me (from the AutoMan and SurveyMan perspectives) that this all be accessible by an API.