Milestone 4 Analysing Failure with "FeedbackMe" system by Team1

From crowdresearch
Revision as of 22:43, 7 February 2016 by Kamilamananova (Talk | contribs)

Jump to: navigation, search


Analysing Failure Idea: a "FeedbackMe" system

Сrowdsourcing entails the process of obtaining needed services, ideas, or content from a large group of people. A wide variety of tasks can be crowdsourced. But in any crowd, people have different skills and different backgrounds, which means that some workers may be better than others at some tasks, and worse than others at other tasks. Tasks can vary from easy ones, those with binary questions (simple analysis of images, i.e. questions like “Is there a human face on the image?”), to complex ones such as translation or any other task where the output is intricate. Due to the lack of proper communication, it is impossible to predict what problems will be met during the creation and execution of a task. Language barriers, ambiguity of questions and the quantity of work-time relationship can cause huge risk of task failures. But it was shown that the quality of task results can be substantially improved by choosing an adequate task design (Huang et al., 2010). Both requesters and workers are interested in receiving good quality work, therefore they are interested to receive feedback to improve their work and to lower the risk of task failures.

The design of a system

This introduction presents a design and evaluation of «FeedbackMe», a software feature designed to analyze task failure and find its causes. Often, when giving a feedback, we meet some difficulty in scaling the measurable and unmeasurable qualities, and observable and unobservable qualities. Therefore, we present the following design of a system:

- Multiple choices questions. We propose a series of binary questions (“Did the worker respected the deadline for every task?”, a requester can evaluate only with “yes” or “no”) and from 0 to 5 scale questions to be answered by the requester and worker after completing a task, both successfully or inverse.

- After checking the cases in a visible feedback window with a series of generated questions and open question case, the results of checkboxes will be recorded (open question is only viewable to requesters and workers involved) and stored on the platform both in the requester’s and worker’s dashboard.

- In cases where the worker or requester gets the same « bad » evaluation of one of the qualities, « FeedbackMe » sends an automatic message to the evaluated person with proposals for further improvement.

- For example, if a requester does not use the proper vocabulary or details for describing a task, he is sent an auto-generated message with the proposition of improving his vocabulary or be more clear when explaining tasks, etc. The exact same thing will apply to workers.

- The feedback will be available shortly after completing the task and will be synchronous for the worker and requester.

Overall, our system offers benefits to requesters (better, clearer tasks are posted) and workers (quality improvement of delivered work).


To demonstrate the performance of the system, we can run a test attracting 10 workers and 15 requesters. We split workers into two groups: one group would provide and receive feedback to and from requesters (group S) and another who will not (group P). We assume that both groups are of same capacities and skills. Requesters are split too, into groups F, N and T (5 persons in each group). We assume that requesters of three groups give mixed (easy and complex ones) tasks. Group F gives feedback using “FeedbackMe”, while T and N do not.


After the experiment, we expect the following results:

- All the participants would have a feedback stored in the system (while passing by worker’s evaluation stage, all workers get a final feedback and while passing by requester’s evaluation stage, all requesters get a final feedback)

- Group S workers and group F requesters will answer a set of questions (multiple choices and open questions) after completing the task, and those who get a “bad evaluation” will receive an automatic message.

- Group S workers will perform better than group P workers in the final tasks (group T requesters’ tasks).

- Group F requesters will create clearer and more detailed tasks (because of mutual feedback).


Schulze, Thimo; Seedorf, Stefan; Geiger, David; Kaufmann, Nicolas; and Schader, Martin, "EXPLORING TASK PROPERTIES IN CROWDSOURCING – AN EMPIRICAL STUDY ON MECHANICAL TURK" (2011). ECIS 2011 Proceedings. Paper 122.

Aleksandrs Slivkins, Jennifer Wortman Vaughan "Online Decision Making in Crowdsourcing Markets: Theoretical Challenges" November 2013


Team1: @seko - Sekandar Matin & @purynova - Victoria Purynova & @kamila - Kamila Mananova