WinterMilestone 4 Team-UXCrowd

From crowdresearch
Jump to: navigation, search


Ensuring quality in crowdsourced platform by introducing a Platform ready certification, Sentiment analysis and Standard gold test. By S.S.Niranga | Alka Mishra


A global phenomenon with minimal barrier to entry, crowdsourcing has transformed human force from mere consumers of products to active participants in value co-creation. The crowdsourcing ecosystem is one in which work is being re- defined as an online meritocracy in which skilled work is rewarded in real time and job training is imparted immediately via feedback loops[1]. Under such working conditions, the diverse pool of untrained participants: workers and requester, often find themselves circling with mistrust and ambiguity with respect to result quality and task authorship. This indicates that there is a requirement for quality control mechanisms to account for a wide range behavior: bad task authorship, malicious workers, ethical workers, slow learners, etc.[2]. Although many crowdsourced platforms offer clear guidelines, discussion forums and tutorial sessions to overcome some of these issues but, there is still a large percentage of workers and requesters unaware with the use of platforms. In this paper, we assess how crowd workers can produce a quality output by introducing below three proposed methods.

• Platform ready certifications

• Sentimental analysis system

• Gold Test


Crowdsourcing is use to complete a task for a low production rate, publishing that task to the general public. There are many crowd sourcing platforms available in the market and most of them offers a decent service to the users[3]. Although many task requesters get benefit out of the system, some requesters questions the production quality that workers offers[4]. Half cooked task creation, lack of attention, among the reasons for this but many platform have their own mechanism to prevent this. Some crowd sourcing platforms have many tutorial, practice sessions, general forums to mitigate the risk and sometimes they offers interactive training sessions for the users. However, majority of workers and requesters hardly use the platforms features and sometimes they don’t have a sufficient knowledge about the system.

Platform ready certifications

So, to address this issues, we introduce “platform ready certifications” to the users. The certification will have a multiple stages such as,

• Platform ready (Beginner)

• Intermediate

• Expert

Each certification stage will define proficiency of the user and to start working on a task each worker must get the Platform ready certification (Beginner). In order to achieve the certification each user will get a series of basic questions related to the platform. Ex: - How to accept a task, How to communicate with the requester, How to rate etc. The certification will encourage users to learn about the platform thoroughly and this will produce a quality output. Once the worker complete adequate number of projects, reasonable working hours, higher ratings from the requester and sufficient community support sessions such as article writing etc., the worker can take the Intermediate or Expert level certification. The advance certification levels will motivate the worker to be a professional worker and help out the community.


Requester also could take the certification but, it’s not mandatory. However, if they take the certification, they will identify as certified requester which will add more value to their profile and the workers will like to work with them.

Sentimental analysis system

In addressing the issue of inefficient task authorship, Daemo proposes feedback iteration of prototype task from workers and the usage of those feedbacks to create refined task by requesters. At this point we are proposing that feedback from the prototype task should be presented to requesters in the form of sentiment analysis. Opinion mining or sentiment analysis, deduce and analyzes the emotions conveyed in texts and is highly efficient in case of complex task with large number of feedbacks. In this case, the mood board (visual representation) generated by analyzing the feedbacks would be much easier to understand by requesters and will not require any language proficiency[5].


Figure 1: We can see 3 out of 14 Daemo’s prototype task feedback from workers (right) which can be used to revise the task interface (left).

For a visual example of sentiment analysis, we have drafted our study on the nature of feedbacks provided by workers for a particular prototype task on Daemo. The pulse of the opinion and feeling about overall task design, is monitored through the collection of feedbacks shown above in figure 1, and analyzed using the Bales Interaction Process Analysis (IPA) method. This method has enabled us to identify and categorize moods and opinions of an ongoing interaction between requester and workers [6]. Feedbacks from individual workers about the prototype task were categorized based on whether they were:

• Positive in the Expressive-Integrative Social-Emotional Area

• Attempted answers in the Instrumental-Adaptive task area

• Questions in the Instrumental-Adaptive task area

• Negative in the Expressive-Integrative Social-Emotional Area

We altered the Bales IPA method slightly to extract the sentiment about the design of this particular task and created a spreadsheet for sentiment analysis followed by graphical representation, as shown below in figure 2 and 3.



Figure 3:

Gold Test

Finally, we are proposing an extra layer of gold test to be included in the re-designing of task before launching it on general platform. As means of quality control, this additional sifting could be helpful in determining effective results from workers. Under this gold test, are a series of questions, that requester choose and answered prior before launching a particular task. Moreover, these sets of test questions will be used to test workers performance both before they start working on a task and also on an on-going basis while they are in the process of completing a task. Based on their performance on these gold test questions, final results are compiled by removing any poorly or underperformed worker's contribution. For a given task, if the maximum number of workers perform poorly, then ultimately the task will be suspended.


[1] V. Lehdonvirta and M. Ernkvist, “Converting the Virtual Economy into Development Potential: Knowledge Map of the Virtual Economy,” 2011.

[2] D. Oleson, A. Sorokin, G. Laughlin, V. Hester, J. Le, L. Biewald, and S. Francisco, “Programmatic Gold : Targeted and Scalable Quality Assurance in Crowdsourcing,” Artif. Intell., pp. 43–48, 2011.

[3] A. Doan, R. Ramakrishnan, and A. Y. Halevy, “Crowdsourcing systems on the World-Wide Web,” Commun. ACM, vol. 54, no. 4, p. 86, 2011.

[4] M. Allahbakhsh, B. Benatallah, A. Ignjatovic, H. R. Motahari-Nezhad, E. Bertino, and S. Dustdar, “Quality control in crowdsourcing systems: Issues and directions,” IEEE Internet Comput., vol. 17, no. 2, pp. 76–81, 2013.

[5] Y. Hong, H. Kwak, Y. Baek, and S. Moon, “Tower of Babel: A Crowdsourcing Game Building Sentiment Lexicons for Resource-scarce Languages,” Proc. 22nd Int. World Wide Web Conf. Companion Publ., pp. 549–556, 2013.

[6] Bales, Robert. “Interaction Process Analysis Article.” Interaction Process Analysis Web. 25 Mar. 2012.

Milestone Contributors

S.S.Niranga @niranga,

Alka Mishra @alkamishra