Milestone 3 San Jose Spartans PowerIdea 1: Building a Task and Skill Recommendation engine

From crowdresearch
Jump to: navigation, search


• Workers spend a lot of time in searching a task relevant to their areas of interest. Finding a task that aligns with their tasks and is priced fairly should be a fairly easy, quick and hassle free process.

• Requesters don’t have very efficient means to find workers with required skillset themselves.


Using machine learning and data mining techniques like association rule mining or using wide range of classifiers like Decision Trees, Logistic Regression (or their corresponding ensemble forms), etc., a comparatively good recommendation engine can be constructed provided there is avenue to collect sufficient data regarding workers skills, historic task preferences, task prices, task skills.

We encounter a recommendation engine in our day-to-day lives while surfing for movies on IMDB or Netflix, shopping at Amazon or surfing for books on Amazon bookstore. A similar recommendation engine used by Amazon on its other platforms can be used on MTurk with some tweaks to make it specific to turking needs in order to make task picking a more friendly process.


A recommendation engine that:

• Suggests a list of tasks relevant to worker’s interest and skill set.

• Suggests a list of workers with relevant skill set ideal for requester’s task


(Assumptions: The tasks fall under broad defined categories. Example: Programming, annotating, etc. (To be refined))

At Worker’ End:

1) Prompt new Turkers to select broad 5 categories, based on which suggestions would be made initially.

2) Check the worker’s historic tasks. Gather his/her skills and past task compensation. Based on the skills and workers completed task record, association rules and be developed using association rule mining and then support and confidence of new available tasks can be found.

Other classifiers like Logistic Regression, Decision Trees or their ensembled forms can also be trained (in lieu of association mining) based on worker’s past task preferences, task compensation and skills (as a feature set) to predict the most suitable category for a worker and then tasks in that category can be presented to the worker. Classifier’s performance can be expected to improve with time as it learns more about worker’s likings and dislikings.

3) A task pool consisting of tasks above the threshold confidence and support can then be presented to the worker. The worker can then pick the task from the suggested pool.

4) The workers choice is recorded to aid in future recommendations.

At Requester’s End: A skill-matching component can be devised which finds workers with equivalent skills and suggests a pool of workers for the requester to pick from.