Milestone 7 sanjosespartans

From crowdresearch
Revision as of 13:33, 15 April 2015 by Adityasharma (Talk | contribs) (System)

Jump to: navigation, search

This is the template for Milestone 7. It is the same format as last week's research proposal.

You will propose your platform in the form of an introduction to a mock research paper. Essentially, imagine you have built your system, incorporating in all the ideas that you wanted to have in it, have run your user studies and evaluations, and everything has gone as planned. How do you convince other researchers that you have built a platform that is novel and that it is more effective at addressing problems than any existing ideas that have been attempted in the past?

An introduction of a research paper summarizes the main contributions of the research. It is generally roughly 1 page (roughly 1000 words), and consists of the following components:

Pricing Mechanism for Crowd Sourcing Market

Abstract

1. Effective Pricing and Allocation of tasks needs good tools.

2. Framework needed to design mechanism with provable guarantees in crowd sourcing market.The framework enables automating process of pricing and allocation of tasks for requester's in AMT where workers arrive in online fashion and requester's face budget constraints and task completion deadlines.

3. Competitive Incentive Compatible Mechanism for maximum number of tasks under budget and minimum payment given to a fixed number of tasks to be completed.

4. Create a platform that enables applying pricing mechanism in markets like Mechanical Turk.

Motivation

1. Advancement of Internet lead to new labor markets called Crowdsourcing Markets where cognitive work distribution was done to hundreds of thousands of people geographically disparate workers.

2. Requesters usually outsource large quantities of simple tasks to anonymous workers. Typically the work includes image labelling,sentiment analysis ,content generation , listing verification etc. those type of things that are too difficult or expensive to automate are usually crowdsourced.

3. There are many crowdsourcing platforms that provide workers with non-monetary incentives like entertainment , educational opportunities, information and altruism for their efforts. Despite platform success its tough to engineer non-monetary incentive scheme for tedious and repetitive tasks.

4. Overwhelming majority of crowdsourcing tasks are done in exchange of payments ,here implementing a campaign successfully needs pricing and allocation of tasks effectively.

5. Designing effective pricing and allocation mechanism presents challenging problems due to requesters constraints and realities of markets.

6. Requesters face task completion deadlines and budget constraints and must account for dramatic elasticity in workforce supply. Also there is large variance in effort needed to complete different tasks , that depend on skills and background of workers based in multiple geographical locations.

7. There are limited tools present in crowdsourcing markets for pricing tasks effectively. Develop theoretical framework and design mechanism that work in practice and have provable guarantee.

8. We describe a platform that enables requesters to automate process of pricing in crowdsource market using mechanism we have along with other pricing schemes. Framework designed for tasks where quality of workers performance doesn't yield additional utility to requesters above a certain threshold.

9. Workers receive payments after requesters approval in crowdsourcing platform . It's assumed that requesters have access to verification schemes and focus on effective pricing and allocating tasks independent of quality .

10. We take mechanism of design approach to pricing problem and enable workers to bid on work by expressing their cost for performing tasks and number of tasks they wish to perform, although most crowdsouring platforms don't provide workers such level of expressions ,existing API's make feature easy to integrate into most platform.

11. Mechanism present here for 2 major objectives -

a) Maximize number of tasks performed under a budget and minimize the payments for given number of tasks. We consider requesters that impose deadline for task completion and workers who arrive according to some known distribution and can strategically misreport their cost or number of tasks that they are performing .

b) 'Incentive Compatible Mechanism' designed to ensure allocation and pricing such that it's in everyone's interest to bid truthfully.

Related Work

1. Problem of designing mechanism for pricing tasks in crowdsource markets has been done like bargaining between requester and workers to minimize work and using 'Bandit Algorithm' to maximize tasks.

2. While both are natural approximations ,they leave room for frameworks that allow better theoretical guarantees,approach here is 'Incentive Control' problem, another approach is to develop mode for worker's efforts and learning its parameters from data. Problem of designing mechanism for procurement been extensively studied by Algorithm Game Theory community over past decade.

3. Recently budget feasibility framework has been initiated ,where goal is to design incentive compatible mechanism that maximizes requesters objective under budget .

4. In the model we account for online arrival of workers that raises significant challenge . There is subset literature on online mechanism design where workers arrive according to given distribution . We consider the mechanism for buying items (rather than selling) from strategic agents that need different machinery.

Insight

PricingAlgo.png



We also crawled TopCoder website

Price was taken from the prize section. Useful ‘price drivers’ categories constructed from data:

1) Development Type (DEV). e.g. new components or updates to existing components, what programming language will be used.

2) Quality of Input (QLY). e.g. the review score and related design statistics.

3) Input Complexity (CPX). e.g. implementation “difficulty” drivers.

4) Previous Phase Decision (PRE). e.g. the price decision of previous design phase.

16 such price drivers were constructed.

Other Models studied in the Research Papers are as follows :

1. Multiple Linear Regression Model

PRICE = β1TECH + β2DEPE + β3REQU + β4COMP + β5SEQU + β6SCOR + β7AWRD + β8EFRT+ β9SUML + β10WRAT + β11REGI + β12SUBM + β13ISUP + β14ISJA + β15ISCS + β16SIZE +β0 +ε

2. Three decision tree based learners (C4.5, CART, QUEST)

3. Two instance-based learners (KNN-1, KNN-k∈[3,7])

4. One multinomial Logistic regression method (Logistic), One Neural Network learner (NNet) and One Support Vector Machines for Regression learner (SVMR)

System

Pricing Tasks for Finishing on Time :

1. To prevent task starvation or tasks that stay unattended by workers, tasks should be priced right (not (underpriced), good enough to be taken up by workers.

2. Using survival analysis model to create an optimal task price. Survival analysis can determine the right price based on historical market data. The disadvantage is, survival analysis does not provide any idea about how the market works, and how workers decide to perform tasks.

3. Worker arrivals can be modeled with a non-homogenous Poisson Process (NHPP) based on quantitative data.Building a proper model for worker behavior also requires a descriptive model of how workers decide to take on and finish a task. Workers often select their tasks from a desirable task pool.

4. Our observation shows that workers often have preferences for the types of tasks they like to accept. We use this concept to develop a discrete choice based model for a better pricing policy and scheduling for crowdsourced tasks. In cases where complete, or even partial, information of the market is available, a requester can optimize her task attributes to increase the likelihood of workers accepting the task.

5. Discrete choice models can provide a framework to optimize the attributes of a task and therefore increase its desirability to the user. One convenient aspect of discrete choice models is that this change in desirability can be captured, quantified and used for attribute optimization.

6. Survival analysis, frequently used in epidemiology and biostatistics, is a general term for statistical techniques to determine the time until a particular event occurs. Time can be represented in any units (hours, minutes or years). What constitutes an event depends on context. For instance, in epidemiology an event usually refers to the death of the individual. In the context of maintenance scheduling, an event can be referring to a machine breakdown.

7.

Evaluation

1. Aim is to design a mechanism that performs well,a mechanism that decides how many tasks each worker performs and how they are paid.

2. As workers may report false costs, we seek 'Incentive Compatible' Mechanism for which reporting true cost is dominant strategy.

3. As Incentive Compatible Mechanism guarantees that bids are truthful,its performance over bids can be compared against theoretically optimal algorithms that knows workers true values.

References

1. http://www.eecs.harvard.edu/econcs/pubs/Singer_www13.pdf

2. http://www.ieor.berkeley.edu/~faridani/papers/hcomp-2011.pdf

3. http://www0.cs.ucl.ac.uk/staff/mharman/nier13.pdf