Proposition of a model to simulate evolution due to organizational factors
We try to define a game-theoretic description of crowdsourcing in order to define the basics for a agent-based simulation of its evolution. The value of variables are given as examples some tuning will be necessary in particular to be sure strategies make sense. The focus of the model is to understand in which ways organizational structures, and in particular the introduction of a guild concept, influence cost and quality of production as well as wages for a win-win evolution of the economics of crowd-sourcing platform.
Definition of Players
Two sets of actors : W (workers) and R (requesters) Objectives
- W get maximum amount of m for a fixed value of t (get max money for spent time)
- R get a target amount o for a minimum of c (get the desired output at minimal cost)
Variables must synthesize major aspects of the crowdsourcing environment. We have to restrain ourselves to multiply their number and strive to make them as few as possible because each new variable produces a new dimension in our state space. Redundancy of information is then exhorting an heavy price in term of computational complexity.
In this model, we consider that all requesters generate a same work proposal. This batch is a set B of identical micro-tasks b.
Each individual task b takes a fixed time t_0=2 to complete and its difficulty is defined by a normal distribution representation expected proficiency of a standard worker. Considering an execution of the task as a random event generated according to this distribution, the lesser the distance to the average the better the quality of the output of this execution. Let us define by d_0 the standard deviation of the task distribution. The price p paid by a requester per successful completion of a task is variable and is part of requester strategy.
The cost c of executing a batch is the sum of the total amount paid to workers for successfully completed tasks and an amount corresponding to the processing cost for unsatisfactory job completion that are rejected. This later cost originates from various reasons : dealing with complaints from workers, managing the detection of the fault, ... Intuitively, we assume a linear relation between the number of rejected task output and this cost. So rejecting a task generates a fixed cost c_0 incurred by the requester.
c(B) = n_0 * p + n_1 * c_0 with n_0 the number of task in B and n_1 the number of rejected task output. n_1 is a random number depending on worker execution strategy and requester rejection strategy.
For a worker w in W, execution is expressed as a gaussian distribution. Standard deviation is a measure of the quality of the output. A task output is a random event against this distribution. We consider three possible strategies with three different standard deviation. Accordingly in comparison with the regular task completion (the normal distribution associated with our typical micro-task), three execution strategies can arise:
- Lazy execution (LEW), standard deviation of execution by this worker is significantly larger (let's say d_0*3) than the regular standard deviation for task completion.
- Correct execution (CEW), standard deviation of execution by this worker is the same than the regular standard deviation for task completion.
- Thorough execution (TEW), standard deviation of execution by this worker is significantly lesser (let's say d_0/3) than the regular standard deviation for task completion.
Of course, each of these strategies leads to different execution time (alternatively we could use execution time to model skill level: a better execution in a same time period is a sign of better skill)
- A worker applying a lazy execution can produce an output in a fraction (let's say half or t_0*0.5) of the time required for a regular execution
- A worker applying a correct execution can produce an output exactly in the time required for a regular execution
- A worker applying a thorough execution can produce an output in a multiple (let's say double or t_0*2) of the time required for a regular execution
For a requester r in R, strategies are defined by their acceptation of the output of workers as well as their pricing of b. A strategy is thus described by couples of (d, p) where d is the max deviation admissible for acceptation of a task output and p is the price paid per accepted output. If a worker submit an output at a distance delta from best execution such that delta > d then this output is rejected by the requester. We envision the following possible strategies:
- Quality first (QFR), aggressive discard of output (d is for example 0.5*d_0) associated an high price for success in completion (p=2)
- Fair Market (FMR), regular acceptation (d=d_0) as well as price (p=1)
- Price First (PFR), lax acceptation of output (d is for example 2*d_0) in association with lower prices (p=0.5)
These lists of strategies can of course be extended to better represent behaviours. One must keep in mind that multiplying strategies result in a squared expansion of the number of evolution patterns and thus a more difficult analysis of possible trajectories in our simulation.
At the beginning, W and R contain a diversified population applying their various strategies. The game consists on successive turns in which players interacts one-to-many : one requester to several workers. A turn in the game organizes at random interactions between requesters r and groups of workers.
Let us consider 1000 workers interacting with 100 requesters on batch of size 100, each worker delivering an output for 12 units of time. Example of work distribution: a requester has gained access to 10 workers (random selection) - 3 LEW, 5 CEW, 2 TEW
- a LEW can realize an output of size 12 in a period of 12 units of time
- a CEW can realize an output of 6 in the same period
- and a TEW can realize an output of 3
This situation results in a weighted distribution of the output: 3*12 par les 3 LEW, 5*6 par les 5 CEW, et 2*3 par les 2 TEW
When a requester has not reached a complete execution for its batch, random workers are selected in W to provide more outputs until complete execution). At the end of each turn, each requester computes its cost for the turn and each worker computes its revenue. A scale of target cost and revenue must be defined to evaluate the level of success of a requester or a worker during the turn.
A simulation consists in the definition of an initial population of requesters and an initial population of workers, each with a selected strategy. The simulation runs several turns of the game to understand the dynamics in the composition of the population due to the evolution of their respective strategy.
At each turn, requesters and workers can change their strategies from the current one to a new one or keep the same strategy active. This evolution results from experiences at previous turns with memory. If a strategy is consistently underperforming (meaning result in a majority of turns with what is deemed as an inadequate level of success), then the actor will have to select a new strategy. The statistics of this evolution is part of the model.
For example, a requester has a cost deemed as too high with its current QFR strategy, what is his next move? could be that he has 25% probability to remain in QFR, 30% to change for FMR and 45% to turn to PFR. The simulation makes the actor evolve randomly according to this statistics.
A loose situation results from persistent performance. Facing a very low level of performance in a consistent way must result in the actor quitting the game. This is a loosing outcome for this actor.
Calibrating the Model
A number of values in the model are currently guess estimates. We should calibrate the model using real world inputs. A set of questions addressed to real users (workers and requesters) must give the necessary feedback to define the correct scales for various variables:
- the scale of standard deviation evolution between average, lazy and top performers in term of quality and in comparison with rejection policy by requester
- the scale of different prices for a task
- the scale of times spent by task
- the matrix defining the evolution statistics between strategies
The envision procedure is too select a typical workload (say an image tagging workload) and
- ask a group of workers to execute each the same workload with a direct measure of their respective throughput. we then rate their output and compare the quality level to the time spent for producing the output.
- ask a group of requesters to first give a price per micro-task for our experimental workload and then review the previously generated output for acceptance or rejection. This gives us the price range and the statistics of various attitudes towards rejection.
- then expose both population to a survey of their attitude in adverse situation. The survey must consist in simple instinctive choices (see examples).
Survey - Workers
Your production is consistently rejected by requesters, you achieve only 20% of successful submissions of your work, what is your choice : select one of those
- spend more time to make sure my output is better
- spend less time to augment the number of tasks I submit
You find a group of well-priced tasks, what is your choice :
- I rush your execution to make sure I can accomplish a maximum of them
- I make sure that I produce a quality result to make sure to be accepted even if this means I need to spend more time to execute each of the tasks
You produce work of high quality but does not make much money, what is your choice:
- I want to maintain my quality standard but I will try to more efficient
- I understand that lowering my standard help me to deliver faster and produce more output
Survey - Requesters
The output received from workers is lacking and you consistently have to recruit twice the number of workers to achieve a useful production, what is your reaction:
- find the lowest possible price to attract more workers
- augment the price to attract better workers
What has your preference:
- accept not ideal submissions to remain in my initial budget
- reject lacking submission even if this leads to committing more budget to add more workers in the crowd
Representing the Influence of New Additional Mechanisms
To compare various situations and the expected impact of different mechanisms, we have to represent in the game their basic effect. For example, during a same turn, an actor can be allowed to run multiple interactions at the same time: this helps to represent the extent of existing mechanism to enhance diffusion of information in a group.
The Guild Case
Introduction of the guild organization: to define in a nutshell the main action of a guild regarding our basic focus on cost and quality, a guild acts as a filter. This filter blocks its members to accept jobs under a minimum p, while at the same time it also blocks actual submission of output (realization of the random event) over an accepted deviation d, demanding for new submission by the same worker.
In our experiment, the main settings is to compare the outcome of the game for requesters, autonomous players and guild members with varying initial population repartition (in terms of applied strategies).
Let us develop an example: a requester proposes a batch B at price p. Guild members will participate only if p is superior to their minimum. If it is not the case, only autonomous workers can accept the jobs. When p is sufficient, both guild members and autonomous workers produce output, but while autonomous worker output is directly submitted to requester approval, in the case of a guild member, the guild can decide not to submit the output due to insufficient quality (according to the actual distance of the output from ideal performance) and ask for a rework before submission. Rework requires time for the worker, time that limit its personal total output capacity. Task submission coming from a guild member is consequently assured to meet the guild standard.
We must also compare the outcome when guilds are involved with the outcome without guilds in terms of requester satisfaction.
Other organizational system can be represented to be tested one against the other.
Several directions are possible from first results:
- Expand the model for more accurate description of reality
- Develop a whole range of mechanisms and model their impact and interactions in simulation results
- Launch the platform and track the various variables of the model to verify the match and prediction capacity of the model