Reputation Simulations

From crowdresearch
Jump to: navigation, search

Summary of current simulation results

Below are distribution of various run throughs for reputation simulations found here There were 6 runs currently made

Baseline:

  • Simulates basic interaction between workers and requesters where if workers finish a segment of a job from the requester, they both rate each other depending on the quality of the workers' work and the behavior of the requester.
  • Both workers and requesters have various thresholds for the type of rating they give.
  • Workers are selected at random from the requester as long as the workers have the skill to complete the task then they accept the work
  • 100 time iterations were run through where workers found tasks to complete and rating then took place.
  • At the end of each time iteration, new workers and requesters were added, all starting with a reputation of 0
  • Three reputation scores were available for rating: -1,0,1. These represent the check-minus, check, and check-plus model currently described in our reputation paper

Average:

  • Only change from the baseline is that workers and requesters are now given the average worker or requestor reputation rating at the start rather than 0

Cascade:

  • This simulates the cascading work release through adding a additional feature to a Job. Now jobs have a minimum reputation that starts high and as time goes on, it will gradually decrease allowing more workers to complete tasks.

4 point scale:

  • Four reputation scores were available for rating: -1, -0.33, 0.33, 1

Priority:

  • Workers were selected from requesters based on their reputation being a certain level and priority was given if workers were on the requester's 'like' list, which represents if the requester previous rated a worker highly (in this case when they rated the worker a 1)
  • While there is also a block list that would have lowest priority and workers were added to it if they were rated a -1

Priority cascade:

  • Combination of both the priority and cascade simulations

Task Distribution for all Simulations

Below is the distribution of the number of task completed throughout the entire simulation. For reference:

  • X-Axis = Number of Tasks Completed by a Worker
  • Y-Axis = Number of Workers

So these are showing the number of workers that were able to complete a certain number of tasks. For example, the baseline results are showing a power distribution, thus meaning there are about 700 workers that finished between 0-1000 tasks when there is a very small amount (around 5) workers that finished over 6000 tasks. Right away it is noticeable that the Average, Cascade, and 4 point simulations are producing much more even amounts of work completed across all workers. A much better result that the baseline. However the Priority simulation is producing an even worse distribution than baseline, indicating only a few workers are getting a lot of work done. I wanted to see if the cascade model could assist in making the priority distribution better, but as you can see there is very little difference.

Reputation Distribution

Only looking at the four best (Baseline, Average, Cascade, 4 point) from now on, now I will examine the reputation distribution of the workers. Below within each graph is two lines, green and blue. The green lines represent a probability density of the reputation of workers and the blue line represents a cumulative density of the reputation of workers. What we should be looking for is if we are to assume that 0 is the average, is that the green line would be at its peak around 0 and the blue line would have its greatest increase in that area of the x-axis as well. However we see that within the baseline, average, and cascade simulations, the distribution is skewed in the right direction, indicating that on average the reputation of workers is higher than 0. The blue line also adds some more information by showing that workers who have a reputation of 1, are actually the most abundant. Only the 4 point scale simulation has a more normal distribution around 0, which is seen through the cumulative distribution (blue line) steadily increasing. Possible issues:

  • While this does shed some more light of the differences, there may be an issue with this simulations assumption in rating criteria. The simulation gives thresholds for workers and requesters to rate on, which are choosen at random through a given range of values. This range may be too narrow and causing a lot of ratings to be given very positively. It is interesting to see that there are differences even if this is the case

Task Distribution Over Age

One question was if new workers were going to have an equal chance within the system if they had low reputation. Examining the distribution of the average number of tasks completed over age should shed some light on this. Right away it is noticeable that within the baseline simulation, that the older a worker is in the system, the more tasks they will have completed. This is to be expected cause they have had more time to complete tasks. However the other simulations are having some rather difference results, in that on average, workers across all ages are completing an equal number of tasks. This is great to see in that more work is getting distributed and not excluding out newer workers. The most equal distribution has to be the average simulation, with the 4 point scale coming in second.