Evaluation for Boomberang Reputation System
- From Amazon Mechanical Turk or from public datasets we take 10 open ended tasks and 10 closed ended tasks.
- We recruit workers from AMT to our platform (we will try to cover the spectrum of workers; taking workers with poor reputation; medium reputation; and excellent reputation.
- We bring requestors onto the platform and randomly assign them to use boomerang or not.
- Each requestor then picks one of the tasks. The requestor will then: (1) make the task interface; (2) launch the task; (3) rate the quality of the work executed by each worker. The requestors will then picks another task, again prepare it; and send it out to workers and evaluate the quality of work by each worker. Requestors will follow these steps for 2-3 times.
- Requestors will then rank the overall quality of the worker based on what they saw. For open-ended tasks we will review the ratings in their individual level (analyzing their check-minus for each worker for each task); and for closed ended tasks at the aggregate level.