Summer Milestone 10 Content Driven Reputation

From crowdresearch
Revision as of 10:30, 30 July 2015 by Rcompton (Talk | contribs) (Measuring the Quality of Contributions in Collaborative Systems)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Content-Driven Reputation for Collaborative Systems

Authors: Luca de Alfaro and Bo Adler

Link: Paper or [1]

Main Goal

The authors' main area of interest is to judge users by their actions, rather than by the word of other users. In such a system, users gain or lose reputation according to how their contributions fare; users whose work is undone lose reputation. This system does not require users to express judgments on one another.

Each time reputation is used to grant privileges to users, the authors believe the users are likely to provide useful contributions. Use of this reputation is to estimate content quality and identify vandalism. Issues with user generated rating:

  1. Can be quite sparse
  2. Gathering the feedback and the ratings require secondary systems outside of main system

In content driven reputation systems every user is turned into an active evaluator of other users’ work by the simple act of contributing to the system. Furthermore, a content driven reputation system is resistant to bad mouthing and attacks.

Contribution

The authors' main contribution is the idea and formulation of a pseudometric. They demonstrate a pseudometric on document versions, specifically document versions of Wikipedia files and git repositories. Pseudometrics satisfy two requirements:

  1. Outcome preserving: If two versions look similar to users, the pseudometric should consider them close. Assigns a distance of 0 to versions that look identical
  2. Effort preserving: If a user can transform one version of a document into another via a simple transformation, the pseudometric should consider the two versions close.

Pseudometrics are not meant to be a distance function as they can be negative, however a pseudometric with measure of 0 means that the two versions are equal. The present connection between version pseudometric distance and edit quality describe how the resulting reputation system can be made resistant to broad types of attacks. This model is inspired by wikis, but they claim this can be applied to more complex collaborative systems, such as SketchUp models.

Measuring the Quality of Contributions in Collaborative Systems

They follow the idea that the quality of an edit can be measured by how long the edit survives in the subsequent history of the document. They define a function that gives the quality of an edit based on the difference in edit distance between the pre-edited document and a judged later version that contains the same edit, then divided by the difference in edit distance from the edit made between the pre-edited and edited document. Maximum quality is achieved when the change done going from the previous version to the judged version is preserved and the minimum quality is achieved when all changes from the previous version is undone in the subsequent change, this corresponds to a reversion.

  • A pseudometric is effort preserving if the distance between versions that can be easily obtained one from the other is small.
  • A pseudometric is outcome preserving if the versions that are similar to users are close in distance.

Metrics for domains such as the 3D solids generated in SketchUp is not an easy problem. The main difficulty lies in meeting the outcome preserving criterion, which requires the metric to consider close the designs that are visually similar.

Content-Driven Reputation

The initial value of user reputations corresponds to the amount of reputation that can accord to users whom have never been seen in the system before, and it depend on how difficult it is to create new user accounts. In systems where users cannot easily create many accounts, it can be afforded to give new users some amount of reputation. Essentially there is trust in account creation, in that the more effort it takes to create an account, the more trust that can be given to that account.

Updating Reputation

Reputation is updated as follows, for each edit done by the author, the quality of the edit is measured with respect to the set of future versions. The reputation of the author of the edit is incremented in proportion to the amount of work done multiplied by its quality and multiplied by the reputation of the author of the reference revision, rescaled according to a function. The higher the reputation of the author of the judged version, the higher the weight we give to the quality judgement that uses that version as reference. Rescaling is done in order to limit the influence of high-reputation users over the system. In implementing such a system, they found that a simple log transformation sufficed within Wikipedia. One restriction made was users were not allowed to judge their own work. A decay factor is also used to ensure that the edit causes a bounded change in the user’s reputation. This system is a truthful mechanism in the game-theoretic meaning, thus if a user wants to modify a document, a dominating strategy consists in performing the modification as a single edit. There is no incentive to play complicated strategies in which the modification is broken up into a sequence of edits having the same cumulative effect.

Resistance to Attacks and Dark Corners in Collaboration

This system limits the amount of reputation that can be gained from an interaction with other users, unless the contribution itself has stood the test of time. This systems is applicable if it has no "dark corners": when all edits are viewed in a timely fashion by honest users. This set of good users must consist of users who are both well-intentioned, and willing to repair vandalism or damage to documents via edits. In a system with no dark corners, the author of a version only gains reputation from a future reference version in two cases:

  1. The reputation of the author of the future version is greater than the reputation of the current edits author
  2. The amount of time elapsed between two version is longer than a pre-determined amount of time, for all versions separated by a time less than the time pre-determined.

These conditions ensure that a user can gain reputation only from users of higher reputation, or, if no other users objected to the edits performed, for a pre-determined length of time. These conditions remove the possibility of sock-puppet accounts that a person can use to increase their reputation.

Conclusions

Content-driven notions of edit quality and reputation are well suited to a large class of collaborative editing systems. Two main requirements:

  1. Ability to embed document versions in a metric space, so that distance is both effort-preserving and outcome-preserving.
  2. Presence of patrolling mechanisms that ensure that the system does not have “dark corners”

Future work:

  • Identifying suitable notions of distance for more general collaborative editing domains.
  • Studying the social consensus dynamics that the system induce

Reaction: This system would be heavy on task creation in that the task of creating a specific adaptation of the pseudometric would be related on the requester. Can a requester be given suggestions on working pseudometrics? No actual implementation within working systems, only comparisons and implications within current systems. Proves were provided which gave good amount of support for their system within a document setting.