Milestone 10 Aura

From crowdresearch
Jump to: navigation, search

Team Aura Milestone 10

In this milestone, the foundational notions of "Power and Trust" are explored and expanded by introducing the triple lenses of "Fairness," " Communication," and importantly "Productivity," which will be explained as simple, but vital values for the health of (any) scalable interpersonal relationships. These values will define the base of a scalable system for the generation, curation, and implementation of collaborative Ideas which may be a step towards the structuring of a "self-healing, self-implementing" platform.

Further design ideas will be presented [some at a later date], informed by the trio of "Fairness, Communication, and Productivity," to demonstrate some ways these concepts may be applied to further aspects of a crowdwork platform. Readers should come away with a foundational understanding of these values and can begin to view many social conflicts as a breakdown among them.

The Trio


[edit: Let me clarify that I am considering Productivity to be what unites "requester" and "worker" into a system with a common drive towards dissimilar, and sometimes competing goals and values. Structuring a system around a dichotomous rather than unified relationship seems to invite friction]

"Productivity," or the visible communication of the success and value of effort towards achieving known Progress, is a value which may be considered an "engine" to any work-centric relationship and platforms based on them. Let's present a few instances to support this:

The #1 thing that requesters love about AMT from her recent survey of requesters, is that the MOMENT I post tasks, they start getting done - Kristy

This is easily understood as a limiting of "Barriers to Productivity" being a dominant positive force in the mind of requesters, who drive both opportunity for work and $$ into the system. Counter to this positive effect is a negative one when "barriers to productivity" are present, as when considering AMT's limited search options lead workers to build work-arounds:

All of these options waste a considerable amount of time and effort on the part of workers as well as constantly draining bandwidth from mturk. There are many different ways this could be fixed and result in more efficient work for workers and more money for mturk, so why not invest the time in upgrading the basic functionality of the site more than once every decade? - A full-time turker

In this case the barrier to productivity limits the market's ability to function (low attention jobs become delayed, decreasing requesters sense of Productivity), impacts workers motivation, and portrays Amazon as a "Non-trusted Actor," which may further degrade workers sense of "Fairness" since there is a "blame-bearing agent" who is perceived to control this barrier, and may itself be seen as a barrier (thus creating a motive to subvert the barrier-controller through negative or toxic Communications, un-checked workarounds [powerful userscripts], or even the creation of alternative systems)

A quote from the CEO of Ten Thirty One, an entertainment company valued at $30m suggests the power of productivity, and the importance of recognizing it when designing engaging systems and interfaces which encourage sustained and varied efforts within a progress-oriented platform (be it a mobile game, a social network, personal relationship, a traditional or non-trad work opportunity):

Growth can be like crack for highly driven people. It's easy to get caught up in trying to take advantage of every opportunity...

The design of progress-oriented relationships, and the systems and networks they inspire, should be informed by a drive to create opportunity (and thus provoke goal-setting) and facilitate, rather than at any point hinder, the harnessing of effort and attempts at Productivity at all levels and for all participants.

Examples of productivity in design

  • Fast registration process (or none)
  • Clear HIT formatting to encourage workers rapid engagement with new tasks
  • Mechanisms are in place to support requesters rapidly seeing "progress" by powerful task creation tools, collaboration w/ workers improves task designs when productive, communicative feedback and milestones.
  • Immediate access to work upon entering the system
  • Highly visible, functional "help" features
  • Rapid payment
  • Reduction of "noise" when finding work (e.g. Curation of relevant HITs)
  • A decently internet savvy non-programmer should be able to sign up and have a very basic task (e.g., like the Masters Categorization batches on mturk) launched in a matter of minutes. There should be clear tutorials and the friendliest possible interface...Requesters who have a frustrating first experience will just disappear. I would focus almost entirely on that side of the platform design. - user
  • The more that is already done for people and the lower the initial time investment is to post work, the more requesters that will use the site. - user
  • Incorporated positive feedback
  • Modular platform design supporting users to make individual tweaks easily.
  • Goal setting and milestones mechanisms.
  • Schemes to increase signs of productivity when it is perceived to wane (Think an influx of platform dollars to incentivize productive work; or positive effect of reporting decrease in unemployment rates, even if that decrease is due to temporary positions)

Note a close association between wealth and progress. A platform which is designed to promote and even increase global and individual productivity can be a means to earn a living (and the positive benefits therein) and accomplish valuable distributed work, while supporting a fundamental drive for opportunity and progress.


Every monkey prefers grape to cucumber... Media: Fairness (or more intensely, unfairness) is more easily 'felt' than explained.

Fairness ties in closely with the drive for Productivity in that one basic component of what is 'fair' are mental comparisons of perceived value or worth versus what others express that value to be. 20 hours of work that is later sidelined by the boss feels unfair because value which accumulates while progressing towards a goal becomes de-valued upon reaching it; this holds for something as marginal in perceived 'value' as completing a 30 second survey which is 'disqualified' rather than paid. The determination of the 'real' value impacts the perception of fairness for all parties, thus the control and balance of that determination often underlies struggles of Power.

-Unpaid 'mass rejection' on AMT totalling "thousands of dollars" leads to a mailing campaign requesting Amazon step in to moderate [a 'one-time' action, AFAIK].

Another theme of fairness is a sense of community ownership and investment, which may encourage empathy and commiseration, as well as a defensive gate-keeping force that could act as a limiting floor to chaos. Publics (to borrow from the Dynamo paper) can spontaneously take up actions, or communicate to arrive at enforceable, if unwritten, standards and norms. Specific publics might have a "dove" effect by encouraging positivity to defend their particular area.

-Recognition of effort, no matter the scale.

-Transparent and consistent guidelines for growth and opportunity.

-Outlets exist for free expression of positive and negative feedback.

-Users are encouraged to develop a sense of community and ownership.

-Transparent communication and logic is shared w.r.t decisions impacting the system

- 'Community' is supported by recognizing the non-traditional nature of our platform and its users, using that to band together for more than just 'work'.


Communication may be seen as a tool which influences the efficacy and rate at which parties in the relationship move towards an equilibrium state of understanding in wide-reaching cases (e.g. self:other, worker:requester, users:platform-developers, international politics, celebrity:offended party, constituents:governing body). Open access to information supports each actor having thorough raw data, but the conclusions of the various actors should be actively reconciled to reach conclusions that satisfy both parties and offer improved insight to the direction of positive growth for the system or relationship.

Communication necessarily functions as a generative-receptive process, and problems of fairness and collaborative progress seem to arise when Power is maintained, or Trust is loosely 'assured' by means of barriers to communication. The Power to decide and act is perceived as stripped from one party by another. [Consider the power of governments that silence dissent, or the parental means of authority known as "I'm the mom, and I say so. End of discussion."]

In a collaborative work environment, highly-effective communication should inform movements for progress by ascertaining the needs and values of the system; predicting certain positive and negative outcomes of actions in the planning stages; ascertaining the capabilities of the actors to limit unrealistic expectations or unappreciated efforts (e.g. by reputation tracking, workflow accountability, audits); and collaborative efforts to guide actors to productive action that reflects back the worthiness of the system's needs and values. Such an exchange should promote Trust-in-Actors, a global sense of Productivity, an enhancement to Fairness and empowerment degrading existant inequalities. [Perhaps this also encourages Trust-in-Askers, wherein the Actors are more encouraged in taking up future collaboration, as one often is when efforts are appreciated as Productive contributions]

A case of negative breakdowns in Productivity, Fairness, and Communication

Unbabel, a crowd-translation service, decides translators must connect their accounts via FB or Linkedin or immediately lose access to paid work (but not 'free' tasks). This causes backlash among users, which escalates to a point of toxicity.


Trust is an essential part of building a community that works together for a common goal. We believe that transparency and authenticity is essential to build this Trust. So we are requiring all accounts to be connected to a social profile either Facebook or LinkedIn. This ID verification will be mandatory to get paid tasks from April 25th onward.- Unbabel

User responses:

I am not too happy with this coupling. How is this helping "transparency", "authenticity" and "trust"?

Yes please explain how this helps "transparency", "authenticity" and "trust". What are you going to do with these informations? I don't trust people who spew this PR bullshit, so I would say trust on my part has gone down due to your lack of transparency.

I agree. I think this is totally unfair. I have neither. :(

Call me dramatic, but this whole thing looks so ridiculous I'm almost starting to wonder whether we should strike or something.

This is not helping us trust these actions either, Unbabel. I'm seriously doubting your intentions here.

I don't see why they shouldn't still be able to cash out on what we earned before this change. I think anything less than that is totally unfair to us considering all the hard work we have done here.

I mean, I can understand that it's also not fair to any of us (including the management) if fake users are signing up on Unbabel, but there are so many other ways to tell whether we are fake or not other than this. At least a few of them have even been suggested in this very discussion.

It's not the right way, it's the easy way...

If you have identified the problem, then you know the profiles concerned. Why not just delete them, instead of holding the whole community ransom for the faults of the few?

There is evidence in the sentiment of this discussion[1] that incomplete, and especially "superficial" communications can lead to a sense of decline in Fairness by portraying unequal Power distribution, poor collaboration leading to workers feeling marginalized as valued and validated by the system, and degrading Trust in the actor's motives and ability to hold in mind the values of those impacted by future actions. [Note the sense of community and ownership expressed in some of the user comments (maybe an effect of dedication and effort invested to date)]

Further efforts by Unbabel in the discussion to explain the new requirement seemed to have less impact in reversing negative sentiment, and may appear as deflection and superficial assurance to squelch the spread of negative sentiment throughout the system rather than truly encouraging a collaborative sense of communication.

Hi Everyone

Thank you for your comments and suggestions. We will consider some of them in further developments.

I would like to reinforce that we are not using your personal information for any other reason than account verification and identification. So we do not sell data and we do not spam.

If you don't agree with the terms that's ok, we respect that and we part ways.

I just remembered this quote: "I don't know the key to success, but the key to failure is trying to please everybody." - Bill Cosby

The statement "If you don't agree with the terms that's ok, we respect that and we part ways" speaks to the inequality and strongly-reinforced duality of the system at hand; there are rule makers and rule followers. Mistrust arises when the rule makers appear to misstep. Given the relative success of the platform, this might be a case of high Productivity as maintained by low values of Fairness and semi-"need to know" Communication that informs rather than involves the workforce at the expense of tenuous Trust on both sides.

Throughout this discussion, there are clear feeling of decline in Fairness, loss of Trust, general de-motivation (with some users stating they will 'part ways'), questioning practices of Communication and transparency (in some ways de-valuation of equal-status becoming a level of mistrust), and a rise in the level of "toxicity" existing in the system (friction).

Improving the Unbabel case

If a collaborative process had been in place, the Unbabel team may have been able to link the platform-wide announcement to a full disclosure of the process that informed the decision. Negativity may be lessened if the Actors can demonstrate that the process was informed by user need and values - for example showing that the base issue of trust and authentication had reached a threshold of complaints (which offers users a limited but effective feedback tool), and maintaining an open forum (broadly) where issues were heard, considered, and incorporated into the design concepts (a more intensive collaboration effort). Or at least evidence that the decision not to incorporate certain ideas was logical and informed by evidence. -

Limiting barriers to productive communication, and encouraging that communication with a history of Trust-in-Actor, or even objective incentives, may promote a sense of community and unified productivity. Most efforts are likely well intentioned, and informative, collaborative communications may lessen the productivity-sapping experience of negative feedback when those intentions don't translate to results, by offering an impression of community support to diffuse that and hopefully encourage future collaborative action.

A 'self sustaining/ self-repairing' market

The basis of the concept:

a step to a 'self sustaining/ self-repairing' market. Problems are id'd by the crowd, solutions are gathered from the crowd and requesters, the priority ideas are distributed back as tasks for the crowd. The users can redefine and reform the platform with diminishing need for oversight and outside technical support. Eventually we have crowdusers whose purview include structuring the 'repairing' tasks...Then it doesn't matter if you have an idea or a contribution, the market has a place for whatever you can give

One summary of the basis of my concept is "Never lose a great idea, nor forget the worst." Any idea can be a point for growth, and that idea can come from anywhere, how can a crowd capitalize on that potential?

Some important points in building upon it:

the open gov mechanism should be able to scale... we could have 1 million or 100million freelancers in the future. the platform must find a way to handle that if open gov will be implemented
let's remember that if this scales as we think it will - there is a good possibility that the idea feed could read like a twitter feed - prolific and fast
there are certain tasks that aren't sexy or high profile but are critical
it would be a shame to shelve a good rule/feature/policy just because of lack of substantial votes
would you be willing to ditch open gov if there is an alternative which is a simpler approach?

I was especially intrigued thinking about "Does simpler mean better?" "Do fewer actors improve or limit outcomes?" "What benefits would a small group of decision makers have over a scalable group, and how could those benefits be replicated (in a much larger, distributed system)?"

The crowd would need to replicate the speed and (anticipated) quality of decision making of a small dedicated leadership group. It should have no problem generating ideas, but will need to cull through them. And due to the nature of the platform, I believe there should be no tasks that require off-platform workers (eventually...)

So a platform for governance must 1)Scale to support any number (or lack of) ideas, voters, topics, etc. 2)Function Productively for users in terms of low-friction design limiting noise, toxicity, and sources of Friction and Stalling 3)Encourage Trust-in-Actor, that is, it must sustain visible productivity regardless of the task, including 'unsexy' ones. 4)Have benefits off setting potential flaws.

Some of the problems and solutions I imagined seemed to have encouraging overlap with the Dynamo paper, which I read after developing the system I have in mind.

A general overview

(sort of 1st tier, identifying needs) Users submit ideas, needs, complaints, system-maintenance type tasks, etc. through a platform tying into but segregated from the work platform. Other opted-in users are able to access these ideas and vote on them directly and immediately. Incentives at this tier encourage sharing and participation, community. This tier will have a mechanism to cull ideas.

(2nd tier, translating need to action) Submissions that pass some threshold are moved forward to a second tier where reputation-having (accountable, potentially not anonymized) crowdworkers can essentially bid to become a 'foreman' (or fore-team, consider X-prize scale efforts, etc) for that idea by submitting a more fleshed-out plan to advance and accomplish it, including a labor-hours and budget estimate; this is translated either by the foreman or another into tasks that can be done by crowd workers (looped back through the platform). Since great ideas are just ideas without action, a foreman's plans might sit in tier 2 until enough qual'd crowd workers have signed on to fill the labor-hours anticipated for the project (perhaps including a buffer for drop out rate etc.). Incentives at this tier should generate competition and varied solutions. This tier should be a graveyard for ideas that are inactionable.

(A tier 2.5 might combine brief ideas together to collate issues that should be dealt with simultaneously)

(3rd tier approval and action) An idea that has initial support, a plan, a foreman (or fore-team), and an apparent 'crew' at the ready gets approved to launch either all at once, or the plan, foreman, etc. are vetted progressively (or maybe some decisions like crew are left to the foreman. need to avoid preferences or slowing the process). Progress begins on the project. I would propose there is some oversight, as in communication among the crew is open to an auditing crowd who can comment on the progress or halt it under certain conditions.

(Testing tier) In order to replicate some of the 'foresight' that we would hope to find in decision makers, systemic ideas are tested in n- tiers, which return feedback to the foreman and crew to re-iterate as needed. We try to rely on "evidence" before rolling out changes on the large scale, for example A-B testing interface redesigns, positive user feedback in surveys, measurable improvements in aspects of work or use (e.g. decrease downtime between tasks, decrease task completion rates, improve accuracy)

(Implement) Once we have 'reason to believe' a change will have a positive effect on a larger scale, it is rolled out accordingly. Hopefully the system is informed by user needs, generates a range of proposals to address it, addresses it effectively with intelligent resource allocation, and guards against future issues by smart roll-out (and reversal) policies.

Ideas market anonimity

In order to encourage free exchange of platform/community-advancing ideas, and eliminate some of the defensiveness when an idea is "owned" by an individual, user-submitted ideas are stripped of identifying data. Governance/ idea market is separated by tying my "crowd worker" user account to a anonymized "crowd voter" account - you can use block chain to assure I am who I am and have some vote, but I'm assured that my identity is otherwise unassociated with my vote and any opinions I share.

On the other hand, there seems to be a place in the market for accountability and known "blame bearing agents," namely when moving from idea to action. By naming the 'Actors,' the system might learn which users are able to accomplish goals and allocate resources in ways that diminish risk. This could be visible or somewhat behind the scenes with a rating system.

Idea submission

Users submit ideas for the platform in a character-limited format to encourage a "to the point" approach stripping away the "toxicity and noise" that can sometimes obscure communications or cause bias and defensiveness. Designing ideas can become "for the community" and perhaps avoid flame-wars and defensiveness (massive friction) by limiting communication at the points where there is the most friction for the least progress (I'm realizing this runs counter to above discussion about open communication).

Ideas are submitted and accessed based primarily on relevant topic - Is this regarding payments? user interface? bugs? color schemes? I imagine a hierarchy that can collapse and expand subtopics algorithmically; when a certain grouping of issues has enough activity, it can be brought down from its parent category, or collapsed back when activity is limited. This should encourage focus and improve user ability to find and report in the appropriate ways.

NLP or crowd-culling might be used to identify and collate issues with the same basic premise.

Accessing Ideas

Opting in to the "idea/ governance" system allows users to access the ideas and actions currently in the system. Hopefully awareness of all of this encourages a sense of community progress.

Users might be able to dig through the parent and sub categories like a reddit or forum - seeing all 'ideas', all 'proposals', checking the progress of 'proposals in action' as well as notions like "most controversial" for ideas that have split approval, "idea hell" for those that have universal disapproval (but haven't been flagged out), "idea heaven" for ideas or proposals that had lots of support but have not found ways to be implemented so far (to encourage attempts, X-prize style), "Staling ideas" for those that are losing traction and going stale, etc.

In addition to this, I foresee a widget that allows users to view a stream of whatever categories they want to access. This would be 'incoming only,' allowing ideas to be displayed and remembered/ refined, but not responded to directly. Each stream might only represent a small portion of the total ideas in the system, and submitting a new idea won't assure it's seen by any particular user. The rate of the stream might be adjustable, but is limited to allow the user to consider ideas rather than rush through or be crushed by them. This and a collapsable hierarchy might help users not be overwhelmed, while keeping them engaged with the community that is visibly acting on the platform.

The tasks generated in the tier 2 proposals are available for all workers in the regular market, perhaps distinguished as "proposed projects".

"Remember or Refine" (a noise limiting approach)

When an idea comes through a stream the default lifecycle of that idea is "doomed". A user only needs to participate with the stream and "vote" if they think the idea should "be remembered by the system later." The implication is "You might be the only one who sees this, so act on it if it's good enough". With one click, the user can inform the system that it should propagate the idea in the future. This is what I call "remember"

The other option is "refine" which allows a user to take the original character-limited suggestion and "refine" it by submitting their own edit or re-phrasing of it. Ideas that are refined propagate both the intact original and the re-worded version.

A user who clicks to "remember" or "refine" a suggestion is shown up to n (e.g.3) previous refinements of the idea and asked if any of those better express their feelings, whichever they decide is best is propagated in that instance.

The idea behind this is a system that allows the crowd to self-moderate in terms of limiting noise. Ideas are visible, but 'forgotten' by default. Good ideas that are poorly expressed or blurred by anger, etc. are refined only by users who 'fundamentally agree' with the statement, not by those who disagree outright (since there is no ability to attach comments; there is no way to structure a discussion/ flame war in the stream system; and refining an idea in a way that changes it completely necessarily propagates the original as well). The "faithful" get the power to define what is "noise" and "toxic" surrounding and obscuring the core issue, which may be a seed for excellence.

The stream as a (social) power-weighting mechanism

Because the spread of each issue is dependant on users actively propagating it, manipulating the reach of each user should impact overall awareness and activity of the issue. A reputation system can (in theory) provide uneven reach for certain users. For example, users whose ideas tend to progress far into testing stages may be given a larger initial voice for submissions of a similar type than users whose ideas are most often flagged as offensive, or otherwise tend to be 'unproductive' and fall short of action.

A user who produces 'noise' (unproductive ideas, not spam which can be machine filtered) will likely not be 'remembered or refined' often, thus the idea can fall out of the system rapidly. Because this user's submission gained little traction, the next submission may be adjusted on the backend to reach fewer user streams, fall lower in a reddit-style list, etc. Conversely, producers of highly-effective ideas and proposals are given more voice (e.g. a user's contributions go far in the process of creating action advancing the system). In this way, the crowd further limits its own noise and perhaps replicates a process of identifying the best informants and actors participating in the system.

An individual's 'social reach' in this case could vary based on the particular bit they are spreading in that instance - for example, a user whose "refined" ideas in the UI-improvements category create a lot of productive activity will have far spread in that field, but limited spread in others where the 'reputation' has not developed.

Incidentally, I might suggest new, un-vetted users have ideas spread at an assumed high social status, letting them fall in relative standing, which I suspect could be rapid, rather than 'proving themselves' repeatedly over time. A users one and only idea could be just right, and maybe should be assumed to be so until evidence points otherwise.

When to progress

I would propose that there is more than just votes to determining how long all these bits stay active in the market, and when action is taken to move a bit to the next tier. These should take into account total activity the idea or proposal generates. A controversial or complex idea may not get many votes, but its importance can be seen in the views, sharing, time spent on page, etc.

At any point the system can determine the most and least 'engaging' ideas on the market and hone in on them. This also eliminates strict threshold requirements and could even out the effect of periods of high and low system activity.


One of the salient issues raised in the Dynamo paper is the need for outside actors to step in when stalling is occuring. I summarize the situation as such: "Volunteers require dedication. Workers require fair compensation."

(My interpretation is the work done by Dynamo leaders when the crowd-effort waned exemplified a high level of dedication, which would be difficult to replicate by any static gate keeping group facing growing scales of time, effort, etc.)

In this platform, there should be a constant source of income (tax, fee per task) which can be used to incentivize progress. The 'Actors' are incentivized in the traditional way by offering the opportunity for paid labor contracts. I would propose that ideas making it to the 2nd tier can be prioritized by the crowd (or inherently are through the process of moving to that tier), and that that prioritization can inform the level of incentive available to each task. Meaning the budget allotted a task is fluid dependant upon its priority. Low priority tasks can get done, but have low incentive in terms of real compensation/ tight budget. This allows a mechanism to adjust the rate at which "unsexy" tasks are taken up and assure they do; for example by compensating "system maintenance" tasks with a 20% bonus.

Users who share ideas may be incentivized to a degree by a Twitter-like "Your Impact" updates; tracking views, 'remember/refines', and providing updates on progress of actions inspired by their ideas.

On a highly complex end, users are compensated "for effort produced" at each stage. I see basically a profit sharing scheme where every member is rewarded proportional to the success of their effort. A user submitting an idea that gains no traction and is quickly "forgotten" receives marginal compensation. A user submitting an idea that eventually becomes a system-wide change is rewarded highly. The foreman of a proposal that doesn't "win the bid" from a range of other proposals is compensated a low amount for the effort, proportional to the support it gained. Etc.

An additional possibility is to create 'unfairness' for users who produce toxicity and noise - an extreme example is a user who recieves multiple flags as submitting 'toxicity' could lose access to certain pay opportunities or tiers, etc. In this way, being an 'unproductive voice' leads to limited social status (in terms of reach of submission, remember/refine), and being a 'toxic' and disfunctional voice can lead to 'unfair' conditions, in an attempt to replicate a strong community's ability to protect from within (separate from the ramifactions or ethics of that ability to band and ostracize).

N-tier testers as community of doves

Tiers of feedback-providing members might be formed by identifying users who produce high quality feedback (low noise, low bias, high resultant progress) in the associated area of impact (again, e.g. category or 'issue'-specific specialists, etc). Honing on these members could increase the productivity and quality of information in each step of the process; essentially clustering users who have 'high dedication' to a particular area and whose positive feedback might act as an incentive to drive past stalling and friction. Members of these tiers would be incentivized to contribute where they make the most progressive impact, and hopefully be excluded or dis-incentivized from submitting lower-quality feedback to other efforts. For example an alpha-tier tester of 'coding work' might recieve 120% incentive in that field, but only 70% for participating in areas of lesser expertise and productive input.

Toxic Topics

The idea system could have a purpose in identifying systemic Toxic Issues in a manner of "trending topics." Any topic that seems to receive a significant spike in attention (submissions around that topic, views of those submissions, severity of those submissions, etc), might generate an opportunity for broader open discussion, or be jumped in priority, etc.

Things to look into (please add)