Getting Started WIP

From crowdresearch
Jump to: navigation, search

Watch the bootcamp video to understand this page and to get started with coding: watch

Introduction to Crowdsourcing Research

One of the goals of computer scientists & researchers is to discover innovative ideas and bring them to the real world. We aim to build systems that will help harness collective intelligence of thousands of people around the globe.

Since last few months we have been building the research ideas focused on four core foundation :

  1. Micro+macrotask market: Could the same marketplace scale from 2 to N people not just labeling images, but also Photoshopping my vacation photos or mixing my new song?
  2. Input and output transducers: Tasks get vetted or improved by people on the platform immediately after getting submitted, and before workers are exposed to them. Results are likewise vetted and tweaked.
  3. External quality ratings: Metaphor of credit ratings: rather than just people rating each other, have an (external?) authority or algorithm responsible for credit ratings (A, B, C, etc.)
  4. Open governance: Leadership shared by requesters, workers, (researchers?). Policy changes can be worked out by this group

Getting Started with the Research Foundations

  • Crowdsourcing Research watch After 49.55 minutes Professor Bernstein gives amazing overview of the current state of the research in the crowdsourcing. Slides: pdf
  • Foundations Hangout watch & Slides pdf
  • Synthesis of the ideas Hangout watch & slides
  • Previous Milestones

The Hybrid Approach: Research & Production Level Code goes together

How to Make Your First Contribution

This is a simple tutorial to add yourself to the list of contributors on our dev site. It should introduce you to angular.js and to the pull request process on github. This tutorial goes through a subset of the tutorial listed here: You should go through this tutorial to familiarize yourself with git. A sample pull request has been created to go show the intended PR that you will create at the end of this tutorial -

You will create your Profile Page and update it every week with your accomplishments
Accomplishments: Add names of the Features you are working on for current and past weeks. List the production releases you have accomplished. Add the names of people you worked with or give appropriate credit for others ideas and contributions.

1. Get the repository

Before you can contribute any code, you will need to sign this CLA license.

First, make a fork of the GitHub repository, located here.

Then, clone the repository into your local working directory.

You have the choice to either create a new branch in your forked repository, or working off of the current branch.

2. Make your changes

Frontend code is under the /staticfiles folder.

  1. Go to:
  2. Create a new url for your own profile page underneath existing urls.
  3. Upload your profile image to: We recommend uploading an image that is 200x200 px.
  4. Create a new template at /staticfiles/templates/contributors/<yourname.html>
  5. You can design this page however you'd like.
  6. Now, add yourself to the list of contributors by going to:
  7. Copy the following code snippet and add it after existing contributors. Edit it to include your name and your url.
          <div class="col-sm-2 col-xs-4 thumb">
            <a class="contributor-thumb" href="/contributors/<your url>">
              <img class="thumbnail img-responsive" src="/static/images/contributors/<image_name>.png" />
              <span><Your Name></span> 

3. Push your changes to your branch

Now you need to add your changes. Go back to your terminal and cd into the root crowdsource-platform folder.

  1. git status - This should show you the changes you've made and the files you've edited.
  2. git add staticfiles/templates/contributors/home.html
  3. git add staticfiles/templates/contributors/<your_name_file>.html
  4. git add staticfiles/js/crowdsource.routes.js
  5. git add staticfiles/images/contributors/<your_name_image>.jpg
  6. git commit -m "Adding myself to contributor's page"
  7. git push origin [branch_name]

3. Set up the server locally

Please go through the README in order to set up your local test environment for your specific OS.

4. Verify your changes

Now that the local server is set up and running, you can view the changes you have made.

  1. Go to http://localhost:8000 to view the Crowdsource Platform
  2. At the bottom of the page, click on "Contributors"
  3. If you see yourself listed, click on your image. If you are not listed, check the /staticfiles/templates/contributors/home.html file and ensure you added the code properly.
  4. If it directs you to the HTML template you created in step 2, you've successfully added yourself to the list of contributors!

5. Submit your Pull Request

We use git pull requests to merge code into the main codebase. Your code is reviewed by DRIs (don't worry we're nice :)) and then approved, after which you can merge it.

  1. Go to
  2. Replace "your_github_account" with your GitHub account name and "branch_name" with your branch name.
  3. Click Create pull request.
  4. You should see a list of your changes.

Now notify your DRIs on the infra slack channel by pasting a link to your Pull Request.

How to Keep Contributing

How to Find Work To Do

The Open issues listed on the GitHub repo are features, foundations, and bugs that need to be created or fixed.

Timeline for each week

  1. We have weekly milestone schedule
  2. Each release should be 5 days: Saturday, Sunday, Monday, Tuesday, Wednesday
  3. The Pull Request should be raised on Wednesday
  4. DIRs should finish the Merge by Thursday/Friday
  5. Saturday is the DEMO Day

The release Cycle

  1. Development Saturday, Sunday, Monday, Tuesday, Wednesday
  2. Staging Ready for Production Thursday
  3. Released to Production Friday
  4. Demo Saturday

Working on the existing issues, FOCUS: Core Foundations

  1. Take a look at the Open Issues
    Fig 2. Release Cycle & Tags
  2. Helpful search tags: Unassigned Open & Critical, Need Backup ,
  3. Choose the issue you would like to work on
  4. If you want to raise request for the new issue or feature (see the section below)

What are the categories?

Design Dashboard Dashboard Class Class Class

Creating New issue/feature Requests, FOCUS: Core Foundations

  1. If you want to create new issue, task, feature request add it the Fig 2 Github Issues
  2. The Labels in the Fig 2 highlight various tags that needs to be associated with the issue.
  3. Description Write a clear description explaining the new request. Please explain how does it contribute enhancing the core research foundations. Add below tags:-
  4. Add tags Feature Request, Please Prioritize
  5. Add the one tags from 1 to 9 describing category of your request
  6. Assign the issue to yourself and in the description add following DRI handle so that immediate notification will be sent:
# Category Name Add DRI handles in the Description
1 DESIGN @neilthemathguy
2 FRONT END ENGINEERING @nistala, @neilthemathguy, @dmorina
3 SYSTEMS @dmorina, @elsabakiu, @neilthemathguy, @ksetyadi
4 DATA @dmorina, @elsabakiu, @neilthemathguy, @ksetyadi
5 DEPLOYMENT @ksetyadi, @dmorina, @neilthemathguy
6 SECURITY @ksetyadi, @dmorina, @elsabakiu, @neilthemathguy
7 ANALYTICS @neilthemathguy
8 TESTING @neilthemathguy, @dmorina, @ksetyadi
9 OTHERS @neilthemathguy

Creating New feature Requests, FOCUS: Generic other than the foundations Please follow the same process as above, Add additional tags Nice to Have, Others

Coding Guidelines

Collaboration Guidelines

  • Being Respectful & sensitive We are a growing community of young researchers, we come from different skills and backgrounds. Please respect your colleagues and community members. Let's take everyone together and make sure no one is left behind.
  • Constructive Feedback We agree to disagree in respectful manner. As suggested by TAs, Negative comments are discouraged - if you disliked some aspect of a submission, make a suggestion for improvement.
  • Mentor, Help, & Make new friends Share your experience with community, pass the wisdom and knowledge to others.
  • Strictly Prohibited Collaboration is highly encouraged, however, please do not to share your github account with someone else and ask them to checking or develop the code under your name. Please list the names of collaborators in the release and provide appropriate credits to the person who came up with the original idea. If you have any questions regarding this, don't hesitate to talk to DRIs or TAs. We believe every computer scientist should be comfortable using the tools. If you need any help setting up environment please reach out to the DRIs or community.
  • Responsibility This platform is a result of collaborative efforts, our goal is to make the world better place. You can own the part of the system you are working on and take responsibility to productionize and maintain it.

How to Submit the Work You've Completed

  1. Finish the development and testing
  2. Create the branch with the FEATURE NAME and tag ACTIVE TAG
  3. Add the GIT issue number to the request, it will help to cross reference the release
  4. Create the PULL REQUEST
  5. Update the description of the issue

I Need Help

  1. Immediately Ping DRIs
  2. Raise Help Needed Tag on the Git Issue you are working on
  3. In case you need more resources/backup Raise Need Backup Resource Tag on the Git Issue you are working on

If you have questions

  1. Check existing FAQs.
  2. If you don't get answer: add your question to the FAQs list, ping on slack #infra channel, escalate the issue to DRIs

System Architecture

This section is an overview of the System Architecture. It is provided in order to better assist you in understanding the system and where you would like to contribute.

Fig 1. System Architecture
  1. Fig 1 shows the overview of the architecture. For more Information see
  2. Data Models
  3. Front End
  4. Git Strategy
  5. Vision
  6. Active Branches: Develop, Staging, Production

Current folder structure:

  1. Backend: crowdsourcing Serializers, Validators, Viewsets, Models, Views, tests
  2. Front End statcfiles css, js (angular services, controllers) Angular routes Configurations, templates html
  3. Admin csp

System Architecture Workflow


  • Nginx is used as a reverse-proxy and serve the static files
  • Gunicorn will handle the WSGI applications, in our case the Django Apps.
  • Rest API The Django app is a great way to modularization. After completing the main web application we will work on rest APU with OAUTH2 autheentication. This app will be used for mobile and desktop clients. Other applications can be derived as project progresses.
  • Websockets: We will need websockets for live communication between the client apps and the users themselves, we will start with Tornado if it plays well with Django.
  • Gunicorn can run on multiple web workers and we will use redis to handle the sessions for websockets and so on.
  • In this architecture it is very easy to implement new features, either by grouping them into a module and just integrating the urls in the urls.conf file. This way you may implement any feature and just plug it in the existing application.
  • Another way would be by extending the current code, it can be done in three simple steps:
    • Create your html templates
    • Add the class based views in the or another file(s)
    • Import the views in the urls.conf file and define your url mappings in there, this will not in any way affect the existing features.


  • Client makes a request via web using AngularJS ngResource or native app made using PhoneGap
  • Request makes a REST API call to the Heroku hosted Django server.
  • Request prepended with /api/<call> gets routed via a gunicorn to Django API server running REST framework.
  • Multiple instances of the api server will be provisioned on different nodes to scale for traffic, each request is round robin(ed) until a free server is found and accepts the request.
  • Django talks with the database coordinator which itself talks only to the Master database.
  • Master database either reads from slaves or writes to master and syncs, this will the job of the PG coordinator. In future data center can be scaled using pgpool-II, middleware that works between PostgreSQL servers and a PostgreSQL database client can be implemented. Watchdog can be used to ensure the high availability feature o it.
  • Data is sent back up the chain via a HTTP response on the REST API and the client is reloaded. There is no page refresh required anywhere and this allows for a smooth native mobile interface as well. This is provided natively by Heroku but this setup can be utilized for any system on AWS, GCE, Rackspace or any cloud provider to allows for maximum scaling of the application.

Current Data Model

Fig 3. Data Model Architecture