Watch the bootcamp video to understand this page and to get started with coding: watch
- 1 Crowdsourcing Research
- 2 Research & Engineering How to setup the environment?
- 3 What to work on?
- 4 What are the categories?
- 5 What is the Release Cycle
- 6 Coding Guidelines
- 7 Collaboration Guidelines
- 8 Example: Hello World Tutorial
- 9 Current Data Model
- 10 Vision
One of the goals of computer scientists & researchers is to discover innovative ideas and bring them to the real world. We aim to build systems that will help harness collective intelligence of thousands of people around the globe.
Since last few months we have been building the research ideas focused on four core foundation :
- Micro+macrotask market: Could the same marketplace scale from 2 to N people not just labeling images, but also Photoshopping my vacation photos or mixing my new song?
- Input and output transducers: Tasks get vetted or improved by people on the platform immediately after getting submitted, and before workers are exposed to them. Results are likewise vetted and tweaked.
- External quality ratings: Metaphor of credit ratings: rather than just people rating each other, have an (external?) authority or algorithm responsible for credit ratings (A, B, C, etc.)
- Open governance: Leadership shared by requesters, workers, (researchers?). Policy changes can be worked out by this group
Getting Started with the Research Foundations
- Crowdsourcing Research watch After 49.55 minutes Professor Bernstein gives amazing overview of the current state of the research in the crowdsourcing. Slides: pdf
- Foundations Hangout watch & Slides pdf
- Synthesis of the ideas Hangout watch & slides
- Previous Milestones
The Hybrid Approach: Research & Production Level Code goes together
Research & Engineering How to setup the environment?
- Please go through README
Current folder structure:
- Backend: crowdsourcing Serializers, Validators, Viewsets, Models, Views, tests
- Front End statcfiles css, js (angular services, controllers) Angular routes Configurations, templates html
- Admin csp
- Fig 1 shows the overview of the architecture. For more Information see
- Data Models
- Front End
- Git Strategy
- Active Branches:
Develop, Staging, Production
If you have questions
- Check existing FAQs.
- If you don't get answer: add your question to the FAQs list, ping on slack #infra channel, escalate the issue to DRIs
What to work on?
Working on the existing issues, FOCUS: Core Foundations
- Take a look at the Open Issues
- Helpful search tags:
Unassigned Open & Critical, Need Backup,
- Choose the issue you would like to work on
- If you want to raise request for the new issue or feature (see the section below)
Creating New issue/feature Requests, FOCUS: Core Foundations
- If you want to create new issue, task, feature request add it the Fig 2 Github Issues
- The Labels in the Fig 2 highlight various tags that needs to be associated with the issue.
- Description Write a clear description explaining the new request. Please explain how does it contribute enhancing the core research foundations. Add below tags:-
- Add tags
Feature Request, Please Prioritize
- Add the one tags from 1 to 9 describing category of your request
- Assign the issue to yourself and in the description add following DRI handle so that immediate notification will be sent:
|#||Category Name||Add DRI handles in the Description|
|2||FRONT END ENGINEERING||@nistala, @neilthemathguy, @dmorina|
|3||SYSTEMS||@dmorina, @elsabakiu, @neilthemathguy, @ksetyadi|
|4||DATA||@dmorina, @elsabakiu, @neilthemathguy, @ksetyadi|
|5||DEPLOYMENT||@ksetyadi, @dmorina, @neilthemathguy|
|6||SECURITY||@ksetyadi, @dmorina, @elsabakiu, @neilthemathguy|
|8||TESTING||@swapagarwal, @dmorina, @ksetyadi, @neilthemathguy|
Creating New feature Requests, FOCUS: Generic other than the foundations
Once the issue is assigned to you, please acknowledge.
What are the categories?
What is the Release Cycle
The release Cycle
Saturday, Sunday, Monday, Tuesday, Wednesday
- Staging Ready for Production
- Released to Production
How to submit the work
- Finish the development and testing
- Create the branch with the
FEATURE NAMEand tag
- Add the GIT issue number to the request, it will help to cross reference the release
- Create the
- Update the description of the issue
Timeline for each week
- We have weekly milestone schedule
- Each release should be 5 days:
Saturday, Sunday, Monday, Tuesday, Wednesday
Pull Requestshould be raised on
- DIRs should finish the Merge by
- Saturday is the
I Need Help
- Immediately Ping DRIs
Help NeededTag on the Git Issue you are working on
- In case you need more resources/backup Raise
Need Backup ResourceTag on the Git Issue you are working on
- Please follow the standard python guidelines Style Guide
- Java Script Style Guide
- AngularJS Guidelines
- If you are referencing or building upon someone else's open source code, please check the license and provide appropriate credits
- Being Respectful & sensitive We are a growing community of young researchers, we come from different skills and backgrounds. Please respect your colleagues and community members. Let's take everyone together and make sure no one is left behind.
- Constructive Feedback We agree to disagree in respectful manner. As suggested by TAs, Negative comments are discouraged - if you disliked some aspect of a submission, make a suggestion for improvement.
- Mentor, Help, & Make new friends Share your experience with community, pass the wisdom and knowledge to others.
- Strictly Prohibited Collaboration is highly encouraged, however, please do not to share your github account with someone else and ask them to checking or develop the code under your name. Please list the names of collaborators in the release and provide appropriate credits to the person who came up with the original idea. If you have any questions regarding this, don't hesitate to talk to DRIs or TAs. We believe every computer scientist should be comfortable using the tools. If you need any help setting up environment please reach out to the DRIs or community.
- Responsibility This platform is a result of collaborative efforts, our goal is to make the world better place. You can own the part of the system you are working on and take responsibility to productionize and maintain it.
Example: Hello World Tutorial
Create your Profile Page and update it every week with your accomplishments
Accomplishments: Add names of the Features you are working on for current and past weeks. List the production releases you have accomplished. Add the names of people you worked with or give appropriate credit for others ideas and contributions.
Your first contribution
First go through the setup and To get started, we have introduced a simple tutorial to add yourself to the list of contributors on our dev site. It should introduce you to angular.js and to the pull request process on github. This tutorial goes through a subset of the tutorial listed here: https://guides.github.com/activities/hello-world/. You should go through this tutorial to familiarize yourself with git. A sample pull request has been created to go show the intended PR that you will create at the end of this tutorial - https://github.com/crowdresearch/crowdsource-platform/pull/40
Creating a new branch
Create a new branch after checking out the main repo using the following commands.
cdinto your root folder of the crowdresearch repo.
git fetch -a
git checkout develop2
git branch [branch-name]
git checkout [branch-name]
Now you're all set to start writing some code!
Frontend code is under the /staticfiles folder.
- Go to: https://github.com/crowdresearch/crowdsource-platform/blob/develop2/staticfiles/js/crowdsource.routes.js#L71
- Create a new url for your own profile page underneath existing urls.
- Upload your profile image to: https://github.com/crowdresearch/crowdsource-platform/tree/develop2/staticfiles/images/contributors. We recommend uploading an image that is 200x200 px.
- Create a new template at /staticfiles/templates/contributors/<yourname.html>
- You can design this page however you'd like.
- Now, add yourself to the list of contributors by going to: https://github.com/crowdresearch/crowdsource-platform/blob/develop2/staticfiles/templates/contributors/home.html
- Copy the following code snippet and add it after existing contributors. Edit it to include your name and your url.
<div class="col-sm-2 col-xs-4 thumb"> <a class="contributor-thumb" href="/contributors/<your url>"> <img class="thumbnail img-responsive" src="/static/images/contributors/<image_name>.png" /> <span><Your Name></span> </a> </div>
Pushing changes to your branch
Now you need to add your changes. Go back to your terminal and
cd into the root crowdsource-platform folder.
git status- This should show you the changes you've made and the files you've edited.
git add staticfiles/templates/contributors/home.html
git add staticfiles/templates/contributors/<your_name_file>.html
git add staticfiles/js/crowdsource.routes.js
git add staticfiles/images/contributors/<your_name_image>.jpg
git commit -m "Adding myself to contributor's page"
git push origin [branch_name]
Creating a Pull Request
We use git pull requests to merge code into the main codebase. Your code is reviewed by DRIs (don't worry we're nice :)) and then approved, after which you can merge it.
- Go to https://github.com/crowdresearch/crowdsource-platform/compare
- Change compare to your [branch-name] from above.
- Click Create pull request.
- You should see a list of your changes.
Now notify your DRIs on the infra slack channel by pasting a link to your Pull Request.
System Architecture Workflow
- Nginx is used as a reverse-proxy and serve the static files
- Gunicorn will handle the WSGI applications, in our case the Django Apps.
- Rest API The Django app is a great way to modularization. After completing the main web application we will work on rest APU with OAUTH2 autheentication. This app will be used for mobile and desktop clients. Other applications can be derived as project progresses.
- Websockets: We will need websockets for live communication between the client apps and the users themselves, we will start with Tornado if it plays well with Django.
- Gunicorn can run on multiple web workers and we will use redis to handle the sessions for websockets and so on.
- In this architecture it is very easy to implement new features, either by grouping them into a module and just integrating the urls in the urls.conf file. This way you may implement any feature and just plug it in the existing application.
- Another way would be by extending the current code, it can be done in three simple steps:
- Create your html templates
- Add the class based views in the views.py or another file(s)
- Import the views in the urls.conf file and define your url mappings in there, this will not in any way affect the existing features.
- Client makes a request via web using AngularJS ngResource or native app made using PhoneGap
- Request makes a REST API call to the Heroku hosted Django server.
- Request prepended with /api/<call> gets routed via a gunicorn to Django API server running REST framework.
- Multiple instances of the api server will be provisioned on different nodes to scale for traffic, each request is round robin(ed) until a free server is found and accepts the request.
- Django talks with the database coordinator which itself talks only to the Master database.
- Master database either reads from slaves or writes to master and syncs, this will the job of the PG coordinator. In future data center can be scaled using pgpool-II, middleware that works between PostgreSQL servers and a PostgreSQL database client can be implemented. Watchdog can be used to ensure the high availability feature o it.
- Data is sent back up the chain via a HTTP response on the REST API and the client is reloaded. There is no page refresh required anywhere and this allows for a smooth native mobile interface as well. This is provided natively by Heroku but this setup can be utilized for any system on AWS, GCE, Rackspace or any cloud provider to allows for maximum scaling of the application.
Current Data Model