Milestone 7 Architecture

From crowdresearch
Revision as of 20:35, 13 April 2015 by Saiphsavage (Talk | contribs) (CrowdResearch Infra Work Dynamics)

Jump to: navigation, search


Introduction

We propose fault tolerant, extensible, and modular system that scales to the level of intended usage. Our main goal is to design and create the core infrastructure to support basic interactions between workers and requestors. More specifically, we aim to have a design that can automatically adapt to the power and trust structures that a given collective of requestors and/or workers define.


Fig 1. Goal

CrowdResearch Infra Work Dynamics

The CrowdResearch teams will collaborate to collectively produce the Core Architecture. We have identified list of tasks needed to carry out the architecture. Teams will sign up for the tasks to execute them. Collectives of teams will produce each task. We will then connect each of the tasks and have a finished architecture!

WorkflowCoreArchitecture.jpg


Actions Requiered.

  • Sign up with your team for the tasks you want to help execute (Sign up for tasks you want to, have experience in, want to learn from etc.)
  • Each team needs to sign up for 1-3 tasks.
  • The teams working under a particular task need to communicate with each other and discuss a work plan to execute the task (We recommend for each task, having one team who will lead the other teams.)
  • For each task, the teams at the end of the week will need to collectively provide:
    • Basic design of what you will implement
      • Provide expected input and output of what you will implement.
      • Explain how other components will communicate with your part. Here we recommend to create a list of other team collectives (teams working for a certain task) that your part needs to communicate with. Talk with these team collectives to say how your part will communicate with what they are doing.
    • Have a setup ready to execute your task.
    • Start Preliminary implementation.

System Architecture

Components

  • Nginx is used as a reverse-proxy and serve the static files
  • Gunicorn will handle the WSGI applications, in our case the Django Apps.
  • Rest API The Django app is a great way to modularization. After completing the main web application we will work on rest APU with OAUTH2 autheentication. This app will be used for mobile and desktop clients. Other applications can be derived as project progresses.
  • Websockets: We will need websockets for live communication between the client apps and the users themselves, we will start with Tornado if it plays well with Django.
  • Gunicorn can run on multiple web workers and we will use redis to handle the sessions for websockets and so on.
  • In this architecture it is very easy to implement new features, either by grouping them into a module and just integrating the urls in the urls.conf file. This way you may implement any feature and just plug it in the existing application.
  • Another way would be by extending the current code, it can be done in three simple steps:
    • Create your html templates
    • Add the class based views in the views.py or another file(s)
    • Import the views in the urls.conf file and define your url mappings in there, this will not in any way affect the existing features.
Fig 2. SYSTEM ARCHITECTURE


Workflow

  • Client makes a request via web using AngularJS ngResource or native app made using PhoneGap
  • Request makes a REST API call to the Heroku hosted Django server.
  • Request prepended with /api/<call> gets routed via a gunicorn to Django API server running REST framework.
  • Multiple instances of the api server will be provisioned on different nodes to scale for traffic, each request is round robin(ed) until a free server is found and accepts the request.
  • Django talks with the database coordinator which itself talks only to the Master database.
  • Master database either reads from slaves or writes to master and syncs, this will the job of the PG coordinator. In future data center can be scaled using pgpool-II, middleware that works between PostgreSQL servers and a PostgreSQL database client can be implemented. Watchdog can be used to ensure the high availability feature o it.
  • Data is sent back up the chain via a HTTP response on the REST API and the client is reloaded. There is no page refresh required anywhere and this allows for a smooth native mobile interface as well. This is provided natively by Heroku but this setup can be utilized for any system on AWS, GCE, Rackspace or any cloud provider to allows for maximum scaling of the application.


Data Model Example

Fig 3. Data Model Example


  • User - This is the Django User model, it provides functions like create_user and it's used for login/logout.
  • UserProfile - Inherits from the User model, it extends it with more information about the user.
  • Worker - Inherits from the UserProfile, this will be the worker profile.
  • Requester - Inherits from the UserProfile model as well, it contains the requester profile. *
  • Skill - This model is used to maintain the skills, allows subskills as well.
  • WorkerSkill - Maps the skills to workers, it allows skill verification.
  • Tests - These will be tests for skills WorkerTest - Maps the tests to workers, the result and so on.
  • Project - This is the project itself which is led by one Requester ProjectRequester - defines the requester and the collaborators of a project.
  • ProjectModule - A module is a set of HITs, this defines the project modules, with a description, template, status and so on. A project can have many
  • Template - this will be the template model which contains the defined templates by the user (the html source).
  • Qualification - There are module based qualifications and there can be as many qualifications as needed.
  • QualificationItem - Defines the actual qualification parts, using the workerAttribute, operator, value1 and value2. This allows the requester to create any kind of qualification e.g. username='john', age>20, country = 'USA', approvalRate>'80%' and so on. There will be complex qualification clauses combined with binary operators. #There will be an option to specify the type of the Qualification it could be STRICT or ALLOW_APPLICATIONS, where strict would make these hits appear only to the workers who pass this qualification, and *ALLOW_APPLICATIONS would allow for example new users to apply for the modules even though they don't fullfill the requirements of the qualifications.
  • WorkerApplications - Here we *maintain the worker applications for modules/projects, the requester can review these applications and approve or deny them. Approving them would mean that a new qualification *would be added to the module itself username='approved username' ORed with everything else.
  • HIT - This will be the hit model, every hit belongs to a project module. HITWorker - *This will contain the workers which are working on the hits.
  • HITResult - Maintains the results submitted by the workers, it works together with the specified template for the module. *There will be result requirements which are pretty much the same as "qualifications", but they will be applied to the results instead of workers, e.g. SELECTLIST1 IS NUMERIC AND *SELECTLIST1 BETWEEN 1 AND 10 TEXTFIELD1 is $ANIMAL_NAME and so on.

Task Devision

Critical Tasks

  1. Data Model Creation Django, PostgreSQL
  2. Design Mockup Creation CSS, Bootstrap

Normal Tasks

Good to Have Functionalities

Core Architecture & System Functionalities

Milestones

Timeframe