Skip to content

Google Season of Docs Ideas 2021

Rishabh Jain edited this page Mar 26, 2021 · 27 revisions

About CloudCV

Welcome, and thank you for your interest in CloudCV/EvalAI!

CloudCV began in the summer of 2013 as a research project within the Machine Learning and Perception lab at Virginia Tech (now at Georgia Tech), with the ambitious goal of making platforms to make AI research more reproducible. We’re a young community working towards enabling developers, researchers, and fellow students to build, compare and share state-of-the-art Artificial Intelligence algorithms. We have participated in the past seven installments (2014 - 2021) of Google Summer of Code, over the course of which our students built several excellent tools and features.

We are working on building an open-source platform, EvalAI, for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the AI research community to fulfill the critical need of evaluating machine learning models with static ground truth data or with the help of humans. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.

About EvalAI

EvalAI is an open-source platform that is helping to simplify and standardize the process of benchmarking AI models. It helps researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe.

We believe that the progress on several important problems in Computer Vision (CV) and Artificial Intelligence (AI) has been driven by the introduction of bold new tasks coupled with the curation of large, realistic datasets. Not only do these tasks and datasets establish new problems and provide data necessary to analyze them, but more importantly they also establish reliable benchmarks where proposed solutions and hypotheses can be tested – an essential part of the scientific process. We are helping the AI community to host these tasks and datasets without any friction on our platform to measure the progress in this domain. At EvalAI, we think that the infrastructural logistics for hosting a test dataset shouldn’t be a problem.

Current Status - EvalAI has been live for 4 years and we have hosted 100+ AI challenges with 10,000+ users, who have created 100,000+ submissions. Several organizations from industry such as Facebook, Google, IBM, eBay, etc. and academia such as Stanford, CMU, MIT, Georgia Tech, etc. are using it and its forked version for hosting their internal challenges instead of reinventing the wheel.

EvalAI's Documentation

How is EvalAI's documentation built?

EvalAI's documentation is built using the markdown and Sphinx. Sphinx is a tool that generates the static HTML files and they are hosted using Read the Docs. The latest documentation is available here.

Project Ideas

Idea 1: Run a full audit for the current documentation and add docs for challenge creation on EvalAI

The current documentation for EvalAI is outdated and inconsistent at multiple places due to the continuous development of the project since 2016. The main goal for this project is to go through the entire current documentation, check it for inconsistencies, and create a friction log for one of the most important use cases on EvalAI, i.e. challenge creation. The challenge creation includes an end-to-end pipeline for creating the challenge config, uploading it on EvalAI, and running the workers for evaluation. Some of the main tasks in this project will include -

  • Read the current documentation, follow the onboarding instructions for new challenge organizers and check for inconsistencies
  • Create a document for missing documentation in the challenge creation process
  • Add the missing documentation for challenge creation using a zip file and using GitHub
  • Add the documentation for running the challenge workers locally and on EvalAI
  • Add FAQ section for challenge hosts to address common errors
  • Add screenshots, GIFS, of various common scenarios, video walkthroughs to help users set up challenges with minimal supervision

Mentors - Deshraj, Rishabh

Contact Info:

Required Skills

  • Good, working knowledge of English
  • Familiarity with Git, GitHub, and markdown
  • Familiarity with EvalAI platform
  • Familiarity with photo / video-editing / screen-recording tools

Idea 2: Add docs for code upload challenge hosting and evaluation on EvalAI

EvalAI provides the feature for evaluating the machine learning model’s code in the containers as opposed to simply evaluating a prediction file. This feature has seen rapid development in the past year or so. We have seen an inflection point in the number of hosts that want to utilize this feature. To enable this large-scale adoption, we want to create extensive documentation for setting up a code upload challenge evaluation on EvalAI. Some of the main tasks in this project would be -

  • Update the EvalAI-CLI documentation with the new added features and commands
  • Add documentation and architecture diagrams for the working of evaluation of code upload challenges
  • Add missing documentation for creating an environment docker container and agent docker container
  • Add examples for sample code upload challenge environment and agent docker container using the past hosted challenges on EvalAI
  • Work with one of the mentors to add docs for the various packages installed on the server and why are they installed
  • Add documentation for the job scheduling on the server and where are all the logs stored for each of the jobs
  • Add documentation for scaling the server resources
  • Add screenshots, GIFS, of various common scenarios, video walkthroughs to help users set up challenges with minimal supervision

Mentors - Ram, Rishabh

Contact Info:

Required Skills

  • Good, working knowledge of English
  • Familiarity with Git, GitHub, and markdown
  • Familiarity with EvalAI platform
  • Familiarity with photo / video-editing / screen-recording tools

Idea 3: Add docs for remote challenge hosting and evaluation on EvalAI

A frequent use case for EvalAI is hosting a challenge on a private server to protect the test data. The aim of this project is to enrich EvalAI’s existing documentation such that it won’t involve EvalAI’s admin to get into the loop of hosting such challenges. Some of the main deliverables for this project would be -

  • Add documentation and architecture diagrams for the working of remote evaluation of the challenges
  • Add documentation for pulling the submissions from EvalAI’s challenge queue
  • Add documentation for updating the EvalAI when the submission is set to “RUNNING” and “FINISHED”
  • Add a FAQ section for potential errors while setting up this pipeline
  • Add screenshots, GIFS, of various common scenarios, video walkthroughs to help users setup remote challenges with minimal supervision

Mentors - Rishabh

Contact Info:

Required Skills

  • Good, working knowledge of English
  • Familiarity with Git, GitHub, and markdown
  • Familiarity with EvalAI platform
  • Familiarity with photo / video-editing / screen-recording tools

Measuring Project’s Success

We deploy our documentation automatically as soon as a pull request is merged to the master branch. So, we will publish the documentation as the technical writers are working, and to measure success we will consider two metrics -

  • Reduction in the number of new challenges requiring EvalAI-admins’ help for a period of 3 months after deploying the documentation. The current number of new challenges in a 3-month period is ~25.
  • Direct feedback from challenge organizers who used our documentation to create challenges using a feedback form based on the following metrics (on a scale of 1 to 5) to check for readability, clarity, context, accuracy, organization, succinctness, completeness, and findability of the documentation.

We would consider the projects successful when -

  • The number of queries we received through emails gets reduced by 25% and the time spent on the docs website increases by 25% of the current time.
  • The number of challenges on EvalAI increases by 25% and EvalAI admins are not in the loop for onboarding 50% of those challenges.
  • The number of positive feedbacks we receive from challenge hosts is greater than 70%.

Project Budget

Budget Item Amount Running Total Notes/Justification
Project 1 (Technical Writer) (Run a full-audit for the current documentation and add docs for challenge creation on EvalAI) $4,200 $4,200 10hr/week @ $30/hr for 14 weeks
Project 2 (Technical Writer) (Add docs for code upload challenge hosting and evaluation on EvalAI) $4,200 $8,400 10hr/week @ $30/hr for 14 weeks
Project 3 (Technical Writer) (Add docs for remote challenge hosting and evaluation on EvalAI) $4,200 $12,600 10hr/week @ $30/hr for 14 weeks
Mentor Stipend $1,500 $14,100 3 Mentor Stipends * 500 = $1500
Project Swag $200 $14,300 Stickers for CloudCV and EvalAI
Total $14,300

Additional information

Previous experience with technical writers or documentation

EvalAI has official documentation available here. Documentation of EvalAI-CLI is available here. All the contributors are expected to write documentation along with their code contributions and are reviewed as part of the pull-request review. Additionally, the core-team consists of graduate students and faculty of the Machine Learning and Perception lab at Georgia Tech. They publish scientific documents, technical reports at top-tier conferences routinely.Here is the white paper explaining EvalAI, and other CloudCV projects. We routinely review the writing through an internal review process where members of the CVMLP lab review the document and leave feedback. In the past, contents of the peer-reviewed technical report (figures, explanations, video walkthroughs) developed by the core-team have been contributed back to the documentation improving the quality of documentation immensely. As part of the Season of Docs, we will make sure that the technical writers, active contributors, and the core-team work closely with each other to build high-quality documentation for the developers and users of EvalAI.


Previous participation in Season of Docs, Google Summer of Code or others

Continuous & subsequent participation in GSoC (2014 - 2021) & GCI (2017 - 2020) programs help us to stay connected with the community. As an organization, we learned a lot in terms of skills such as communication, mentoring students, etc. It has helped us to improve our projects in terms of documentation, features & also helped us to simplify the setup procedures to onboard new students. We always take feedback from our GSoC & GCI students & incorporate it in order to make these programs even better for the upcoming students.

CloudCV has had an excellent track record with “graduated” GSoC students. For instance, 2 GSoC 2020 students have been actively maintaining their projects even after GSoC. Students that showcase good knowledge of the codebase and are eager to contribute are granted commit access to the project repository.

Since we have already mentored a lot of students during GSoC and GCI programs, we plan the project timelines in advance, maintain an informal and friendly working environment, try to ensure that students feel included in the community, and never hesitate to reach out to mentors. We also help them with low-level tasks for the projects. Moreover, we try to retain students with the organization and projects so as to give them an opportunity to learn about other aspects of the project such as software development like performance, operations, and infrastructure as well; all these factors help keep students motivated to become long-term contributors.