Skip to content

Submitting data to SRA

Mats Töpel edited this page Dec 18, 2015 · 4 revisions

The NCBI Short Read Archive (SRA) is an excellent location to store high-throughput sequencing raw data. Data can be uploaded and placed under an embargo for many years, under the duration of a project, and can then be released to the public once a study is published. As the SRA data is stored in multiple locations, it is safe even in the event of system malfunctions.

There are a few steps that need to be performed before uploading any data though. I will briefly describe them here.

  1. First, you will need to create an account at NCBI: http://www.ncbi.nlm.nih.gov/

  2. The next step is to create a "BioProject" at https://submit.ncbi.nlm.nih.gov/subs/bioproject/ . During the BioProject submission procedure, you will need to enter information about yourself (affiliation), the project and the organism(s) under study. For each of your samples, you will be asked to create "BioSamples" during the BioProject creation. If you only have a few samples, you can enter the information for each of them now. If you have many, it is usually better to leave this blank for now, finish the BioProject submission, and then submit a batch file for BioSamples at a later stage. Once the BioProject has been created and processed, you will get a confirmation email with the BioProject ID.

  3. Submit your BioSample information at: https://submit.ncbi.nlm.nih.gov/subs/biosample/ . You will be asked to enter meta-information about your sample. Please be as detailed as possible, as this will help other researchers in the future. You can also enter information about voucher specimens, if applicable. A template for batch submission can be found at the top of the page, next to the "New Submissions" button. During the BioSample submission process, you can specify when the data will become publicly available, up to 5 years into the future. After each BioSample submission , you receive a confirmation email with the BioSample ID numbers.

  4. Once all data for all of your samples have been entered as BioSamples, you can start creating a SRA submission at: https://submit.ncbi.nlm.nih.gov/subs/sra/ . During the submission process, you will need to download a template (using the excel format is usually easier), where you need to enter information about your libraries (what type of library, which type of machine was it sequenced in etc. etc.). You will also need to associate each library with a BioProject, a BioSample and a sequence data filename. Then upload the metadata. In the next step, you will specify where the files are stored on your computer, and then they will be uploaded to the database.

Clone this wiki locally