Skip to content

Reference Loader Exercise

mihailefter edited this page May 27, 2019 · 1 revision

How to load and use your own reference sequence

You only need the Reference File Loader if the reference sequence is not available under a GenBank or the LRG accession number.

The Reference File Loader accepts any file in valid GenBank Flat file format.

The Reference File Loader can load reference sequences generated yourself using its first two options:

  • The reference sequence file is a local file.
  • The reference sequence file can be found at the following URL.

The valid GenBank Flat file may have been created by your sequencing analysis software or sequence submission tools like Network aware Sequin.

The exercise

GenBank accession number U14680.1 represents the major BRCA1 allele containing a Proline at position 871 (dbSNP rs799917) in the Caucasian population. In this exercise, we will generate a reference sequence reflecting the major BRCA1 allele in the Afro-American population, which contains a C>T substitution at position c.2612 leading to a Leucine at position 871.

  1. Open Entrez Nucleotide in a new window.

  2. Paste the accession number U14680.1 reflecting the major BRCA1 allele containing a Proline at position 871 in the Caucasian population into the Search box and click the Search button.

  3. Click the "Send link" above the record title on the right side, select "File" under the heading "Choose destination", file format "GenBank and" click the "Create File" button. Save the file as U14680.gb.

  4. Open U14680.gb in a text editor (WordPad or equivalent).

  5. Change the C at position 2731 into a T (1). Save the file as U14680modP871L.gb in plain text format.

  6. Open Reference File Loader page, select the first option, browse to locate file U14680modP871L.gb and click the Submit button.

  7. When successful, Mutalyzer returns a UD_xxxxxxxxxxxx accession number, which can be used a reference sequence for further checks.

  8. Compare the Name Checker results for U14680.1 and the UD_ in combination with c.2612C>T. How do you have to describe the major allele in the Caucasian population using the UD_?

Notes

  1. To generate a consistent record, you would also have to change the P at position 871 in the CDS translation.