Skip to content

Conversation

@vivienyuwenchen
Copy link

@vivienyuwenchen vivienyuwenchen commented Sep 14, 2017

I finally understood the purpose of if name == "main": and moved the gene_finder(dna) portion under that branch.

Copy link

@SeunginLyu SeunginLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really good in general! I like your clean syntax practices and the code is really well documented with appropriate docstings. I added a few comments that are mostly stylistic guides.

from load import load_seq
dna = load_seq("./data/X73525.fa")
import doctest
from pickle import dump, load

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow pickle toolbox! great job

import random
from amino_acids import aa, codons, aa_table # you may find these useful
from load import load_seq
dna = load_seq("./data/X73525.fa")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you usually want to separate any function calls from these import statements. It would make more sense to put it under if name == "main":

elif nucleotide == 'T':
return 'A'

#doctest.run_docstring_examples(get_complement, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove comments in your final upload

I added two more doctests because each complementary nucleotide is in
its own if/else if branch. If one branch doesn't work and there is no
doctest for that branch, the mistake will not be caught. In fact, I
found out that I spelled nucleotide wrong in the G branch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great practice that you are adding your own tests


return comp_str

#doctest.run_docstring_examples(get_reverse_complement, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another comment to remove

for nuc in range(0, len(reversed_dna)):
comp_nuc = get_complement(reversed_dna[nuc])
comp_list.append(comp_nuc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you wanna get fancy with just one line of code :
return ''.join([get_complement(reversed_dna[nuc]) for nuc in dna[::-1]])

I added two more doctests: one in which the frame stop codon is TAA and
another in which there is no frame stop codon. This way, all the
branches are tested for error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


return dna

#doctest.run_docstring_examples(rest_of_ORF, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another comment to remove

frame1 = find_all_ORFs_oneframe(dna)
frame2 = find_all_ORFs_oneframe(dna[1:])
frame3 = find_all_ORFs_oneframe(dna[2:])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also use a for loop here
[find_all_ORFs_oneframe(dna[i:]) for i in range 3]

humongest = longest_ORF(dna_shuffles[i+1])

return len(humongest)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be more memory efficient to just save the max_length of longest_ORF than to save the actual longest_ORF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants