-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating script that loads organisms and gene info into the database, as well as requirements file, to include latest version of django-organisms and django-genes #8
Conversation
…h', to make the name clearer - also formatting and cleaning up this file, adding commands to load gene history, and removing commands for loading GO, KEGG and DO sets, as those are processed by the annotation-refinery now. Also removing update_tribe.sh file, since it was unnecessary.
…nstallation process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the PR and left a few comments. Using "for" loop will definitely reduce the number of lines and make the code more compact. I think it will be a good idea in the long run. If you want to close this PR as soon as possible, I strongly recommend that you add an issue to fix it later.
load_organisms_and_genes_to_db.sh
Outdated
python manage.py organisms_create_or_update --taxonomy_id=208964 \ | ||
--scientific_name="Pseudomonas aeruginosa" --common_name="Pseudomonas aeruginosa" | ||
|
||
#CREATE XRDBS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you use "#CREATE XRDBS" (w/o a space char after #
) on purpose? (Same format is found in other comments lines below too.) If not, you may want to add the space char to make the comment style consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - that was actually legacy stuff from the previous files, I used a better format in the new comments, but I will change this.
@@ -0,0 +1,244 @@ | |||
# Create organism records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are sure that this script will be run with sh
or bash
, a line like this at the beginning will be helpful:
#!/bin/sh
or
#!/bin/bash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good idea, I will add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. The second line of code should be #!/bin/bash
. I have corrected it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good call, thanks!
@@ -0,0 +1,244 @@ | |||
# Create organism records | |||
python manage.py organisms_create_or_update --taxonomy_id=9606 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether it will be better to write a for
loop to deal with these commands. What do you think?
load_organisms_and_genes_to_db.sh
Outdated
--scientific_name="Pseudomonas aeruginosa" --common_name="Pseudomonas aeruginosa" | ||
|
||
#CREATE XRDBS | ||
python manage.py genes_add_xrdb --name=Ensembl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question: What about a "for" loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually for this one, would it really make sense to create a for loop? According to the link you posted, I would need to declare an associative array for each of the xrdbs, which would take as many lines as just calling these commands, no?
load_organisms_and_genes_to_db.sh
Outdated
# database for each organism | ||
|
||
#HUMAN | ||
wget -P data/ -N \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for" loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to tell you before: when I deployed tribe on my desktop, I had to change line 28 of requirements/base.txt
from:
psycopg2==2.6
to:
psycopg2==2.7
@mhuyck reported this issue some time ago when he deployed adage-server.
I did some research on how easy it is in bash to define an array of dictionaries to be used in |
…rganisms_and_genes_to_db.sh file, as per code review comments.
@dongbohu - I started to implement the for-loops with the associative array as you suggested, but I feel like it is a more involved process (with logic that is not as trivial), and I don't know how much it would help at this point, so I think I will open an issue for it. Is that ok with you? |
Did you see my comment on updating |
About using |
Ah yes, I did, and I made the change to the psycopg2 version, I had just forgotten to commit it and push it, good catch! Just pushed it. |
When I was a Unix system administrator (~2002), sometimes I wrote C++ programs to generate shell script that had a lot of boilerplate lines, because sh/bash/csh/tcsh was not powerful enough. It was more like a workaround, not a good long-term solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Thanks! I agree with your comment about not writing other programs to generate the boilerplate shell commands, as it is not a long-term solution. |
No description provided.