Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating script that loads organisms and gene info into the database, as well as requirements file, to include latest version of django-organisms and django-genes #8

Merged
merged 6 commits into from
Nov 10, 2017

Conversation

rzelayafavila
Copy link
Contributor

No description provided.

…h', to make the name clearer - also formatting and cleaning up this file, adding commands to load gene history, and removing commands for loading GO, KEGG and DO sets, as those are processed by the annotation-refinery now. Also removing update_tribe.sh file, since it was unnecessary.
@ghost ghost assigned rzelayafavila Nov 8, 2017
@ghost ghost added the code review label Nov 8, 2017
Copy link
Contributor

@dongbohu dongbohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the PR and left a few comments. Using "for" loop will definitely reduce the number of lines and make the code more compact. I think it will be a good idea in the long run. If you want to close this PR as soon as possible, I strongly recommend that you add an issue to fix it later.

python manage.py organisms_create_or_update --taxonomy_id=208964 \
--scientific_name="Pseudomonas aeruginosa" --common_name="Pseudomonas aeruginosa"

#CREATE XRDBS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you use "#CREATE XRDBS" (w/o a space char after #) on purpose? (Same format is found in other comments lines below too.) If not, you may want to add the space char to make the comment style consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - that was actually legacy stuff from the previous files, I used a better format in the new comments, but I will change this.

@@ -0,0 +1,244 @@
# Create organism records
Copy link
Contributor

@dongbohu dongbohu Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are sure that this script will be run with sh or bash, a line like this at the beginning will be helpful:
#!/bin/sh
or

#!/bin/bash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good idea, I will add that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. The second line of code should be #!/bin/bash. I have corrected it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good call, thanks!

@@ -0,0 +1,244 @@
# Create organism records
python manage.py organisms_create_or_update --taxonomy_id=9606 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether it will be better to write a for loop to deal with these commands. What do you think?

--scientific_name="Pseudomonas aeruginosa" --common_name="Pseudomonas aeruginosa"

#CREATE XRDBS
python manage.py genes_add_xrdb --name=Ensembl \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question: What about a "for" loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually for this one, would it really make sense to create a for loop? According to the link you posted, I would need to declare an associative array for each of the xrdbs, which would take as many lines as just calling these commands, no?

# database for each organism

#HUMAN
wget -P data/ -N \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for" loop?

Copy link
Contributor

@dongbohu dongbohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to tell you before: when I deployed tribe on my desktop, I had to change line 28 of requirements/base.txt from:
psycopg2==2.6
to:
psycopg2==2.7
@mhuyck reported this issue some time ago when he deployed adage-server.

@dongbohu
Copy link
Contributor

dongbohu commented Nov 9, 2017

I did some research on how easy it is in bash to define an array of dictionaries to be used in for loop. Bash 4.0 (released in 2009) introduced associative array, which makes the implementation easy. See this example:
https://stackoverflow.com/questions/1494178/how-to-define-hash-tables-in-bash

@rzelayafavila
Copy link
Contributor Author

@dongbohu - I started to implement the for-loops with the associative array as you suggested, but I feel like it is a more involved process (with logic that is not as trivial), and I don't know how much it would help at this point, so I think I will open an issue for it. Is that ok with you?

@rzelayafavila
Copy link
Contributor Author

Ok @dongbohu, I think I have addressed everything. I opened up issue #9 to address the converting of commands into for-loop as per our conversations here and in person. Let me know how it looks, thanks!

@dongbohu
Copy link
Contributor

Did you see my comment on updating psycopg2 from 2.6 to 2.7 in requirements/base.txt?

@dongbohu
Copy link
Contributor

About using for loop in bash, I agree its syntax seems a little weird. Let me think about it. You don't need to do it now.

@rzelayafavila
Copy link
Contributor Author

rzelayafavila commented Nov 10, 2017

Ah yes, I did, and I made the change to the psycopg2 version, I had just forgotten to commit it and push it, good catch! Just pushed it.

@dongbohu
Copy link
Contributor

When I was a Unix system administrator (~2002), sometimes I wrote C++ programs to generate shell script that had a lot of boilerplate lines, because sh/bash/csh/tcsh was not powerful enough. It was more like a workaround, not a good long-term solution.

Copy link
Contributor

@dongbohu dongbohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@rzelayafavila
Copy link
Contributor Author

Thanks! I agree with your comment about not writing other programs to generate the boilerplate shell commands, as it is not a long-term solution.

@rzelayafavila rzelayafavila merged commit 8b8c194 into greenelab:master Nov 10, 2017
@ghost ghost removed the code review label Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants