Skip to content

This package is designed to pull annotations from PubChem via their PUGView API It iterates through JSON pages and retrieves the annotations. It is designed for Multi-threading and will work best with HPC systems with fast internet connections. Be aware the time required will vary wildly with each annotation type

License

Notifications You must be signed in to change notification settings

demontrees/PubChem_Prospector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

     _ __                              _,.----.  ,--.-,,-,--,    ,----.         ___                 
  .-`.' ,`.  .--.-. .-.-.   _..---.  .' .' -   \/==/  /|=|  | ,-.--` , \ .-._ .'=.'\                
 /==/, -   \/==/ -|/=/  | .' .'.-. \/==/  ,  ,-'|==|_ ||=|, ||==|-  _.-`/==/ \|==|  |,--.--------.  
|==| _ .=. ||==| ,||=| -|/==/- '=' /|==|-   |  .|==| ,|/=| _||==|   `.-.|==|, |  / -/==/,  -   , -\ 
|==| , '=',||==|- | =/  ||==|-,   ' |==|_   `-' \==|- `-' _ /==/_ ,    /|==|   \/  ,\==\.-.  - ,-./ 
|==|-  '..' |==|,  \/ - ||==|  .=. \|==|   _  , |==|  _     |==|    .-' |==|- ,   _ |`--`--------`  
|==|,  |    |==|-   ,   //==/- '=' ,\==\.       /==|   .-. ,\==|_  ,`-._|==| _|  |  |               
/==/ - |    /==/ , _  .'|==|   -   / `-.`.___.-'/==/, //=/  /==/ ,     //==/  /V/ , /               
`--`---'    `--`..---'  `-._`.___,'             `--`-' `-`--`--`-----`` `--`./  `--`                
                  _ __                  _,.---._      ,-,--.     _ __                               
               .-`.' ,`.  .-.,.---.   ,-.' , -  `.  ,-.'-  _\ .-`.' ,`.                             
              /==/, -   \/==/  `   \ /==/_,  ,  - \/==/_ ,_.'/==/, -   \,--.--------.               
             |==| _ .=. |==|-, .=., |==|   .=.     \==\  \  |==| _ .=. /==/,  -   , -\              
             |==| , '=',|==|   '='  /==|_ : ;=:  - |\==\ -\ |==| , '=',\==\.-.  - ,-./              
             |==|-  '..'|==|- ,   .'|==| , '='     |_\==\ ,\|==|-  '..' `--`--------`               
             |==|,  |   |==|_  . ,'. \==\ -    ,_ //==/\/ _ |==|,  |                                
             /==/ - |   /==/  /\ ,  ) '.='. -   .' \==\ - , /==/ - |                                
             `--`---'   `--`-`--`--'    `--`--''    `--`---'`--`---'                                
                  ,----.    _,.----.  ,--.--------.   _,.---._                                      
               ,-.--` , \ .' .' -   \/==/,  -   , -\,-.' , -  `.   .-.,.---.                        
              |==|-  _.-`/==/  ,  ,-'\==\.-.  - ,-./==/_,  ,  - \ /==/  `   \                       
              |==|   `.-.|==|-   |  . `--`\==\- \ |==|   .=.     |==|-, .=., |                      
             /==/_ ,    /|==|_   `-' \     \==\_ \|==|_ : ;=:  - |==|   '='  /                      
             |==|    .-' |==|   _  , |     |==|- ||==| , '='     |==|- ,   .'                       
             |==|_  ,`-._\==\.       /     |==|, | \==\ -    ,_ /|==|_  . ,'.                       
             /==/ ,     / `-.`.___.-'      /==/ -/  '.='. -   .' /==/  /\ ,  )                      
             `--`-----``                   `--`--`    `--`--''   `--`-`--`--'                       

                              
                                           ::::::::::                                          
                                        :::::::  ::::::!!!!!!!!!!!/!                           
                                     ::::         !!!!!!!!!!!!!!!!!!!!                         
                                   ::::        !!!!!!!!!!!!!!!!!!!!!!!!!!!                     
                                  ::::       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!                    
                                 :::       !!!!!!!!!!!!!!+!! !!!!!!!!!!!!!!                    
                                 :::    !!!!!!!!!!!!!!    +! !!!!!!!!!!!!!!                    
                               ::::    !!!!!!!!!!!!      ;   !  !!!!!!!!!!!                    
                               ::::  !!!!!!!!!!!     ;;;;;  ;;  ;;!!!!!!!!                     
                              :::: !!!!!!!!!!!    ;;;;;;;; ;;;;;;;!!!!!!!!                     
                              ::::!!!!!!!!!!    ;;;   +      ;;;;!!!!!!!!                      
                             :::!!!!!!!!!!     ;;   ++++      ++ !!!!!!!                       
                            :::!!!!!!!!!!W     ;    ++@0   \ +0@ !!!!!!\\\\\\\\                
                            :!!!!!!!!!!WWW          +++@    \ @++!!!!!\\\\\\\\\\\\\\\\         
                            :!!!!!!!!!WWWW               |  │   !!!!\\\\\\\\\\\\\\\\\\\\\      
                            !!!!!!!!! WWWW W           | |   \  !  W\\\\\\\\\\\\     \\\\\\    
                           !!!!!!!!!!WWWWWWWW      WWWWW──|──|  WWW\\\  \\\\\\\\        \\\\\  
                          !!!!!!!!!!WWWWWWWWWWW WWWWWWWWWWWWWWWWWWW\     \\\\\\\          \\\\ 
                          !!!!!!!!!!WWWWWWWWW WWWW┌ W WWWWWWWWWWWWW       \\   \            \\ 
                          !!!!!!!!!!WWWWWWWWWWWWWW \U\ ____/  WWWWWW  W    \   \\              
                          !!!!!!!!!!WWWWWWW WWWWWW │  U UU  │  WWWWWW      \\   \              
                          !!!!!!!!!!WWWWWWWWWWWWWW  \ \  \ \│ WWWWW         \\  \\             
                          !!!!!!!!!WWWWWWWWWWWWWWWWW \nn_n__/WWWWW           \   \/            
                            !!!!!WWWWWWWWWWWWWWWWWWWWWWW W WWWWWWWW          \\  /  /          
                             !!!!!WW!WWWWWWWWWWWWWWWWWWWWWWWWWWW            / ─  -   /         
                              !!!!!!!!!!!WWWWWWWWWWWWWWWWWWWWWWWWW             /    /          
                                !!!!!!!WWWW!WWWWWWWWWWWW WWWWWW               ─   -   /        
                                     !!!!!!!W!!WWW WWWWWWW  WWWW                  - /          
                                                 WWWW W         WWW            ── ─            
                                                                                                                                                               
                                                                                                                                                
                                                                                                                                   
          PubChem Prospector:                                                                     _                                
   This pakcage is designed to pull annotations from PubChem via their PUGView API         __  __|_|__                        
   It iterates through JSON pages and retrieves the annotations. It is designed for      / \   _|+.+|_                     
   Multi-threading and will work best with HPC systems with fast internet connections       \// |WWW|\\                          
   Be aware the time required will vary wildly with each annotation type                        // \\       

                   
   build_annotation_dict:									                                
this function creates a dictionary of all possible annotations PubChem Prospector can access	                       
just run "build_annotation_dict()" and access the dictionary by calling pc_annotations		                 
it is structured {entry_type:[annotations]}							          


the available entry types are:                                                                        

Taxonomy, Element, Cell, Protein, Assay, Gene, Compound, and Pathway				          
          
Usage:                                                                                                    
          
prospector.build_annotation_dict()                                                                                  
prospector.pc_annotations
       
annotation_search:                                                                                                  
this function is just a quick shortcut for searching annotations in the pc_annotations dictionary                
just give it the entry_type and the string to search by and you can save the result(s) as a variable     

Usage:

z = prospector.annotation_search({ENTRY_TYPE},{SEARCH_STRING})

get_dict
this is the main function of PubChem_Prospector. It calls a series of smaller functions to build a 
dictionary containing all selected annotations for the given entry type. It collects each annotation
one at a time, so the more you ask for, the longer it will take, but it puts them all together in a 
nice clean package, accesible by the numeric ids for that entry type in PubChem. You can use either 
a string specifying a single annotation type, or a list of annotation types for the second argument

You can specify the number of threads as well. Default is 32

Usage:

pc_dict = prospector.get_dict({ENTRY_TYPE}, {ANNOTATION_TYPE}, threads = 32)                                        
													 

About

This package is designed to pull annotations from PubChem via their PUGView API It iterates through JSON pages and retrieves the annotations. It is designed for Multi-threading and will work best with HPC systems with fast internet connections. Be aware the time required will vary wildly with each annotation type

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages