Skip to content

Navigation Menu

Explore
By size
By industry
By use case
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ToniWestbrook / paladin Public

Notifications You must be signed in to change notification settings
Fork 7
Star 60

Code
Issues 5
Pull requests
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pending Tasks

Jump to bottom

Toni Westbrook edited this page Jun 25, 2015 · 39 revisions

####Research, Design, Development####

Tool for automated/scripted GO analysis of PALADIN results to measure effectiveness (of identifying functionality, as opposed to taxonomy)
Post-processing functionality (SAM translation, embedding protein info, BAM generation, output options)
Options refinement (remove BWA options not usable/relevant to PALADIN, ensure argument checking works with new PALADIN related options
Code cleanup
Incorporate ORF detection hinting depending on results obtained below

Fix intentional memory leak issue in the getSequenceORF function. (6/21/15)
For each major algorithm variant that we will use, create a command line argument to select it and set appropriate options. Part of this will be detection of single vs multi frame protein translation in the reference index during the alignment phase and adjusting functionality accordingly (including for Post-processing, see below) (6/21/15)
Create script to generate a mapping between UniProt/RefSeq IDs and CDS entry for mapped reads (for use in GO work) (6/16/15)
Create second version of UniProt nucleotide database with the references removed for our 6 MCBS913 species (6/9/15)
Fix >2GB indexing issue (inherent to BWA), necessary for UniProt nucleotide testing (6/10/15)
Memory leak in sequence header name parsing issue (causing huge amounts of memory to be used)
Build nucleotide sequence database for each corresponding protein sequence in the UniProt database (for the 95% where possible)
Move multi-frame protein translation from indexing to alignment - then choose best alignments during SAM output
Research reference set used by PhyloSift, clone environment in PALADIN
Add command arguments for ORF length

####Testing and Verification####

For all tests, perform accompanying GO analysis (create an individual FINISHED item below for each test complete, remove this item when all tests done)
Align: Generated metagenomic reads against UniProt (Full and Filtered) using Novo, plain
Align: Generated metagenomic reads against UniProt (Full and Filtered) using Novo, degenerate
Align: Real metagenomic reads against UniProt using BWA
Align: Real metagenomic reads against UniProt using Novo, plain
Align: Real metagenomic reads against UniProt using Novo, degenerate
Test with 150 sized reads
After incorporating any hinting functionality from tests above, retest all algorithm variants on the real metagenomic reads (with full UniProt DB)

Test specific stop codons frequency ordering vs GC content (similar to test below) to see if specific stops differ from overall pattern (6/24/15)
Verify alignment of mapped reads, develop error measure, look for patterns/recurring issues in misalignment (The non-GO portion of this is done - verification now needs to be performed at the functional level) (6/24/15)
Complete tests on GC content vs stop codon frequency frame order (6/17/15)
Generated metagenomic reads against UniProt (Full and Filtered) using BWA (6/17/15)
Complete tests on stop codon frequency frame order (6/14/15)
Run all generated metagenome reads tests again using the filtered UniProt database (6/11/15)
Complete test of algorithm variant 1 on the generated metagenome reads (6/10/15)
Run stop codon frequency stats (counts, frame order) on UniProt, MCBS913 species, and Acidovorax
Test variant 2 and 3 completed with generated metagenome reads
Initial test of ORF detection strategies (See initial list here)

####Paper####

Toggle table of contents Pages 6

Home
Discussion Items
File and Test Organization
Pending Tasks
Removed BWA Functionality
Sequence Header Format

Wiki Navigation

Discussion Items

Knowledgebase

File and Test Organization

Removed BWA Functionality

Sequence Header Formats

Clone this wiki locally

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.