Functionmotifs #116

tnitka · 2023-11-27T18:12:43Z

Add Motif mode for identification of functionally relevant sequence motifs

…are being applied properly.

… of CSVs

…g rebase

christinehc · 2023-12-12T00:20:12Z

Please remember to update the version here before pushing. Depending on whether #115 gets merged first or not, try to coordinate versions (I suggest 1.2.0 for #115 and 1.3.0 for this update, unless we do a simultaneous 1.2.0)

changelog: - kmers can now be scored by probability score subtracting the observed kmers in a supplied background set, family set, or combining both background and family - note: some column headers have changed, which may affect downstream analysis (e.g. integration with #115, #116) - to handle user-supplied background files, new rules have been created to count background kmers and combine background kmer counts into a background matrix. The appropriate files for the new workflow have been created. - extensive changes have been made to `snekmer.score` to accommodate the new changes, including: - `snekmer.score.score` now has 3 distinct formulae to compute probability scores according to the desired scoring method - `snekmer.score.feature_class_probabilities` now also integrates the scoring method - the main scoring rule itself has been significantly altered as follows" - all references to the old and not-working "background subtraction" (e.g. separating sequences by "sample" or "background" labels) have been removed - extraneous kmer probability scores for every family are no longer calculated; only the family in question's kmer profile is scored - scoring method now integrated

christinehc

Overall Comments

Code is very well documented IMO, great job on that front :)
I've added (very minor) comments re:code changes.
I think the function motifs capability would strongly benefit from an example notebook or demo going through results and the interpretation thereof. We can also potentially work on a template for a summary report.
We need a docs page, as well as general updates to the docs, to reflect the function motifs workflow.

christinehc · 2024-01-23T18:49:36Z

.github/workflows/action.yml

@@ -52,7 +52,7 @@ jobs:
      - shell: bash -l {0}
        run: mamba install -y -c conda-forge snakemake==7.0 tabulate==0.8.10
      - shell: bash -l {0}
-        run: pip install -e git+https://github.com/PNNL-CompBio/Snekmer@kmer-association#egg=snekmer
+        run: pip install -e git+https://github.com/PNNL-CompBio/Snekmer@functionmotifs#egg=snekmer


I see why this was changed for local testing, but once we merge in the PR, this should be changed to not point to the functionmotifs branch anymore

christinehc · 2024-01-23T19:14:35Z

snekmer/rules/motif.smk

@@ -0,0 +1,209 @@
+"""model.smk: Module for supervised kmer-based annotation models.


Should be motif.smk

changelog: - RTD pages now include notebook-formatted pages with the learn/apply and motif demos. - Formatted notebooks for compatibility with sphinx notebook formatting. - Cleaned up notebooks, e.g. removing extraneous import statements

changelog: - removed ~HEAD notebook file - moved learn/apply and motif tutorial notebooks to docs folders to add them into sphinx documentation - created symlinks for the above notebooks so that notebooks are still accessible from the resources directory

christinehc

My suggestions are minor and snekmer motif mode works on my machine without issues. We are close!

docs/source/getting_started/config.rst

docs/source/getting_started/usage.rst

resources/tutorial/snekmer_motif_tutorial.ipynb

christinehc · 2024-03-05T21:26:46Z

resources/tutorial/snekmer_motif_tutorial.ipynb

I would additionally love to see a (minor) deep dive on the results for the two different families, i.e. comparing the highest-scoring kmers for the two families, or brief analysis of the top N kmers. Extra points if there is some sequence motif we can use as an example that makes sense for each given family.

snekmer/motif.py

snekmer/scripts/motif_motif.py

snekmer/scripts/motif_preselect.py

snekmer/scripts/motif_rescore.py

…fig parameters

christinehc

Looks good!

tnitka added 30 commits April 28, 2023 08:17

Fix wildcard error in motif.smk

a5ddb35

Intentionally break motif before current error to check that changes …

b1e5c5d

…are being applied properly.

Fix most motif.smk errors, including intentional break

9b0d4c7

Fix wildcard and key errors

508a01b

Fix undefined name in motif script

9d91602

Fix input error in motif

83b90b7

Fix logging error in motif script

eca87b3

Fix input error in motif script

b26a173

Fix logging error in motif script

5b17a81

Fix input error in motif script

80bf8d0

Fix input error in motif script

c10b240

Fix input error in motif script

6aacb30

Fix compressed input handling in motif script

fcf8a2d

Fix compressed input handling in motif script

16e1256

Change input score source in motif script to the score matrix instead…

5278048

… of CSVs

Implement bad debugging practive to be reverted in next commit

2e312d3

Fix input handling and remove bad debugging practice

cbafaff

Fix compressed input handling in motif script

7f8691e

Fix conflict between dataframe and ndarray usage in motif_motif.py

381a68b

Use common basis for all proteins in motif

77ea592

Fix input error in motif snakefile

d1f07d2

Fix input error in motif snakefile

7096a71

Fix input error in motif script

00438e7

Fix input processing error in motif

11e7d1d

Fix input processing error in motif

43efb71

Fix input processing error in motif

0422444

Fix scoring error in motif script

6e37b51

Fix error in motif scoring

4ef303f

Fix error in motif scoring

2028438

Fix error in motif scoring

cd5951b

tnitka added 4 commits December 5, 2023 11:36

Update test and fix command line parser error introduced during rebase

3c54509

fixup! Update test and fix command line parser error introduced durin…

4e81aee

…g rebase

Add motif test to CI workflow

7383c42

Correct snekmer motif test environment

2d5cae5

christinehc mentioned this pull request Dec 12, 2023

Enable background subtraction / file unzipping #118

Open

tnitka and others added 2 commits December 12, 2023 14:39

chore: update _version.py

8de742d

Merge branch 'main' into functionmotifs

46b526c

christinehc requested review from biodataganache and christinehc December 20, 2023 23:46

christinehc requested changes Jan 23, 2024

View reviewed changes

tnitka and others added 8 commits January 30, 2024 10:34

Add Motif tutorial

8db7a27

Update docs to include motif

22f0a02

Add model from RFE as output in motif

4c4ad19

Update documentation for motif

9c47a7a

Fix formatting

aa1c11d

Add motif report output

6924e9e

docs: add demo pages for learn/apply and motif

9f72ba1

changelog: - RTD pages now include notebook-formatted pages with the learn/apply and motif demos. - Formatted notebooks for compatibility with sphinx notebook formatting. - Cleaned up notebooks, e.g. removing extraneous import statements

christinehc requested changes Mar 13, 2024

View reviewed changes

tnitka and others added 5 commits March 15, 2024 10:59

Fix formatting

4ff1b0a

Remove redundant code from motif result script

cdb04f7

move Motif tutorial into separate directory with more informative con…

4d86439

…fig parameters

Add motif to README.md

a57cb02

Merge branch 'main' into functionmotifs

64f4fef

christinehc approved these changes Jun 5, 2024

View reviewed changes

tnitka removed the request for review from biodataganache June 5, 2024 20:30

tnitka merged commit c3b8d91 into main Jun 5, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functionmotifs #116

Functionmotifs #116

tnitka commented Nov 27, 2023

christinehc commented Dec 12, 2023

christinehc left a comment

christinehc Jan 23, 2024

christinehc Jan 23, 2024

christinehc left a comment

christinehc Mar 5, 2024

christinehc left a comment

		@@ -0,0 +1,209 @@
		"""model.smk: Module for supervised kmer-based annotation models.

Functionmotifs #116

Functionmotifs #116

Conversation

tnitka commented Nov 27, 2023

christinehc commented Dec 12, 2023

christinehc left a comment

Choose a reason for hiding this comment

Overall Comments

christinehc Jan 23, 2024

Choose a reason for hiding this comment

christinehc Jan 23, 2024

Choose a reason for hiding this comment

christinehc left a comment

Choose a reason for hiding this comment

christinehc Mar 5, 2024

Choose a reason for hiding this comment

christinehc left a comment

Choose a reason for hiding this comment