Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for matching names to phylogeny tips provided by Upham et al. 2019 #158

Open
jhpoelen opened this issue Jun 21, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@jhpoelen
Copy link
Member

As discussed with @ajacsherman et al. , we'd like to add support for matching names against phylogeny tips as published by:

Upham NS, Esselstyn JA, Jetz W. Inferring the mammal tree: Species-level sets of phylogenies for questions in ecology, evolution, and conservation. PLoS Biol. 2019 Dec 4;17(12):e3000494. doi: 10.1371/journal.pbio.3000494. PMID: 31800571; PMCID: PMC6892540.

@jhpoelen
Copy link
Member Author

Data supplements published via:

Upham, Nathan S.; Esselstyn, Jacob A.; Jetz, Walter (2019), Inferring the mammal tree: Species-level sets of phylogenies for questions in ecology, evolution, and conservation, Dryad, Dataset, https://doi.org/10.5061/dryad.tb03d03

with 4GB zip file containing:

$ unzip -l doi_10.5061_dryad.tb03d03__v4.zip
Archive:  doi_10.5061_dryad.tb03d03__v4.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
 26162477  2023-06-21 07:48   Data_S6_patchClade_runfiles.zip
  6920653  2023-06-21 07:48   Data_S2_geneTree_files.zip
  8195052  2023-06-21 07:48   Data_S3_globalRAxML_files.zip
  4359516  2023-06-21 07:48   Data_S1_geneChecking_and_masterTaxonomy.zip
636119340  2023-06-21 07:48   Data_S8_finalFigureFiles.zip
  4520185  2023-06-21 07:48   Data_S4_patchClade_results_and_MCC.zip
3712197364  2023-06-21 07:48   Data_S7_Mammalia_credibleTreeSets_tipDR.zip
  1831991  2023-06-21 07:48   Data_S5_backboneDating_runfiles_and_MCC.zip
---------                     -------
4400306578                     8 files

with Data_S7_Mammalia_credibleTreeSets_tipDR.zip containing

$ unzip -l Data_S7_Mammalia_credibleTreeSets_tipDR.zip 
Archive:  Data_S7_Mammalia_credibleTreeSets_tipDR.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2019-09-25 15:32   Data_S7_Mammalia_credibleTreeSets_tipDR/
1273572508  2019-07-14 23:13   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_all10k_v2_nexus.trees
        0  2019-09-25 15:22   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/
  3090277  2019-07-15 08:40   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_target.tre
        0  2019-09-25 15:40   __MACOSX/
        0  2019-09-25 15:40   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/
        0  2019-09-25 15:40   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/
      220  2019-07-15 08:40   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/._MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_target.tre
  3144609  2019-07-15 09:25   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_MCC_v2_target.tre
      220  2019-07-15 09:25   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/._MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_MCC_v2_target.tre
     6148  2019-09-25 15:22   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/.DS_Store
      120  2019-09-25 15:22   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/._.DS_Store
   351745  2019-09-24 22:50   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_MCC_v2_PLOTTED.pdf
      177  2019-09-24 22:50   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/._MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_MCC_v2_PLOTTED.pdf
   350503  2019-09-24 21:54   Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_PLOTTED.pdf
      233  2019-09-24 21:54   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/DNAonly_MCCs/._MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_PLOTTED.pdf
    10244  2019-09-25 15:32   Data_S7_Mammalia_credibleTreeSets_tipDR/.DS_Store
      120  2019-09-25 15:32   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/._.DS_Store
1852405303  2019-07-15 00:19   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_Completed_5911sp_topoCons_NDexp_all10k_v2_nexus.trees
1881543473  2019-07-15 11:07   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_nexus.trees
1298801361  2019-07-14 23:08   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_all10k_v2_nexus.trees
        0  2019-09-25 15:32   Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/
  1362126  2019-09-24 08:38   Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/DR-SUMMARY_MamPhy_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_expanded.txt
        0  2019-09-25 15:46   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/
      176  2019-09-24 08:38   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/._DR-SUMMARY_MamPhy_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_expanded.txt
1065782805  2019-09-23 19:13   Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/DR-matrix_MamPhy_BDvr_Completed_5911sp_topoCons_NDexp_all10k_v2.txt
1062985577  2019-09-24 02:51   Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/DR-matrix_MamPhy_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_prune5911.txt
      176  2019-09-24 02:51   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/._DR-matrix_MamPhy_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_prune5911.txt
  1368273  2019-09-23 21:57   Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/DR-SUMMARY_MamPhy_BDvr_Completed_5911sp_topoCons_NDexp_all10k_v2_expanded.txt
      176  2019-09-23 21:57   __MACOSX/Data_S7_Mammalia_credibleTreeSets_tipDR/Completed_tipDR_all10k/._DR-SUMMARY_MamPhy_BDvr_Completed_5911sp_topoCons_NDexp_all10k_v2_expanded.txt
---------                     -------
8444776570                     30 files

from which the following files appear to contain nexus trees of sorts -

$ unzip -l Data_S7_Mammalia_credibleTreeSets_tipDR.zip  | grep nexus
1273572508  2019-07-14 23:13   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_all10k_v2_nexus.trees
1852405303  2019-07-15 00:19   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_Completed_5911sp_topoCons_NDexp_all10k_v2_nexus.trees
1881543473  2019-07-15 11:07   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_Completed_5911sp_topoCons_FBDasZhouEtAl_all10k_v2_nexus.trees
1298801361  2019-07-14 23:08   Data_S7_Mammalia_credibleTreeSets_tipDR/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_FBDasZhouEtAl_all10k_v2_nexus.trees

@n8upham - which resource should I use to have nomer map taxonomic names to their equivalent phylogenetic trees?

@myrmoteras
Copy link

there is an effort among GBIF and phylogeny specialists, et open tree of life, to do this and make them accessible for use in GBIF and beyond:

  • be able to have a specimen and plot it on a tree
  • to projet a tree on a map based on the specimens used in the phylogeny

@jhpoelen
Copy link
Member Author

@myrmoteras thanks for sharing that GBIF and phylogeny specialists are working on linking specimen to their associated phylogenies. Can you point to the methods they use / or intent do use? Who's working on it? Where do they keep their source code?

@jhpoelen
Copy link
Member Author

@n8upham pointed to

https://github.com/n8upham/MamPhy_v1/blob/master/_DATA/taxonomy_mamPhy_5911species_toPublish.csv

to use for taxonomic alignment with Upham et al. 2019 mammal phylogeny.

@jhpoelen
Copy link
Member Author

jhpoelen commented Jun 28, 2023

@n8upham
Copy link

n8upham commented Aug 30, 2023

More details on how to align taxonomy file from the Mammalia phylogeny of Upham et al. 2019 (MamPhy v1.0) to the Bat Taxonomic Alignment

  1. Go to this file: https://github.com/n8upham/MamPhy_v1/blob/master/_DATA/taxonomy_mamPhy_5911species_toPublish.csv

  2. Recommend doing the following to add this taxonomy to the BTA:
    subset by "ord" = "CHIROPTERA"

  • do an automated match to the BTA (I would use "left_join()" in the R dplyr package, but there are many ways to do this)

  • Keep all columns of the MamPhy taxonomy -- it is the "tiplabel" column you will need to interact with the phylogenies themselves

  • for those names that don't match

    • In BTA, not MamPhy
      • add entry for which BTA species (of which taxonomy) that species is likely represented by in the MamPhy phylogeny
      • can use the "MSW3_sciName_matched" column in MamPhy taxonomy to assess if which MamPhy name matches to MSW3 (or if the name differs since MSW3)
    • In MamPhy, not BTA
      • add row in the BTA (but perhaps this case doesn't exist?)

@ajacsherman
Copy link

ajacsherman commented Sep 1, 2023 via email

@jhpoelen jhpoelen added the enhancement New feature or request label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants