Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: partial monophyly constraints #1153

Open
seraklop opened this issue May 29, 2024 · 6 comments
Open

Feature request: partial monophyly constraints #1153

seraklop opened this issue May 29, 2024 · 6 comments

Comments

@seraklop
Copy link

Would it be possible to add partial monophyly constraints, that is, monophyly enforced only for a subset of taxa, while the rest are free to go anywhere in the tree? This can for instance be done in MrBayes with "constraint myconstraint partial = taxonlist1 : taxonlist2".

This would help considerably with datasets that include fossils, which often provide too limited information to be sufficiently certain about their position for a hard monophyly constraint. An example here are rooting constraints that only apply to extant taxa, while the fossils are left free to go anywhere.

Another example are of course heterogeneous datasets, where genomic data is available only for a few backbone taxa, while the information for the rest is too limited to enforce monophyly.

@rbouckaert
Copy link
Member

rbouckaert commented May 30, 2024

Perhaps the MRCAPriorWithRogues is what you are looking for. It is in the BEASTLabs package. To use it requires a bit of XML editing. Probably easiest is to set up an MRCA prior in BEAUti and specify the taxa in taxonlist1. Save the file, and open the XML in a text editor. Go to the MRCA prior, which looks something like this:

                <distribution id="taxonlist1.prior" spec="beast.base.evolution.tree.MRCAPrior" tree="@Tree.t:dna" monophyletic="true">
                    <taxonset id="taxonlist1" spec="TaxonSet">
                        <taxon id="Carp" spec="Taxon"/>
                        <taxon id="Chicken" spec="Taxon"/>
                        <taxon id="Cow" spec="Taxon"/>
                    </taxonset>
                </distribution>

Replace the spec attribute with beastlabs.math.distributions.MRCAPriorWithRogues and add the rogues in taxonlist2, so it looks like this:

                <distribution id="taxonlist1.prior" spec="beastlabs.math.distributions.MRCAPriorWithRogues" tree="@Tree.t:dna" monophyletic="true">
                    <taxonset id="taxonlist1" spec="TaxonSet">
                        <taxon id="Carp" spec="Taxon"/>
                        <taxon id="Chicken" spec="Taxon"/>
                        <taxon id="Cow" spec="Taxon"/>
                    </taxonset>
                    <rogues id="taxonlist2" spec="TaxonSet">
                        <taxon id="Dog" spec="Taxon"/>
                        <taxon id="Dolphin" spec="Taxon"/>
                        <taxon id="Duck" spec="Taxon"/>
                    </rogues>
                </distribution>

You need to have to the BEASTLabs package installed to run the XML. Hope this is what you had in mind.

@seraklop
Copy link
Author

seraklop commented Jun 1, 2024

Dear Remco,

great, thanks so much! Of course that is exactly what we need.
However, now initialization apparently does not work anymore with the NJ starting tree. Is it possible that this function only works with hard constraints and not with such partial constraints?

Thanks a lot for your help,
Seraina

@rbouckaert
Copy link
Member

Hi Seraina,

The ClusterTree indeed does not pick up such constraints, but the ConstrainedClusterTree. Can you try replacing the XML element with spec="ClusterTree" to spec="beastlabs.evolution.tree.ConstrainedClusterTree" and see whether that starts.

Cheers, Remco

@AlexaViert
Copy link

Dear Remco

Thank you already for your help. Seraina and I tried with the ConstrainedClusterTree, but now the analysis will not initialize because it detects negative branch lengths. Do you know what the issue could be? We tried already to use different seeds.

Best,
Alexandra

@rbouckaert
Copy link
Member

Hi Alexandra,

The ConstrainedClusterTree has a minimum branch length option (defaults to minBranchLength="1e-10") that should guarantee some minimum branch length. Obviously, this is not working for your analysis. If you can send me the XML I can have a look at what is causing the problem.

Have you tried using a RandomTree for initialisation? This is recommended in general in order to not bias the MCMC sample by a fixed starting point.

Cheers, Remco

@seraklop
Copy link
Author

Dear Remco,
thanks so much for all your help - Alexandra is in the last weeks of her PhD, and your input already greatly helped us speed up the process!
We are now using both a random and a user tree as starting trees and compare the outcome. So currently, we have a work-around for the issue with the ConstrainedClusterTree (it is a dataset with a large number of highly incomplete fossils and thus a lot of uncertainty in the topology, which is making convergence very slow - hence our need to use a good starting tree as well).
But if it is interesting for you to get our input files in any case to repeat the issue, let us know. Otherwise, from our side, this could be closed.
Thanks again!
Seraina

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants