Releases: tskit-dev/tsdate
0.2.4
0.2.3
Jun 7th 2025
Breaking changes
-
All returned nodes are now forced to be at least epsilon apart, rather than
allowing some to be the next allowable floating point number. This allows room
for mutations. Epsilon has been changed to 1e-10 from 1e-6, to minimise the
impact, as this has the potential to (marginally) change the dates of internal
(fixed) sample nodes. -
Multiple mutations at the same site above the same node are now spaced out evenly
in time, rather than placed at identical times.
0.2.2
Features
-
An
allow_unary
flag (False by default
) has been added to all methods. -
A
set_metadata
flag has been added so that node and mutation metadata can be
omitted, saved (default), or overwritten even if this requires changing the schema. -
An environment variable
TSDATE_ENABLE_NUMBA_CACHE
can be set to cache JIT
compiled code, speeding up loading time (useful when testing). -
The time taken for running tsdate is now recorded in the provenance data
-
Historical samples (sample nodes with time > 0) are now allowed in the
tree sequence.
Documentation
- Various fixes in documentation, including documenting returned fits.
Breaking changes
-
The
return_posteriors
argument has been removed and replaced withreturn_fit
.
An instance of one of two previously internal classes,ExpectationPropagation
andBeliefPropagation
, are now returned whenreturn_fit=True
, and posteriors can
be obtained usingfit.node_posteriors()
. -
Topology-only dating (setting
mutation_rate=None
) has been removed for tree sequences
of more than one tree, as tests have found that span-weighting the conditional coalescent
causes substantial bias. -
The
trim_telomeres
parameter in thetsdate.preprocess_ts()
function has been renamed
toerase_flanks
, to matchtinfer.preprocess()
. The previous name is kept as a
deprecated alias.
0.2.1
Bugfixes
- Minor bug fixed with final step of algorithm (path rescaling).
Features
- Initial support for dating with unphased (or poorly phased) singleton
mutations viasingletons_phased=False
option. The API is preliminary and
may change.
Documentation
- Fixed description of priors for variational gamma method, which were referred
to a 'flat' or improper but are actually empirical Bayes priors on root node ages,
fit by expectation maximization.
0.2.0
Bugfixes
-
Variational gamma uses a rescaling approach which helps considerably if e.g.
population sizes vary over time -
Variational gamma does not use mutational area of branches, but average path
length, which reduces bias in tree sequences containing polytomies
Breaking changes
-
The default method has been changed to
variational_gamma
. -
Variational gamma uses an improper (flat) prior, and therefore
no longer needspopulation_size
specifying. -
The standalone
preprocess_ts
function also applies thesplit_disjoint_nodes
method, which creates extra nodes but improves dating accuracy. -
Json metadata for mean time and variance in the mutation and node tables is now saved
with a suitable schema. This meansjson.loads()
is no longer needed to read it. -
The
mutation_rate
andpopulation_size
parameters are now keyword-only, and
therefore these parameter names need to be explicitly typed out. -
The
ignore-oldest
option has been removed from the command-line interface,
as it is no longer very helpful with new tsinfer output, which has the root
node split. The option is still accessible from the Python API.
0.1.7
[0.1.7] - 2024-01-11
Bugfixes
- In variational gamma, Rescale messages at end of each iteration to avoid numerical
instability.
0.1.6
[0.1.6] - 2024-01-07
Breaking changes
-
The standalone
preprocess_ts
function now defaults to not removing unreferenced
individuals, populations, or sites, aiming to change the tree sequence tables as
little as possible. -
get_dates
(previously undocumented) has been removed, as posteriors can be
obtained usingreturn_posterior
. Thenormalize
terminology previously used
inget_dates
is changed tostandardize
to better reflect the fact that the
maximum (not sum) is one, and exposed via theoutside_standardize
parameter. -
The
Ne
argument todate
has been deprecated (although it is
still in the API for backward compatibility). The equivalent argument
population_size
should be used instead. -
The CLI
-verbosity
flag no longer takes a number, but uses
action="count"
, so-v
turns verbosity to INFO level,
whereas-vv
turns verbosity to DEBUG level. -
The
return_posteriors=True
option withmethod="inside_outside"
previously returned a dict that included keysstart_time
andend_time
,
giving the impression that the posterior for node age is discretized over
time slices in this algorithm. In actuality, the posterior is discretized
atomically over time points, sostart_time
andend_time
have been
replaced by a single keytime
. -
The
return_posteriors=True
option withmethod="maximization"
is no
longer accepted (previously simply returnedNone
) -
Python 3.7 is no longer supported.
Features
-
A new continuous-time method,
"variational_gamma"
has been introduced, which
uses an iterative expectation propagation approach. Tests show this increases
accuracy, especially at older times. A Laplace approximation and damping are
used to ensure numerical stability. After testing, the node priors used in this
method are based on a global mixture prior, which can be refined during iteration.
Future releases may switch to using this as the default method. -
Priors may be calculated using a piecewise-constant effective population trajectory,
which is implemented in thedemography.PopulationSizeHistory
class. The
population_size
argument todate
accepts either a single scalar effective
population size, or aPopulationSizeHistory
instance. -
Added support and wheels for Python 3.11
-
The
.date()
function is now a wrapper for the individual dating methods
(accessible usingtsdate.core.dating_methods
), which can be called independently.
(e.g.tsdate.inside_outside(...)
). This makes it easier to document method-specific
options. The API docs have been revised accordingly. Provenance is now saved with the
name of the method used as the celled command, rather than"command": "date"
. -
Major re-write of documentation (now at
https://tskit.dev/tsdate/docs/), to use the
standard tskit jupyterbook framework.
Bugfixes
-
The returned posteriors when
return_posteriors=True
now return actual
probabilities (scaled so that they sum to one) rather than standardized
"probabilities" whose maximum value is one. -
The population size is saved in provenance metadata (as a dictionary if
it is aPopulationSizeHistory
instance) -
preprocess_ts
always records provenance as being from thepreprocess_ts
command, even if no gaps are removed. The command now has a (rarely used)
delete_intervals
parameter, which is normally filled out and saved in provenance
(as it was before). If no gap deletion is done, the param is saved as[]
0.1.5 - Minor release
Changelog:
-
Added the
time_units
parameter totsdate.date
, allowing users to specify
the time units of the dated tree sequence. Default is"generations"
. -
Added the
return_posteriors
parameter totsdate.date
. If True, the function
returns a tuple of(dated_ts, posteriors)
. -
mutation_rate
is now a required argument intsdate.date
andtsdate.get_dates
-
tsdate returns an error if users attempt to date an unsimplified tree sequence.
-
Updated tsdate citation information to cite the recent Science paper
-
Support for Python 3.10
Minor feature and bugfix release
Breaking changes
- The algorithm now operates in unscaled time (in units of generations) under
the hood, which means thattsdate.build_prior_grid
now requires the parameter
Ne
.
Features
- Users now have access to the marginal posterior distributions on node age by running
tsdate.get_dates
, though this is undocumented for now.
Bugfixes
- A fix to the way likelihoods are added should eliminate numerical errors that are
sometimes encountered when dating very large tree sequences.
Support for non-contemporaneous samples, preprocessing tree sequences
Features
- Two new methods,
tsdate.sites_time_from_ts
andtsdate.add_sampledata_times
,
support inference of tree sequences from non-contemporaneous samples. - New tutorial on inferring tree sequences from modern and historic/ancient samples
explains how to use these functions in conjunction withtsinfer
. tsdate.preprocess_ts
supports dating inferred tree sequences which include large,
uninformative stretches (i.e. centromeres and telomeres). Simply run this function
on the tree sequence before dating it.ignore_outside
is a new parameter in the outside pass which tellstsdate
to
ignore edges from oldest root (these edges are often of low quality intsinfer
inferred tree sequences)- Development environment is now equivalent to other
tskit-dev
projects