Skip to content

Version 5.6

Compare
Choose a tag to compare
@desilinguist desilinguist released this 10 Jul 20:11

This is an important release that has a critical bugfix as well as useful improvements.

Bugfixes

  • Fixed critical bug in computation of standardized mean differences. The denominator for SMDs should be using population standard deviations, not the ones computed over the subgroups themselves.
  • Added converters to the notebook header to allow correct treatment of candidate IDs with leading zeros.
  • Modified the test utility functions to catch discrepancies caused by missing leading zero.

Improvements

  • The tables generated by rsmsummarize are now saved in the same way as for other tools.
  • rsmsummarize now shows a table with standardized coefficients for all models.
  • The predictions for the post-processed training set are now also saved.
  • Added a new notebook that shows differential feature functioning (DFF) plots by subgroup. To use it, add dff_by_group to the general_sectionconfiguration option. Read more here.
  • The features that have not been used in the model are now excluded from the datasets before they are sent to SKLL for prediction. This makes the prediction step much faster for large datasets.
  • When testing whether the feature std. dev. in the training set is zero, we currently set tolerance to 1e-06. This is not sufficient with features with very low values (these can result from an inverse transform of acoustic likelihoods which are logs of very small values). This tolerance is now increased to 1e-07.

Other Minor Changes

  • Update the utility script update_skll_model.py to allow it to be used with other tools.
  • Update tests and documentation.