Release v3.4.0 · opensanctions/yente

This release completely re-works the way in which the OpenSanctions API will score matches in the /match API.

Until now, the API has used a simple statistical model to assign a match quality score to each result it has returned. With the new release of yente 3.4, we've made that mechanism more flexible: clients can now select one of a set of supported algorithms to optimise the behaviour of the API for their use case.

With the new release, we've added three new scoring systems to augment the existing model (now called regression-v1):

regression-v2 is a new statistical model for matching people and companies. Unlike regression-v1 it uses pronounciation-based (phonetic/soundex) comparison for entity names, and it has reduced the impact of birthdates as a decision criterion. The new model will generally produce much lower scores for results, so you may want to reduce your matching threshold parameter in the API to 0.5 or 0.6.
name-based is a simple scoring mechanism based on name similarity only. It uses two criteria, the Jaro-Winkler string distance mechanism and the Soundex phonetic algorithm. This can be a useful tool to conduct matching on data where you only have entity names, and no other details such as birth dates, nationalities, etc.
name-qualified uses the score from the name-based mechanism but then considers other criteria, such as birth dates, nationalities, tax and registration identifiers. If any of these mismatch between the query and the result, the score is lowered. This attempts to anticipate a simple review process that a human analyst might otherwise undertake when a result is found.

What's Changed

Bump asyncstdlib from 3.10.6 to 3.10.7 by @dependabot in #250
Bump types-aiofiles from 23.1.0.1 to 23.1.0.2 by @dependabot in #249
Bump orjson from 3.8.10 to 3.8.11 by @dependabot in #246
Bump uvicorn[standard] from 0.21.1 to 0.22.0 by @dependabot in #247
Multiple scoring algorithms by @pudo in #251
Stable patches by @pudo in #248

Full Changelog: v3.3.1...v3.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.4.0

What's Changed

Contributors