The purpose of this script is to match Makindo Person objects with InfoUSA records. Each matching process has three steps:
- Retrieve a Person object from the Makindo API.
- Match the Person object to InfoUSA records based on personal details such as name and location.
- Report to the Makindo API whether the Person object exactly matches, ambiguously matches, or does not match any InfoUSA records. Also include basic demographic data from InfoUSA.
This script was developed using Python 2.7.5 and the following libraries:
Note also the external file Parameters.json
, which includes database
and API connection details. This file resembles the following structure:
{
"mysql": {
"host": "$DB_HOST",
"user": "$DB_USER",
"passwd": "$DB_PASSWD",
"db": "$DB_NAME"
},
"makindo": {
"token": "$API_KEY"
}
}
This client uses the Makindo API, so it is necessary to comply with its documentation.
- More sophisticated matching algorithms are necessary. For example, in cases of ambiguous query results a subsequent, broader search should be conducted. See Haystaq_pycli for an example.
- It may be desirable to modify the Makindo API to accept two additional parameters. The first would be the number of ambiguous matches. Obviously, matches with two ambiguous Persons would be easier to de-duplicate than matches with hundreds. The second would be the version of the matching script. This would allow Makindo to re-match some Persons after the matching script is updated.
Ross Petchler ([email protected]) originally authored this script.
John Masterson ([email protected]) provided technical advice on the behalf of Makindo.
Rachel Shorey ([email protected]) and Brad Wieneke ([email protected]) provided helpful initial code.