Skip to content

Automatic matching of Blue Button JSON data (detection of new, duplicate and partial match entries)

License

Notifications You must be signed in to change notification settings

amida-tech/blue-button-match

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

blue-button-match

Automatic matching of Blue Button JSON data (detection of new, duplicate and partial match entries)

NPM

Build Status Coverage Status Dependency Status

Library interfaces/APIs

This library exposes methods for matching entire health records as well as lower level methods for matching sections of health records.

This library provides the following functionality

  • Match two health records in blue-button JSON format
  • Match individual sections of above

Quick up and running guide

Prerequisites

  • Node.js (v14.19+) and NPM
  • Grunt.js
# Install dependencies
npm i

# Install grunt
npm i -g grunt

# Test
grunt

Usage example

Require blue-button-match module

var match = require("./index.js") 
var bb = require("@amida-tech/blue-button");

var recordA = bb.parseString("record A");
var recordB = bb.parseString("record B");

var result = match.match(recordA.data, recordB.data);

console.log(result);

This will produce a match object looking like this:

{
  "match": {
    "allergies" : [
      {
        "match": "new",
        "percent": 0,
        "src_id": "0",
        "dest_id": "0",
        "dest": "dest"
      },
      {
        "match": "new",
        "percent": 0,
        "src_id": "1",
        "dest_id": "0",
        "dest": "dest"
      },
      {
        "match": "new",
        "percent": 0,
        "src_id": "2",
        "dest_id": "0",
        "dest": "dest"
      },
      {
        "match": "new",
        "percent": 0,
        "src_id": "0",
        "dest_id": "1",
        "dest": "src"
      },
        ...
      }
    ],
    "medications" : [...],
    "demographics" : [...]
    ...
  },
  "meta": {
    "version" : "0.0.1"
	},
	"errors": []
}

Matching record explanation

Match element can be {"match" : "duplicate", "percent": 100}, {"match" : "new", "percent: 0"} or {"match" : "partial", "percent": 50}.

Partial match is expressed in percent and can range from 1 to 99. Percent is included in the duplicate and new objects as well for range based calculations, but will always equal 100 or 0 respectively.

Element attribute dest_id refers to the element position (index) in the related section's array of the Master Health Record. Element attribute src_id refers to the element position (index) in the related array of the new document being merged (new record). This is modulated by the 'dest' field. When {dest:'dest'} is present the dest_id references the index of the record matched against a new entry. When {dest: 'src'} is present, the dest_id references the index of the record contained within the same record as the src_id.

{
  "match":
  {
    "allergies" : [
      { "match" : "duplicate", "src_id" : 0, "dest_id": 2 },
      { "match" : "new", "src_id" :1 },
      { "match" : "partial", "percent" : 50, "src_id" : 2, "dest_id" : 5},
      ...
    ],
    "medications" : [...],
    "demographics" : [...]
    ...
  }
}

Matching results JSON structures (by record type)

Multiple facts

Applied to: All sections excluding Demographics

New match entry (dest):

{
  "match": "new",
  "percent": 0,
  "src_id": "1",
  "dest_id": "2",
  "dest": "dest"
}

Duplicate match entry (dest):

 {
    "match": "duplicate",
    "percent": "100",
    "src_id": "1",
    "dest_id": "1",
    "dest": "dest"
 }

Partial match entry (dest):

{
  "match": "partial",
  "percent": 50,
  "subelements": {
    "reaction": [{
      "match": "new",
      "percent": 0,
      "src_id": "0",
      "dest_id": "0",
      "dest": "dest"
    }]
  },
  "diff": {
    "date_time": "duplicate",
    "identifiers": "duplicate",
    "allergen": "duplicate",
    "severity": "duplicate",
    "status": "duplicate",
    "reaction": "new"
  },
  "src_id": "2",
  "dest_id": "2",
  "dest": "dest"
}

Single facts

Applied to: Demographics

Record is a duplicate:

[ { match: 'duplicate', src_id: 0, dest_id: 0 } ]

Record is a partial match:

[ 
  { 
    match: 'diff',
    diff: 
    { 
       name: 'duplicate',
       dob: 'new',
       gender: 'duplicate',
       identifiers: 'duplicate',
       marital_status: 'duplicate',
       addresses: 'new',
       phone: 'duplicate',
       race_ethnicity: 'duplicate',
       languages: 'duplicate',
       religion: 'duplicate',
       birthplace: 'duplicate',
       guardians: 'new' 
    },
    src_id: 0,
    dest_id: 0 
  } 
]

Edge cases for single facts:

Both objects are empty e.g. comparePartial({}, {})

[ { match: 'duplicate' } ]

Comparing empty object e.g. {} with non-empty (master record)

[ { match: 'diff', diff: {} } ]

Comparing non empty object with empty master record {}

[ { match: 'new' } ]

Matching Rules

Common matching rules

Date/time match - Hard match on dates, After initial date mismatch, fuzzy date match performed. Will check for overlap of dates if they don't hard match.

Code match (Code System match) - Either names must match, or code/code system must match. Translations are supported: Translated objects may be matched against an object and follow the same rules.

String match - Case insensitive/trimmed match of string values.

String array match - Case insensitive/trimmed match of arrays for equality.

Boolean match - Simple true/false equality comparison.

Each sections logic is contained in a .json file corresponding to the section name. It is divided into primary and secondary logic. Secondary logic only executes only if primary logic resulted in a successful match, and can bolster match percentages. Each section contains an array of match data.

Each element of match data has 3 components, a path, the location of the element, the type, or what common utility should handle it, and the percentage, which is a resulting increase if a match is made successfully.

Additionally, elements may have 'subarrays', which is used to populate sub-arrays from the match and provides corresponding diffs. Their structure is the same as match entries.

Note: Currently, the logic is designed so a match over 50% is considered actionable.

Allergies

Primary: Allergen Coded Match.

Secondary: Date/time.

Claims

Primary: Payer String Match, Number String Match, and type String Array match.

Demographics

All subelements are compared.

Encounters

Primary: Encounter Coded Match, Date/time.

Immunizations

Primary: Product Coded Match, Date/time.

Insurance

Primary: Plan Identifier String Match, Policy Number String Match, and Payer Name String match.

Note: Any combination of two matches will be over 50%.

Medications

Primary: Product Coded Entry.

Secondary: Date/time.

Payers

Primary: Policy Insurance Object.

Plan of Care

Primary: Plan Coded Entry, Date/time.

Problems

Primary: Problem Coded Entry.

Secondary: Date/time, Status string match, Negation Indicator boolean match.

Procedures

Primary: Procedure Coded Entry Match.

Secondary: Date/time.

Providers

Primary: Provider Type String Match.

Secondary: Person Object, and Name String.

Results

Primary: Result set Coded Entry, and Result set Date/time.

Note: Date/time calculated as most recent value from results array.

Social History

Primary: Value String entry.

Secondary: Date/time.

Vitals model

Primary: Vital Coded entry.

Secondary: Date/time.

Contributing

Contributors are welcome. See issues

Release Notes

See release notes here

License

Licensed under Apache 2.0

About

Automatic matching of Blue Button JSON data (detection of new, duplicate and partial match entries)

Resources

License

Stars

Watchers

Forks

Packages

No packages published