Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Validator and Reference Validator enhancements #17

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

LakshmiDintakurty
Copy link

Code Enhancements: Below are the list of code changes made for reference-ccda-validator (and code-validator). Please review the changes and let us know if you need any additional information and we can setup a meeting to discuss.

  1. Async thread pool for code validation

    VocabularyValidationService uses a pool of ValidateWorker objects to perform code validation in parallel.

  2. Fixed Thread safety issues in Vocab validator

  3. VTD – Object pool pre-compiled XPaths

    a. VTD is an alternate high performance XML Parser.
    b. The auto pilot package (org.sitenv.vocabularies.validation.pool) encapsulates a pool of precompiled XPaths based on ccdaReferenceValidatorConfig.xml file
    c. The pool configuration is specified in CodeValidator.properties.

  4. Switched from HSQL to H2 DB

    a. Configurable DatabaseConnection pool in CodeValidator.properties.
    b. Replaced sql.Connection with sql.DataSource.
    c. All the loader classes now use DataSource to load the valuesets/codesets from files to H2 DB and at the time of loading from H2 into HashSets.

  5. Vocab lookup using Java HashSets/HashMaps instead of JPA queries

    a. Reimplemented the Repository classes to respective DAO classes (CodeSystemCodeDAO and ValueSetDAO)
    b. References to the CodeRepository and VsacValueSetRepository in CodeSystemCodeValidator & ValueSetCodeValidator are replaced with respective DAOs (CodeSystemCodeDAO and ValueSetDAO)

  6. DB Cleanup after loading codes and valuesets to HashSets

    a. In order to reduce the memory footprint, after loading all the codes and valuesets to HashSets, all the data is deleted from H2DB.
    b. VocabularyLoadRunner performs the cleanup based on the flag ‘cleanUpDatabaseAfterLoadingHashSets’ set in CodeValidator.properties.
    c. Additional comments provided on the side effects/conditions w.r.t to setting the flag to true vs false in VocabularyLoadRunner class in the finally block of afterPropertiesSet().

  7. ReferenceCCDAValidationService : Added new overloaded service methods to handle document validation

    Ex., support optional REST request parameter SeverityLevel (allowable values: error, warning, info). if SeverityLevel=”error”, don’t evaluate “warning” and “info” Vocab Conformance Rules

  8. Added new functionality to support R1.1 CCDA doc Vocabulary Validation (VocabularyValidationService)

  9. Ability to upload a compressed file for validation in addition to plain xml file to improve network latency

    Added functionality to support .zip format in addition to .xml file format in ReferenceCCDAValidationService. Useful when CCDA documents are large in size.

  10. Perform MDHT & Vocab Validation of a document regardless whether the doc has schemaError or not, to better handle MU3 CERT Validator negative test cases

  11. Support 100% accurate source CCDA file Line Numbers for each of the Vocabulary Validation error/warning/info results.

  12. For supportability, added the Vocabulary Conformance Rule ID feature

    a. A unique Vocabulary Conformance Rule ID is defined for each configured in ccdaReferenceValidatorConfig.xml. Such as < validator id="1" >
    b. In each Vocabulary Validation error/warning/info result, output the conformance Rule ID it's violating. Such as "ruleID": "140"

Improved validationObjective defaulting logic for both R1.1 and R2.1 documents to use CCDATypes.NON_SPECIFIC_CCDAR2 so there's no need to parse ccdaFile up front.
@Plow74
Copy link
Contributor

Plow74 commented Aug 8, 2017

Thanks! We will review the changes. Async is a welcomed improvement.

@onc-healthit onc-healthit locked and limited conversation to collaborators Aug 8, 2017
@drbgfc
Copy link
Contributor

drbgfc commented Aug 14, 2017

Hi Haiwen and the rest of Cerner. Sounds like a lot of great work as we had discussed on some of the MDHT calls. I am especially interested (as I suspect all would be) in the performance improvements. Good idea to submit a PR. I haven't had time to review specifically, but a note that this, "Perform MDHT & Vocab Validation of a document regardless whether the doc has schemaError or not, to better handle MU3 CERT Validator negative test cases" might be an issue. Not performing validation when there is/are schema error(s) was an ONC requirement so it's a feature not a bug. If you could expand on the reasoning I could see if they would be interested in the change.

@drbgfc
Copy link
Contributor

drbgfc commented Dec 15, 2017

Hi, what is the status on this? Last we had spoken you were going to create a new PR without conflicts and I think which defaulted to not run vocab validation if there were schema errors (via an implemented switchable config). I would suggest that this happen directly after a release. Releases happen on the last Monday of every month with the exception of Decemeber. So, the next release is Jan 22nd 2018. So a great time for the PR would be directly after that. However, I don't expect many changes in addition to what is in the repo already, so, submitting mid development phase could be fine too. Especially since the main developers, including me, will be off until after the New Year. Let me know what you think, no rush. Thanks, Dan.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants