Skip to content
This repository has been archived by the owner on Jan 31, 2023. It is now read-only.

Improved dataframe handling and save declined matches also #20

Open
wants to merge 92 commits into
base: master
Choose a base branch
from

Conversation

dpriskorn
Copy link
Owner

Rewrite with new CrossrefSubject class that handles all the subject splitting and lookup. Also implemnent support for approved/declined in the CacheMatch and FuzzyMatch so that we now store both and act on them.

…plitting and lookup. Also implemnent support for approved/declined in the CacheMatch and FuzzyMatch so that we now store both and act on them.

crossref/enums.py: New file with enums specific to crossref
models/enums.py: New file with enums
fuzzy_match.py: New attribute approved
match_cache.py:
MatchCache: Rename private method __append_new_match_to_the_dataframe__ to __append_match_result_to_the_dataframe__ and store the new attribute in a dataframe column.
__extract_match__(): Support extracting approved also.
named_entity_recognition.py: Move methods to new class CrossrefSubject and rewrite __lookup_subjects__()
ontology.py: Ontology:
New private method __enrich_cache_match__().
Move the dataframe preparation code to  __get_the_dataframe_from_config__()
Handle new approved and add it to self.match.
ontology_dataframe.py: Rename Dataframe to OntologyDataframe.
subject.py: New file with class.
console.py: New function print_all_matches_table()
engine.py: Disable the old match table
wikipedia_page.py: Print the new one
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant