-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provide proper prefix support in the travharv config #35
Milestone
Comments
marc-portier
changed the title
show proper prefix support is working
provide proper prefix support in the travharv config
Apr 15, 2024
quick separate exercise showing how the config entries in the yml can be normalised using the prefix declarations from typing import Dict, List
import validators
import re
from re import Match
from rdflib import Namespace, Graph, URIRef
from rdflib.namespace import NamespaceManager
# see https://rdflib.readthedocs.io/en/stable/namespaces_and_bindings.html
def makeNSM(pfx_declarations: Dict[str, str]) -> Dict[str, Namespace]:
pfxs = {k: Namespace(v) for k, v in pfx_declarations.items()}
print(f"{pfxs=}")
nsm = NamespaceManager(Graph(), bind_namespaces="none")
for pf, ns in pfxs.items():
nsm.bind(pf, ns, override=True)
print(f"{list(nsm.namespaces())=}")
return nsm
def resolve_uri(uri: str, nsm: NamespaceManager) -> URIRef:
# TODO reconsider the validators trick -- we might want to explicitely demand <> surrounding the <uri>
return URIRef(uri) if validators.url(uri) else nsm.expand_curie(uri)
def resolve_literals(literal_uris: List[str], nsm: NamespaceManager) -> List[URIRef]:
return [resolve_uri(u, nsm) for u in literal_uris]
def resolve_sparql(sparql, nsm):
pfxlines: str = "\n".join((f"PREFIX {p}: {u.n3()}" for p,u in nsm.namespaces()))
return f"{pfxlines}\n{sparql}"
PPATH_RE: str = r'(([^<>\/\s]+)|<([^>]+)>)\s*\/' # how to match parts of property-paths
def ppath_split(ppath: str) -> List[str]:
return (m.group(2) or m.group(3) for m in re.finditer(pattern=PPATH_RE, string=ppath + "/"))
def resolve_ppaths(ppaths: List[str], nsm: NamespaceManager):
return [
" / ".join(resolve_uri(part, nsm).n3() for part in ppath_split(ppath)) for ppath in ppaths
]
def do():
yml_pfx_declarations = dict(
schema="https://schema.org",
ex="https://example.org/",
)
yml_literals = [
"ex:test",
"schema:DataSet",
"https://demo.me/whatever",
]
yml_sparql = """select * where ?s schema:name ?n ."""
yml_ppaths = [
"<https://demo.me/whatever> / ex:some",
"ex:some",
"<https://demo.me/whatever>",
"schema:owner / schema:name",
]
# make actual namespaces that can be used
nsm: NamespaceManager = makeNSM(yml_pfx_declarations)
literals = resolve_literals(yml_literals, nsm)
print(f"{literals=}")
sparql = resolve_sparql(yml_sparql, nsm)
print(f"{sparql=}")
ppaths = resolve_ppaths(yml_ppaths, nsm)
print(f"{ppaths=}") |
cedricdcc
added a commit
that referenced
this issue
May 7, 2024
- Added helper.py containing functions that allow for prefix support in sparql queries and traversal harvesting paths. - deleted call functions and refactored code in config_builder and all subsequent files that used this __call__ method. - replaced and refactored all files that worked with the GraphNameMapper, now the maper of py-rdf-store is being used. - refactored the config builder propery subjects so that when they are called they will get the subjects from the graph if that is required (when SPARQL query is given instead of list of subjects). - edited the .yml files that are used as configs to now not contain the <> anymore in the prefixes since these will now cause issues for the helper functions resolve_uri() Issues that were affecting by the changes in this commit are: - #35 - #43 - #48 - #34
Merged
waiting for PR #51 to get merged with main branch |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
the prefix config in the yml should extend to
an updated test-yml should show this is all actually working
(and if needed implementation fixes should make it work)
The text was updated successfully, but these errors were encountered: