Skip to content

Commit

Permalink
utils_extra check_value url quote on n3 fail
Browse files Browse the repository at this point in the history
if we receive a string that we think should be a url then check to see
if rdflib can safely serialize it, on failure, url quote it

this is not efficient but it is at least somewhat safe

note that uri dealiasing has to be done to be able to do a full
comparison between uris to determine if they even have a chance of being
the same uri, so this is just another layer in that where the escape
sequences have to be undone as one step along the way to finding the
canonical represenation of the uri
  • Loading branch information
tgbugs committed Nov 13, 2020
1 parent 0f4e78c commit f57b769
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion pyontutils/utils_extra.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Reused utilties that depend on packages outside the python standard library.
"""
import hashlib
from urllib.parse import quote as url_quote
import rdflib


Expand All @@ -15,7 +16,18 @@ def check_value(v):
if isinstance(v, rdflib.Literal) or isinstance(v, rdflib.URIRef):
return v
elif isinstance(v, str) and v.startswith('http'):
return rdflib.URIRef(v)
# FIXME this is dumb and dangerous but whatever
uri = rdflib.URIRef(v)
try:
uri.n3()
except:
# dois allow ... non-url and non-identifier chars
# that must be escaped or we have to use strings
# FIXME this WILL induce an aliasing problem if
# another process quotes using a different rule
uri = rdflib.URIRef(url_quote(v, ':/;()'))

return uri
else:
return rdflib.Literal(v)

Expand Down

0 comments on commit f57b769

Please sign in to comment.