Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offer basic API to query ontology terms #1903

Open
1 task
areleu opened this issue Nov 6, 2024 · 5 comments
Open
1 task

Offer basic API to query ontology terms #1903

areleu opened this issue Nov 6, 2024 · 5 comments

Comments

@areleu
Copy link

areleu commented Nov 6, 2024

Description of the issue

Currently if I do a GET request of an ontology entity like

curl -H "Content-Type: application/json"  -X GET https://openenergyplatform.org/ontology/oeo/OEO_00010257/

I get a whole webiste like:

<!DOCTYPE html>
<html lang="en">
<head>


    <!-- Meta Tags, which should be implemented -->
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
	...

It would be nice that, if someone sends a content header specifying for xml, json or even rdf, they get a response with the respective format. The colleagues from the CCO currently deliver a fraction of the ontology in xml by default, look for example at: https://www.ontologyrepository.com/CommonCoreOntologies/Country

This is also not that sophisticated because they only offer the xml response

Ideas of solution

I think the simplest implementation would be to implement a simple json response Middleware where users get the label, the definition, subclasses and super classes, basically the same content of the website but in a format that can be furtherly processed without webscarapping. That way one can build simple applications that rely on the ontology. The question may be if the servers could deal with the traffic of applications using the ontology, but if we follow the spirit of the semantic web If that is a problem I would really encourage you to consider soluions like making the ontology hosting its own service that is used by the OEP.

Workflow checklist

@jh-RLI
Copy link
Contributor

jh-RLI commented Nov 11, 2024

We can do this. I can extend the OEP view that returns the HTML in your example to handle the xml and json requests differently. Or we could provide specific endpoints for those. In general, the backend that handles the textual oeo viewer is a bit of a mess, which is why I'd like to have a separate endpoint for these requests or find the time to rework the implementation as I see the benefits of having just a single URL that can handle both as it makes things clearer.

Anyway, if you have experience with these API's, you're welcome to contribute, I could show you how and where. That would speed things up :) Nonetheless, I will try to implement a working draft and make it available before the next OEP release at the end of November.

Some more thoughts:
This would help only for simple queries like everything for this ID “OEO_00010257” as you suggested by "basic API". I think more specific queries should be done with SPARQL as there are applications that implement such an API. I mean i implemented something similar using pythons OWLReady2 & RDFlib for the OEO-Extended to find OEO terms, but not sure if it will scale as it is not really used so far (quite hidden feature, lacks good documentation). The Python implementation for searching terms in the OEO will always be a bit slow and I also think it consumes more resources. We already operate a JennaFuseki Server maybe we should load the OEO there and make the API available via the OEP not sure what would be best yet. Currently, it stores only the OEKG.

So far we have avoided implementing a worker for the OEP that could do the tasks in the background and support the django worker. I don't think it's a problem at the moment, but if there are more users, it could become one. In that case we will have to go for the more complex implementation. This problem is very similar to dealing with big data uploads/downloads, so it could be relevant in any case.

@jh-RLI jh-RLI self-assigned this Nov 11, 2024
@jh-RLI jh-RLI moved this to 📋 Backlog in OEO integration Nov 11, 2024
@areleu
Copy link
Author

areleu commented Nov 12, 2024

This would help only for simple queries like everything for this ID “OEO_00010257” as you suggested by "basic API".

I think having the URI resolve to different resources is already very useful, querying the ontology is a completly different beast.

I could implement it but I don't know how fast I can get the OEP developing environment up and running in my computer, I guess the routing should be happening somewhere here https://github.com/OpenEnergyPlatform/oeplatform/blob/develop/ontology/views.py isn't it?

@jh-RLI
Copy link
Contributor

jh-RLI commented Nov 13, 2024

Okay :) i agree this info is already useful.

BTW I kinda forgot for a moment that there is the so balled oep-lookup service available. It is based on the dpedia lookup software (implementing Apache lucene).

There you can use text based search to find classes in the OEO. The retrun type is JSON. The service runs a sparql query and indexes the oeo terms to be able to wight the results for relevant based on the text input.

See:
https://openenergyplatform.org/api/v0/oeo-search?query=test

The result will be something like:

{
    "docs": [
        {
            "score": [
                "399.44183"
            ],
            "definition": [
                "A <B>test</B> data set is a data set used for <B>testing</B>."
            ],
            "label": [
                "<B>test</B> data set"
            ],
            "type": [
                "http://www.w3.org/2002/07/owl#Class"
            ],
            "resource": [
                "http://openenergy-platform.org/ontology/oeo/OEO_00000408"
            ]
        },
        {
            "score": [
                "48.067623"
            ],
            "definition": [
                "Carbon capture is a process that captures carbon dioxide from a gas. <B>test</B>"
            ],
            "altLabel": [
                "CO2-Abscheidung"
            ],
            "label": [
                "carbon capture"
            ],
            "type": [
                "http://www.w3.org/2002/07/owl#Class"
            ],
            "resource": [
                "http://openenergy-platform.org/ontology/oeo/OEO_00010138"
            ]
        },
....

@jh-RLI
Copy link
Contributor

jh-RLI commented Nov 13, 2024

There we might need to update the index ... not sure ATM if it uses the latest OEO version.

@jh-RLI
Copy link
Contributor

jh-RLI commented Nov 13, 2024

This service should be able to manage scale also and it is deployed as a microservice that will not interfere with the oeplatform service.

@jh-RLI jh-RLI moved this from 📋 Backlog to 🔖 Ready in OEO integration Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🔖 Ready
Development

No branches or pull requests

2 participants