Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JSON-LD schemas from schema.org #113

Open
ferrisoxide opened this issue Mar 13, 2023 · 3 comments
Open

Use JSON-LD schemas from schema.org #113

ferrisoxide opened this issue Mar 13, 2023 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ferrisoxide
Copy link
Owner

ferrisoxide commented Mar 13, 2023

The lack of any formal structure in the data has always bothered me, as we basically store everything in one blob of JSON data. In the general case this is fine, but when products/items have very definite properties (e.g. a Book has an author, food items have calories, etc) it gets harder to ensure that the data makes sense - or is consumable in a repeatable way.

Proposal

Brocade.io to start using JSON-LD as the base model for all data presented via the API, using schemas published by third-parties like https://schema.org.

For instance, if we adopt the "Product" type from schema.org, we can structure product information that is both human-readable and easily processed by applications:

{
  "@context": "https://schema.org/",
  "@type": "Product",
  "name": "Lite Italian Dry Salami",
  "gtin": "00073007107096",
  "countryOfAssembly": "USA"
  "brand": {
    "@type": "Brand",
    "name": "Columbus"
  },
  "material": "processed meat"
}

JSON-LD also enables us to add a graph for extended attributes, e.g. if nutritional information is available for a product we can use the NutritionInformation type to present these attributes in a structured manner:

{
  "@context": "https://schema.org/",
  "@type": "Product",
  "name": "Lite Italian Dry Salami",
  "@graph": [
    {
      "@type": "NutritionInformation",
      "calories": "214 kcal",
      "servingSize": "28 g",
      ...
      }
    },
    ...
}

Other types from schema.org can be used as applicable (e.g. Book, Movie), etc. We can also make use of schemas published by other third parties - or our own custom types - as required and potentially "future proof" the underlying data model.

We can also use the type information in the frontend, using schema types to determine the best way to present data like nutritional information in a table-like format (see #11). We can also use to insert Microdata into the HTML to nest metadata suitable for search engines, web scrapers and the like to consume.

Benefits

  • Leverage existing data structures and tools
  • Consistent results from the API
  • Helps inform the presentation in the UI

Risks / Possible Problems

  • Harder to parse incoming data
  • Large amount of existing data that needs to be processed

We can mitigate the second problem by processing individual products on demand and progressively update data. The problem of parsing data requires a bit more thought and investigation, but it looks like a solvable problem.

@ferrisoxide ferrisoxide added help wanted Extra attention is needed enhancement New feature or request labels Mar 13, 2023
@ferrisoxide ferrisoxide changed the title Proposal: use JSON schemas from schema.org [LOOKING FOR FEEDBACK] Proposal: use JSON-LD schemas from schema.org [LOOKING FOR FEEDBACK] Mar 13, 2023
@ferrisoxide ferrisoxide changed the title Proposal: use JSON-LD schemas from schema.org [LOOKING FOR FEEDBACK] Proposal: use JSON-LD schemas from schema.org Mar 14, 2023
@ferrisoxide
Copy link
Owner Author

ferrisoxide commented Mar 17, 2023

Notes

Checking unique keys in the products.properties field:

select distinct(json_data.key)
from products, jsonb_each(products.properties) as json_data

returns

 potassium
 weight_g
 fiber
 carbohydrate
 sugars
 unit_count
 size
 author
 monounsaturated_fat
 volume_ml
 ingredients
 polyunsaturated_fat
 fat_calories
 alcohol_by_volume
 calories
 weight_ounce
 servings_per_container
 trans_fat
 format
 saturated_fat
 pages
 fat
 volume_fluid_ounce
 sodium
 serving_size
 protein
 publisher
 cholesterol

There are only a small number of keys to worry about - and most will map to schema.org types. I think I'll go ahead and start introducing JSON-LD to the API.

@ferrisoxide ferrisoxide changed the title Proposal: use JSON-LD schemas from schema.org Use JSON-LD schemas from schema.org Mar 17, 2023
@ferrisoxide
Copy link
Owner Author

ferrisoxide commented Mar 17, 2023

Notes

Per discussions on Schema.org, repeated values are fine. This is valid Product data:

{
  "@context": "https://schema.org",
  "@type": "Product",
  "additionalProperty": [
   {
      "@type": "PropertyValue",
      "name": "myCustomProperty",
      "value": "my custom value"
    },
    {
      "@type": "PropertyValue",
      "name": "myOtherCustomProperty",
      "value": "my other custom value"
    }
  ],
  ...

so we can use an array additionalProperty to record anything that doesn't fit elsewhere.

NB Can use https://validator.schema.org/ to validate data

@ferrisoxide
Copy link
Owner Author

ferrisoxide commented Apr 1, 2023

DEV NOTE

Might be also worth getting my head around SHACL

https://www.w3.org/TR/shacl/

Also, JSON-LD Best Practices

https://w3c.github.io/json-ld-bp/#bp-summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant