Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract Lexeme IDs in SPARQL Queries for Language Totals #110

Merged
merged 7 commits into from
Mar 20, 2024

Conversation

mhmohona
Copy link
Member

Contributor checklist


Description

The SELECT statement in the query has been updated to include (REPLACE(STR(?lexeme), "http://www.wikidata.org/entity/", "") as ?lexemeID) as the first element. This modification ensures that the query returns the lexeme ID alongside the word category and its counts, aligning with the requirements outlined in the issue discussion and the specific approach suggested by Andrew.

Related issue

Copy link

github-actions bot commented Mar 19, 2024

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!

Maintainer checklist

  • The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution

    • The contributor's name and icon in remote commits should be the same as what appears in the PR
    • If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@andrewtavis andrewtavis self-requested a review March 19, 2024 08:07
@andrewtavis
Copy link
Member

Thanks for this, @mhmohona! Will look to review it as soon as I can :) :)

@andrewtavis
Copy link
Member

Hey @mhmohona 👋 Am realizing the directions here weren't quite what they should have been. The three queries you edited are actually the only three that don't need this change :) Specifically if you go into the extract_transform/languages directory and then find queries like extract_transform/languages/French/nouns/query_nouns.sparql, these are the files we want to add this line to 😊

Can you go through and remove the edits to the current files and send along versions of all instances of query_nouns.sparql, query_verbs.sparql and query_prepositions.sparql that have the lexemeID line included?

@mhmohona
Copy link
Member Author

Ops! 🫠 Let me update it.

@mhmohona
Copy link
Member Author

@andrewtavis this PR is up for review now!

@@ -1,7 +1,9 @@
# All Arabic (Q13955) nouns.
# Enter this query at https://query.wikidata.org/.

SELECT DISTINCT ?lexeme ?noun WHERE {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewtavis just one thing, here I am suppose to make change, isnt it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that'd be great, @mhmohona :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the changes I made, are they sufficient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be find, @mhmohona :) I'll go through and do the review and add in the line for you 😊

Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes I sent along were just minor formatting, @mhmohona 😊 Thanks for all the help here! 🚀

@andrewtavis andrewtavis merged commit f6f593d into scribe-org:main Mar 20, 2024
2 checks passed
@mhmohona mhmohona deleted the lexemes branch August 28, 2024 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to grab the Wikidata lexemes for queried words
2 participants