-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indication of vowel length interfering with xml parsing #1
Comments
Which data are you trying to parse? We'll be converting them to long and short accent marks at some point. |
This was in the 'formario'. Distinctions of length are marked with single (breve) or double (longum) quotes, which was interfering with attribute strings. As a quick fix i replaced with another symbol, but unicode character might be best?
William Michael Short
Lecturer in Classics
Department of Classics & Ancient History
University of Exeter
…________________________________
From: Greta Franzini <notifications@github.com>
Sent: Monday, December 10, 2018 4:01:32 PM
To: CIRCSE/WFL
Cc: William Michael Short; Author
Subject: Re: [CIRCSE/WFL] Indication of vowel length interfering with xml parsing (#1)
Which data are you trying to parse? We'll be converting them to long and short accent marks at some point.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCIRCSE%2FWFL%2Fissues%2F1%23issuecomment-445868139&data=02%7C01%7C%7C000da9609c3248032bbf08d65eb8c1bf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800544944104061&sdata=phIQPiHjHv9pe1XkxWAWJvdccDI90VEulN6CIodxIxg%3D&reserved=0>, or mute the thread<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAizEouRmZPr4OA9qAI_W4nbErR1lLpWPks5u3oVcgaJpZM4ZJjvm&data=02%7C01%7C%7C000da9609c3248032bbf08d65eb8c1bf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800544944104061&sdata=L1S7GcMfDjZSa3LTxoUXtk80QYSGYbA7CJFQ5RZ6qdk%3D&reserved=0>.
|
The problem is caused by a sloppy/buggy output of LemLat. |
Agreed, HTML double quote: "
William Michael Short
Lecturer in Classics
Department of Classics & Ancient History
University of Exeter
…________________________________
From: gersh0m <notifications@github.com>
Sent: Monday, December 10, 2018 5:21:57 PM
To: CIRCSE/WFL
Cc: William Michael Short; Author
Subject: Re: [CIRCSE/WFL] Indication of vowel length interfering with xml parsing (#1)
The problem is caused by a sloppy/buggy output of LemLat.
You don't need any UNICODE in here!
You just need to use the corresponding codes for (double)quote as basic XML syntax states...
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCIRCSE%2FWFL%2Fissues%2F1%23issuecomment-445898145&data=02%7C01%7C%7Cdfc187cc91d847b8b20208d65ec3fe4a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800593201568755&sdata=Xp8G0tKCl6vph6bmFdeE0HY%2F%2FG3LAfNIQWGFT3sMA88%3D&reserved=0>, or mute the thread<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAizEoqa2BmYEZpakW5C80BWEIu-LWaruks5u3pg1gaJpZM4ZJjvm&data=02%7C01%7C%7Cdfc187cc91d847b8b20208d65ec3fe4a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800593201568755&sdata=qLoDIiSXxXADWfTanciszt7k0RAyXPgeBQsGEMrErFY%3D&reserved=0>.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The coding scheme for vowel length -- following " for long-vowels and following ' for short vowels, where this indicates etymological an difference -- is interfering with XML parsing. I would suggest a different coding scheme entirely (e.g., following : for long vowels) or, where possible, relying on already-included morphological information for differentiation?
The text was updated successfully, but these errors were encountered: