-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elements not documented in the EGD #299
Comments
I believe the contents of those two repositories are all files encoded for previous projects and not (yet) migrated to DHARMA norms. If Arlo confirms that, I think it would be best to just ignore them for the time being, and perhaps create a new list of undocumented elements occurring outside those repos.
I don't know if any of the other elements are to be recognised as legitimate and in what contexts. |
@danbalogh Thank you. My bad for |
Thanks. I'd like to know where To my mind we never really discussed how to deal with legacy files obtained by conversion to our model from the Campa and EIAD corpora, after Axelle had handled the import. We should probably have that discussion now. Unless you feel it is a bad idea, I'd be happy to delete from the teiHeaders in our converted INSCIC files all such elements that are inherited from the ancestor files. I can do this manually or perhaps @michaelnmmeyer could automate the process. The percentage of files inherited from the earlier Campa corpus that will eventually be part of tfc-campa-epigraphy will be less than 25%, I think, and we don't need to be slavish about whatever best practices may be for reuse of xml data. In fact I had had on my mind to discuss with @michaelnmmeyer the issue of EIAD files. These have been imported by Axelle at a fairly early stage of the project stage from my private iksvaku-inscriptions repo. Since it is the latter which has the source code for http://hisoma.huma-num.fr/exist/apps/EIAD/index2.html, and my collaborator Vincent Tournier requested some updates after Axelle imported the xml source files to erc-dharma, a small number of asynchronisms have arisen, with better data in iksvaku-inscriptions than what we have in tfb-eiad-epigraphy. I estimate it's a handful of cases, and they can probablky be tracked down easily via the record of commits on tiksvaku-inscriptions. Would @michaelnmmeyer accept to track down and make a list of the meaningful differences, if I gave him access to iksvaku-inscriptions, so we can next freeze that repo, implement the same changes on erc-charma, and only use the latter versions of the EIAD files henceforward? |
If you think the extra data in CIC and EIAD files is unnecessary, I can delete them. For the EIAD files, I can produce a diff of the files Axelle processed and the latest revision. |
Thanks @michaelnmmeyer. I will take a look at those cases of About the removal of extra metadata from CIC and EIAD files, I'd like to have @danbalogh's advice. Can you look at a few files, Dan? In tfc-campa-epigraphy, converted files included DHARMA_INSCIC00001.xml, DHARMA_INSCIC00001.xml and DHARMA_INSCIC00064.xml. Thanks! |
I would recommend against deleting any data that have already been encoded, unless we are very sure we don't need them. We still don't have a definitive setup for encoding roles and responsibilities in our DHARMA editions. We should perhaps try to sort that out in the EGD working group. At any rate, I think that until then the extra TEI header data in CIC and EIAD files should be either just ignored, or - if its presence bothers someone - commented out. |
Thanks Dan. In the CIC files, there is stuff like @michaelnmmeyer : do you think commenting out by machine is an option? or would you only be able to automate the process in case we opt for deletion? The main issue indeed touches on encoding roles and editorial responsibilities. I think indeed it is a high priority to bring that discussion to a conclusion and I'd be happy if you could take the lead. We have potentially thousand of files to be revised on this matter once the decisions have been taken, so we'd better get our act together. |
While cleaning up our schema, I found a few elements that are not documented in the EGD but that occur in a significant number of inscriptions. This mainly concerns texts from tfb-eiad-epigraphy and from tfc-campa-epigraphy. Here is the list:
All these elements but
term
appear in theteiHeader
, and most of them are due to the addition of bibliographical data undersourceDesc
.I am not sure what to do with the data, but, in any case, I would prefer not to allow bibliographic entries to be encoded in TEI (with
biblFull
). Things would be simpler for me if we used<bibl><ptr ref="..."/></bibl>
with a Zotero entry everywhere.The text was updated successfully, but these errors were encountered: