-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The cesspool that is MS Word #2
Comments
Tables – structuresI wonder, are these all the same, having headers with at least one identifiable first column like Examples from DH: I don't think they're all the same, unfortunately. Here's one from S5C2, which uses "Field" rather than "Field Name". Another |
Tables – property listsExample from DH: I didn't check too many docs, but in S4, I think there's some consistency: this is S4G489: S4G548 has a table with the same headers. |
Tables – property valuesExample from DH: Probably no consistency here. This is from S4G387; note the property name (RowSourceType) in the column header: |
Tables – code examplesPossibly easy to identify as squat 2-column table with Example from DH: I'm pretty sure this is consistent. |
Tables - "colspan" headersThese will likely be problematic. Easy enough to identify, I reckon. Examples from |
Tables — simpleAnd then we have simple, generic tables; these should be easy. These are just another mundane list? Example from |
Presently stymied by the utter BS that is MS word.
This issue is a place to list content patterns, and devise tactics, to extract content from Word.
This issue is a spec that is a work in process.
Feel free to edit and augment comments as opposed to adding to the thread. We can create separate issues for individual tactics, cross-referencing them back here.
The text was updated successfully, but these errors were encountered: