-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Unicode and emoji support #35
base: master
Are you sure you want to change the base?
Conversation
Hello, first thanks for your contribution, but your example |
I had a project in which I needed to parse an XML format inteded for laws
in Portuguese, so I would get things like:
```xml
<capítulo id="cap_Introdução" título="Introdução">
```
…On Wed, 6 Mar 2019, 00:15 zhengchun, ***@***.***> wrote:
Hello, first thanks for your contribution, but your example
***@***.***ášs日本語 is doesn't make sense. In mose cases, HTML, XML and
JSON document supported english, number or under-line only. You should
provides more example about why supports Unicode query. Thanks.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#35 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC6MHF0Z92fd_C44Wkz_Ewy1eHmswRvNks5vTzK7gaJpZM4bbX4c>
.
|
I need it to, because Federal Taxation Service of the Russian Federation uses Cyrillic characters in its xml files. |
|
||
go 1.12 | ||
|
||
require github.com/antchfx/xpath v0.0.0-20190319080838-ce1d48779e67 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why do you add go mod in this patch that has nothing to do with moving the library to go mod? Have we discussed about the module management strategy moving forward and agreed upon on this approach?
@@ -329,6 +330,7 @@ func (p *parser) parsePathExpr(n node) node { | |||
} | |||
} else { | |||
opnd = p.parseLocationPath(nil) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this empty line?
@@ -363,6 +365,7 @@ func (p *parser) parseLocationPath(n node) (opnd node) { | |||
p.next() | |||
opnd = newRootNode("//") | |||
opnd = p.parseRelativeLocationPath(newAxisNode("descendant-or-self", "", "", "", opnd)) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
if err != nil { | ||
t.Fatalf("//ul/li/@clášs日本語 should be correct but got error %s", err) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this does seem to cover the additions you made in scanner/parse.go code, it would be nice to actually have a test/tests to show the compiled utf8 friendly expr that actually works on an actual html/xml, i.e. select works.
No description provided.