-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement io.StringWriter #32
base: master
Are you sure you want to change the base?
Conversation
murmur.go
Outdated
// us take over the default behavior of the compiler to have the byte slice | ||
// share the underlying memory buffer of the string and avoid the extra heap | ||
// allocation. | ||
return d.Write(*(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversion is unsafe (see golang/go#40701)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this case is correct because it is covered by the rules 1, 3, and 6 of unsafe:
(1) Conversion of a *T1 to Pointer to *T2.
(3) Conversion of a Pointer to a uintptr and back, with arithmetic.
(6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer.
https://golang.org/pkg/unsafe/#Pointer
Let me know if I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard for me to actually tell and I wish this type of conversion were documented as explicitly safe or unsafe. I took this as violating rule 6 because the slice doesn't refer to an actual underlying slice, the slice is being made up out of thin air from a reflect.SliceHeader. What's more obviously safe to me is when the slice header is referring to an underlying slice beforehand.
It is entirely possible that this is safe, I just haven't trusted it since the six unsafe rules came out (this was my original go to for how to convert strings to slices).
As an aside, I think the 40701 I mentioned in the first ticket refers to other behavior that this is definitely not doing: taking in a byte pointer and making a slice out of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/jlauinger/go-safer
I believe this is effectively patterns 1 and 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is pattern 1 and 2, because the conversion to and from unsafe is done in a single expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looks like the Go doc may have been updated to explicitly indicate that this conversion may not be safe:
In general, reflect.SliceHeader and reflect.StringHeader should be used only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual slices or strings, never as plain structs. A program should not declare or allocate variables of these struct types.
Here's an alternative approach we could try:
https://github.com/cespare/xxhash/blob/master/xxhash_unsafe.go#L52-L57
// sliceHeader is similar to reflect.SliceHeader, but it assumes that the layout
// of the first two words is the same as the layout of a string.
type sliceHeader struct {
s string
cap int
}
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something like twmb/murmur3@610077f is safer, and afaict it's just as fast.
Hello @twmb, would you like to get this PR merged or would you prefer not to bring it in? |
I'm not a gate on this repo, I just saw the PR and decided to look since I fixed roughly the same thing in my own fork from way back. |
We're maintaining a fork as well which includes this optimization https://github.com/segmentio/murmur3 (in case it is useful to someone). |
Convert to go module
Return opaque types
Hello!
This PR adds a
WriteString
method to the internaldigest
type to implement theio.StringWriter
interface and avoid memory copies when hashing strings.I added a benchmark and ran it before and after the change to demonstrate that a heap allocation is removed by providing this method:
The use of the
unsafe
package is a bit unfortunate, but it seemed like the package already depended on it, so this change doesn't introduce a new dependency onunsafe
.Let me know if you'd like to see anything changed!