Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve transport.Post Do method #3373

Merged
merged 3 commits into from
Nov 15, 2024
Merged

Conversation

lkeix
Copy link
Contributor

@lkeix lkeix commented Nov 13, 2024

Description

related

This PR optimizes the Do method in transport.Post by addressing memory allocation inefficiencies. Specifically:

  1. Removed unnecessary []byte to string conversions in the Do method, which were causing additional memory allocations.
  2. Introduced sync.Pool for graphql.RawParams to enable memory reuse, reducing overall allocations.

These changes aim to improve the performance of the gqlgen server, especially in high-throughput scenarios.

Related Issue

Resolves #3372 (replace with actual issue link)

Benchmark

I took benchmark simple code 5 times.
schema is generated project initizalization.

resolver implementation

// CreateTodo is the resolver for the createTodo field.
func (r *mutationResolver) CreateTodo(ctx context.Context, input model.NewTodo) (*model.Todo, error) {
	return &model.Todo{}, nil
}

// Todos is the resolver for the todos field.
func (r *queryResolver) Todos(ctx context.Context) ([]*model.Todo, error) {
	return []*model.Todo{}, nil
}

Result

Before

BenchmarkReponse-16    	    4557	    260146 ns/op	   21297 B/op	     157 allocs/op
BenchmarkReponse-16    	    3516	    308304 ns/op	   21351 B/op	     157 allocs/op
BenchmarkReponse-16    	    3307	    353268 ns/op	   21485 B/op	     157 allocs/op
BenchmarkReponse-16    	    3379	    343946 ns/op	   21434 B/op	     157 allocs/op
BenchmarkReponse-16    	    3438	    400341 ns/op	   21276 B/op	     157 allocs/op

After

BenchmarkReponse-16    	    4507	    253248 ns/op	   21395 B/op	     155 allocs/op
BenchmarkReponse-16    	    3798	    311202 ns/op	   21364 B/op	     155 allocs/op
BenchmarkReponse-16    	    3756	    312710 ns/op	   21268 B/op	     155 allocs/op
BenchmarkReponse-16    	    3168	    362442 ns/op	   21387 B/op	     155 allocs/op
BenchmarkReponse-16    	    2730	    523350 ns/op	   21286 B/op	     155 allocs/op

Checklist

I have:

  • Added tests covering the optimization (see testing)
  • Updated any relevant documentation (see docs)

@coveralls
Copy link

Coverage Status

coverage: 59.578% (+0.03%) from 59.549%
when pulling c4ba63b on lkeix:master
into aaf44f5 on 99designs:master.

resp := exec.DispatchError(ctx, gqlerror.List{gqlErr})
writeJson(w, resp)
return
}

bodyReader := io.NopCloser(strings.NewReader(bodyString))
if err = jsonDecode(bodyReader, &params); err != nil {
bodyReader := bytes.NewReader(bodyBytes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little nervous about removing the io.NopCloser here. We used to have quite a few issues with accidentally closing io.Reader more than once, and it was pretty hard to track down.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nevermind. It looks like that doesn't apply at all here (or at least it doesn't anymore). That was a different part of the codebase.

@StevenACoffman
Copy link
Collaborator

I greatly appreciate you looking into performance improvements, but I want to ask how you have picked which places to optimize here and in your other PR.

It would be extremely valuable to the entire project to identify where the biggest source of allocations or CPU usage are coming from during a realistic load scenario. It may be that you have done that, but even if this is the most important place, I would like to hear what you found were second or third and why, as others in the community may be able to help on those.

Also, I'm not sure how far to take performance improvements in this area at the expense of readability and maintainability.

For instance, io.ReadAll is much more readable than using io.Copy, but it does produce more allocations. Make up a 10 MB text file:

func BenchmarkReadAll(b *testing.B) {
    for i := 0; i < b.N; i++ {
        in, _ := os.Open("sample.txt")
        io.ReadAll(in)
        in.Close()
    }
}

func BenchmarkCopy(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buf := &bytes.Buffer{}
        in, _ := os.Open("sample.txt")
        io.Copy(buf, in)
        in.Close()
    }
}

Output:

goos: linux
goarch: amd64
pkg: example.org
cpu: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
BenchmarkReadAll-32           26      45445764 ns/op    102635996 B/op        43 allocs/op
BenchmarkXxx-32               48      23985133 ns/op    67108544 B/op         21 allocs/op

Or take for example this article: https://klotzandrew.com/blog/speeding-up-json-processing-in-go/
This is very easily read and maintained:

  bodyBytes, _ := ioutil.ReadAll(body)
  box := BoxType{}
  _ = json.Unmarshal(bodyBytes, &box)

But it can definitely be made faster and allocate much less at the expense of complicating it as in the article.

It is worth it if the gains are worth it in comparison to everything else, but if it's a small part of the overall, then it's not worth it.

With this PR, I can't tell how much your performance improvement in this area relates to the overall CPU/memory performance. If you have already worked that out, I would love it if you could share that.

@StevenACoffman StevenACoffman merged commit 288848a into 99designs:master Nov 15, 2024
16 of 17 checks passed
@StevenACoffman
Copy link
Collaborator

I'm merging the PR here, but I would still appreciate someone sharing an analysis of the most performance critical areas of gqlgen runtime.

@lkeix
Copy link
Contributor Author

lkeix commented Nov 20, 2024

@StevenACoffman
Apologies for the delayed response.
Thank you for merging this PR.

I will continue to share insights on gqlgen runtime bottlenecks in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants