Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset for tests and benchmarks #46

Open
al8n opened this issue Apr 19, 2023 · 5 comments
Open

Dataset for tests and benchmarks #46

al8n opened this issue Apr 19, 2023 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@al8n
Copy link
Owner

al8n commented Apr 19, 2023

We need some datasets that can be used to give more insight into the performance and the hit ratio when we add new features.

@al8n al8n added enhancement New feature or request help wanted Extra attention is needed labels Apr 19, 2023
@Yiling-J
Copy link

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

@al8n
Copy link
Owner Author

al8n commented Apr 22, 2023

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

The low hit ratio for ristretto in your benchmark may be caused by the write buffer, in ristretto, if you insert an item, and then try to read this item, if the item is still in the write buffer, then you will get a miss.

@Yiling-J
Copy link

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

@al8n
Copy link
Owner Author

al8n commented Apr 22, 2023

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yeah, writing to map first is better. I was thinking of having a method that lets the cache can read the item from the write buffer, e.g. add an Arc for the item, and have a hashmap to store the item in the buffer and remove it when the item is handled. But I do not think the idea is good enough. I have appreciated it if there is any idea about this feature.

@Yiling-J
Copy link

I think it's just switching the order, first writing to map, then adding to write buffer. I think Ristretto write to buffer first because write buffer is a channel, they drop some Sets under high concurrency when channel is full. This improve write performance, but maybe not the excepted behavior for Ristretto users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants