Dataset for tests and benchmarks #46

al8n · 2023-04-19T09:36:54Z

We need some datasets that can be used to give more insight into the performance and the hit ratio when we add new features.

Yiling-J · 2023-04-22T02:42:51Z

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

al8n · 2023-04-22T08:21:20Z

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

The low hit ratio for ristretto in your benchmark may be caused by the write buffer, in ristretto, if you insert an item, and then try to read this item, if the item is still in the write buffer, then you will get a miss.

Yiling-J · 2023-04-22T08:43:03Z

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

al8n · 2023-04-22T08:51:54Z

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yeah, writing to map first is better. I was thinking of having a method that lets the cache can read the item from the write buffer, e.g. add an Arc for the item, and have a hashmap to store the item in the buffer and remove it when the item is handled. But I do not think the idea is good enough. I have appreciated it if there is any idea about this feature.

Yiling-J · 2023-04-22T09:08:34Z

I think it's just switching the order, first writing to map, then adding to write buffer. I think Ristretto write to buffer first because write buffer is a channel, they drop some Sets under high concurrency when channel is full. This improve write performance, but maybe not the excepted behavior for Ristretto users.

al8n added enhancement New feature or request help wanted Extra attention is needed labels Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset for tests and benchmarks #46

Dataset for tests and benchmarks #46

al8n commented Apr 19, 2023

Yiling-J commented Apr 22, 2023

al8n commented Apr 22, 2023

Yiling-J commented Apr 22, 2023

al8n commented Apr 22, 2023 •

edited

Loading

Yiling-J commented Apr 22, 2023

Dataset for tests and benchmarks #46

Dataset for tests and benchmarks #46

Comments

al8n commented Apr 19, 2023

Yiling-J commented Apr 22, 2023

al8n commented Apr 22, 2023

Yiling-J commented Apr 22, 2023

al8n commented Apr 22, 2023 • edited Loading

Yiling-J commented Apr 22, 2023

al8n commented Apr 22, 2023 •

edited

Loading