-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tictac AAE object_stats shows various statistics over a period of time. #1874
Comments
This isn't normal or expected behaviour. Couple of things which might clarify what is occurring:
|
Thanks for the quick response
|
on (4), is there a pattern to which keys disappear/reappear? Is it the same subset of keys that vary in terms of their presence in responses, or does it seem to be that any key can disappear/reappear at random? |
Made 5 tries. There are keys that are always present. The rest continue their strange behavior. |
What proportion are always present, and what proportion exhibit the strange behaviour? For a key that has the strange behaviour, when it is returned is the vector clock always the same? Can you share an example of the clock returned (by fetch_clocks_range) for an always present key, and a varying key? |
Here is an example of a key request that is always present.
Here are the keys that behave strangely
I didn't notice that the strange keys were any different |
Normally I expect to see a VCLOCK entry as something like:
which is but you have some like:
so a "," has become a "." between the counter and the timestamp - is this just a transcribing error? Assuming that to be true, there are still some strange things about the clocks on the disappearing keys:
All of the above may be red herrings. Non-standard PUT paths may generate funny vnode ids (the write once path does, but not I believe of this form). Getting a few more dissappearingkey clocks may help me understand if this weirdness is relevant to the actual problem. P.S. The timestamp for the coordination of change in the vector clock (e.g. |
Tomorrow, if I get chance, I'm going to try and extend one of the I have this feature tested in prod environments with o(1bn) keys producing reliable and predictable results, so I don't think the feature is fundamentally flawed - but there may be a sequence of events combined with particular feature combinations that may be exposing a bug. Any thing you can suggest that might make this test more realistic to your environment would be helpful. e.g. what PUT options do you use, what are the bucket properties you have, any noticeable events that may have occurred in the past etc (e.g. particular failures), any uncommon features that you use. |
I noticed that the missing keys have approximately the same time 26 Sep 2022. Regardless of what period of time I request. I requested the first keys for May 5
Mostly the disappearing keys belong to <<2,226,39,102>>. There are others
Not sure if this is related, but there is a LOCK that coincides in time
In addition to this, there is also one for July.
Not sure what happened in September. But in July 2022 we moved from gp2 to gp3, re-creating the riak nodes. In May 2023, the riak version was updated from 2.9.0 to 3.0.12
Typically we simply remove the failing index. Maybe this is the wrong approach |
If you fetch the disappearing keys via the a HTTP request, what is the Last Modified Date on the object? |
|
I looked at the underlying code of the leveled store, as to how it qualifies something as being within the last modified range: https://github.com/martinsumner/leveled/blob/develop-3.0/src/leveled_penciller.erl#L1712-L1715 If the object in the parallel AAE store has a LMD of What I don't know is how it might get to be Also, the AAE stores are periodically rebuilt based on the I will have a thing what else to try. It may be worth touching a disappearing object (i.e. GET it then PUT it back as-is). This should reset the LMD, and it would be interesting to see if that stops it from appearing in the wrong range. The other thing would be to test the object to see if the function that extract the LMD works correctly on it. If you could do this via
|
I tried it on several disappearing keys and got
|
Which is the correct LMD:
So this doesn't explain why we could get an Can you confirm you have the default value for |
This might be related. there are two binary formats for Riak objects - v0 and v1. Before storing an object Riak checks that the cluster supports v1: https://github.com/basho/riak_kv/blob/develop-3.0/src/riak_kv_vnode.erl#L4094-L4101 This caused problems in leveled, which only supports v1 of the binary format. The way capabilities are negotiated could lead to sometimes v0 being chosen during certain failure scenarios (hence with the leveled backend it is now forced to be always v1). The elevledb backend doesn't have that protection. So possibly during an incident, an object could be stored in the v0 binary format. If that happens though, the binary format doesn't included the last-modified-date so when the update is put into the AAE backend this (and other) metadata might end up missing.
So I suspect this is what happened here. You have some objects stored during an incident on 26th September 2022, and they now exist in the eleveldb backend in v0 format. So now when we build the tictac aae stores they are added to the key store with a LMD of undefined - and hence they have unpredictable behaviour when running aae_folds with modified date ranges. The periodic rebuilds of the AAE store won't help as they will keep rebuilding based on the decoding of the v0 format. the only thing that will help will be to do an inert update on the object, outside of the failure, so that it is replaced in the backend with a v1 version. |
I think there is an underlying issue with negotiating things like object version via riak_core capabilities, especially as support for v0 is now so legacy there is no possibility of being in a mixed cluster with something requiring v0. The riak.conf file sets the preferred object format: https://github.com/basho/riak_kv/blob/riak_kv-3.0.16/priv/riak_kv.schema#L529-L548 However the default format used, when not all cluster members can confirm the v1 capability is to use v0: https://github.com/basho/riak_kv/blob/riak_kv-3.0.16/src/riak_kv_app.erl#L193-L195 Really this should default to the configured setting, given that cluster support should be guaranteed now. However, the So really, in this scenario where we discover a v0 binary: https://github.com/basho/riak_kv/blob/riak_kv-3.0.16/src/riak_object.erl#L1275-L1277 Rather than returning an incomplete set of metadata including an
|
Thank you for your help. |
Hi.
Some context. We use riak version 3.12, leveldb, ring_size = 64, n_val = 3, 5 riak nodes. We tried switching to Tictac AAE to use NextGen replication. The following was added to the config:
After tictac aae has been built, we observe different statistics on objects through
riak_client:aae_fold({object_stats, <<"domainRecord">>, all, all}).
on immutable buckets (no external traffic enters riak). There are more than 10kk objects in the bucket, and object_stats shows a different value each time with a slight difference. erase_keys also gives different statistics. For example (the request was repeated with a difference of 5 minutes):Over a longer period of time the difference will be greater.
Also, when real-time replication is enabled, object_stats on the sink cluster and the source cluster will be different. There are no errors in the logs.
Can you give any comments? Is this normal behavior? Due to the lack of tools for checking data integrity, we cannot be sure that consistency between the two clusters will be maintained and we cannot reliably use replication.
The text was updated successfully, but these errors were encountered: