Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify long/short term SNL data serialisation #1717

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

Doy-lee
Copy link
Collaborator

@Doy-lee Doy-lee commented Sep 4, 2024

  • Remove 'get_version' in data_for_serialisation. In the past we might of had a dynamic version that was assigned and it might of changed during a change in the hardfork. This is no longer the case and we always return a fixed value which means we don't use this feature. Inline and remove.

  • Remove the long/short term data cache in the transient struct on the SNL. In the past this was probably modified outside of the serialisation function 'service_node_list::store()' but now it is only modified in the single function and everytime we enter this function we clear out the cache and re-fill it in with runtime data.

Potentially, it was cached to save a heap allocation on store, in this case I would switch to thread-local storage OR a static to reuse the vector storage. I've inline it for now onto the stack to reduce complexity in the header.

  • Pull data_for_serialization into the implementation file. This structure is not used outside of the SNL.

  • Rename 'state_added_to_archive' to 'long_term_dirty_flag' to more accurately convey the intent of this flag (e.g. when it's dirty, the next store will have to write the [long term] data to disk ...)

  • Remove the cache data blob, similar reasoning as above. This is cleared everytime we enter the function. I suspect it was done to save a heap allocation.

The solution I'd advise here is thread-local storage or static. Simplifies the complicated SNL header.

  • Remove the for-loop to initialise the data for serialisation which just clears the structures and reassigns the version. The version is static now and default initialised so no need to set it. The data structures, potentially were there for heap allocation optimisation. Again recommended solution is thread local storage or static.

--

I've chosen to use static because

  • There will always only be one SNL list instance in the entire program so there aren't multiple SNL's that we have to synchronise against.
  • Storing requires taking a mutex so the instruction level lock will never be contended.
  • Store will be called from multiple threads (multiple P2P threads AFAICT) so we'd end up with multiple per-thread 'caches' if we used thread-local storage.

@jagerman
Copy link
Member

jagerman commented Sep 4, 2024

I'm good with the simplification and getting implementation details out of the header here, but I'm not okay with the change to using a global variable to store the data. For one, it violates OO design principles by because it creates a violation of the promise of store() being a instance method. This point is part of the problem:

  • Storing requires taking a mutex so the instruction level lock will never be contended.

Because that mutex isn't good enough anymore; effectively this creates a class that cannot guard its own internal data structures.

While static technically "works" here, it's a tradeoff that would require, for me, a very strong justification for going against the general recommendation by major C++ guidelines against its use. The listed exceptions where this is acceptable are generally cases where you're deferring calculation of a value until first use, but where the value is constant after that. "We currently only happen to ever create one instance at a time" isn't the same as "we may now only ever create one instance at a time or else the code will break", but this PR makes it the latter, and that's really why this is such a frowned upon technique (both both guidelines, and me).

I think we can avoid the static but keep the welcome movement of the details out of the header and into the .cpp by using an opaque pointer here, i.e. adding just a single void* serialize_cache; in the header, and then in the cpp do something like:

// service_node_list.cpp:

struct data_for_serialization { ... };
using serialize_cache_t = std::array<data_for_serialization, 2>; // {short term, long term}

// In constructor:
    /*...*/, serialize_cache{new serialize_cache_t{}}, /*...*/

// In destructor:
service_node_list::~service_node_list { delete static_cast<serialize_cache_t*>(serialize_cache); }

// and in use in a member function:
    auto& [short_term_data, long_term_data] = *static_cast<serialize_cache_t*>(serialize_cache);

Edit after thinking some more and realizing that a void* member is perhaps a little too raw: you could go slightly less opaque than a raw void* with a forward declaration in the header and a unique_ptr member:

struct serialize_cache_data; // predeclaration

class service_node_list {
    // ...

    // member variable:
    std::unique_ptr<serialize_cache_data> serialize_cache;
    ~service_node_list();
};

in the .h and the .cpp having:

struct data_for_serialization { ... };
struct serialize_cache_data {
    data_for_serialization short, long;
};

// Force the destructor instantiation to be here, where serialize_cache_data is actually visible:
service_node_list::~service_node_list() = default;

    // And then in the member function:
    auto& [short_term_data, long_term_data] = *serialize_cache;

@Doy-lee Doy-lee force-pushed the doyle-data-serialisation-simplify branch 2 times, most recently from 26236a2 to 58c72cc Compare September 5, 2024 02:19
@Doy-lee
Copy link
Collaborator Author

Doy-lee commented Sep 5, 2024

It's a fair point, kind of a footgun if the invariant one day breaks which I was okay with but it's not a big deal to avoid it.

I've gone ahead and made the change, used the pimpl pattern as suggested but instead of just pulling in data_for_serialization the entire m_transient field is actually implementation detail that leaked outside which I've now pulled into the cpp file. Gets instantiated on service_node_list::reset

@javabudd
Copy link

I received a deserialization error my first time running this. Subsequent runs have been fine. My guess is this won't be a problem just wanted to bring it forward.

[2024-09-09 19:51:54] [+0.962s] [service_nodes:error|cryptonote_core/service_node_list.cpp:4972] Failed to parse service node data from blob: Invalid integer or enum value during deserialization:
Stack trace (most recent call first):
#0  0x0000556ddb4bef9c in oxen::exception<std::out_of_range>::exception<char const (&) [53]>(char const (&) [53]) at /home/javabudd/javabudd/oxen-core/src/common/exception.h:49
      47:     explicit exception(Args&&... args) :
      48:             StdExcept{std::forward<Args>(args)...},
      49:             raw_trace{cpptrace::generate_raw_trace(/*skip*/ 0, /*max_depth*/ 256)} {}
      50:
      51:     const char* what() const noexcept override {
#1  0x0000556ddb5b86b1 in void serialization::varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:280
     278:     varint(ar, val);
     279:     if (Archive::is_deserializer && !test(val))
     280:         throw oxen::traced<std::out_of_range>{"Invalid integer or enum value during deserialization"};
     281: }
#2  0x0000556ddb5b7ca7 in void serialization::field_varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, std::basic_string_view<char, std::char_traits<char> >, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}&&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:320
     318:         ar.tag(name);
     319:
     320:     varint(ar, val, std::forward<Predicate>(test));
     321: }
#3  0x0000556ddb5b7726 in void service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&) at /home/javabudd/javabudd/oxen-core/src/cryptonote_core/service_node_list.cpp:130
     128:     template <class Archive>
     129:     void serialize_value(Archive& ar) {
     130:         field_varint(ar, "version", version, [](auto v) { return v < version_t::count; });
     131:         field_varint(ar, "height", height);
     132:         field(ar, "infos", infos);
#4  0x0000556ddb5b68df in void serialization::value<serialization::binary_string_unarchiver, service_nodes::state_serialized>(serialization::binary_string_unarchiver&, service_nodes::state_serialized&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:225

@Doy-lee
Copy link
Collaborator Author

Doy-lee commented Sep 13, 2024

I received a deserialization error my first time running this. Subsequent runs have been fine. My guess is this won't be a problem just wanted to bring it forward.

[2024-09-09 19:51:54] [+0.962s] [service_nodes:error|cryptonote_core/service_node_list.cpp:4972] Failed to parse service node data from blob: Invalid integer or enum value during deserialization:
Stack trace (most recent call first):
#0  0x0000556ddb4bef9c in oxen::exception<std::out_of_range>::exception<char const (&) [53]>(char const (&) [53]) at /home/javabudd/javabudd/oxen-core/src/common/exception.h:49
      47:     explicit exception(Args&&... args) :
      48:             StdExcept{std::forward<Args>(args)...},
      49:             raw_trace{cpptrace::generate_raw_trace(/*skip*/ 0, /*max_depth*/ 256)} {}
      50:
      51:     const char* what() const noexcept override {
#1  0x0000556ddb5b86b1 in void serialization::varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:280
     278:     varint(ar, val);
     279:     if (Archive::is_deserializer && !test(val))
     280:         throw oxen::traced<std::out_of_range>{"Invalid integer or enum value during deserialization"};
     281: }
#2  0x0000556ddb5b7ca7 in void serialization::field_varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, std::basic_string_view<char, std::char_traits<char> >, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}&&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:320
     318:         ar.tag(name);
     319:
     320:     varint(ar, val, std::forward<Predicate>(test));
     321: }
#3  0x0000556ddb5b7726 in void service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&) at /home/javabudd/javabudd/oxen-core/src/cryptonote_core/service_node_list.cpp:130
     128:     template <class Archive>
     129:     void serialize_value(Archive& ar) {
     130:         field_varint(ar, "version", version, [](auto v) { return v < version_t::count; });
     131:         field_varint(ar, "height", height);
     132:         field(ar, "infos", infos);
#4  0x0000556ddb5b68df in void serialization::value<serialization::binary_string_unarchiver, service_nodes::state_serialized>(serialization::binary_string_unarchiver&, service_nodes::state_serialized&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:225

Interesting, I haven't changed the version numbers here yet for the blobs, by chance were you jumping around different branches for the stagenet? Some of those branches have some newer versions for data_for_serialization that it would have wrote out to the blobs on disk.

If you then switch to this branch and run it'll complain about the unknown version that got serialized in the blob. This will need a rebase which I'll do now

- Remove 'get_version' in data_for_serialisation. In the past we might
of had a dynamic version that was assigned and it might of changed
during a change in the hardfork. This is no longer the case and we
always return a fixed value which means we don't use this feature.
Inline and remove.

- Remove the long/short term data cache in the transient struct on the
SNL. In the past this was probably modified outside of the serialisation
function 'service_node_list::store()' but now it is only modified in the
single function and everytime we enter this function we clear out the
cache and re-fill it in with runtime data.

Potentially, it was cached to save a heap allocation on store, in this
case I would switch to thread-local storage OR a static to reuse the
vector storage. I've inline it for now onto the stack to reduce
complexity in the header.

- Pull data_for_serialization into the implementation file. This
structure is not used outside of the SNL.

- Rename 'state_added_to_archive' to 'long_term_dirty_flag' to more
accurately convey the intent of this flag (e.g. when it's dirty, the next
store will have to write the [long term] data to disk ...)

- Remove the cache data blob, similar reasoning as above. This is
cleared everytime we enter the function. I suspect it was done to save
a heap allocation.

The solution I'd advise here is thread-local storage or static. Simplifies
the complicated SNL header.

- Remove the for-loop to initialise the data for serialisation which
just clears the structures and reassigns the version. The version is
static now and default initialised so no need to set it. The data
structures, potentially were there for heap allocation optimisation.
Again recommended solution is thread local storage or static.

--

I've chosen to use static because

- There will always only be one SNL list instance in the entire program
  so there aren't multiple SNL's that we have to synchronise against.
- Storing requires taking a mutex so the instruction level lock will
  never be contended.
- Store will be called from multiple threads (multiple P2P threads
  AFAICT) so we'd end up with multiple per-thread 'caches' if we used
  thread-local storage.
@Doy-lee Doy-lee force-pushed the doyle-data-serialisation-simplify branch from 58c72cc to 298811f Compare September 13, 2024 06:22
@javabudd
Copy link

I received a deserialization error my first time running this. Subsequent runs have been fine. My guess is this won't be a problem just wanted to bring it forward.

[2024-09-09 19:51:54] [+0.962s] [service_nodes:error|cryptonote_core/service_node_list.cpp:4972] Failed to parse service node data from blob: Invalid integer or enum value during deserialization:
Stack trace (most recent call first):
#0  0x0000556ddb4bef9c in oxen::exception<std::out_of_range>::exception<char const (&) [53]>(char const (&) [53]) at /home/javabudd/javabudd/oxen-core/src/common/exception.h:49
      47:     explicit exception(Args&&... args) :
      48:             StdExcept{std::forward<Args>(args)...},
      49:             raw_trace{cpptrace::generate_raw_trace(/*skip*/ 0, /*max_depth*/ 256)} {}
      50:
      51:     const char* what() const noexcept override {
#1  0x0000556ddb5b86b1 in void serialization::varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:280
     278:     varint(ar, val);
     279:     if (Archive::is_deserializer && !test(val))
     280:         throw oxen::traced<std::out_of_range>{"Invalid integer or enum value during deserialization"};
     281: }
#2  0x0000556ddb5b7ca7 in void serialization::field_varint<serialization::binary_string_unarchiver, service_nodes::state_serialized::version_t, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}>(serialization::binary_string_unarchiver&, std::basic_string_view<char, std::char_traits<char> >, service_nodes::state_serialized::version_t&, service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&)::{lambda(auto:1)#1}&&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:320
     318:         ar.tag(name);
     319:
     320:     varint(ar, val, std::forward<Predicate>(test));
     321: }
#3  0x0000556ddb5b7726 in void service_nodes::state_serialized::serialize_value<serialization::binary_string_unarchiver>(serialization::binary_string_unarchiver&) at /home/javabudd/javabudd/oxen-core/src/cryptonote_core/service_node_list.cpp:130
     128:     template <class Archive>
     129:     void serialize_value(Archive& ar) {
     130:         field_varint(ar, "version", version, [](auto v) { return v < version_t::count; });
     131:         field_varint(ar, "height", height);
     132:         field(ar, "infos", infos);
#4  0x0000556ddb5b68df in void serialization::value<serialization::binary_string_unarchiver, service_nodes::state_serialized>(serialization::binary_string_unarchiver&, service_nodes::state_serialized&) at /home/javabudd/javabudd/oxen-core/src/serialization/serialization.h:225

Interesting, I haven't changed the version numbers here yet for the blobs, by chance were you jumping around different branches for the stagenet? Some of those branches have some newer versions for data_for_serialization that it would have wrote out to the blobs on disk.

If you then switch to this branch and run it'll complain about the unknown version that got serialized in the blob. This will need a rebase which I'll do now

this was definitely the case, good catch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants