Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using code generation to produce Rust code from structured RPC information exported from Bitcoin Core #4

Open
casey opened this issue Nov 14, 2024 · 13 comments

Comments

@casey
Copy link

casey commented Nov 14, 2024

I've been considering an alternative approach to writing a Bitcoin Core JSON RPC client, namely, exporting the structured API information available within Bitcoin Core as JSON, and using that to auto-generate the API.

I created a branch of Core here which implements a new JSON RPC command, api, which dumps structured API information as JSON.

You can see the output here. It contains types and documentation for all RPC commands.

These could be used to automatically generate Rust code, instead of writing them manually and keeping them up to date between versions of Core. Types are not ideal, for example, addresses have type "string". However, you could write a complementary JSON "patch" file which described what the user facing types should be.

For example, getaddressinfo takes a single argument which is of type string. The patch file could contain instructions for the code generator to expose this as a bitcoin::Address:

{
  "getaddressinfo": {
    "arguments": {
      "address": {
        "type": "address",
      }
    }
  }
}

The code generator would then load the JSON file dumped by bitcoin-cli api, as well as the patch file, and use them to generate user-facing Rust code which exposed the appropriate types, and performed the appropriate conversions when calling Bitcoin Core.

There are a bunch of advantages to this approach. You would be guaranteed to be up-to-date with Core, and could automatically expose documentation from Core as Rust doc comments, without needing to copy them manually. Since everything is auto-generated, you would likely only need a minimum of tests to be confident that things were working.

Some types would require special considerations, like addresses, fee rates, and amounts (which are exposed by the Core RPC with different units in different places). So, it might be wise to consider the patchfile to be a kind of whitelist, and require entries in the patch file for all commands before allowing to code generator to expose them, to ensure that a human had manually confirmed that no types require special handling.

I don't know if I'm going to keep working on this myself. Since rust-bitcoincore-rpc has been deprecated, I went down a bit of a rabbit hole thinking about how the whole RPC situation could be improved, and this is what I came up with. However, in reality, we'll probably keep using rust-bitcoincore-rpc for the time being, and just hack around the limitations until they become intolerable.

If this approach is fruitful, it would be nice to get the branch which adds the api command merged into Core, so it doesn't require continual rebasing. I've been away from Core development for a while, and have been spoiled rotten by Rust, so I might not be the right person to do that.

@tcharding
Copy link
Member

Thanks for the write up man, I didn't totally grok it but I had an immediate question, will your idea achieve what I want to achieve with less work now and less work to maintain it? If so I'm in favour of it. To assist you in possibly answering this perhaps I can outline what it is this repo is trying to achieve and why I think it should be low maintenance going forward. FTR if I never have to look at JSON RPC code again I will be a happy camper, I am working on this crate because I think it needs to exist not because I find it interesting in any way.

Wants and don't wants

  1. I want to write integration test code and have it run against some specific subset of released Core versions
  2. I don't want to spend much time maintaining this repo
  3. I don't want it to break every time we release a new version of rust-bitcoin, and when it does break I want to be able to patch it quickly/easily
  4. I want it to be easy enough to upgrade when new Core versions come out (an hour or twos work)
  5. I don't want it to keep getting more and more complicated as Core's RPC API changes over time

This project is pretty new so I don't know if it will achieve these aims but in regards to each one:

  1. Yep, it definitely does this.
  2. We will have to see how this pans out
  3. I'm concerned with this one, there is a shit ton of rust-bitcoin type conversion code that will be annoying to update but it should be brain dead simple at least.
  4. Core 28 upgrade was done recently by an external contributor so it can't be toooo hard, once one knows the repo it should achieve this easily
  5. All the complexity should be isolated to the json::model module

It is bleedingly obvious while working on this repo that a trained monkey could do it because it is so repetative. It is also boring, hence why it is taking me so long.

@casey
Copy link
Author

casey commented Nov 14, 2024

  1. I'm actually a little bit unclear about how this crate does 1. Looking at the client code, it looks like each version has a different module containing the types for that version. How does the client code run against different core versions? Does it have to enumerate those versions, like doing v1::method(), v2::method(), or does it somehow select the right method and models to use dynamically based on what version of Core it's connected to?

  2. I think codegen potentially helps with this. Each time a new version of core comes out, you rebase the C++ code which dumps the API to JSON onto the new version. Then you run a Rust codegen script to auto-generate the Rust code which contains Rust structs for all the request and response types for that version.

  3. I think codegen helps with this. You have the type conversion code, for example between a JSON string and a Rust address in one place, so if that changes, for example because of a change to rust-bitcoin, you only have to change it in one place, and the codegen script will replicate it wherever that type is used in a request or response.

  4. This depends on whether or not the code in Core changes enough to make rebasing hard. In the best case, it gets integrated into Core, so you don't need to rebase. In the medium case you have to rebase manually but it's easy, in the worst case there's crazy code churn in the new version of Core, so you have to modify it to rebase it.

  5. I think codegen probably helps, or at least makes that complexity easy to deal with.

A big problem with the codegen approach is that I wrote the code for Bitcoin Core 28. I don't know how much change there's been in the RPC API code, but if there's been a lot, rebasing the the code onto earlier and earlier versions, back to v17, might be very hard. I do wonder if it's worth supporting versions of core which are that old. Is there demand for it? If not, then you could just decided to support v28 going forward, and anyone who wants support for an older version should use the deprecated library.

@casey
Copy link
Author

casey commented Nov 14, 2024

I posted the API description PR to Twitter, and someone mentioned it in an issue in Core, so linking here for posterity.

@tcharding
Copy link
Member

Edit: If there is a single software project out there, which currently correctly and fully implements the RPC API in a type-safe manner, it would be interesting to see how they approached it. But I doubt there is one?

Gawd damn, if I knew that before I started this repo I might not have started.

@casey
Copy link
Author

casey commented Nov 14, 2024

😭

@maflcko
Copy link

maflcko commented Nov 15, 2024

Edit: If there is a single software project out there, which currently correctly and fully implements the RPC API in a type-safe manner, it would be interesting to see how they approached it. But I doubt there is one?

Gawd damn, if I knew that before I started this repo I might not have started.

I wasn't aware of this repo. I only looked at https://github.com/rust-bitcoin/rust-bitcoincore-rpc, which looked a bit incomplete and stale, according to its own readme: https://github.com/rust-bitcoin/rust-bitcoincore-rpc?tab=readme-ov-file#supported-bitcoin-core-versions

This repo looks a lot more what I think should be the goal. That is, this repo properly versions RPCs for each release and every field has the correct type (only did some spot checking). Short of auto-generation (which no one has implemented yet) this is probably the best approach that I've seen so far.

@tcharding
Copy link
Member

Mad, that inspires me. Thanks.

@tcharding
Copy link
Member

tcharding commented Nov 15, 2024

The names are a mess now, below I'm going to use the new names because they are shorter but for anyone not following the rename I list them here

  • bitcoind-json-rpc-regtest -> corepc-node (directory is bitcoind because the crate was originally called that)
  • bitcoind-json-rpc-client -> corepc-client
  • bitcoind-json-rpc-types -> corepc-types
  1. I'm actually a little bit unclear about how this crate does 1. Looking at the client code, it looks like each version has a different module containing the types for that version. How does the client code run against different core versions? Does it have to enumerate those versions, like doing v1::method(), v2::method(), or does it somehow select the right method and models to use dynamically based on what version of Core it's connected to?

We test using features on the integration_test crate that enable features in corepc-node.

corepcp-node spins up a Bitcoin Core node of the correct version and passes back a handle in BitcoinD.

corepc-node has features that configure which Client is exposed from corepc-client in the handle i.e., the Client field of BitcoinD is version specific. When one calls bitcoind.client.foo() they are calling client::client_sync::v17::foo (assuming corepc-node was built with feature 0_17_1).

  1. I think codegen potentially helps with this. Each time a new version of core comes out, you rebase the C++ code which dumps the API to JSON onto the new version. Then you run a Rust codegen script to auto-generate the Rust code which contains Rust structs for all the request and response types for that version.

I think you are right, as long as we don't manually go in and touch the generated code. All "fixes" would have to go into the code generator.

  1. I think codegen helps with this. You have the type conversion code, for example between a JSON string and a Rust address in one place, so if that changes, for example because of a change to rust-bitcoin, you only have to change it in one place, and the codegen script will replicate it wherever that type is used in a request or response.

This one I'm more confident you are correct.

  1. This depends on whether or not the code in Core changes enough to make rebasing hard. In the best case, it gets integrated into Core, so you don't need to rebase. In the medium case you have to rebase manually but it's easy, in the worst case there's crazy code churn in the new version of Core, so you have to modify it to rebase it.

I'm hopeful that you are correct here, I don't know exactly how hard it is to go from a fully fleshed out version to the next in this crate yet because v17 is not fully fleshed out. All other versions are just minimal (basically just what was needed by miniscript).

  1. I think codegen probably helps, or at least makes that complexity easy to deal with.

One idea would be to only codegen Rust std types and leave the model stuff done manually. It would be nice to code gen the whole thing of course.

A big problem with the codegen approach is that I wrote the code for Bitcoin Core 28. I don't know how much change there's been in the RPC API code, but if there's been a lot, rebasing the the code onto earlier and earlier versions, back to v17, might be very hard. I do wonder if it's worth supporting versions of core which are that old. Is there demand for it? If not, then you could just decided to support v28 going forward, and anyone who wants support for an older version should use the deprecated library.

Yeah I asked myself this question also, I started at v17 because that is where bitcoincore-rpc started and I wanted to replace it without loss of test coverage.

@casey
Copy link
Author

casey commented Nov 17, 2024

I started working on a new branch which exports a description of the RPC API as JSON Schema. I don't have a lot of experience with JSON Schema, but it seems pretty widely adopted, and there is at least some amount of tooling for it. At the very worst, it doesn't seem a whole lot more difficult to work with than an ad-hoc JSON description of the API

For each command, there is an arguments key which holds a JSON Schema describing the arguments to the command. I haven't started on return types yet, since they're much more complicated.

@casey
Copy link
Author

casey commented Nov 17, 2024

corepc-node has features that configure which Client is exposed from corepc-client in the handle i.e., the Client field of BitcoinD is version specific. When one calls bitcoind.client.foo() they are calling client::client_sync::v17::foo (assuming corepc-node was built with feature 0_17_1).

Does that mean that client code has to statically select which version of bitcoind it wants to talk to with a feature?

@tcharding
Copy link
Member

I don't quite understand the question. By 'client code' you mean code in the corepc-client crate or code that uses the Clent (by way of eg bitcoind.client)?

@casey
Copy link
Author

casey commented Nov 17, 2024

Ah yeah, sorry it's confusing. Yeah, I mean a user of corepc-client::Client.

@tcharding
Copy link
Member

tcharding commented Nov 17, 2024

I'll try to clarify.

The Client (and corepc-client) is specifically, and explicitly, only for integration testing. I envisage only two users of Client:

  1. The integration_test crate in this repo (solely exits to verify that the types crate is correct).
  2. Other projects that wish to do integration testing using bitcoind (for an example see rust-miniscript/bitcoind-tests).

A user uses Client by way of corepc-node and the BitcoinD type (which I'd like to rename to Node). corepc-node controls, by way of features, which version of Core the Client is for.

A user of Client should be able to write arbitrary code e.g., client.foo(). The return type of client.foo() is version specific eg if it returns Foo then the corepc-node guarantees that Foo is correct for that version of Core. This is tested because when the integration tests in this repo run they run against each version of Core.

If other projects (miniscript or anyone else) want to extend the Client I plan of having a low bar for merging new methods on Client.

Now, my opinion, and background on design assumptions:

For production code, people should write their own client and only use the corepc-types crate from this repo. They will be able to have confidence that what ever version of Core they hit the types contained in types will be valid and correct but they are version specific. Such crates will have to work out for themselves how they handle hitting different versions of Core. I maintain the position that this decision is application specific eg

  1. app explicitly only supports latest Core version (easy)
  2. app explicitly only supports latest 3 versions (if dynamic, may need branching logic in the code switching on version)
  3. app supports any version but must be statically built for that version (can use features)

And all of these depend on what subset of RPC methods the app calls. Hence my view that its app specific and not right to have a single production client library that everyone can use.

The final piece of the puzzle is the types::model module. This has types that are:

  • General over all versions of Core
  • Have rust-bitcoin types (Amount instead of u64)

One can convert from an types::vXY::Foo to a model::Foo by calling bitcoind.client.foo().into_model().

I have a good reason for the split between vXY::Foo and model::Foo but I've already written enough text for now.

@tcharding tcharding transferred this issue from rust-bitcoin/rust-bitcoind-json-rpc Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants