Replies: 10 comments 1 reply
-
This sounds like a great plan @abrassel! A couple questions I have:
In addition, I've used that guide before for personal use and it's a bit out of date for modern axum + tonic versions, but I think there's been a good amount of progress on that end. tokio-rs/axum#2736 |
Beta Was this translation helpful? Give feedback.
-
Sorry for jumping into the conversation. Regarding |
Beta Was this translation helpful? Give feedback.
-
Hey @ognis1205, First of all, don't apologize! This is a "Request for Comments", every and all comments are appreciated :D Second of all, I just looked at the Delta Sharing protocol and it looks relatively trivial to implement (in fact, I see on your profile that you have a delta-sharing-rs server implementation, we can likely just directly leverage that and nest it under the main router) — my only question is, where did you get the information that Unity will support Delta Sharing? I was under the impression that DBX provides its own properitary server implementation that interfaces with Unity, but it's not necessarily a built-in feature of the catalog itself. I'm not aware of the internals. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the reply and your understanding. Regarding the main router, yes, I thought the same way as you did. The reason I believe Unity will support Delta Sharing is due to the following comment and the resource:
As you mentioned, just from the roadmap and her statement, it might still be unclear how they plan to support the Delta Sharing protocol. |
Beta Was this translation helpful? Give feedback.
-
@ognis1205 ty so much for the links! You're indeed right. At the end of the day, the "unity catalog protocol" is basically an access-control server for assets scoped like a database (all it does is hand out leases to assets living in cloud storage), delta sharing is basically the same thing without the multimodality and some parquet-specific optimizations (like data skipping). I guess we can say that the unity catalog is meant to represent an evolution of delta-sharing (while delta-sharing is a bit more stateful, unity catalog is theoretically agnostic to the underlying data asset). That is to say, if the unity catalog is a more generalized form of delta-sharing (which I currently believe it to be), then nesting a router under the main unity catalog router is probably trivial depending on the backend. I don't know in all honesty, but anyways, I just discovered |
Beta Was this translation helpful? Give feedback.
-
One requirement I'd like to advocate for is support for the Iceberg REST catalog, like what's being worked in https://github.com/unitycatalog/unitycatalog ! I'd be happy to help with any efforts in that area. |
Beta Was this translation helpful? Give feedback.
-
Thanks @amogh-jahagirdar ! That's a great suggestion. I agree that we should definitely prioritize that super useful feature. It may be slightly outside the scope of this RFC, since here we're focused on the broad capabilities and architecture - i.e. are we exposing gRPC endpoints, rather than which API endpoints. That being said, It would be great if you could submit an RFC explicitly asking for Iceberg support! I don't think it'll be controversial :) |
Beta Was this translation helpful? Give feedback.
-
Final thoughts from anyone in this thread? Personally, I am inclined towards gRPC and Protobuf definitions, but at least for the time being, I want to focus on the REST implementation first and retrofit it later. I've spent some time trying to get the new axum and tonic working and it's quite challenging, I worry that gRPC will block us. |
Beta Was this translation helpful? Give feedback.
-
sounds great to me! I think lets consider this RFC closed. I'll be approaching this from the client side and we can use swagger to generate a rust client from the openapi spec. |
Beta Was this translation helpful? Give feedback.
-
Have just been redirected here by @abrassel, personally I agree with everything said in this thread and would personally also favor gPRC. Just did some playing around with that in delta-sharing-rs and while there are some rough edges to even out, I believe it works out nicely. I would also be happy to open a PR - not necessarily to merge it, but to see what that might look like. |
Beta Was this translation helpful? Give feedback.
-
TL;DR
As we begin work on the server, we will need to settle on a high level architecture. What will the server do? What framework(s) will we use? What is the entity source of truth? ... and more.
This RFC will dedicate one section to each of these questions, and we can expand the scope as necessary.
External Requirements
MUST
SHOULD
gRPC
In addition to these external requirements, I am proposing for our implementation to expose a mirror set of gRPC endpoints as well. This, combined with a protobuf spec, will have a number of performances
Object Model
Currently, we have code-genned a set of Rust objects to represent the Unity Catalog object model. Instead, if we use protobuf, I am proposing we also generate our Rust object model from the protobuf generation. This comes with a number of benefits:
Server architecture
If we pursue a dual gRPC and HTTP server, this guide seems like a decent model to follow.
TL;DR it proposes using axum, hyper, tonic, tower.
We can furthermore use
prost
to generate our Rust types and utoipa for our openAPI and swagger spec. It is worth noting that we will need to do additional work and possibly an upstream contribution to properly support theapplication/x-ndjson
specification. See this issue for some context.Beta Was this translation helpful? Give feedback.
All reactions