Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: allow use of UUIDs as component IDs in instance specs #816

Open
wants to merge 3 commits into
base: gjcolombo/one-ensure-api
Choose a base branch
from

Conversation

gjcolombo
Copy link
Contributor

Stacked on #813. Related to #804.

Reviewer note: Most of this change is mechanical conversion of types, but there are some slightly more substantial functional changes to Crucible backend management in server.rs, initializer.rs, and state_driver.rs.


Today instance specs use Strings to identify VM components. In many cases, components are associated with control plane objects (like Disk and NetworkInterface records) that have UUIDs. Allow components to be identified by UUID by defining a SpecKey type that deserializes as a Uuid when possible and as a String when not, then plumb it throughout Propolis. See the comments on the SpecKey type for more details, including notes on why the type is a UUID/String union and not just a UUID.

The main functional change in this PR is that propolis-server's component maps now use SpecKeys to identify components. The idea is to say that if a component has key "Foo" in a VM's spec, then subsequent API calls that want to act on that component should pass "Foo" as the component ID. This (hopefully) simplifies how Crucible-related APIs should designate the disk they're targeting: Omicron should pick an ID for each Crucible backend it requests (which ID can just be the control plane disk ID), then pass the same ID to subsequent snapshot and VCR replacement operations. Control plane disk names no longer appear in the Propolis API at all.

Tests: cargo test, PHD, manual creation/migration of VMs with propolis-cli, manually hit the snapshot and VCR replacement endpoints and verified these dispatched their operations to the correct Crucible backend.

Fixes #776. Fixes #772.

Instead of using Strings as keys in instance specs, define a SpecKey
type that is a union of a UUID and a String. This type serializes as a
string, and its `FromStr` and `From<String>` impls will try to parse
strings as UUIDs before storing keys in string format. This allows the
control plane to use UUIDs wherever it makes sense while allowing it
(and other users) to continue to use strings where a UUID component ID
is unavailable or is being used by some other spec element (as might be
the case for a disk device and disk backend).

Index most objects in propolis-server using SpecKeys instead of Strings.
This is mostly just a rote change-the-types, make-it-compile exercise,
but it extends to Crucible backends, which deserve some additional
commentary. In the old ensure API, sending a `DiskRequest` would
register disk components with several different identifiers:

- The disk's name, given by the `DiskRequest`'s `name` field, is used
  - as a key in the VM's `DeviceMap`,
  - as the disk device component's key in the VM's instance spec, and
  - to generate the disk backend component's key in the VM's instance
    spec.
- The disk's ID, given as the `id` in the `Volume`-variant
  `VolumeConstructionRequest` in the `DiskRequest`, is reported by the
  Crucible backend's `get_uuid` function and used
  - to register a metrics producer for each disk, and
  - as the key in the VM's `CrucibleBackendMap`.

Now that all new VMs are started using instance specs, use component
keys for everything:

- Entities in the `DeviceMap` are identified by their corresponding
  components' `SpecKey`s.
- Crucible backends in the `CrucibleBackendMap` are also identified by
  their `SpecKey`s.
- APIs that take a Crucible backend ID on their path now take a String
  and not a UUID. The server converts these to a SpecKey before looking
  up backends in the map.
- When a Crucible backend is created, the machine initializer checks to see
  if its `SpecKey` was a `Uuid`. If so, it will register the backend to
  produce metrics.

The intention is that Omicron will use disk IDs as the keys for its
`CrucibleStorageBackend` components in the specs it generates. It can
then also use these IDs as parameters when requesting snapshots or VCR
replacements. The friendly names users give their Omicron disks no
longer appear in the Propolis API at all.
@gjcolombo
Copy link
Contributor Author

@leftwo You may want to take a look at the Crucible bits of this one (in server.rs, initializer.rs, and state_driver.rs). Instead of calling Volume::get_uuid to get a disk ID from the VCR to use in the CrucibleBackendMap, Propolis now expects that

  • Omicron will choose a Crucible backend component ID for each Crucible disk when a VM starts
  • This ID will be used as the key to the Crucible backend map
  • Omicron will specify the same ID when it needs to request a snapshot or replace a VCR

I reckon Omicron will just use the disk ID as the Crucible backend ID to make this as smooth as possible in Nexus and sled agent. As far as I can tell, the disk ID is the ID that we were getting from get_uuid before, because that's how Omicron constructs its volume construction requests; this just makes the IDs at the Propolis layer a little more explicit/transparent.

LMK whether this seems like a reasonable approach.

// - Propolis users outside the control plane may not have any component UUIDs
// at all and may just want to use strings to identify all their components.
//
// For these reasons, the key type is a union of a UUID and a String. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suuuuuuper pedantic, but i wouldn't call this a "union", since Rust also has union types which are untagged unions.

@hawkw
Copy link
Member

hawkw commented Nov 20, 2024

This seems reasonable to me overall, but I'll defer to the Crucible folks

Comment on lines +671 to +674
// If metrics are enabled and this Crucible backend was
// identified with a UUID-format spec key, register this disk
// for metrics collection, using the key as the disk ID.
if let (Some(registry), SpecKey::Uuid(disk_id)) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nexus will never tell propolis-server about a Crucible disk with non-UUID ID, right? if so, the SpecKey::Name path here, where there just won't be metrics collected for the disk, shouldn't be reachable except if we've crafted a weird instance request by hand? if this shouldn't ever be reachable, it seems worth rejecting the spec /more precisely typing IDs we know need to be UUID.

i do remember disk names ending up in here somewhere, rather than IDs.. gonna go reread how that plumbing works unless you remember more immediately.

my real concern here is that this is where Nexus had been giving what are now SpecKey::Name, and that we'll lose Crucible metrics here. do you know if we have a test that exercises that?

Copy link
Contributor Author

@gjcolombo gjcolombo Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nexus will never tell propolis-server about a Crucible disk with non-UUID ID, right? if so, the SpecKey::Name path here, where there just won't be metrics collected for the disk, shouldn't be reachable except if we've crafted a weird instance request by hand? if this shouldn't ever be reachable, it seems worth rejecting the spec /more precisely typing IDs we know need to be UUID.

This makes sense. I'm a little leery of it because I think it makes the API a little weird: the ID for a given component can be either a UUID a Name, unless it's this one specific kind of component, in which case it has to be a UUID. This isn't horribly broken or anything, it's just a slightly sharp edge (and we'd also have to account for it in the non-Omicron tools/processes that let users attach ad-hoc Crucible volumes to Propolis servers).

We could address this in a few other ways:

  1. warn! if the spec key isn't a UUID but there's a producer registry available
  2. put a #[cfg(feature = "omicron-build")] guard in that enforces the "Crucible keys must be UUIDs" requirement
  3. go back to calling Volume::get_uuid in this path to get the UUID to use for stats reporting, and warn! if it differs from the spec key

I kind of like option 3 now that I've written it up: it limits our regression risk (at least where metrics are concerned) but keeps the API flexible. WDYT?

i do remember disk names ending up in here somewhere, rather than IDs.. gonna go reread how that plumbing works unless you remember more immediately.

IIRC the disk name wasn't used in this specific path (EDIT: this is incorrect, see below; the disk name was used as a spec key previously, which then used it to construct serial numbers for NVMe devices), but it was used in VCR replacement, which worked like this:

  • Take the VM objects lock
  • Look up the CrucibleStorageBackend component in the instance spec
    • the key for this was derived from the disk name
  • Look up the Crucible backend object in the VM's crucible_backends map
    • the key for this turns out to be the disk ID; the initialization code path used to read it from the Crucible backend's get_uuid function, which would read it out of the disk's volume construction request; Omicron guaranteed that this ID matched the disk ID (by creating the VCR that way)
  • send the replace operation to Crucible
  • on success, write the updated VCR back to the instance spec

my real concern here is that this is where Nexus had been giving what are now SpecKey::Name, and that we'll lose Crucible metrics here. do you know if we have a test that exercises that?

We don't have an end-to-end test for disk metrics, at least AFAIK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, after re-reading a bit and taking a second look, i'm also a fan of Volume::get_uuid here. i generally like the change to make crucible just Option<Arc<...>> though, so to double-check: you're thinking literally calling crucible.get_uuid() somewhere around this if, yeah? i think that should be pretty clean.

FWIW the get_uuid() call before was context'd and ?, so it would bubble an error up rather than warn. that still seems like an appropriate error behavior here too.

more UUID tracing i haven't done: could backend_id be different from crucible.get_uuid()? i assume this is "technically possible but not really desired"

@@ -586,22 +593,22 @@ impl<'a> MachineInitializer<'a> {
Nvme,
}

for (disk_name, disk) in &self.spec.disks {
for (device_id, disk) in &self.spec.disks {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do think that disk_name was a good name for this one - it's a SpecKey, and it could be a Uuid if something sent one, but Nexus currently sends a Name and changing it would (for better or worse) change /dev/disk/by-id/ paths guest VMs should be able to depend on.

i recognize that you're not changing disk_name to something else (yet?), but i happen to also think we should change this name to an ID. if you agree, it'd be nice to change the name of this variable when we do it 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch here, thanks! I think the disk serials will remain stable over migrations, at least, but you're right, without some care this will change the serial number strings we use for NVMe devices across a reboot.

One of my goals for this layer is to divorce components' keys from their behaviors--i.e. two devices with the same component specs should have the same guest-visible behavior even if they had different keys. In this particular case, we can achieve that by adding a serial_number field to the NvmeDisk component, and then populating it with whatever makes sense in the control plane (the actual disk name, I believe). If that sounds good, I can give this a try when I next make updates to this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, love that idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants