move the TUF trust store to the database #8488

iliana · 2025-07-01T00:11:15Z

Closes #3578
Depends on oxidecomputer/tufaceous#30

This removes the [updates] section of the Nexus config, instead opting to allow operators to configure which TUF root roles they trust via the API. See RFD 447.

This deliberately punts on a few choices: any identifying information about the roots themselves; which trusted root was used to verify a repository is not recorded; deleting a trusted root does not impact any uploaded repositories; synchronizing the Nexus trust store to Wicket; whether to automatically add new versions of roots to the trust store (or possibly replace the older version, to allow for a real key rotation scenario). A future RFD and implementation PR should cover these.

iliana · 2025-07-01T00:12:15Z

TODO:

land repository load methods with trusted roots tufaceous#30
resolve the test_unauthorized_coverage test failure

iliana · 2025-07-01T00:14:22Z

nexus/tests/integration_tests/updates.rs

-// Tests that ".." paths are disallowed by dropshot.
-#[tokio::test]
-async fn test_download_with_dots_fails() {
-    let cptestctx =
-        test_setup::<omicron_nexus::Server>("test_download_with_dots_fails", 0)
-            .await;
-    let client = &cptestctx.internal_client;
-
-    let filename = "hey/can/you/look/../../../../up/the/directory/tree";
-    let artifact_get_url = format!("/artifacts/{}", filename);


I removed this test because we already test for directory traversal rejection in the console API integration tests:

omicron/nexus/tests/integration_tests/console_api.rs

Lines 375 to 385 in cbae98c

// test that `..` is not allowed in paths. (Dropshot handles this, so we

// test to ensure this hasn't gone away.)

let _ = RequestBuilder::new(

&testctx,

Method::GET,

"/assets/../assets/hello.txt",

)

.expect_status(Some(StatusCode::BAD_REQUEST))

.execute()

.await

.expect("failed to 400 on `..` traversal");

The /artifacts/{..path} internal API is long gone.

davepacheco

Nice! I can't wait to use this when doing updates.

Most of my comments here are pretty minor but I'm pretty worried about the way we're representing the root in the database and db model type. More on that below.

davepacheco · 2025-07-01T22:54:53Z

nexus/db-model/src/tuf_repo.rs

+    #[diesel(embed)]
+    pub identity: TufTrustRootIdentity,
+    pub time_deleted: Option<DateTime<Utc>>,
+    pub root_role: serde_json::Value,


I'm pretty worried about the format of this value.

The problem, as always: how do we avoid accidentally changing the format in a way that only breaks at upgrade-time at a customer site?

I'm kind of thinking of this as two separate problems:

how do we avoid adding stuff to this structure in some release and then expecting it to be there in all the rows we find, even if they were written in earlier releases

how do we ensure more generally that we can continue reading older structures

One idea is to just store it as a String. This solves (1) because we'd be treating it as opaque. It seems like we only use the stringified value anyway, right? One downside though is that if we ever do want to use the parsed version in the future, we'll have an annoying database migration to deal with because we might fail to deserialize each row. Also, this doesn't solve (2). If tough evolves its structure incompatibly, we wouldn't notice until we deployed this to a system that had roots written with an older release.

Another idea to solve (1) while still giving us future flexibility is to create a newtype around the underlying tough type. When we insert it, we serialize it as a JSON value. When we read a row, we deserialize it into the newtype. (I think you can probably impl some Diesel trait for this, but you can also just do it inline at insert/read time.) This prevents us from adding our own stuff to it without considering compatibility, but still makes sure what's in the database is valid JSON, but it doesn't solve (2).

A way we usually solve (2) is to add an expectorate test on the JSON schema, but that doesn't seem possible here without changing tough to impl JsonSchema.

We could define our own copy of the tough type for this that does impl JsonSchema, do the newtype thing, and add an expectorate test.

Last idea: if the format is really simple, we could hardcode some specific JSON that we believe is representative and then have a test that we can deserialize those. This is easy but a little brittle: it wouldn't catch cases where some future release added something (not covered by the test) and then that then got ripped out in a later release.

Of course, the other option is to define a proper schema for the whole object and store it directly. That might actually be the least work of these.

We discussed this a bit on this morning's update watercooler call.

In terms of the DB representation, we settled on keeping it stored as JSON. I had brought up that it seems unfortunate to either require a nested JSON string or a base64 representation of an artifact that should be trivially visible to a customer so they can check if it matches what we provide elsewhere. In order to ensure the GET paths always work, it makes sense to store the underlying data as JSON so that there's a guarantee that whatever's in the database can actually be serialized out as part of a JSON response.

In terms of avoiding adding stuff to the structure, we settled on adding a newtype to the db-macros crate that implements Deserialize and Serialize, so that the format of a trust root is completely opaque to Nexus or any other code. This will probably be a newtype around serde_json::Value as opposed to the tough Signed<Root>, in case we decide to add any top-level fields to the signature envelope later (either for identification or cross-signing from another PKI); such things would still be opaque to Nexus and would likely be implemented in Tufaceous.

I'll also add a simple test to verify that we can still generate a repository with a hardcoded root (and its associated signing key) to help prevent being surprised by an incompatible change to tough. When we have a real updates PKI we should also add tests for those trust roots.

davepacheco · 2025-07-01T22:55:55Z

nexus/db-queries/src/db/datastore/update.rs

+    ) -> ListResultVec<TufTrustRoot> {
+        use nexus_db_schema::schema::tuf_trust_root::dsl;
+
+        opctx.authorize(authz::Action::Read, &authz::FLEET).await?;


nit and not a big deal: I'd suggest creating a little synthetic resource for the list of TUF trust roots.

I'm not sure what that means, particularly in the context of db-queries code?

I mean that in general, it's slightly better to have the authorize() calls use the actual resource that you're operating on ("list of TUF trust roots") and let the policy decide what that means in terms of roles ("you can add to the list of TUF trust roots if you can modify the fleet") than it is to bake the policy choice in the authorize() call. So for a bunch of resources, we create what we call a synthetic resource (which just means that it doesn't correspond to something in the database).

There's a paragraph of docs here:

omicron/nexus/auth/src/authz/omicron.polar

Lines 339 to 348 in c16f14f

#

# SYNTHETIC RESOURCES OUTSIDE THE SILO HIERARCHY

#

# The resources here do not correspond to anything that appears explicitly in

# the API or is stored in the database. These are used either at the top level

# of the API path (e.g., "/v1/system/ip-pools") or as an implementation detail of the system

# (in the case of console sessions and "Database"). The policies are

# either statically-defined in this file or driven by role assignments on the

# Fleet. None of these resources defines their own roles.

#

and there's an example here:

omicron/nexus/auth/src/authz/omicron.polar

Lines 452 to 468 in c16f14f

# Describes the policy for creating and managing Silo certificates

resource SiloCertificateList {

permissions = [ "list_children", "create_child" ];

relations = { parent_silo: Silo, parent_fleet: Fleet };

# Both Fleet and Silo administrators can see and modify the Silo's

# certificates.

"list_children" if "admin" on "parent_silo";

"list_children" if "admin" on "parent_fleet";

"create_child" if "admin" on "parent_silo";

"create_child" if "admin" on "parent_fleet";

}

has_relation(silo: Silo, "parent_silo", collection: SiloCertificateList)

if collection.silo = silo;

has_relation(fleet: Fleet, "parent_fleet", collection: SiloCertificateList)

if collection.silo.fleet = fleet;

the other half is the Rust part:

omicron/nexus/auth/src/authz/api_resources.rs

Lines 518 to 568 in c16f14f

/// Synthetic resource describing the list of Certificates associated with a

/// Silo

#[derive(Clone, Debug, Eq, PartialEq)]

pub struct SiloCertificateList(Silo);

impl SiloCertificateList {

pub fn new(silo: Silo) -> SiloCertificateList {

SiloCertificateList(silo)

}

pub fn silo(&self) -> &Silo {

&self.0

}

}

impl oso::PolarClass for SiloCertificateList {

fn get_polar_class_builder() -> oso::ClassBuilder<Self> {

oso::Class::builder()

.with_equality_check()

.add_attribute_getter("silo", |list: &SiloCertificateList| {

list.0.clone()

})

}

}

impl AuthorizedResource for SiloCertificateList {

fn load_roles<'fut>(

&'fut self,

opctx: &'fut OpContext,

authn: &'fut authn::Context,

roleset: &'fut mut RoleSet,

) -> futures::future::BoxFuture<'fut, Result<(), Error>> {

// There are no roles on this resource, but we still need to load the

// Silo-related roles.

self.silo().load_roles(opctx, authn, roleset)

}

fn on_unauthorized(

&self,

_: &Authz,

error: Error,

_: AnyActor,

_: Action,

) -> Error {

error

}

fn polar_class(&self) -> oso::Class {

Self::get_polar_class()

}

}

Hopefully that's all there is to it. The advantages are: it's easier to write and also validate the authz checks (because you check exactly the action you're doing on the resource you're doing it to) and it also means it shows up in places like this:

omicron/nexus/db-queries/tests/output/authz-roles.out

Lines 155 to 167 in c16f14f

resource: Silo "silo1": certificate list

USER Q R LC RP M MP CC D

fleet-admin ✘ ✘ ✔ ✘ ✘ ✘ ✔ ✘

fleet-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

fleet-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

silo1-admin ✘ ✘ ✔ ✘ ✘ ✘ ✔ ✘

silo1-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

silo1-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

silo1-proj1-admin ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

silo1-proj1-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

silo1-proj1-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘

unauthenticated ! ! ! ! ! ! ! !

but in the end, it's not a big deal -- the behavior is the same.

davepacheco · 2025-07-01T22:59:51Z

nexus/external-api/src/lib.rs

+    /// Upload update repository
+    ///
+    /// Update repositories are verified by the update system trust store.


I think @ahl had some thoughts on nomenclature here.

I often find it confusing when we use "update" as a noun, but it might be unavoidable.

davepacheco · 2025-07-01T23:00:58Z

nexus/external-api/src/lib.rs

@@ -2985,6 +2987,58 @@ pub trait NexusExternalApi {
        path_params: Path<params::UpdatesGetRepositoryParams>,
    ) -> Result<HttpResponseOk<TufRepoGetResponse>, HttpError>;

+    /// List root roles in the update system trust store
+    ///
+    /// Root roles are a JSON document describing the keys trusted to sign


Suggested change

/// Root roles are a JSON document describing the keys trusted to sign

/// Root roles are JSON documents describing the security keys that are trusted to sign

davepacheco · 2025-07-01T23:47:00Z

nexus/external-api/src/lib.rs

+    #[endpoint {
+        method = GET,
+        path = "/v1/system/update/trust-roots",
+        tags = ["experimental"], // ["system/update"],


this is going to generate CLI commands, right?

davepacheco · 2025-07-02T02:15:48Z

nexus/src/app/update.rs

+        opctx: &OpContext,
+        trust_root: serde_json::Value,
+    ) -> Result<TufTrustRoot, HttpError> {
+        opctx.authorize(authz::Action::Modify, &authz::FLEET).await?;


I think this belongs in the DataStore, and if you have it there, I wouldn't keep it here.

That goes for all the functions here.

davepacheco · 2025-07-02T02:17:33Z

nexus/src/app/update.rs

@@ -25,16 +32,27 @@ impl super::Nexus {
    ) -> Result<TufRepoInsertResponse, HttpError> {
        opctx.authorize(authz::Action::Modify, &authz::FLEET).await?;

-        // XXX: this needs to validate against the trusted root!


schema/crdb/dbinit.sql

davepacheco · 2025-07-02T02:23:10Z

update-common/src/artifacts/artifacts_with_plan.rs

+    TrustStore(&'a [Vec<u8>]),
+    /// Blindly trust the root role present in the repository (but still perform
+    /// signature checks using that root role).
+    BlindlyTrustAnything,


I regret that this exists but I gather we need it for MUPdate and also use it in reconfigurator-cli.

Yeah. I think eventually we'll add some sort of trust to Wicket but we should design what we want there.

iliana added 6 commits June 30, 2025 15:46

wip

645e15e

wip

e5f9ba5

wip

c5c0306

wip

53bf36b

wip

8e82a61

wip

81e01f6

iliana requested review from sunshowers, davepacheco and plotnick July 1, 2025 00:11

iliana commented Jul 1, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into iliana/tuf-trust

cfe8636

davepacheco reviewed Jul 2, 2025

View reviewed changes

iliana added 3 commits July 3, 2025 18:22

drop time_modified; comment on the index

daaada8

do the wrapper struct thing

79e5344

remove redundant opctx.authorize calls

ac86807

	// test that `..` is not allowed in paths. (Dropshot handles this, so we
	// test to ensure this hasn't gone away.)
	let _ = RequestBuilder::new(
	&testctx,
	Method::GET,
	"/assets/../assets/hello.txt",
	)
	.expect_status(Some(StatusCode::BAD_REQUEST))
	.execute()
	.await
	.expect("failed to 400 on `..` traversal");

	#
	# SYNTHETIC RESOURCES OUTSIDE THE SILO HIERARCHY
	#
	# The resources here do not correspond to anything that appears explicitly in
	# the API or is stored in the database. These are used either at the top level
	# of the API path (e.g., "/v1/system/ip-pools") or as an implementation detail of the system
	# (in the case of console sessions and "Database"). The policies are
	# either statically-defined in this file or driven by role assignments on the
	# Fleet. None of these resources defines their own roles.
	#

	# Describes the policy for creating and managing Silo certificates
	resource SiloCertificateList {
	permissions = [ "list_children", "create_child" ];

	relations = { parent_silo: Silo, parent_fleet: Fleet };

	# Both Fleet and Silo administrators can see and modify the Silo's
	# certificates.
	"list_children" if "admin" on "parent_silo";
	"list_children" if "admin" on "parent_fleet";
	"create_child" if "admin" on "parent_silo";
	"create_child" if "admin" on "parent_fleet";
	}
	has_relation(silo: Silo, "parent_silo", collection: SiloCertificateList)
	if collection.silo = silo;
	has_relation(fleet: Fleet, "parent_fleet", collection: SiloCertificateList)
	if collection.silo.fleet = fleet;

	/// Synthetic resource describing the list of Certificates associated with a
	/// Silo
	#[derive(Clone, Debug, Eq, PartialEq)]
	pub struct SiloCertificateList(Silo);

	impl SiloCertificateList {
	pub fn new(silo: Silo) -> SiloCertificateList {
	SiloCertificateList(silo)
	}

	pub fn silo(&self) -> &Silo {
	&self.0
	}
	}

	impl oso::PolarClass for SiloCertificateList {
	fn get_polar_class_builder() -> oso::ClassBuilder<Self> {
	oso::Class::builder()
	.with_equality_check()
	.add_attribute_getter("silo", \|list: &SiloCertificateList\| {
	list.0.clone()
	})
	}
	}

	impl AuthorizedResource for SiloCertificateList {
	fn load_roles<'fut>(
	&'fut self,
	opctx: &'fut OpContext,
	authn: &'fut authn::Context,
	roleset: &'fut mut RoleSet,
	) -> futures::future::BoxFuture<'fut, Result<(), Error>> {
	// There are no roles on this resource, but we still need to load the
	// Silo-related roles.
	self.silo().load_roles(opctx, authn, roleset)
	}

	fn on_unauthorized(
	&self,
	_: &Authz,
	error: Error,
	_: AnyActor,
	_: Action,
	) -> Error {
	error
	}

	fn polar_class(&self) -> oso::Class {
	Self::get_polar_class()
	}
	}

	resource: Silo "silo1": certificate list

	USER Q R LC RP M MP CC D
	fleet-admin ✘ ✘ ✔ ✘ ✘ ✘ ✔ ✘
	fleet-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	fleet-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	silo1-admin ✘ ✘ ✔ ✘ ✘ ✘ ✔ ✘
	silo1-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	silo1-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	silo1-proj1-admin ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	silo1-proj1-collaborator ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	silo1-proj1-viewer ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘
	unauthenticated ! ! ! ! ! ! ! !

	/// Root roles are a JSON document describing the keys trusted to sign
	/// Root roles are JSON documents describing the security keys that are trusted to sign

move the TUF trust store to the database #8488

Are you sure you want to change the base?

move the TUF trust store to the database #8488

Uh oh!

Conversation

iliana commented Jul 1, 2025

Uh oh!

iliana commented Jul 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davepacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!