Skip to content

[inventory] Add full OmicronSledConfig and fields for upcoming config reconciler #8188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
51fe9cc
initial inventory integration - no database fixes yet
jgallagher May 15, 2025
d507eec
update db schema and model for new inventory fields
jgallagher May 16, 2025
655528b
wip: first pass at inserts
jgallagher May 16, 2025
d409d4a
continued wip: reading collections back out
jgallagher May 19, 2025
b74d38f
wip: delete collections
jgallagher May 19, 2025
2e9555a
persist OmicronZoneConfig::image_source
jgallagher May 19, 2025
8b316e6
add schema migration
jgallagher May 19, 2025
d06d6be
finish todo!
jgallagher May 20, 2025
446bc61
fixup tests
jgallagher May 20, 2025
b78b434
fix db type namespace
jgallagher May 20, 2025
87579cb
openapi update
jgallagher May 20, 2025
9f13465
expectorate
jgallagher May 20, 2025
dbf98fe
Merge remote-tracking branch 'origin/main' into john/sled-agent-confi…
jgallagher May 20, 2025
5274214
report empty disk/dataset configs before RSS
jgallagher May 20, 2025
4fe5d13
omdb output cleanup
jgallagher May 20, 2025
15738d6
use strongly-typed IDs in Tabled structs
jgallagher May 21, 2025
769d98e
expand comment
jgallagher May 21, 2025
f97bb70
remove dead code
jgallagher May 21, 2025
577483b
struct with named fields over huge tuple
jgallagher May 21, 2025
5a93622
slf -> this
jgallagher May 21, 2025
a7c62c5
clearer InvOmicronSledConfig::new()
jgallagher May 21, 2025
7ee7e6e
combine closely-related CHECK constraints
jgallagher May 21, 2025
ddea0c8
add and use strongly-typed OmicronSledConfigUuid
jgallagher May 21, 2025
f964a51
Merge remote-tracking branch 'origin/main' into john/sled-agent-confi…
jgallagher May 21, 2025
6bf29a0
Merge remote-tracking branch 'origin/main' into john/sled-agent-confi…
jgallagher May 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions dev-tools/omdb/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ nexus-db-schema.workspace = true
nexus-inventory.workspace = true
nexus-reconfigurator-preparation.workspace = true
nexus-saga-recovery.workspace = true
nexus-sled-agent-shared.workspace = true
nexus-types.workspace = true
omicron-common.workspace = true
omicron-uuid-kinds.workspace = true
Expand Down
210 changes: 196 additions & 14 deletions dev-tools/omdb/src/bin/omdb/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,10 @@ use nexus_db_queries::db::pagination::Paginator;
use nexus_db_queries::db::pagination::paginated;
use nexus_db_queries::db::queries::ALLOW_FULL_TABLE_SCAN_SQL;
use nexus_db_queries::db::queries::region_allocation;
use nexus_sled_agent_shared::inventory::ConfigReconcilerInventoryResult;
use nexus_sled_agent_shared::inventory::ConfigReconcilerInventoryStatus;
use nexus_sled_agent_shared::inventory::OmicronSledConfig;
use nexus_sled_agent_shared::inventory::OmicronZoneImageSource;
use nexus_types::deployment::Blueprint;
use nexus_types::deployment::BlueprintZoneDisposition;
use nexus_types::deployment::BlueprintZoneType;
Expand All @@ -149,6 +153,7 @@ use omicron_uuid_kinds::DatasetUuid;
use omicron_uuid_kinds::DownstairsRegionUuid;
use omicron_uuid_kinds::GenericUuid;
use omicron_uuid_kinds::InstanceUuid;
use omicron_uuid_kinds::OmicronZoneUuid;
use omicron_uuid_kinds::ParseError;
use omicron_uuid_kinds::PhysicalDiskUuid;
use omicron_uuid_kinds::PropolisUuid;
Expand Down Expand Up @@ -7329,27 +7334,204 @@ fn inv_collection_print_sleds(collection: &Collection) {
println!(" reservation: {reservation:?}, quota: {quota:?}");
}

println!(
" zones generation: {} (count: {})",
sled.omicron_zones.generation,
sled.omicron_zones.zones.len(),
);
if let Some(config) = &sled.ledgered_sled_config {
inv_collection_print_sled_config("LEDGERED", config);
} else {
println!(" no ledgered sled config");
}

if sled.omicron_zones.zones.is_empty() {
continue;
if let Some(last_reconciliation) = &sled.last_reconciliation {
if Some(&last_reconciliation.last_reconciled_config)
== sled.ledgered_sled_config.as_ref()
{
println!(" last reconciled config: matches ledgered config");
} else {
inv_collection_print_sled_config(
"LAST RECONCILED CONFIG",
&last_reconciliation.last_reconciled_config,
);
let disk_errs = collect_config_reconciler_errors(
&last_reconciliation.external_disks,
);
let dataset_errs = collect_config_reconciler_errors(
&last_reconciliation.datasets,
);
let zone_errs = collect_config_reconciler_errors(
&last_reconciliation.zones,
);
for (label, errs) in [
("disk", disk_errs),
("dataset", dataset_errs),
("zone", zone_errs),
] {
if errs.is_empty() {
println!(" all {label}s reconciled successfully");
} else {
println!(
" {} {label} reconciliation errors:",
errs.len()
);
for err in errs {
println!(" {err}");
}
}
}
}
}

println!(" ZONES FOUND");
for z in &sled.omicron_zones.zones {
println!(
" zone {} (type {})",
z.id,
z.zone_type.kind().report_str()
);
print!(" reconciler task status: ");
match &sled.reconciler_status {
ConfigReconcilerInventoryStatus::NotYetRun => {
println!("not yet run");
}
ConfigReconcilerInventoryStatus::Running {
config,
started_at,
running_for,
} => {
println!("running for {running_for:?} (since {started_at})");
if Some(config) == sled.ledgered_sled_config.as_ref() {
println!(" reconciling currently-ledgered config");
} else {
inv_collection_print_sled_config(
"RECONCILING CONFIG",
config,
);
}
}
ConfigReconcilerInventoryStatus::Idle { completed_at, ran_for } => {
println!(
"idle (finished at {completed_at} \
after running for {ran_for:?})"
);
}
}
}
}

fn collect_config_reconciler_errors<T: Ord + Display>(
results: &BTreeMap<T, ConfigReconcilerInventoryResult>,
) -> Vec<String> {
results
.iter()
.filter_map(|(id, result)| match result {
ConfigReconcilerInventoryResult::Ok => None,
ConfigReconcilerInventoryResult::Err { message } => {
Some(format!("{id}: {message}"))
}
})
.collect()
}

fn inv_collection_print_sled_config(label: &str, config: &OmicronSledConfig) {
let OmicronSledConfig {
generation,
disks,
datasets,
zones,
remove_mupdate_override,
} = config;

println!("\n{label} SLED CONFIG");
println!(" generation: {}", generation);
println!(" remove_mupdate_override: {remove_mupdate_override:?}");

if disks.is_empty() {
println!(" disk config empty");
} else {
#[derive(Tabled)]
#[tabled(rename_all = "SCREAMING_SNAKE_CASE")]
struct DiskRow {
id: PhysicalDiskUuid,
zpool_id: ZpoolUuid,
vendor: String,
model: String,
serial: String,
}

let rows = disks.iter().map(|d| DiskRow {
id: d.id,
zpool_id: d.pool_id,
vendor: d.identity.vendor.clone(),
model: d.identity.model.clone(),
serial: d.identity.serial.clone(),
});
let table = tabled::Table::new(rows)
.with(tabled::settings::Style::empty())
.with(tabled::settings::Padding::new(8, 1, 0, 0))
.to_string();
println!(" DISKS: {}", disks.len());
println!("{table}");
}

if datasets.is_empty() {
println!(" dataset config empty");
} else {
#[derive(Tabled)]
#[tabled(rename_all = "SCREAMING_SNAKE_CASE")]
struct DatasetRow {
id: DatasetUuid,
name: String,
compression: String,
quota: String,
reservation: String,
}

let rows = datasets.iter().map(|d| DatasetRow {
id: d.id,
name: d.name.full_name(),
compression: d.inner.compression.to_string(),
quota: d
.inner
.quota
.map(|q| q.to_string())
.unwrap_or_else(|| "none".to_string()),
reservation: d
.inner
.reservation
.map(|r| r.to_string())
.unwrap_or_else(|| "none".to_string()),
});
let table = tabled::Table::new(rows)
.with(tabled::settings::Style::empty())
.with(tabled::settings::Padding::new(8, 1, 0, 0))
.to_string();
println!(" DATASETS: {}", datasets.len());
println!("{table}");
}

if zones.is_empty() {
println!(" zone config empty");
} else {
#[derive(Tabled)]
#[tabled(rename_all = "SCREAMING_SNAKE_CASE")]
struct ZoneRow {
id: OmicronZoneUuid,
kind: &'static str,
image_source: String,
}

let rows = zones.iter().map(|z| ZoneRow {
id: z.id,
kind: z.zone_type.kind().report_str(),
image_source: match &z.image_source {
OmicronZoneImageSource::InstallDataset => {
"install-dataset".to_string()
}
OmicronZoneImageSource::Artifact { hash } => {
format!("artifact: {hash}")
}
},
});
let table = tabled::Table::new(rows)
.with(tabled::settings::Style::empty())
.with(tabled::settings::Padding::new(8, 1, 0, 0))
.to_string();
println!(" ZONES: {}", zones.len());
println!("{table}");
}
}

fn inv_collection_print_keeper_membership(collection: &Collection) {
println!("\nKEEPER MEMBERSHIP");
for k in &collection.clickhouse_keeper_cluster_membership {
Expand Down
1 change: 1 addition & 0 deletions id-map/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,7 @@ impl<T: IdMappable + Debug + Eq> Diffable for IdMap<T> {

/// Wrapper around a `&'a mut T` that panics when dropped if the borrowed
/// value's `id()` has changed since the wrapper was created.
#[derive(Debug)]
pub struct RefMut<'a, T: IdMappable> {
original_id: T::Id,
// Always `Some(_)` until the `RefMut` is consumed by `into_ref()`.
Expand Down
18 changes: 15 additions & 3 deletions live-tests/tests/test_nexus_add_remove.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,9 +186,21 @@ async fn test_nexus_add_remove(lc: &LiveTestContext) {
let agent = latest_collection.sled_agents.get(&sled_id).expect(
"collection information for the sled we added a Nexus to",
);
if agent.omicron_zones.zones.iter().any(|z| z.id == new_zone.id) {
debug!(log, "zone still present in inventory");
return Err(CondCheckError::<()>::NotYet);
if let Some(config) = &agent.ledgered_sled_config {
if config.zones.iter().any(|z| z.id == new_zone.id) {
debug!(log, "zone still present in ledger");
return Err(CondCheckError::<()>::NotYet);
}
}
if let Some(config) = agent
.last_reconciliation
.as_ref()
.map(|lr| &lr.last_reconciled_config)
{
if config.zones.iter().any(|z| z.id == new_zone.id) {
debug!(log, "zone still present in inventory");
return Err(CondCheckError::<()>::NotYet);
}
}
return Ok(latest_collection);
},
Expand Down
17 changes: 15 additions & 2 deletions nexus-sled-agent-shared/src/inventory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -111,11 +111,12 @@ pub struct Inventory {
pub usable_hardware_threads: u32,
pub usable_physical_ram: ByteCount,
pub reservoir_size: ByteCount,
pub omicron_zones: OmicronZonesConfig,
pub disks: Vec<InventoryDisk>,
pub zpools: Vec<InventoryZpool>,
pub datasets: Vec<InventoryDataset>,
pub omicron_physical_disks_generation: Generation,
pub ledgered_sled_config: Option<OmicronSledConfig>,
pub reconciler_status: ConfigReconcilerInventoryStatus,
pub last_reconciliation: Option<ConfigReconcilerInventory>,
}

/// Describes the last attempt made by the sled-agent-config-reconciler to
Expand Down Expand Up @@ -196,6 +197,18 @@ pub struct OmicronSledConfig {
pub remove_mupdate_override: Option<MupdateOverrideUuid>,
}

impl Default for OmicronSledConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on deriving this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that would require Generation to impl Default, which it currently does not. I can't think of a reason why it shouldn't, but will ask around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generated a lot of discussion. 😅 Generation not implementing Default is intentional, so I'll leave this as-is.

fn default() -> Self {
Self {
generation: Generation::new(),
disks: IdMap::default(),
datasets: IdMap::default(),
zones: IdMap::default(),
remove_mupdate_override: None,
}
}
}

impl Ledgerable for OmicronSledConfig {
fn is_newer_than(&self, other: &Self) -> bool {
self.generation > other.generation
Expand Down
Loading
Loading