You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[gateway] ingest sensor measurements from SPs into oximeter (#6354)
This branch adds code to the Management Gateway Service for periodically
polling sensor measurements from SPs and emitting it to Oximeter. In
particular, this consists of:
- a task for managing the metrics endpoint, waiting until MGS knows its
underlay network address to bind the endpoint and register it with the
control plane,
- tasks for polling sensor measurements from each individual SP that MGS
knows about,
- a task that waits until SP discovery has completed and the rack ID to
be known, and then spawns a poller task for every discovered SP slot
The SP poller tasks send samples to the Oximeter producer endpoint using
a `tokio::sync::broadcast` channel, which I've chosen primarily because
it can be used as a bounded ring buffer that actually overwrites the
*oldest* value when the buffer is full. This mostway, we use a bounded
amount of memory for samples, but prioritize the most recent samples if
we have to throw anything away because Oximeter hasn't come along to
collect them recently.
The poller tasks cache the component inventory and identifying
information from the SP, so that we don't have to re-read all this data
from the SP on every poll. While MGS, running on a host, would probably
be fine with doing this, it seems better to avoid making the SP do
unnecessary work at a 1Hz poll frequency, especially when *both* switch
zones are polling them. Instead, every time we poll sensor data from an
SP, we first ask it for its current state, and only invalidate our
cached understanding of the SP when the state changes. This way, if a SP
starts reporting new metrics due to a firmware update, or gets replaced
with a different chassis with a new serial number, revision, etc, we
won't continue to report metrics for stale targets, but we don't have to
reload all of that once per second. To detect scenarios where the SP's
state and/or identity has changed in the midst of polling its sensors
(which may result in mislabeled metrics), we check whether the SP's
state at the end of the poll matches its state at the beginning, and if
it's not, we poll again immediately with its new identity.
At present, the timestamps for these metric samples is generated by MGS
--- it's the time when MGS received the sensor data from the SP, as MGS
understands it. Because we don't currently collect data that was
recorded prior to the switch zone coming up, we don't need to worry
about figuring out timestamps for data recorded by the SP prior to the
existence of a wall clock. Figuring out the SP/MGS timebase
synchronization is probably a lot of additional work, although it would
be nice to do in the future. At present, [metrics emitted by sled-agent
prior to NTP sync will also be from 1987][1], so I think it's fine to do
something similar here, especially because the potential solutions to
that [also have their fair share of tradeoffs][2].
The new metrics use a schema in
`oximeter/oximeter/schema/hardware-component.toml`. The target of these
metrics is a `hardware_component` that includes:
- the rack ID and the identity of the MGS instance that collected the
metric,
- information identifying the chassis[^1] and of the SP that recorded
them (its serial number, model number, revision, and whether it's a
switch, a sled, or a power shelf),
- the SP's Hubris archive version (since the reported sensor data may
change in future firmware releases)
- the SP's ID for the hardware component (e.g. "dev-7"), the kind of
device (e.g. "tmp117", "max5970"), and the humman-readable description
(e.g. "Southeast temperature sensor", "U.2 Sharkfin A hot swap
controller", etc.) reported by the SP
Each kind of sensor reading has an individual metric
(`hardware_component:temperature`, `hardware_component:current`,
`hardware_component:voltage`, and so on). These metrics are labeled with
the SP-reported name of the individual sensor measurement channel. For
instance, a MAX5970 hotswap controller on sharkfin will have a voltage
and current metric named "V12_U2A_A0" for the 12V rail, and a voltage
and current metric named "V3P3_U2A_A0" for the 3.3V rail. Finally, a
`hardware_component:sensor_errors` metric records sensor errors reported
by the SP, labeled with the sensor name, what kind of sensor it is, and
a string representation of the error.
[1]:
#6354 (comment)
[2]:
#6354 (comment)
[^1]: I'm using "chassis" as a generic term to refer to "switch, sled,
or power shelf".
0 commit comments