Does SDPi need to define a SDC suitable slewing period ? #295

PaulMartinsen · 2024-06-06T12:09:47Z

Some sources indicate the slewing period, the amount of time a device will take to bring its internal clock in-line with a time-stamp supplied by an NTP source, could be 300s or 600s. Are system defaults for NTP synchronization acceptable for SDC?

PaulMartinsen · 2024-06-17T06:17:35Z

Slewing adjustments

Slewing occurs when a device's clock is not too dissimilar from times provided by reference (NTP) sources. Conceptually, this involves speeding up or slowing down the clock so that the device clock eventually matches the reference clock. In practice, a closed loop control system is used to minimize the error between the device and reference clocks.

Background

An SDC provider declares the methods it may use to determine time in ClockDescriptor\TimeProtocol and the current method in ClockState\ActiveSyncProtocol. Both are coded values, typically from 1073-10101, which defines many time synchronization profiles.

11073-20701, R0007 requires support for NTPv3 (or compatible versions) and notes SNTP RFC1769 and SNTP v4 RFC4330 are not recommended. Presumably RFC2030 (also SNTPv4?) is off the table too.

RR1162 from 10700-D7-2022 requires consumers to consider the risks from erroneous timestamps, suggesting regular comparison of clock information from a provider with its own clock.

NTPv4

Looking into NTPv4, the latest version of the options recommended by 20701.

Slewing

Slewing clock adjustments

maximum slew rate limited to 500 ppm [RFC5905:§A.5.5.6]
- this appears to be an implementation detail, not a RFC5095 requirement
- RFC5095 uses "frequency tolerance" for two difference things. The other one has a value of 15 ppm, though the difference is unclear. 500 ppm is mentioned in a few other places though.
under normal conditions,
- "clock discipline" gradually slews the clock to the correct time based on "offset spikes". Offset spikes appear to be the difference between the local clock and reference clock with consideration given to faults
- offset spikes greater than ±128 ms are generally discarded unless
  - the offset consistently exceeds the step threshold (STEPT = 125 ms) in a short amount of time (WATCH = 900 s = 15 min), which triggers a non-slewing adjustment RFC5905§Fig 27;
  - it's not clear if these values are required by the standard, recommended or just an example. "Everyone" probably uses them though?
- so the maximum clock error we'd slew
  - is ±128 ms at a slew rate of 500 ppm,
  - which will take 0.128/500e-6=256.0 seconds (or ±4 minutes, 16 seconds) to bring the local clock in line with the reference clock (this may be where the 300 and 600 s linked in the original issue comes from).

This suggests that, during normal operation, the difference between the device clocks of any two SDC participants synchronizing their clocks with NTPv4 will be less than 8 minutes and 32 seconds.

Time error measurement

The error between a client and an NTP reference source is estimated by exchanging time-stamped (T₁ … T₄) NTP messages:

sequenceDiagram
  participant C as Client
  participant S as NTP server
  C->>S: T1 --- T2
  S->>C: T3 --- T4

$$Offset = \frac{(T_2 - T_1) + (T_3 - T_4)}{2}$$

NTPv4 accuracy

11073-10207-2019 includes accuracy information for the clock state (ClockState\@Accuracy) and notes, for NTPv4, the accuracy is equivalent to (Root Dispersion) + 1⁄2 (Root Delay).

RFC5905 notes:

The delta and epsilon statistics are accumulated at each stratum level from the reference clock to produce the root delay (DELTA) and root dispersion (EPSILON) statistics

NTP.org notes:

… synchronization accuracy in the order of a few milliseconds at update intervals of fifteen minutes … and order of one millisecond with update intervals of one minute.

Note

Likely this all means the expected accuracy during normal operation (e.g., no non-slewing adjustments) is a few milliseconds (value in ClockState\Accuracy?), but at any point in time the worst case error (to a reference clock) is ±4 minutes, 16 seconds. So, timestamps from two participants may differ by up to 8 minutes and 32 seconds. And if the error is this bad, a slewing adjustment could soon follow.

NTPv4 startup

When power is restored after several hours or days, the clock offset and oscillator frequency errors must be resolved by the clock discipline algorithm, but this can take several hours without specific provisions.

NTPv4 startup provisions ensure that, in all but pathological situations, the startup transient is suppressed to within nominal levels in no more than five minutes after a warm start or ten minutes after a cold start. Two provisions, in the reference implementation, synchronize time at startup quickly.

For clock frequency:
- Measured clock oscillator frequency is written to a frequency file every hour or more, with write frequency depending on:
  - measured frequency wander,
  - minimizing writes to NVRAM
- During a warm start, clock frequency is initialized from the frequency file to avoid a lengthy convergence time.
- For a cold start, the clock frequency is measured over a five minute interval, typically reducing residual frequency error to less than 1ppm.
For clock offset:
- After getting the clock frequency (0 time if read from non-volatile storage, 5 minutes for a cold start), frequency discipline (i.e., adjustment) is disabled and clock offset discipline is enabled with a small time constant.
- This mode operates for five minutes, after which the clock offset error is usually no more than 1 ms.
- In addition, the client can send a volley of six requests, at 2 second intervals, before setting the clock for the first time. Usually takes about 10 seconds to get a reliable estimate to set the clock.

Implications for SDPi

The following options are proposed to resolve this issue, please add others you think of to this issue:

1. No special treatment needed.

11073-20701, R0007 stipulates NTPv3 or higher which, of course, comes with the limitations discussed above and 10700-D7-2022 RR1162 requires consumers to consider associated risks and suggests regular clock comparisons.

Does anyone have broadly applicable use-cases, which could (for example) be added to this issue, that warrant expansion on this SDPi?

2. Monitor time

Time errors exceeding a few seconds after initialization may indicate a problem with timekeeping infrastructure or networking systems rather than a tardy participant.

SDPi might require that the responsible organization should operate a monitoring system that periodically requests time from active participants, reporting faults appropriately, as part of maintaining this infrastructure.

3. Time validation

SDPi might require validation of participant time by the responsible organization, for example during a conformity validation workflow, when a participant joins the organization SDC network. External conformity certification might validate that participants don't perform non-slewing adjustments unnecessarily.

4. Time-stamp messages

A header field with the SDC participants local time could be included in every SDC report and/or message. This would enable time-sensitive participants to make informed decisions about clock discrepancies with little additional overhead and without imposing more complex time-keeping solutions on all providers. WS-Security defines flexible a mechanism to include time-stamps in a SOAP header, though a custom header may be more appropriate.

5. Require more precise time-keeping

SDPi could identify and stipulate a more accurate and/or precise time-keeping protocol. However, this may be challenging with current methods well thought out and perhaps limited by practical considerations that would also be faced by alternatives.

Suggested resolution

Given large time errors seem more likely to arise from network and infrastructure problems (e.g., high network latency, time server can't be reached, wrong time-server configured), to me option 2 and/or 3 make the most sense for SDPi.

Note: time-stamp uncertainty in messages delivered by the history service, shared in plug-a-thon 17, wouldn't be addressed by any of these measures.

References

NTPv4 standard: RFC 5905
Network time project
- How NTPv4 works
- Clock discipline algorithm — responsible for adjusting time with the information it gets from NTP servers.

PaulMartinsen added the Comment NEW A submitted comment waiting to be reviewed and dispositioned label Jun 6, 2024

github-project-automation bot added this to Gemini SDPi Releases Jun 6, 2024

github-project-automation bot moved this to New issues in Gemini SDPi Releases Jun 6, 2024

ToddCooper removed the Comment NEW A submitted comment waiting to be reviewed and dispositioned label Jun 14, 2024

ToddCooper assigned PaulMartinsen Jun 14, 2024

ToddCooper added this to the SDPi 1.4 review milestone Jun 14, 2024

ToddCooper modified the milestones: SDPi 1.4 Review, SDPi 2.0 Review Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does SDPi need to define a SDC suitable slewing period ? #295

Does SDPi need to define a SDC suitable slewing period ? #295

PaulMartinsen commented Jun 6, 2024

PaulMartinsen commented Jun 17, 2024 •

edited

Loading

Does SDPi need to define a SDC suitable slewing period ? #295

Does SDPi need to define a SDC suitable slewing period ? #295

Comments

PaulMartinsen commented Jun 6, 2024

PaulMartinsen commented Jun 17, 2024 • edited Loading

Slewing adjustments

Background

NTPv4

Slewing

Time error measurement

NTPv4 accuracy

NTPv4 startup

Implications for SDPi

1. No special treatment needed.

2. Monitor time

3. Time validation

4. Time-stamp messages

5. Require more precise time-keeping

Suggested resolution

References

PaulMartinsen commented Jun 17, 2024 •

edited

Loading