Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does SDPi need to define a SDC suitable slewing period ? #295

Open
PaulMartinsen opened this issue Jun 6, 2024 · 1 comment
Open

Does SDPi need to define a SDC suitable slewing period ? #295

PaulMartinsen opened this issue Jun 6, 2024 · 1 comment
Assignees

Comments

@PaulMartinsen
Copy link
Collaborator

Some sources indicate the slewing period, the amount of time a device will take to bring its internal clock in-line with a time-stamp supplied by an NTP source, could be 300s or 600s. Are system defaults for NTP synchronization acceptable for SDC?

@PaulMartinsen PaulMartinsen added the Comment NEW A submitted comment waiting to be reviewed and dispositioned label Jun 6, 2024
@ToddCooper ToddCooper removed the Comment NEW A submitted comment waiting to be reviewed and dispositioned label Jun 14, 2024
@ToddCooper ToddCooper added this to the SDPi 1.4 review milestone Jun 14, 2024
@PaulMartinsen
Copy link
Collaborator Author

PaulMartinsen commented Jun 17, 2024

Slewing adjustments

Slewing occurs when a device's clock is not too dissimilar from times provided by reference (NTP) sources. Conceptually, this involves speeding up or slowing down the clock so that the device clock eventually matches the reference clock. In practice, a closed loop control system is used to minimize the error between the device and reference clocks.

Background

An SDC provider declares the methods it may use to determine time in ClockDescriptor\TimeProtocol and the current method in ClockState\ActiveSyncProtocol. Both are coded values, typically from 1073-10101, which defines many time synchronization profiles.

11073-20701, R0007 requires support for NTPv3 (or compatible versions) and notes SNTP RFC1769 and SNTP v4 RFC4330 are not recommended. Presumably RFC2030 (also SNTPv4?) is off the table too.

RR1162 from 10700-D7-2022 requires consumers to consider the risks from erroneous timestamps, suggesting regular comparison of clock information from a provider with its own clock.

NTPv4

Looking into NTPv4, the latest version of the options recommended by 20701.

Slewing

Slewing clock adjustments

  • maximum slew rate limited to 500 ppm [RFC5905:§A.5.5.6]
    • this appears to be an implementation detail, not a RFC5095 requirement
    • RFC5095 uses "frequency tolerance" for two difference things. The other one has a value of 15 ppm, though the difference is unclear. 500 ppm is mentioned in a few other places though.
  • under normal conditions,
    • "clock discipline" gradually slews the clock to the correct time based on "offset spikes". Offset spikes appear to be the difference between the local clock and reference clock with consideration given to faults
    • offset spikes greater than ±128 ms are generally discarded unless
      • the offset consistently exceeds the step threshold (STEPT = 125 ms) in a short amount of time (WATCH = 900 s = 15 min), which triggers a non-slewing adjustment RFC5905§Fig 27;
      • it's not clear if these values are required by the standard, recommended or just an example. "Everyone" probably uses them though?
    • so the maximum clock error we'd slew
      • is ±128 ms at a slew rate of 500 ppm,
      • which will take 0.128/500e-6=256.0 seconds (or ±4 minutes, 16 seconds) to bring the local clock in line with the reference clock (this may be where the 300 and 600 s linked in the original issue comes from).

This suggests that, during normal operation, the difference between the device clocks of any two SDC participants synchronizing their clocks with NTPv4 will be less than 8 minutes and 32 seconds.

Time error measurement

The error between a client and an NTP reference source is estimated by exchanging time-stamped (T1 … T4) NTP messages:

sequenceDiagram
  participant C as Client
  participant S as NTP server
  C->>S: T1 --- T2
  S->>C: T3 --- T4
Loading
$$Offset = \frac{(T_2 - T_1) + (T_3 - T_4)}{2}$$

NTPv4 accuracy

11073-10207-2019 includes accuracy information for the clock state (ClockState\@Accuracy) and notes, for NTPv4, the accuracy is equivalent to (Root Dispersion) + 1⁄2 (Root Delay).

RFC5905 notes:

The delta and epsilon statistics are accumulated at each stratum level from the reference clock to produce the root delay (DELTA) and root dispersion (EPSILON) statistics

NTP.org notes:

… synchronization accuracy in the order of a few milliseconds at update intervals of fifteen minutes … and order of one millisecond with update intervals of one minute.

Note

Likely this all means the expected accuracy during normal operation (e.g., no non-slewing adjustments) is a few milliseconds (value in ClockState\Accuracy?), but at any point in time the worst case error (to a reference clock) is ±4 minutes, 16 seconds. So, timestamps from two participants may differ by up to 8 minutes and 32 seconds. And if the error is this bad, a slewing adjustment could soon follow.

NTPv4 startup

When power is restored after several hours or days, the clock offset and oscillator frequency errors must be resolved by the clock discipline algorithm, but this can take several hours without specific provisions.

NTPv4 startup provisions ensure that, in all but pathological situations, the startup transient is suppressed to within nominal levels in no more than five minutes after a warm start or ten minutes after a cold start. Two provisions, in the reference implementation, synchronize time at startup quickly.

  • For clock frequency:
    • Measured clock oscillator frequency is written to a frequency file every hour or more, with write frequency depending on:
      • measured frequency wander,
      • minimizing writes to NVRAM
    • During a warm start, clock frequency is initialized from the frequency file to avoid a lengthy convergence time.
    • For a cold start, the clock frequency is measured over a five minute interval, typically reducing residual frequency error to less than 1ppm.
  • For clock offset:
    • After getting the clock frequency (0 time if read from non-volatile storage, 5 minutes for a cold start), frequency discipline (i.e., adjustment) is disabled and clock offset discipline is enabled with a small time constant.
    • This mode operates for five minutes, after which the clock offset error is usually no more than 1 ms.
    • In addition, the client can send a volley of six requests, at 2 second intervals, before setting the clock for the first time. Usually takes about 10 seconds to get a reliable estimate to set the clock.

Implications for SDPi

The following options are proposed to resolve this issue, please add others you think of to this issue:

1. No special treatment needed.

11073-20701, R0007 stipulates NTPv3 or higher which, of course, comes with the limitations discussed above and 10700-D7-2022 RR1162 requires consumers to consider associated risks and suggests regular clock comparisons.

Does anyone have broadly applicable use-cases, which could (for example) be added to this issue, that warrant expansion on this SDPi?

2. Monitor time

Time errors exceeding a few seconds after initialization may indicate a problem with timekeeping infrastructure or networking systems rather than a tardy participant.

SDPi might require that the responsible organization should operate a monitoring system that periodically requests time from active participants, reporting faults appropriately, as part of maintaining this infrastructure.

3. Time validation

SDPi might require validation of participant time by the responsible organization, for example during a conformity validation workflow, when a participant joins the organization SDC network. External conformity certification might validate that participants don't perform non-slewing adjustments unnecessarily.

4. Time-stamp messages

A header field with the SDC participants local time could be included in every SDC report and/or message. This would enable time-sensitive participants to make informed decisions about clock discrepancies with little additional overhead and without imposing more complex time-keeping solutions on all providers. WS-Security defines flexible a mechanism to include time-stamps in a SOAP header, though a custom header may be more appropriate.

5. Require more precise time-keeping

SDPi could identify and stipulate a more accurate and/or precise time-keeping protocol. However, this may be challenging with current methods well thought out and perhaps limited by practical considerations that would also be faced by alternatives.

Suggested resolution

Given large time errors seem more likely to arise from network and infrastructure problems (e.g., high network latency, time server can't be reached, wrong time-server configured), to me option 2 and/or 3 make the most sense for SDPi.

Note: time-stamp uncertainty in messages delivered by the history service, shared in plug-a-thon 17, wouldn't be addressed by any of these measures.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New issues
Development

No branches or pull requests

2 participants