Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added examples of relevant output. #1737

Merged
merged 1 commit into from
Nov 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions docs/guide/exporting-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,101 @@ sudo systemctl enable --now ntpd-rs-metrics

After enabling the metrics exporter, a prometheus metrics dataset will be served on `127.0.0.1:9975/metrics`

The dataset will look something like:
```
# HELP ntp_uptime_seconds Time that the ntp daemon is running.
# TYPE ntp_uptime_seconds gauge
# UNIT ntp_uptime_seconds seconds
ntp_uptime_seconds{version="1.3.0",build_commit="e8869f4378971ca470131e54fea6e72655a774c3",build_commit_date="2024-09-19"} 1320106.480437661
# HELP ntp_system_poll_interval_seconds [DEPRECATED] Time between polls of the system.
# TYPE ntp_system_poll_interval_seconds gauge
# UNIT ntp_system_poll_interval_seconds seconds
ntp_system_poll_interval_seconds 256.00000005960464
# HELP ntp_system_accumulated_steps_seconds Accumulated amount of seconds that the system needed to jump the time.
# TYPE ntp_system_accumulated_steps_seconds gauge
# UNIT ntp_system_accumulated_steps_seconds seconds
ntp_system_accumulated_steps_seconds 0
# HELP ntp_system_accumulated_steps_threshold_seconds Threshold for the accumulated step amount at which the NTP daemon will exit (or -1 if no threshold was set).
# TYPE ntp_system_accumulated_steps_threshold_seconds gauge
# UNIT ntp_system_accumulated_steps_threshold_seconds seconds
ntp_system_accumulated_steps_threshold_seconds -1
# HELP ntp_system_leap_indicator Indicates that a leap second will take place.
# TYPE ntp_system_leap_indicator gauge
ntp_system_leap_indicator 0
# HELP ntp_system_root_delay_seconds Distance to the closest root time source.
# TYPE ntp_system_root_delay_seconds gauge
# UNIT ntp_system_root_delay_seconds seconds
ntp_system_root_delay_seconds 0.006932416233916864
# HELP ntp_system_root_dispersion_seconds Estimate of how precise our time is.
# TYPE ntp_system_root_dispersion_seconds gauge
# UNIT ntp_system_root_dispersion_seconds seconds
ntp_system_root_dispersion_seconds 0.000041443621749394485
# HELP ntp_system_stratum Stratum of our clock.
# TYPE ntp_system_stratum gauge
ntp_system_stratum 2
# HELP ntp_source_poll_interval_seconds Time between polls of the source.
# TYPE ntp_source_poll_interval_seconds gauge
# UNIT ntp_source_poll_interval_seconds seconds
ntp_source_poll_interval_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 256.00000005960464
# HELP ntp_source_unanswered_polls Number of polls since the last successful poll with a maximum of eight.
# TYPE ntp_source_unanswered_polls gauge
ntp_source_unanswered_polls{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0
# HELP ntp_source_offset_seconds Offset between the upstream source and system time.
# TYPE ntp_source_offset_seconds gauge
# UNIT ntp_source_offset_seconds seconds
ntp_source_offset_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0.000004342757166443103
# HELP ntp_source_delay_seconds Current round-trip delay to the upstream source.
# TYPE ntp_source_delay_seconds gauge
# UNIT ntp_source_delay_seconds seconds
ntp_source_delay_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0.006932416233916864
# HELP ntp_source_uncertainty_seconds Estimated error of the source clock.
# TYPE ntp_source_uncertainty_seconds gauge
# UNIT ntp_source_uncertainty_seconds seconds
ntp_source_uncertainty_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0.0000629844144133349
# HELP ntp_source_root_delay_seconds Root delay reported by the time source.
# TYPE ntp_source_root_delay_seconds gauge
# UNIT ntp_source_root_delay_seconds seconds
ntp_source_root_delay_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0
# HELP ntp_source_root_dispersion_seconds Uncertainty reported by the time source.
# TYPE ntp_source_root_dispersion_seconds gauge
# UNIT ntp_source_root_dispersion_seconds seconds
ntp_source_root_dispersion_seconds{name="ntp.vsl.nl:123",address="31.223.173.226:123",id="1"} 0.000015258789066052714
# HELP ntp_server_received_packets_total Number of incoming packets.
# TYPE ntp_server_received_packets_total counter
ntp_server_received_packets_total{listen_address="0.0.0.0:123"} 94633291
# HELP ntp_server_accepted_packets_total Number of packets accepted.
# TYPE ntp_server_accepted_packets_total counter
ntp_server_accepted_packets_total{listen_address="0.0.0.0:123"} 93203603
# HELP ntp_server_denied_packets_total Number of denied packets.
# TYPE ntp_server_denied_packets_total counter
ntp_server_denied_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_ignored_packets_total Number of packets ignored.
# TYPE ntp_server_ignored_packets_total counter
ntp_server_ignored_packets_total{listen_address="0.0.0.0:123"} 1429688
# HELP ntp_server_rate_limited_packets_total Number of rate limited packets.
# TYPE ntp_server_rate_limited_packets_total counter
ntp_server_rate_limited_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_response_send_errors_total Number of packets where there was an error responding.
# TYPE ntp_server_response_send_errors_total counter
ntp_server_response_send_errors_total{listen_address="0.0.0.0:123"} 2
# HELP ntp_server_nts_received_packets_total Number of incoming NTS packets.
# TYPE ntp_server_nts_received_packets_total counter
ntp_server_nts_received_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_nts_accepted_packets_total Number of NTS packets accepted.
# TYPE ntp_server_nts_accepted_packets_total counter
ntp_server_nts_accepted_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_nts_denied_packets_total Number of denied NTS packets.
# TYPE ntp_server_nts_denied_packets_total counter
ntp_server_nts_denied_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_nts_rate_limited_packets_total Number of rate limited NTS packets.
# TYPE ntp_server_nts_rate_limited_packets_total counter
ntp_server_nts_rate_limited_packets_total{listen_address="0.0.0.0:123"} 0
# HELP ntp_server_nts_nak_packets_total Number of NTS nak responses to packets.
# TYPE ntp_server_nts_nak_packets_total counter
ntp_server_nts_nak_packets_total{listen_address="0.0.0.0:123"} 0
# EOF
```

## Installed through cargo or from source

When installed through cargo or from source, two things need to be configured manually:
Expand Down
15 changes: 15 additions & 0 deletions docs/guide/security-guidance.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ What values to choose for these thresholds depends on what the expected maximum

Again, note that the thresholds are enforced through ntpd-rs aborting when they are exceeded. Hence, strict values for these will limit the daemons ability to automatically adjust to sudden changes to the clock, potentially decreasing availability of the time synchronization.

When aborting due to the above thresholds, a log message along the lines of the one below is shown:
```
2024-11-28T12:40:32.821717Z ERROR ntp_proto::algorithm::kalman: Unusually large clock step suggested, please manually verify system clock and reference clock state and restart if appropriate.
```

### The risks of rebooting ntpd-rs

Because the `startup-step-panic-threshold` is typically higher than the `single-step-panic-threshold`, rebooting ntpd-rs makes bigger step adjustments possible. Furthermore, rebooting clears the total accumulated step, and repeated reboots can allow an attacker to bypass the protections offered by `accumulated-step-panic-threshold`.
Expand All @@ -52,6 +57,16 @@ For servers being completely unavailable, this is the difference between the num

The downside of a large number of upstream time servers is that an attacker aimed at missteering your local clock is provided with more avenues to do so, because they will need to compromise a smaller fraction of upstream servers to gain clock control. The attacker can then ensure synchronization with that subset through denial of service attacks on the other upstream servers.

If insufficient servers are available for synchronizing the time, or if they don't agree on the current time, this will be logged in a manner similar to
```
2024-11-28T12:40:32.821717Z INFO ntp_proto::algorithm::kalman: No consensus on current time
```
Note that this can happen at times, especially during bootup. However, prolonged periods (more than 15 minutes) of seeing this message, without any sign of synchronization in the form of a log line similar to
```
Sep 05 13:57:38 magnesium ntp-daemon[1593]: 2024-09-05T11:57:38.128571Z INFO ntp_proto::algorithm::kalman: Offset: -84.4077443282968+-94.97527051033717ms, frequency: 0+-10000000ppm
```
means that the client is not steering the clock and may indicate a problem with the configured time sources.

## Configuration, logs and observability

There are more aspects of ntpd-rs besides clock steering that must be considered for secure operations.
Expand Down