Skip to content

This reference describes the timer implementations used by tacet on different platforms.

TimerSpec::Auto uses platform-specific logic to select the best timer:

PlatformAuto BehaviorFallbackPMU Timer (explicit)
x86_64 Linuxrdtsc (~0.3ns)N/Aperf_event (~0.3ns)
x86_64 macOSrdtsc (~0.3ns)N/AN/A
ARM64 Linuxperf_event (~0.3ns, needs sudo)cntvct_el0 (1-40ns)perf_event
ARM64 macOSkperf (~1ns, needs sudo)cntvct_el0 (~42ns)kperf

On x86_64: rdtsc (invariant TSC) is already high-precision (~0.3ns) and measures wall-clock time, which is what attackers observe. No need for PMU access.

On ARM64: System timers (cntvct_el0) are often too coarse (42ns on Apple Silicon, 40ns on Neoverse N1). Auto tries PMU timers first for better precision, falling back gracefully if sudo is unavailable.

PMU-based timers (kperf, perf_event) are available via explicit TimerSpec::Kperf or TimerSpec::PerfEvent for microarchitectural research.

For CI or when you need guaranteed precision, use TimerSpec::RequireHighPrecision:

use tacet::{TimingOracle, TimerSpec};
let result = TimingOracle::new()
.timer_spec(TimerSpec::RequireHighPrecision)
.test(inputs, |data| my_function(data));

This performs runtime detection:

  1. Checks if the system timer has ≤2ns resolution
  2. If yes, uses it (x86_64 rdtsc, ARMv8.6+ cntvct_el0)
  3. If no, falls back to PMU timers (kperf, perf_event)
  4. Panics if no high-precision timer is available
PlatformBehavior
x86_64Always succeeds (rdtsc ~0.3ns)
ARM64 Linux ARMv8.6+ (Graviton4)Succeeds without sudo (cntvct_el0 ~1ns)
ARM64 Linux pre-ARMv8.6Requires sudo for perf_event
ARM64 macOSRequires sudo for kperf

Use TimerSpec::RequireCycleAccurate to request PMU-based cycle counting. This uses kperf on macOS ARM64 and perf_event on Linux:

Terminal window
# Requires BOTH sudo AND single-threaded
sudo -E cargo test -- --test-threads=1

kperf uses Apple’s private performance framework. It requires root privileges and can only be accessed by one thread at a time; parallel tests silently fall back to the standard timer.

The rdtsc (Read Time-Stamp Counter) instruction reads the CPU’s cycle counter:

  • Invariant TSC: Modern CPUs maintain constant rate regardless of frequency scaling
  • Resolution: ~0.3ns at 3 GHz
  • No privileges required

The library uses serialization barriers (mfence/lfence) to prevent out-of-order execution from affecting measurements.

Linux perf_event provides access to hardware performance counters:

  • Same resolution as rdtsc
  • Provides additional isolation from software overhead
  • Requires CAP_PERFMON or perf_event_paranoid ≤ 2

The virtual timer counter runs at a fixed 24 MHz on M1/M2/M3/M4:

  • Resolution: ~41.67ns (1/24 MHz)
  • No privileges required
  • Consistent across P/E cores

This relatively coarse resolution is compensated by adaptive batching.

Apple’s private performance framework provides cycle-accurate timing:

  • Resolution: ~1ns
  • Requires root
  • Single-threaded only (global resource)

To use kperf:

Terminal window
sudo -E cargo test -- --test-threads=1

Counter frequency varies by SoC:

SoCFrequencyResolution
AWS Graviton4 (ARMv8.6+)1 GHz~1ns
Ampere Altra25 MHz~40ns
Raspberry Pi 454 MHz~18ns

Same as x86_64 Linux; requires CAP_PERFMON or perf_event_paranoid ≤ 2.

On platforms with coarse timer resolution, fast operations complete in fewer timer ticks than needed for reliable measurement. The library automatically compensates:

  1. Pilot measurement: Run ~100 iterations to measure ticks per operation
  2. Enable batching: If < 5 ticks per call, batch multiple operations together
  3. Select K: Choose batch size K = min(ceil(50/ticks_per_call), 20)
  4. Measure batches: Record total time for K operations as one sample

Batching is disabled when:

  • Timer has sufficient resolution (≥ 5 ticks per operation)
  • Using cycle-accurate timers (kperf, perf_event)
  • Operation is slow enough

Before measurement begins, the library runs several checks:

  • Verify timer is monotonic (second read ≥ first read)
  • Check timer advances at a reasonable rate
  • Detect if timer resolution is too coarse for any measurement
  • Compare two halves of baseline samples
  • Detects problems like generator timing or side effects
  • If “leak” found between identical inputs, harness has a bug
  • Detect drift over time that would violate statistical assumptions
  • Divide samples into windows, compare medians
  • Flags quality warning if significant drift detected

Operations completing faster than ~10ns on Apple Silicon (or proportionally fast on other platforms) cannot be reliably measured:

Operation too fast to measure
Operation: ~5ns
Timer resolution: ~42ns
Recommendation: Use TimerSpec::RequireHighPrecision with sudo (~1ns resolution)

Options:

  1. Use TimerSpec::RequireHighPrecision for higher resolution (requires sudo on Apple Silicon)
  2. Test a larger workload (more iterations, larger input)
  3. Accept that ultra-fast operations may not be measurable