Platform Timers

This reference describes the timer implementations used by tacet on different platforms.

Timer selection

TimerSpec::Auto uses platform-specific logic to select the best timer:

Platform	Auto Behavior	Fallback	PMU Timer (explicit)
x86_64 Linux	`rdtsc` (~0.3ns)	N/A	`perf_event` (~0.3ns)
x86_64 macOS	`rdtsc` (~0.3ns)	N/A	N/A
ARM64 Linux	`perf_event` (~0.3ns, needs sudo)	`cntvct_el0` (1-40ns)	`perf_event`
ARM64 macOS	`kperf` (~1ns, needs sudo)	`cntvct_el0` (~42ns)	`kperf`

Timer selection rationale

On x86_64: rdtsc (invariant TSC) is already high-precision (~0.3ns) and measures wall-clock time, which is what attackers observe. No need for PMU access.

On ARM64: System timers (cntvct_el0) are often too coarse (42ns on Apple Silicon, 40ns on Neoverse N1). Auto tries PMU timers first for better precision, falling back gracefully if sudo is unavailable.

PMU-based timers (kperf, perf_event) are available via explicit TimerSpec::Kperf or TimerSpec::PerfEvent for microarchitectural research.

Requiring high-precision timing

For CI or when you need guaranteed precision, use TimerSpec::RequireHighPrecision:

use tacet::{TimingOracle, TimerSpec};

let result = TimingOracle::new()
    .timer_spec(TimerSpec::RequireHighPrecision)
    .test(inputs, |data| my_function(data));

This performs runtime detection:

Checks if the system timer has ≤2ns resolution
If yes, uses it (x86_64 rdtsc, ARMv8.6+ cntvct_el0)
If no, falls back to PMU timers (kperf, perf_event)
Panics if no high-precision timer is available

Platform	Behavior
x86_64	Always succeeds (`rdtsc` ~0.3ns)
ARM64 Linux ARMv8.6+ (Graviton4)	Succeeds without sudo (`cntvct_el0` ~1ns)
ARM64 Linux pre-ARMv8.6	Requires sudo for `perf_event`
ARM64 macOS	Requires sudo for `kperf`

Enabling PMU timers

Use TimerSpec::RequireCycleAccurate to request PMU-based cycle counting. This uses kperf on macOS ARM64 and perf_event on Linux:

macOS ARM64
Linux

# Requires BOTH sudo AND single-threaded
sudo -E cargo test -- --test-threads=1

kperf uses Apple’s private performance framework. It requires root privileges and can only be accessed by one thread at a time; parallel tests silently fall back to the standard timer.

# Option 1: Run as root
sudo cargo test

# Option 2: Grant CAP_PERFMON capability
sudo setcap cap_perfmon+ep target/debug/deps/my_test-*

# Option 3: Adjust perf_event_paranoid (temporary)
echo 2 | sudo tee /proc/sys/kernel/perf_event_paranoid

Check current setting:

cat /proc/sys/kernel/perf_event_paranoid
# 3 = no access, 2 = user only, 1 = limited, 0/-1 = full

x86_64 details

rdtsc

The rdtsc (Read Time-Stamp Counter) instruction reads the CPU’s cycle counter:

Invariant TSC: Modern CPUs maintain constant rate regardless of frequency scaling
Resolution: ~0.3ns at 3 GHz
No privileges required

The library uses serialization barriers (mfence/lfence) to prevent out-of-order execution from affecting measurements.

perf_event (Linux)

Linux perf_event provides access to hardware performance counters:

Same resolution as rdtsc
Provides additional isolation from software overhead
Requires CAP_PERFMON or perf_event_paranoid ≤ 2

ARM64 macOS (Apple Silicon)

cntvct_el0

The virtual timer counter runs at a fixed 24 MHz on M1/M2/M3/M4:

Resolution: ~41.67ns (1/24 MHz)
No privileges required
Consistent across P/E cores

This relatively coarse resolution is compensated by adaptive batching.

kperf

Apple’s private performance framework provides cycle-accurate timing:

Resolution: ~1ns
Requires root
Single-threaded only (global resource)

To use kperf:

sudo -E cargo test -- --test-threads=1

ARM64 Linux

cntvct_el0

Counter frequency varies by SoC:

SoC	Frequency	Resolution
AWS Graviton4 (ARMv8.6+)	1 GHz	~1ns
Ampere Altra	25 MHz	~40ns
Raspberry Pi 4	54 MHz	~18ns

perf_event

Same as x86_64 Linux; requires CAP_PERFMON or perf_event_paranoid ≤ 2.

Adaptive batching

On platforms with coarse timer resolution, fast operations complete in fewer timer ticks than needed for reliable measurement. The library automatically compensates:

Pilot measurement: Run ~100 iterations to measure ticks per operation
Enable batching: If < 5 ticks per call, batch multiple operations together
Select K: Choose batch size K = min(ceil(50/ticks_per_call), 20)
Measure batches: Record total time for K operations as one sample

Batching is disabled when:

Timer has sufficient resolution (≥ 5 ticks per operation)
Using cycle-accurate timers (kperf, perf_event)
Operation is slow enough

Pre-flight checks

Before measurement begins, the library runs several checks:

Timer sanity

Verify timer is monotonic (second read ≥ first read)
Check timer advances at a reasonable rate
Detect if timer resolution is too coarse for any measurement

Harness sanity

Compare two halves of baseline samples
Detects problems like generator timing or side effects
If “leak” found between identical inputs, harness has a bug

Stationarity check

Detect drift over time that would violate statistical assumptions
Divide samples into windows, compare medians
Flags quality warning if significant drift detected

Unmeasurable operations

Operations completing faster than ~10ns on Apple Silicon (or proportionally fast on other platforms) cannot be reliably measured:

Operation too fast to measure
  Operation: ~5ns
  Timer resolution: ~42ns
  Recommendation: Use TimerSpec::RequireHighPrecision with sudo (~1ns resolution)

Options:

Use TimerSpec::RequireHighPrecision for higher resolution (requires sudo on Apple Silicon)
Test a larger workload (more iterations, larger input)
Accept that ultra-fast operations may not be measurable

Platform Timers

Timer selection

Timer selection rationale

Requiring high-precision timing

Enabling PMU timers

x86_64 details

rdtsc

perf_event (Linux)

ARM64 macOS (Apple Silicon)

cntvct_el0

kperf

ARM64 Linux

cntvct_el0

perf_event

Adaptive batching

Pre-flight checks

Timer sanity

Harness sanity

Stationarity check

Unmeasurable operations

Further reading