Skip to content

Interpreting Results

When a timing test completes, you get an Outcome that tells you whether a timing leak was detected, how confident the result is, and details about the effect size. This page explains how to read these results.

Every test returns one of four outcomes:

OutcomeMeaningProbability Range
PassNo timing leak detected above your thresholdP(leak) < 5%
FailTiming leak confirmed above your thresholdP(leak) > 95%
InconclusiveCannot reach a confident decision5% ≤ P(leak) ≤ 95%
UnmeasurableOperation too fast to measure reliablyN/A
use tacet::{TimingOracle, AttackerModel, Outcome, helpers::InputPair};
let inputs = InputPair::new(|| [0u8; 32], || rand::random());
let outcome = TimingOracle::for_attacker(AttackerModel::AdjacentNetwork)
.test(inputs, |data| my_function(&data));
match outcome {
Outcome::Pass { leak_probability, quality, .. } => {
println!("No leak detected (P={:.1}%)", leak_probability * 100.0);
println!("Measurement quality: {:?}", quality);
}
Outcome::Fail { leak_probability, effect, .. } => {
println!("Timing leak detected (P={:.1}%)", leak_probability * 100.0);
println!("W₁ distance: {:.1}ns", effect.w1_distance_ns);
if let Some(tail_diag) = &effect.tail_diagnostics {
println!("Pattern: {:?}", tail_diag.pattern_label);
println!("Shift: {:.1}ns, Tail: {:.1}ns ({:.0}% from tail)",
tail_diag.shift_ns, tail_diag.tail_ns,
tail_diag.tail_share * 100.0);
}
}
Outcome::Inconclusive { reason, leak_probability, .. } => {
println!("Inconclusive: {:?}", reason);
println!("Current estimate: P={:.1}%", leak_probability * 100.0);
}
Outcome::Unmeasurable { recommendation, .. } => {
println!("Cannot measure: {}", recommendation);
}
}

The leak_probability is a Bayesian posterior probability: given the timing data collected, what’s the probability that the maximum effect exceeds your threshold?

P(leak) = P(max|Δ| > θ | data)

This is different from a p-value:

Bayesian posteriorFrequentist p-value
”72% probability of a leak""If there were no leak, we’d see this data 28% of the time”
Directly interpretableRequires careful interpretation
Converges with more dataCan produce more false positives with more data
RangeInterpretationOutcome
P < 5%Confident there’s no exploitable leakPass
5% ≤ P ≤ 50%Probably no leak, but uncertainInconclusive
50% < P < 95%Probably a leak, but uncertainInconclusive
P > 95%Confident there’s an exploitable leakFail

You can adjust these thresholds for stricter or more lenient testing:

TimingOracle::for_attacker(AttackerModel::AdjacentNetwork)
.pass_threshold(0.01) // Require P < 1% for pass (stricter)
.fail_threshold(0.99) // Require P > 99% for fail (stricter)

When a leak is detected, the effect field provides the W₁ distance (Wasserstein-1 metric) and decomposes it into shift and tail components to help you understand the nature of the timing leak.

The W₁ distance measures the “cost” of transforming one timing distribution into another. For timing analysis, it approximates the sum of two components:

W₁ ≈ |shift| + tail

  • Shift (μ): A uniform timing difference affecting all measurements. This is the median of rank-matched differences between baseline and sample distributions. Typical cause: the code consistently takes a different branch.

  • Tail (τ): Additional effect concentrated in the upper quantiles (slower measurements). This is computed as the mean of residuals after removing the shift component. Typical cause: data-dependent cache misses or memory access patterns that only trigger for certain inputs.

  • Tail share: The fraction of the total effect coming from the tail: tail_share = tail / (|shift| + tail). Values near 1.0 indicate a pure tail effect; values near 0.0 indicate a pure uniform shift.

match outcome {
Outcome::Fail { effect, .. } => {
println!("W₁ distance: {:.1}ns", effect.w1_distance_ns);
println!("95% CI: [{:.1}, {:.1}]ns",
effect.credible_interval_ns.0,
effect.credible_interval_ns.1);
if let Some(tail_diag) = &effect.tail_diagnostics {
println!("Shift: {:.1}ns", tail_diag.shift_ns);
println!("Tail: {:.1}ns", tail_diag.tail_ns);
println!("Tail share: {:.1}%", tail_diag.tail_share * 100.0);
println!("Pattern: {:?}", tail_diag.pattern_label);
}
}
_ => {}
}

The pattern_label classifies the leak based on its shift and tail decomposition:

PatternConditionInterpretationTypical cause
NegligibleW₁ < 1.0nsEffect too small to matterMeasurement noise
TailEffecttail_share ≥ 0.5Leak concentrated in upper quantilesCache misses, memory access patterns, rare branches
UniformShifttail_share < 0.3All measurements shifted equallyConsistent branch on secret bit, different code path
Mixed0.3 ≤ tail_share < 0.5Both shift and tail contributeMultiple leak sources interacting

For non-negligible effects, tail_diagnostics may include quantile_shifts, showing how individual quantiles differ between baseline and sample distributions:

if let Some(tail_diag) = &effect.tail_diagnostics {
if let Some(quantile_shifts) = &tail_diag.quantile_shifts {
for (q, shift) in quantile_shifts {
println!("q{}: {:.1}ns shift", q, shift);
}
}
}

This reveals which parts of the distribution are most affected. For example, if q95-q99 show large shifts but q50-q90 do not, the leak is concentrated in the tail (slow outliers).

An Inconclusive outcome means the test couldn’t reach a confident decision. The reason field tells you why:

ReasonMeaningWhat to do
DataTooNoisyMeasurement noise is too highReduce system noise, use TimerSpec::CyclePrecision, or increase time budget
NotLearningPosterior stopped updatingCheck for measurement setup issues
TimeBudgetExceededRan out of time before reaching conclusionIncrease time_budget
SampleBudgetExceededHit sample limitIncrease max_samples
WouldTakeTooLongProjected time to conclusion exceeds budgetAccept uncertainty or increase budget
match outcome {
Outcome::Inconclusive { reason, leak_probability, .. } => {
match reason {
InconclusiveReason::DataTooNoisy => {
eprintln!("Environment too noisy for reliable measurement");
}
InconclusiveReason::TimeBudgetExceeded => {
eprintln!("Need more time; current estimate: P={:.0}%", leak_probability * 100.0);
}
_ => {
eprintln!("Inconclusive: {:?}", reason);
}
}
}
_ => {}
}

An Unmeasurable outcome means the operation completes faster than the timer can reliably measure:

match outcome {
Outcome::Unmeasurable { operation_ns, threshold_ns, recommendation, .. } => {
println!("Operation takes ~{:.0}ns, need >{:.0}ns to measure",
operation_ns, threshold_ns);
println!("{}", recommendation);
}
_ => {}
}

Common solutions:

  • Use TimerSpec::CyclePrecision for finer resolution (see Measurement Precision)
  • Test a larger input size or more iterations
  • Accept that ultra-fast operations may not be measurable on your platform

The quality field indicates overall measurement reliability:

QualityMDEMeaning
Excellent< 5nsCycle-level precision
Good5-20nsSuitable for most testing
Poor20-100nsMay miss small leaks
TooNoisy> 100nsResults unreliable

MDE (Minimum Detectable Effect) is the smallest timing difference that could be reliably detected given the noise level.

The diagnostics tell you if threshold elevation occurred:

match outcome {
Outcome::Pass { diagnostics, .. } | Outcome::Fail { diagnostics, .. } => {
if diagnostics.theta_eff > diagnostics.theta_user * 1.1 {
println!("Note: Threshold elevated from {:.1}ns to {:.1}ns",
diagnostics.theta_user, diagnostics.theta_eff);
println!("See: /core-concepts/measurement-precision");
}
}
_ => {}
}
FieldMeaning
theta_userThe threshold you requested (via attacker model)
theta_floorThe minimum detectable effect for this setup
theta_effThe threshold actually used: max(theta_user, theta_floor)

When theta_eff > theta_user, your results answer “is there a leak above theta_eff?” rather than “is there a leak above theta_user?”. See Measurement Precision for details.

  • Pass (P < 5%): No leak detected above your threshold
  • Fail (P > 95%): Leak confirmed, with effect size and exploitability assessment
  • Inconclusive: Check the reason; often fixable by adjusting budget or environment
  • Unmeasurable: Operation too fast; use TimerSpec::CyclePrecision or test larger workload
  • Leak probability is Bayesian (converges with more data, unlike p-values)
  • Effect decomposition tells you how the leak manifests (shift vs tail)
  • Check theta_eff vs theta_user to understand if threshold was elevated