Interpreting Results
When a timing test completes, you get an Outcome that tells you whether a timing leak was detected, how confident the result is, and details about the effect size. This page explains how to read these results.
The four outcomes
Section titled “The four outcomes”Every test returns one of four outcomes:
| Outcome | Meaning | Probability Range |
|---|---|---|
| Pass | No timing leak detected above your threshold | P(leak) < 5% |
| Fail | Timing leak confirmed above your threshold | P(leak) > 95% |
| Inconclusive | Cannot reach a confident decision | 5% ≤ P(leak) ≤ 95% |
| Unmeasurable | Operation too fast to measure reliably | N/A |
Handling all outcomes
Section titled “Handling all outcomes”use tacet::{TimingOracle, AttackerModel, Outcome, helpers::InputPair};
let inputs = InputPair::new(|| [0u8; 32], || rand::random());let outcome = TimingOracle::for_attacker(AttackerModel::AdjacentNetwork) .test(inputs, |data| my_function(&data));
match outcome { Outcome::Pass { leak_probability, quality, .. } => { println!("No leak detected (P={:.1}%)", leak_probability * 100.0); println!("Measurement quality: {:?}", quality); } Outcome::Fail { leak_probability, effect, .. } => { println!("Timing leak detected (P={:.1}%)", leak_probability * 100.0); println!("W₁ distance: {:.1}ns", effect.w1_distance_ns); if let Some(tail_diag) = &effect.tail_diagnostics { println!("Pattern: {:?}", tail_diag.pattern_label); println!("Shift: {:.1}ns, Tail: {:.1}ns ({:.0}% from tail)", tail_diag.shift_ns, tail_diag.tail_ns, tail_diag.tail_share * 100.0); } } Outcome::Inconclusive { reason, leak_probability, .. } => { println!("Inconclusive: {:?}", reason); println!("Current estimate: P={:.1}%", leak_probability * 100.0); } Outcome::Unmeasurable { recommendation, .. } => { println!("Cannot measure: {}", recommendation); }}inputs := tacet.NewInputPair( func() []byte { return make([]byte, 32) }, func() []byte { b := make([]byte, 32); rand.Read(b); return b },)
outcome := tacet.ForAttacker(tacet.AdjacentNetwork). Test(inputs, func(data []byte) { myFunction(data) })
switch o := outcome.(type) {case *tacet.Pass: fmt.Printf("No leak (P=%.1f%%)\n", o.LeakProbability*100)case *tacet.Fail: fmt.Printf("Leak detected (P=%.1f%%)\n", o.LeakProbability*100)case *tacet.Inconclusive: fmt.Printf("Inconclusive: %s\n", o.Reason)case *tacet.Unmeasurable: fmt.Printf("Cannot measure: %s\n", o.Recommendation)}tacet_outcome_t outcome;tacet_test(oracle, inputs, measure_fn, &outcome);
switch (outcome.tag) { case TIMING_ORACLE_PASS: printf("No leak (P=%.1f%%)\n", outcome.pass.leak_probability * 100); break; case TIMING_ORACLE_FAIL: printf("Leak detected (P=%.1f%%)\n", outcome.fail.leak_probability * 100); break; case TIMING_ORACLE_INCONCLUSIVE: printf("Inconclusive: %s\n", outcome.inconclusive.reason); break; case TIMING_ORACLE_UNMEASURABLE: printf("Cannot measure: %s\n", outcome.unmeasurable.recommendation); break;}Understanding leak probability
Section titled “Understanding leak probability”The leak_probability is a Bayesian posterior probability: given the timing data collected, what’s the probability that the maximum effect exceeds your threshold?
P(leak) = P(max|Δ| > θ | data)This is different from a p-value:
| Bayesian posterior | Frequentist p-value |
|---|---|
| ”72% probability of a leak" | "If there were no leak, we’d see this data 28% of the time” |
| Directly interpretable | Requires careful interpretation |
| Converges with more data | Can produce more false positives with more data |
Probability thresholds
Section titled “Probability thresholds”| Range | Interpretation | Outcome |
|---|---|---|
| P < 5% | Confident there’s no exploitable leak | Pass |
| 5% ≤ P ≤ 50% | Probably no leak, but uncertain | Inconclusive |
| 50% < P < 95% | Probably a leak, but uncertain | Inconclusive |
| P > 95% | Confident there’s an exploitable leak | Fail |
You can adjust these thresholds for stricter or more lenient testing:
TimingOracle::for_attacker(AttackerModel::AdjacentNetwork) .pass_threshold(0.01) // Require P < 1% for pass (stricter) .fail_threshold(0.99) // Require P > 99% for fail (stricter)TimingOracle .forAttacker(AttackerModel.AdjacentNetwork) .passThreshold(0.01) // Require P < 1% for pass (stricter) .failThreshold(0.99) // Require P > 99% for fail (stricter)ToConfig cfg = to_config_default(AdjacentNetwork);cfg.pass_threshold = 0.01; // Require P < 1% for pass (stricter)cfg.fail_threshold = 0.99; // Require P > 99% for fail (stricter)auto oracle = Oracle::forAttacker(ToAttackerModel::AdjacentNetwork) .passThreshold(0.01) // Require P < 1% for pass (stricter) .failThreshold(0.99); // Require P > 99% for fail (stricter)result, err := tacet.Test( gen, op, inputSize, tacet.WithAttacker(tacet.AdjacentNetwork), tacet.WithPassThreshold(0.01), // Require P < 1% for pass (stricter) tacet.WithFailThreshold(0.99), // Require P > 99% for fail (stricter))Effect decomposition
Section titled “Effect decomposition”When a leak is detected, the effect field provides the W₁ distance (Wasserstein-1 metric) and decomposes it into shift and tail components to help you understand the nature of the timing leak.
W₁ distance
Section titled “W₁ distance”The W₁ distance measures the “cost” of transforming one timing distribution into another. For timing analysis, it approximates the sum of two components:
W₁ ≈ |shift| + tail
-
Shift (μ): A uniform timing difference affecting all measurements. This is the median of rank-matched differences between baseline and sample distributions. Typical cause: the code consistently takes a different branch.
-
Tail (τ): Additional effect concentrated in the upper quantiles (slower measurements). This is computed as the mean of residuals after removing the shift component. Typical cause: data-dependent cache misses or memory access patterns that only trigger for certain inputs.
-
Tail share: The fraction of the total effect coming from the tail:
tail_share = tail / (|shift| + tail). Values near 1.0 indicate a pure tail effect; values near 0.0 indicate a pure uniform shift.
match outcome { Outcome::Fail { effect, .. } => { println!("W₁ distance: {:.1}ns", effect.w1_distance_ns); println!("95% CI: [{:.1}, {:.1}]ns", effect.credible_interval_ns.0, effect.credible_interval_ns.1);
if let Some(tail_diag) = &effect.tail_diagnostics { println!("Shift: {:.1}ns", tail_diag.shift_ns); println!("Tail: {:.1}ns", tail_diag.tail_ns); println!("Tail share: {:.1}%", tail_diag.tail_share * 100.0); println!("Pattern: {:?}", tail_diag.pattern_label); } } _ => {}}if (result.isFail()) { const effect = result.effect; console.log(`W₁ distance: ${effect.w1DistanceNs.toFixed(1)}ns`); console.log(`95% CI: [${effect.credibleIntervalNs[0].toFixed(1)}, ${effect.credibleIntervalNs[1].toFixed(1)}]ns`);
if (effect.tailDiagnostics) { const tail = effect.tailDiagnostics; console.log(`Shift: ${tail.shiftNs.toFixed(1)}ns`); console.log(`Tail: ${tail.tailNs.toFixed(1)}ns`); console.log(`Tail share: ${(tail.tailShare * 100).toFixed(1)}%`); console.log(`Pattern: ${tail.patternLabel}`); }}if (result.outcome == Fail) { printf("W₁ distance: %.1f ns\n", result.effect.w1_distance_ns); printf("95%% CI: [%.1f, %.1f] ns\n", result.effect.credible_interval_lower_ns, result.effect.credible_interval_upper_ns);
if (result.effect.has_tail_diagnostics) { printf("Shift: %.1f ns\n", result.effect.tail_diagnostics.shift_ns); printf("Tail: %.1f ns\n", result.effect.tail_diagnostics.tail_ns); printf("Tail share: %.1f%%\n", result.effect.tail_diagnostics.tail_share * 100); printf("Pattern: %s\n", result.effect.tail_diagnostics.pattern_label); }}if (result.outcome == ToOutcome::Fail) { std::cout << "W₁ distance: " << result.effect.w1_distance_ns << " ns\n"; std::cout << "95% CI: [" << result.effect.credible_interval_lower_ns << ", " << result.effect.credible_interval_upper_ns << "] ns\n";
if (result.effect.has_tail_diagnostics) { auto& tail = result.effect.tail_diagnostics; std::cout << "Shift: " << tail.shift_ns << " ns\n"; std::cout << "Tail: " << tail.tail_ns << " ns\n"; std::cout << "Tail share: " << (tail.tail_share * 100) << "%\n"; std::cout << "Pattern: " << tail.pattern_label << "\n"; }}if result.Outcome == tacet.Fail { fmt.Printf("W₁ distance: %.1f ns\n", result.Effect.W1DistanceNs) fmt.Printf("95%% CI: [%.1f, %.1f] ns\n", result.Effect.CredibleIntervalLowerNs, result.Effect.CredibleIntervalUpperNs)
if result.Effect.HasTailDiagnostics { tail := result.Effect.TailDiagnostics fmt.Printf("Shift: %.1f ns\n", tail.ShiftNs) fmt.Printf("Tail: %.1f ns\n", tail.TailNs) fmt.Printf("Tail share: %.1f%%\n", tail.TailShare*100) fmt.Printf("Pattern: %s\n", tail.PatternLabel) }}Effect patterns
Section titled “Effect patterns”The pattern_label classifies the leak based on its shift and tail decomposition:
| Pattern | Condition | Interpretation | Typical cause |
|---|---|---|---|
Negligible | W₁ < 1.0ns | Effect too small to matter | Measurement noise |
TailEffect | tail_share ≥ 0.5 | Leak concentrated in upper quantiles | Cache misses, memory access patterns, rare branches |
UniformShift | tail_share < 0.3 | All measurements shifted equally | Consistent branch on secret bit, different code path |
Mixed | 0.3 ≤ tail_share < 0.5 | Both shift and tail contribute | Multiple leak sources interacting |
Interpreting quantile shifts
Section titled “Interpreting quantile shifts”For non-negligible effects, tail_diagnostics may include quantile_shifts, showing how individual quantiles differ between baseline and sample distributions:
if let Some(tail_diag) = &effect.tail_diagnostics { if let Some(quantile_shifts) = &tail_diag.quantile_shifts { for (q, shift) in quantile_shifts { println!("q{}: {:.1}ns shift", q, shift); } }}This reveals which parts of the distribution are most affected. For example, if q95-q99 show large shifts but q50-q90 do not, the leak is concentrated in the tail (slow outliers).
Handling inconclusive results
Section titled “Handling inconclusive results”An Inconclusive outcome means the test couldn’t reach a confident decision. The reason field tells you why:
| Reason | Meaning | What to do |
|---|---|---|
DataTooNoisy | Measurement noise is too high | Reduce system noise, use TimerSpec::CyclePrecision, or increase time budget |
NotLearning | Posterior stopped updating | Check for measurement setup issues |
TimeBudgetExceeded | Ran out of time before reaching conclusion | Increase time_budget |
SampleBudgetExceeded | Hit sample limit | Increase max_samples |
WouldTakeTooLong | Projected time to conclusion exceeds budget | Accept uncertainty or increase budget |
match outcome { Outcome::Inconclusive { reason, leak_probability, .. } => { match reason { InconclusiveReason::DataTooNoisy => { eprintln!("Environment too noisy for reliable measurement"); } InconclusiveReason::TimeBudgetExceeded => { eprintln!("Need more time; current estimate: P={:.0}%", leak_probability * 100.0); } _ => { eprintln!("Inconclusive: {:?}", reason); } } } _ => {}}if (result.isInconclusive()) { const reason = result.inconclusiveReason; if (reason === 'DataTooNoisy') { console.error('Environment too noisy for reliable measurement'); } else if (reason === 'TimeBudgetExceeded') { console.error(`Need more time; current estimate: P=${(result.leakProbability * 100).toFixed(0)}%`); } else { console.error(`Inconclusive: ${reason}`); }}if (result.outcome == Inconclusive) { if (strcmp(result.inconclusive.reason, "DataTooNoisy") == 0) { fprintf(stderr, "Environment too noisy for reliable measurement\n"); } else if (strcmp(result.inconclusive.reason, "TimeBudgetExceeded") == 0) { fprintf(stderr, "Need more time; current estimate: P=%.0f%%\n", result.leak_probability * 100.0); } else { fprintf(stderr, "Inconclusive: %s\n", result.inconclusive.reason); }}if (result.outcome == ToOutcome::Inconclusive) { std::string reason = result.inconclusive.reason; if (reason == "DataTooNoisy") { std::cerr << "Environment too noisy for reliable measurement\n"; } else if (reason == "TimeBudgetExceeded") { std::cerr << "Need more time; current estimate: P=" << (result.leak_probability * 100) << "%\n"; } else { std::cerr << "Inconclusive: " << reason << "\n"; }}if result.Outcome == tacet.Inconclusive { switch result.InconclusiveReason { case "DataTooNoisy": fmt.Fprintf(os.Stderr, "Environment too noisy for reliable measurement\n") case "TimeBudgetExceeded": fmt.Fprintf(os.Stderr, "Need more time; current estimate: P=%.0f%%\n", result.LeakProbability*100) default: fmt.Fprintf(os.Stderr, "Inconclusive: %s\n", result.InconclusiveReason) }}Handling unmeasurable results
Section titled “Handling unmeasurable results”An Unmeasurable outcome means the operation completes faster than the timer can reliably measure:
match outcome { Outcome::Unmeasurable { operation_ns, threshold_ns, recommendation, .. } => { println!("Operation takes ~{:.0}ns, need >{:.0}ns to measure", operation_ns, threshold_ns); println!("{}", recommendation); } _ => {}}if (result.isUnmeasurable()) { console.log(`Operation takes ~${result.operationNs.toFixed(0)}ns, need >${result.thresholdNs.toFixed(0)}ns to measure`); console.log(result.recommendation);}if (result.outcome == Unmeasurable) { printf("Operation takes ~%.0f ns, need >%.0f ns to measure\n", result.unmeasurable.operation_ns, result.unmeasurable.threshold_ns); printf("%s\n", result.unmeasurable.recommendation);}if (result.outcome == ToOutcome::Unmeasurable) { std::cout << "Operation takes ~" << result.unmeasurable.operation_ns << "ns, need >" << result.unmeasurable.threshold_ns << "ns to measure\n"; std::cout << result.unmeasurable.recommendation << "\n";}if result.Outcome == tacet.Unmeasurable { fmt.Printf("Operation takes ~%.0f ns, need >%.0f ns to measure\n", result.OperationNs, result.ThresholdNs) fmt.Println(result.Recommendation)}Common solutions:
- Use
TimerSpec::CyclePrecisionfor finer resolution (see Measurement Precision) - Test a larger input size or more iterations
- Accept that ultra-fast operations may not be measurable on your platform
Quality indicators
Section titled “Quality indicators”The quality field indicates overall measurement reliability:
| Quality | MDE | Meaning |
|---|---|---|
Excellent | < 5ns | Cycle-level precision |
Good | 5-20ns | Suitable for most testing |
Poor | 20-100ns | May miss small leaks |
TooNoisy | > 100ns | Results unreliable |
MDE (Minimum Detectable Effect) is the smallest timing difference that could be reliably detected given the noise level.
Threshold diagnostics
Section titled “Threshold diagnostics”The diagnostics tell you if threshold elevation occurred:
match outcome { Outcome::Pass { diagnostics, .. } | Outcome::Fail { diagnostics, .. } => { if diagnostics.theta_eff > diagnostics.theta_user * 1.1 { println!("Note: Threshold elevated from {:.1}ns to {:.1}ns", diagnostics.theta_user, diagnostics.theta_eff); println!("See: /core-concepts/measurement-precision"); } } _ => {}}if (result.isPass() || result.isFail()) { const diag = result.diagnostics; if (diag.thetaEff > diag.thetaUser * 1.1) { console.log(`Note: Threshold elevated from ${diag.thetaUser.toFixed(1)}ns to ${diag.thetaEff.toFixed(1)}ns`); console.log('See: /core-concepts/measurement-precision'); }}if (result.outcome == Pass || result.outcome == Fail) { if (result.diagnostics.theta_eff > result.diagnostics.theta_user * 1.1) { printf("Note: Threshold elevated from %.1f ns to %.1f ns\n", result.diagnostics.theta_user, result.diagnostics.theta_eff); printf("See: /core-concepts/measurement-precision\n"); }}if (result.outcome == ToOutcome::Pass || result.outcome == ToOutcome::Fail) { if (result.diagnostics.theta_eff > result.diagnostics.theta_user * 1.1) { std::cout << "Note: Threshold elevated from " << result.diagnostics.theta_user << "ns to " << result.diagnostics.theta_eff << "ns\n"; std::cout << "See: /core-concepts/measurement-precision\n"; }}if result.Outcome == tacet.Pass || result.Outcome == tacet.Fail { if result.Diagnostics.ThetaEff > result.Diagnostics.ThetaUser*1.1 { fmt.Printf("Note: Threshold elevated from %.1f ns to %.1f ns\n", result.Diagnostics.ThetaUser, result.Diagnostics.ThetaEff) fmt.Println("See: /core-concepts/measurement-precision") }}| Field | Meaning |
|---|---|
theta_user | The threshold you requested (via attacker model) |
theta_floor | The minimum detectable effect for this setup |
theta_eff | The threshold actually used: max(theta_user, theta_floor) |
When theta_eff > theta_user, your results answer “is there a leak above theta_eff?” rather than “is there a leak above theta_user?”. See Measurement Precision for details.
Summary
Section titled “Summary”- Pass (P < 5%): No leak detected above your threshold
- Fail (P > 95%): Leak confirmed, with effect size and exploitability assessment
- Inconclusive: Check the reason; often fixable by adjusting budget or environment
- Unmeasurable: Operation too fast; use
TimerSpec::CyclePrecisionor test larger workload - Leak probability is Bayesian (converges with more data, unlike p-values)
- Effect decomposition tells you how the leak manifests (shift vs tail)
- Check
theta_effvstheta_userto understand if threshold was elevated