Interpreting Results

When a timing test completes, you get an Outcome that tells you whether a timing leak was detected, how confident the result is, and details about the effect size. This page explains how to read these results.

The four outcomes

Every test returns one of four outcomes:

Outcome	Meaning	Probability Range
Pass	No timing leak detected above your threshold	P(leak) < 5%
Fail	Timing leak confirmed above your threshold	P(leak) > 95%
Inconclusive	Cannot reach a confident decision	5% ≤ P(leak) ≤ 95%
Unmeasurable	Operation too fast to measure reliably	N/A

Handling all outcomes

use tacet::{TimingOracle, AttackerModel, Outcome, helpers::InputPair};

let inputs = InputPair::new(|| [0u8; 32], || rand::random());
let outcome = TimingOracle::for_attacker(AttackerModel::AdjacentNetwork)
    .test(inputs, |data| my_function(&data));

match outcome {
    Outcome::Pass { leak_probability, quality, .. } => {
        println!("No leak detected (P={:.1}%)", leak_probability * 100.0);
        println!("Measurement quality: {:?}", quality);
    }
    Outcome::Fail { leak_probability, effect, .. } => {
        println!("Timing leak detected (P={:.1}%)", leak_probability * 100.0);
        println!("W₁ distance: {:.1}ns", effect.w1_distance_ns);
        if let Some(tail_diag) = &effect.tail_diagnostics {
            println!("Pattern: {:?}", tail_diag.pattern_label);
            println!("Shift: {:.1}ns, Tail: {:.1}ns ({:.0}% from tail)",
                     tail_diag.shift_ns, tail_diag.tail_ns,
                     tail_diag.tail_share * 100.0);
        }
    }
    Outcome::Inconclusive { reason, leak_probability, .. } => {
        println!("Inconclusive: {:?}", reason);
        println!("Current estimate: P={:.1}%", leak_probability * 100.0);
    }
    Outcome::Unmeasurable { recommendation, .. } => {
        println!("Cannot measure: {}", recommendation);
    }
}

inputs := tacet.NewInputPair(
    func() []byte { return make([]byte, 32) },
    func() []byte { b := make([]byte, 32); rand.Read(b); return b },
)

outcome := tacet.ForAttacker(tacet.AdjacentNetwork).
    Test(inputs, func(data []byte) { myFunction(data) })

switch o := outcome.(type) {
case *tacet.Pass:
    fmt.Printf("No leak (P=%.1f%%)\n", o.LeakProbability*100)
case *tacet.Fail:
    fmt.Printf("Leak detected (P=%.1f%%)\n", o.LeakProbability*100)
case *tacet.Inconclusive:
    fmt.Printf("Inconclusive: %s\n", o.Reason)
case *tacet.Unmeasurable:
    fmt.Printf("Cannot measure: %s\n", o.Recommendation)
}

tacet_outcome_t outcome;
tacet_test(oracle, inputs, measure_fn, &outcome);

switch (outcome.tag) {
    case TIMING_ORACLE_PASS:
        printf("No leak (P=%.1f%%)\n", outcome.pass.leak_probability * 100);
        break;
    case TIMING_ORACLE_FAIL:
        printf("Leak detected (P=%.1f%%)\n", outcome.fail.leak_probability * 100);
        break;
    case TIMING_ORACLE_INCONCLUSIVE:
        printf("Inconclusive: %s\n", outcome.inconclusive.reason);
        break;
    case TIMING_ORACLE_UNMEASURABLE:
        printf("Cannot measure: %s\n", outcome.unmeasurable.recommendation);
        break;
}

Understanding leak probability

The leak_probability is a Bayesian posterior probability: given the timing data collected, what’s the probability that the maximum effect exceeds your threshold?

P(leak) = P(max|Δ| > θ | data)

This is different from a p-value:

Bayesian posterior	Frequentist p-value
”72% probability of a leak"	"If there were no leak, we’d see this data 28% of the time”
Directly interpretable	Requires careful interpretation
Converges with more data	Can produce more false positives with more data

Probability thresholds

Range	Interpretation	Outcome
P < 5%	Confident there’s no exploitable leak	Pass
5% ≤ P ≤ 50%	Probably no leak, but uncertain	Inconclusive
50% < P < 95%	Probably a leak, but uncertain	Inconclusive
P > 95%	Confident there’s an exploitable leak	Fail

You can adjust these thresholds for stricter or more lenient testing:

TimingOracle::for_attacker(AttackerModel::AdjacentNetwork)
    .pass_threshold(0.01)   // Require P < 1% for pass (stricter)
    .fail_threshold(0.99)   // Require P > 99% for fail (stricter)

TimingOracle
  .forAttacker(AttackerModel.AdjacentNetwork)
  .passThreshold(0.01)   // Require P < 1% for pass (stricter)
  .failThreshold(0.99)   // Require P > 99% for fail (stricter)

ToConfig cfg = to_config_default(AdjacentNetwork);
cfg.pass_threshold = 0.01;  // Require P < 1% for pass (stricter)
cfg.fail_threshold = 0.99;  // Require P > 99% for fail (stricter)

auto oracle = Oracle::forAttacker(ToAttackerModel::AdjacentNetwork)
    .passThreshold(0.01)   // Require P < 1% for pass (stricter)
    .failThreshold(0.99);  // Require P > 99% for fail (stricter)

result, err := tacet.Test(
    gen, op, inputSize,
    tacet.WithAttacker(tacet.AdjacentNetwork),
    tacet.WithPassThreshold(0.01),  // Require P < 1% for pass (stricter)
    tacet.WithFailThreshold(0.99),  // Require P > 99% for fail (stricter)
)

Effect decomposition

When a leak is detected, the effect field provides the W₁ distance (Wasserstein-1 metric) and decomposes it into shift and tail components to help you understand the nature of the timing leak.

W₁ distance

The W₁ distance measures the “cost” of transforming one timing distribution into another. For timing analysis, it approximates the sum of two components:

W₁ ≈ |shift| + tail

Shift (μ): A uniform timing difference affecting all measurements. This is the median of rank-matched differences between baseline and sample distributions. Typical cause: the code consistently takes a different branch.
Tail (τ): Additional effect concentrated in the upper quantiles (slower measurements). This is computed as the mean of residuals after removing the shift component. Typical cause: data-dependent cache misses or memory access patterns that only trigger for certain inputs.
Tail share: The fraction of the total effect coming from the tail: tail_share = tail / (|shift| + tail). Values near 1.0 indicate a pure tail effect; values near 0.0 indicate a pure uniform shift.

match outcome {
    Outcome::Fail { effect, .. } => {
        println!("W₁ distance: {:.1}ns", effect.w1_distance_ns);
        println!("95% CI: [{:.1}, {:.1}]ns",
                 effect.credible_interval_ns.0,
                 effect.credible_interval_ns.1);

        if let Some(tail_diag) = &effect.tail_diagnostics {
            println!("Shift: {:.1}ns", tail_diag.shift_ns);
            println!("Tail: {:.1}ns", tail_diag.tail_ns);
            println!("Tail share: {:.1}%", tail_diag.tail_share * 100.0);
            println!("Pattern: {:?}", tail_diag.pattern_label);
        }
    }
    _ => {}
}

if (result.isFail()) {
    const effect = result.effect;
    console.log(`W₁ distance: ${effect.w1DistanceNs.toFixed(1)}ns`);
    console.log(`95% CI: [${effect.credibleIntervalNs[0].toFixed(1)}, ${effect.credibleIntervalNs[1].toFixed(1)}]ns`);

    if (effect.tailDiagnostics) {
        const tail = effect.tailDiagnostics;
        console.log(`Shift: ${tail.shiftNs.toFixed(1)}ns`);
        console.log(`Tail: ${tail.tailNs.toFixed(1)}ns`);
        console.log(`Tail share: ${(tail.tailShare * 100).toFixed(1)}%`);
        console.log(`Pattern: ${tail.patternLabel}`);
    }
}

if (result.outcome == Fail) {
    printf("W₁ distance: %.1f ns\n", result.effect.w1_distance_ns);
    printf("95%% CI: [%.1f, %.1f] ns\n",
           result.effect.credible_interval_lower_ns,
           result.effect.credible_interval_upper_ns);

    if (result.effect.has_tail_diagnostics) {
        printf("Shift: %.1f ns\n", result.effect.tail_diagnostics.shift_ns);
        printf("Tail: %.1f ns\n", result.effect.tail_diagnostics.tail_ns);
        printf("Tail share: %.1f%%\n", result.effect.tail_diagnostics.tail_share * 100);
        printf("Pattern: %s\n", result.effect.tail_diagnostics.pattern_label);
    }
}

if (result.outcome == ToOutcome::Fail) {
    std::cout << "W₁ distance: " << result.effect.w1_distance_ns << " ns\n";
    std::cout << "95% CI: [" << result.effect.credible_interval_lower_ns
              << ", " << result.effect.credible_interval_upper_ns << "] ns\n";

    if (result.effect.has_tail_diagnostics) {
        auto& tail = result.effect.tail_diagnostics;
        std::cout << "Shift: " << tail.shift_ns << " ns\n";
        std::cout << "Tail: " << tail.tail_ns << " ns\n";
        std::cout << "Tail share: " << (tail.tail_share * 100) << "%\n";
        std::cout << "Pattern: " << tail.pattern_label << "\n";
    }
}

if result.Outcome == tacet.Fail {
    fmt.Printf("W₁ distance: %.1f ns\n", result.Effect.W1DistanceNs)
    fmt.Printf("95%% CI: [%.1f, %.1f] ns\n",
        result.Effect.CredibleIntervalLowerNs,
        result.Effect.CredibleIntervalUpperNs)

    if result.Effect.HasTailDiagnostics {
        tail := result.Effect.TailDiagnostics
        fmt.Printf("Shift: %.1f ns\n", tail.ShiftNs)
        fmt.Printf("Tail: %.1f ns\n", tail.TailNs)
        fmt.Printf("Tail share: %.1f%%\n", tail.TailShare*100)
        fmt.Printf("Pattern: %s\n", tail.PatternLabel)
    }
}

Effect patterns

The pattern_label classifies the leak based on its shift and tail decomposition:

Pattern	Condition	Interpretation	Typical cause
`Negligible`	W₁ < 1.0ns	Effect too small to matter	Measurement noise
`TailEffect`	tail_share ≥ 0.5	Leak concentrated in upper quantiles	Cache misses, memory access patterns, rare branches
`UniformShift`	tail_share < 0.3	All measurements shifted equally	Consistent branch on secret bit, different code path
`Mixed`	0.3 ≤ tail_share < 0.5	Both shift and tail contribute	Multiple leak sources interacting

Interpreting quantile shifts

For non-negligible effects, tail_diagnostics may include quantile_shifts, showing how individual quantiles differ between baseline and sample distributions:

if let Some(tail_diag) = &effect.tail_diagnostics {
    if let Some(quantile_shifts) = &tail_diag.quantile_shifts {
        for (q, shift) in quantile_shifts {
            println!("q{}: {:.1}ns shift", q, shift);
        }
    }
}

This reveals which parts of the distribution are most affected. For example, if q95-q99 show large shifts but q50-q90 do not, the leak is concentrated in the tail (slow outliers).

Handling inconclusive results

An Inconclusive outcome means the test couldn’t reach a confident decision. The reason field tells you why:

Reason	Meaning	What to do
`DataTooNoisy`	Measurement noise is too high	Reduce system noise, use `TimerSpec::CyclePrecision`, or increase time budget
`NotLearning`	Posterior stopped updating	Check for measurement setup issues
`TimeBudgetExceeded`	Ran out of time before reaching conclusion	Increase `time_budget`
`SampleBudgetExceeded`	Hit sample limit	Increase `max_samples`
`WouldTakeTooLong`	Projected time to conclusion exceeds budget	Accept uncertainty or increase budget

match outcome {
    Outcome::Inconclusive { reason, leak_probability, .. } => {
        match reason {
            InconclusiveReason::DataTooNoisy => {
                eprintln!("Environment too noisy for reliable measurement");
            }
            InconclusiveReason::TimeBudgetExceeded => {
                eprintln!("Need more time; current estimate: P={:.0}%", leak_probability * 100.0);
            }
            _ => {
                eprintln!("Inconclusive: {:?}", reason);
            }
        }
    }
    _ => {}
}

if (result.isInconclusive()) {
    const reason = result.inconclusiveReason;
    if (reason === 'DataTooNoisy') {
        console.error('Environment too noisy for reliable measurement');
    } else if (reason === 'TimeBudgetExceeded') {
        console.error(`Need more time; current estimate: P=${(result.leakProbability * 100).toFixed(0)}%`);
    } else {
        console.error(`Inconclusive: ${reason}`);
    }
}

if (result.outcome == Inconclusive) {
    if (strcmp(result.inconclusive.reason, "DataTooNoisy") == 0) {
        fprintf(stderr, "Environment too noisy for reliable measurement\n");
    } else if (strcmp(result.inconclusive.reason, "TimeBudgetExceeded") == 0) {
        fprintf(stderr, "Need more time; current estimate: P=%.0f%%\n",
                result.leak_probability * 100.0);
    } else {
        fprintf(stderr, "Inconclusive: %s\n", result.inconclusive.reason);
    }
}

if (result.outcome == ToOutcome::Inconclusive) {
    std::string reason = result.inconclusive.reason;
    if (reason == "DataTooNoisy") {
        std::cerr << "Environment too noisy for reliable measurement\n";
    } else if (reason == "TimeBudgetExceeded") {
        std::cerr << "Need more time; current estimate: P="
                  << (result.leak_probability * 100) << "%\n";
    } else {
        std::cerr << "Inconclusive: " << reason << "\n";
    }
}

if result.Outcome == tacet.Inconclusive {
    switch result.InconclusiveReason {
    case "DataTooNoisy":
        fmt.Fprintf(os.Stderr, "Environment too noisy for reliable measurement\n")
    case "TimeBudgetExceeded":
        fmt.Fprintf(os.Stderr, "Need more time; current estimate: P=%.0f%%\n",
            result.LeakProbability*100)
    default:
        fmt.Fprintf(os.Stderr, "Inconclusive: %s\n", result.InconclusiveReason)
    }
}

Handling unmeasurable results

An Unmeasurable outcome means the operation completes faster than the timer can reliably measure:

match outcome {
    Outcome::Unmeasurable { operation_ns, threshold_ns, recommendation, .. } => {
        println!("Operation takes ~{:.0}ns, need >{:.0}ns to measure",
                 operation_ns, threshold_ns);
        println!("{}", recommendation);
    }
    _ => {}
}

if (result.isUnmeasurable()) {
    console.log(`Operation takes ~${result.operationNs.toFixed(0)}ns, need >${result.thresholdNs.toFixed(0)}ns to measure`);
    console.log(result.recommendation);
}

if (result.outcome == Unmeasurable) {
    printf("Operation takes ~%.0f ns, need >%.0f ns to measure\n",
           result.unmeasurable.operation_ns,
           result.unmeasurable.threshold_ns);
    printf("%s\n", result.unmeasurable.recommendation);
}

if (result.outcome == ToOutcome::Unmeasurable) {
    std::cout << "Operation takes ~" << result.unmeasurable.operation_ns
              << "ns, need >" << result.unmeasurable.threshold_ns << "ns to measure\n";
    std::cout << result.unmeasurable.recommendation << "\n";
}

if result.Outcome == tacet.Unmeasurable {
    fmt.Printf("Operation takes ~%.0f ns, need >%.0f ns to measure\n",
        result.OperationNs, result.ThresholdNs)
    fmt.Println(result.Recommendation)
}

Common solutions:

Use TimerSpec::CyclePrecision for finer resolution (see Measurement Precision)
Test a larger input size or more iterations
Accept that ultra-fast operations may not be measurable on your platform

Quality indicators

The quality field indicates overall measurement reliability:

Quality	MDE	Meaning
`Excellent`	< 5ns	Cycle-level precision
`Good`	5-20ns	Suitable for most testing
`Poor`	20-100ns	May miss small leaks
`TooNoisy`	> 100ns	Results unreliable

MDE (Minimum Detectable Effect) is the smallest timing difference that could be reliably detected given the noise level.

Threshold diagnostics

The diagnostics tell you if threshold elevation occurred:

match outcome {
    Outcome::Pass { diagnostics, .. } | Outcome::Fail { diagnostics, .. } => {
        if diagnostics.theta_eff > diagnostics.theta_user * 1.1 {
            println!("Note: Threshold elevated from {:.1}ns to {:.1}ns",
                     diagnostics.theta_user, diagnostics.theta_eff);
            println!("See: /core-concepts/measurement-precision");
        }
    }
    _ => {}
}

if (result.isPass() || result.isFail()) {
    const diag = result.diagnostics;
    if (diag.thetaEff > diag.thetaUser * 1.1) {
        console.log(`Note: Threshold elevated from ${diag.thetaUser.toFixed(1)}ns to ${diag.thetaEff.toFixed(1)}ns`);
        console.log('See: /core-concepts/measurement-precision');
    }
}

if (result.outcome == Pass || result.outcome == Fail) {
    if (result.diagnostics.theta_eff > result.diagnostics.theta_user * 1.1) {
        printf("Note: Threshold elevated from %.1f ns to %.1f ns\n",
               result.diagnostics.theta_user, result.diagnostics.theta_eff);
        printf("See: /core-concepts/measurement-precision\n");
    }
}

if (result.outcome == ToOutcome::Pass || result.outcome == ToOutcome::Fail) {
    if (result.diagnostics.theta_eff > result.diagnostics.theta_user * 1.1) {
        std::cout << "Note: Threshold elevated from "
                  << result.diagnostics.theta_user << "ns to "
                  << result.diagnostics.theta_eff << "ns\n";
        std::cout << "See: /core-concepts/measurement-precision\n";
    }
}

if result.Outcome == tacet.Pass || result.Outcome == tacet.Fail {
    if result.Diagnostics.ThetaEff > result.Diagnostics.ThetaUser*1.1 {
        fmt.Printf("Note: Threshold elevated from %.1f ns to %.1f ns\n",
            result.Diagnostics.ThetaUser, result.Diagnostics.ThetaEff)
        fmt.Println("See: /core-concepts/measurement-precision")
    }
}

Field	Meaning
`theta_user`	The threshold you requested (via attacker model)
`theta_floor`	The minimum detectable effect for this setup
`theta_eff`	The threshold actually used: max(theta_user, theta_floor)

When theta_eff > theta_user, your results answer “is there a leak above theta_eff?” rather than “is there a leak above theta_user?”. See Measurement Precision for details.

Summary

Pass (P < 5%): No leak detected above your threshold
Fail (P > 95%): Leak confirmed, with effect size and exploitability assessment
Inconclusive: Check the reason; often fixable by adjusting budget or environment
Unmeasurable: Operation too fast; use TimerSpec::CyclePrecision or test larger workload
Leak probability is Bayesian (converges with more data, unlike p-values)
Effect decomposition tells you how the leak manifests (shift vs tail)
Check theta_eff vs theta_user to understand if threshold was elevated