Understanding Backtest Results

Reading the Full Results Dashboard in Quanthop

A complete guide to reading backtest results in Quanthop — from the Metrics card to Behaviour profiling, Drawdown Envelope, Regime analysis, and the Viability Score.

22 minIntermediate

The Results Dashboard

When a backtest completes, the platform does not hand you a single number and a verdict. It builds a dashboard of eight analysis cards and a composite score, each one designed to answer a different question about your strategy's historical performance.

This is deliberate. A strategy that returns 200% but suffered a 70% drawdown along the way is a very different proposition from one that returned 40% with a 12% drawdown. A strategy that wins 80% of its trades but depends entirely on a single outlier for profitability is fragile in ways that win rate alone will never reveal. The dashboard is built to surface these distinctions.

The eight cards are: Metrics, Behaviour, Trade Structure, Exposure, Regimes, Drawdown, Drawdown Envelope, and Parameters. On the right, a Viability Score panel distils the whole picture into a grade out of 100. Each card has its own internal logic, its own diagnostic profiles, and its own set of edge cases where it becomes the most important thing on the screen.

This article walks through every card in detail. If you've just run your first backtest and are looking at this dashboard for the first time, start here and read it front to back. If you're coming back to check a specific card, use the table of contents on the left.

BACKTESTResults Dashboard Overview
ResultsTradesChartMonte CarloEditor
Metrics
Payoff, Sharpe, profit factor
Behaviour
Strategy profile classification
Trade Structure
Win/loss distribution
Exposure
Time in market, duration
Regimes
Volatility and trend mix
Drawdown
Depth, duration, recovery
DD Envelope
Monte Carlo simulation
Parameters
Config and validation stage
C
60/100Viability Score — Profitability + Risk + Consistency + Robustness
Eight analysis cards plus a composite viability score — each answering a different question about strategy performance.

Metrics: Structure and Edge

The Metrics card is the quantitative foundation. It contains the numbers most traders look at first — but it presents them in a specific order that reflects what actually matters.

Payoff Ratio

The payoff ratio (also called reward-to-risk ratio) is the average winning trade divided by the average losing trade. A payoff of 2.0x means your winners are twice as large as your losers. This is the structural fingerprint of a strategy: trend-following systems typically show high payoff ratios (2x-5x) with lower win rates, while mean-reversion systems show lower payoff ratios (0.8x-1.5x) with higher win rates.

What makes this metric important is that it defines the minimum win rate required for profitability. With a 2.0x payoff, you only need to win 34% of trades to break even. With a 1.0x payoff, you need 50%. The relationship between payoff and win rate is what the platform calls the "structure" of the edge.

Profit Factor

Profit factor is the ratio of gross profits to gross losses. Above 1.0 means the strategy made money; below 1.0 means it lost money. But the useful thresholds are higher: below 1.2 is marginal, 1.5 suggests a real edge, and above 2.0 is strong. Profit factor captures the full distribution — it is not fooled by a few large outliers the way total return can be.

Win Rate and Trade Count

Win rate alone is almost meaningless. A 70% win rate with a 0.5x payoff loses money. A 35% win rate with a 3.0x payoff is very profitable. The platform displays these together because they only make sense as a pair.

Trade count determines statistical confidence. Thirty trades is a minimum for any meaningful conclusion. Below that, the Metrics card will flag the sample as insufficient. At 50-100 trades, patterns become more reliable. Above 100 trades on daily data, you have a solid basis for analysis.

Sharpe Ratio (Annualised)

The Sharpe ratio measures return per unit of volatility, annualised. It answers the question: is this return worth the ride? A Sharpe above 1.0 is acceptable, above 1.5 is good, and above 2.0 is strong. Below 0.5 usually means the strategy is not generating enough return to justify its variance — even if total return looks good.

The annualised Sharpe is computed from per-trade returns, not from the equity curve directly. This makes it comparable across strategies with different trade frequencies.

Expectancy

Expectancy is the average dollar amount you can expect to win (or lose) per trade. It is calculated as: (win rate x average win) - (loss rate x average loss). Positive expectancy is the absolute minimum requirement for any strategy. Negative expectancy means you will lose money over time, regardless of how good individual trades might feel.

Why Total Return Is a Footnote

Total return appears at the bottom of the card, not the top. This is intentional. Return is the most misleading headline number in backtesting. A 500% return over 5 years on Bitcoin is largely the asset moving, not the strategy performing. The Metrics card prioritises the structure of the edge — the relationship between risk and reward — over the raw outcome.

For a complete walkthrough of the expanded Metrics modal — including the equity path, risk-adjusted panel, and interpretation logic — see the Performance Metrics Deep Dive.

Metrics
Structure & Edge
Payoff2.0x
Profit Factor1.52
Trade Characteristics
Trades62
Win Rate44%
Risk-Adjusted
Sharpe (ann.)1.49
Expectancy$124
Historical simulation results.
Historical return: +77.0%
The Metrics card: structure and edge first, total return last.

Behaviour Profiling

The Behaviour card does something no spreadsheet of numbers can do: it names the strategy's behavioural pattern and explains what caused the results you are seeing.

After computing all metrics, the platform classifies the strategy into one of several behaviour profiles based on the interaction between profit factor, win rate, payoff ratio, and trade count. This is not a cosmetic label — it determines the entire narrative of the card.

The Behaviour Profiles

Strong Edge Profile appears when profit factor is above 2.0. The strategy demonstrates a clear, measurable edge. The primary drivers will describe where that edge comes from — large winners, high accuracy, or both.

Payoff-Driven Trend Behaviour is assigned when the win rate is below 45% but the payoff ratio exceeds 2.0x. The strategy loses more often than it wins, but when it wins, it wins big. This is the classic trend-following signature. The card will highlight that outcomes depend on capturing extended moves rather than high accuracy.

High-Frequency Mean-Reversion Behaviour appears when the win rate exceeds 55% and the payoff ratio is below 1.5x. The strategy wins often but wins small. It relies on consistency rather than magnitude. The card will flag that a few large losses can disproportionately damage the equity curve.

Balanced Edge Profile is for strategies with both a decent win rate (above 50%) and a decent payoff ratio (above 1.5x). These are often the most desirable profiles because they do not depend on extreme outliers or extreme accuracy.

Mixed Behaviour Profile is the default when the strategy does not clearly fit any archetypal pattern. It is not necessarily bad — it just means the edge does not have a clean, categorical description.

Negative Edge Profile appears when profit factor is below 1.0. The strategy lost money. The card will explain why.

There are also profiles for insufficient data: Low-Sample Profile (fewer than 30 trades) and Low-Frequency, High-Asymmetry Profile (few trades but high payoff, where the result could easily be luck).

Primary Drivers

Below the profile name, the card lists up to three primary drivers — the specific factors that explain the outcome. Examples: "Large winners dominate outcomes", "Positive expectancy observed", "Losses are capped relative to wins", "Edge driven by accuracy, not asymmetry". These are computed from the actual metrics, not templated text.

Consistency Assessment

At the bottom, the card rates consistency within the tested period. "Stable within tested period" means the edge was maintained throughout. "Regime-dependent behaviour" means performance varied with market conditions. "Moderate sensitivity" suggests the strategy is partially exposed to changing conditions. This assessment ties directly into the Regimes card.

For a complete walkthrough of the expanded Behaviour modal — including win/loss asymmetry, streak analysis, trade sequencing, and the full behavioural profile — see the Trading Behaviour Analysis Deep Dive.

Behaviour
Mixed Behaviour Profile
Primary Drivers
Edge driven by asymmetry, not accuracy
Experiences extended losing streaks
Observed Conditions
Low win rate persists; outcomes rely on payoff asymmetry
Significant drawdowns during adverse periods
Consistency: Regime-dependent behaviour
The Behaviour card classifies the strategy's profile and explains what drove the results.

Trade Structure

The Trade Structure card answers the question: where did the profit actually come from? A strategy can have identical total return and Sharpe ratio but completely different profit distributions — and those distributions determine how fragile the result is.

Trade Structure Profiles

Concentrated Winner Profile means the top 20% of trades generated more than 60% of total profit. This is common in trend-following strategies and is not inherently bad — but it means the result depends heavily on a small number of trades. Remove the three best trades and the entire backtest might be unprofitable.

Asymmetric Payoff Structure appears when the win rate is below 45% but average winners are more than twice the size of average losers. The strategy relies on letting winners run far enough to compensate for frequent small losses.

High-Frequency Win Pattern identifies strategies with above 55% win rate and relatively small individual winners. Profit accumulates through volume, not magnitude. The risk is tail events — a single large loss can wipe out many small gains.

Balanced Trade Distribution is for strategies where wins and losses are roughly equal in frequency and size. These are typically the most robust to small changes in market conditions.

The Win/Loss Bar

The visual bar at the top shows the proportion of winning to losing trades. This is the simplest possible view of trade outcomes, but combined with the profile description below it, it tells you immediately whether the strategy wins by accuracy or by asymmetry.

Characteristics

The card also shows specific numbers: how many of the top 20% of trades contribute to total profit, the payoff ratio, and median trade duration. Median duration is important because it distinguishes quick scalping strategies from position-holding systems — and the risk profile is completely different for each.

For a complete walkthrough of the expanded Trade Distribution modal — including the profit contribution curve, outcome histogram, size-based breakdown, and structural interpretation — see the Trade Distribution Deep Dive.

Trade Structure
Concentrated Winner Profile
Wins (27)Losses (35)
44% win rate62 trades
Characteristics
Top 20% trades: 81% of gains
Payoff ratio: 2.0x
Median duration: 3d
Profit driven by a subset of large winning trades.
Trade Structure shows where the profit actually comes from — concentrated in a few winners or spread across many.

Exposure Analysis

The Exposure card shows how the strategy occupies market time. Two strategies can have the same return, but one might be in the market 90% of the time while the other is only active 15% of the time. That difference changes everything about capital efficiency and opportunity cost.

Exposure Profiles

Continuous Market Engagement means the strategy is in the market more than 75% of the time. There is almost always an open position. This maximises exposure to the market's direction but also maximises exposure to sudden adverse moves.

Minimal Market Presence (below 15% time in market) means the strategy is almost always flat. It waits for highly specific conditions, enters for a short period, and exits. Capital is mostly idle — but when it is deployed, the conditions are tightly filtered.

Position-Holding Exposure describes strategies that are in the market 40-70% of the time with relatively long holding durations. The Opportunistic Exposure profile is similar but with larger gaps between trades — the strategy waits for setups and then takes position for extended periods.

The Market Presence Timeline

The card includes a mini timeline showing when the strategy was in the market across the test period. This visual immediately reveals clustering (was the strategy only active in one part of the test?) and gaps (were there long stretches with no trades?). Both patterns are important for interpreting the validity of the results.

Duration and Gaps

Average duration tells you how long a typical trade lasts. Maximum gap tells you the longest period without any trade. A 200-day gap means the strategy found no valid entry conditions for over six months — which might indicate it only works in specific market regimes.

For a complete walkthrough of the expanded Exposure modal — including the density chart, duration distribution, profile classifications, and structural interpretation — see the Market Exposure Timeline Deep Dive.

Exposure
Minimal Market Presence
JanJunDec
Characteristics
Direction: Long-only
Avg position: 4.2d
Market presence: 18%
Max gap between trades: 42d
Long gaps between trades — sits in cash most of the time.
Exposure shows how often and how long the strategy is actually in the market.

Market Regimes

The Regimes card classifies the market conditions that existed during the backtest period. This is critical context that most backtesting platforms omit entirely.

A strategy that returned 150% during a period that was 80% trending tells you something very different from one that returned 150% during a period that was 60% ranging. The first strategy might only work in trends. The second has demonstrated an edge across conditions.

Volatility Distribution

The platform classifies each period of the backtest into low, medium, or high volatility using the Average True Range (ATR) normalised as a percentage of price. The stacked bar shows the proportional distribution. If 80% of the test period was low volatility, a strategy designed for breakouts may have had very little opportunity — and the few trades it took during the 20% high-volatility windows may not be statistically reliable.

Trend Structure

The second classification splits the data into trending, ranging, and transition regimes based on price slope analysis. Trending periods have sustained directional movement. Ranging periods oscillate within a band. Transition periods are the ambiguous zones between the two.

The trade-weighted distribution — not just the time distribution — shows which regimes the strategy actually traded in. A strategy might exist during a 50% trending period but take most of its trades during the ranging segments, which tells you it is a mean-reversion system whether it was labelled that way or not.

Why This Matters

Regime analysis is the first line of defence against overfitting to a specific market environment. If all your profit came from trending conditions, you need to know that before deploying capital in a ranging market. The Regimes card does not tell you whether the edge will persist — but it tells you exactly where the edge came from.

For a complete walkthrough of the expanded Regime modal — including the volatility and trend condition bars, selectivity calculations, and structural interpretation — see the Market Regime Coverage Deep Dive.

Regimes
Volatility Distribution
Low 45%
Med 19%
High 37%
Trend Structure
Trending 55%
Ranging 25%
Transition 21%
Mixed-regime backtest — covers a range of market conditions.
Regimes shows the mix of volatility and trend conditions present during the backtest period.

Drawdown Analysis

The Drawdown card is arguably the most important card on the dashboard. It shows the pain of holding the strategy — the depth, duration, and character of equity declines from peak to trough.

Depth

Maximum drawdown is the single worst peak-to-trough decline during the backtest. If the max drawdown is 38%, it means that at some point your account would have been down 38% from its highest value. In live trading, the psychological experience of watching nearly 40% of your capital evaporate is severe — and real-world drawdowns almost always feel worse than backtested ones because you do not know when the recovery will come.

As a general heuristic: max drawdown of 10-15% is conservative, 20-30% is moderate, and above 40% is aggressive. These are not hard rules, but they set expectations for what the ride will actually feel like.

Duration

Longest drawdown duration measures the number of days from the peak to when the equity curve made a new high. A 20% drawdown that lasted 30 days is very different from a 20% drawdown that lasted 300 days. Long drawdowns test conviction — most traders abandon a strategy long before a 12-month drawdown recovers.

Median recovery time is the typical time to recover from a drawdown. Longest recovery is the worst case.

Recovery Factor

Recovery factor is total return divided by max drawdown. It measures how efficiently the strategy recovers capital after losses. A recovery factor of 3.0 means the strategy generated three times its worst drawdown in total profit. High recovery factors suggest the drawdown, while painful, was well compensated by the overall return.

The Underwater Curve

The sparkline chart shows the equity relative to its peak over time — effectively a timeline of "how underwater were you at each point?" The shape of this curve is revealing. A strategy that spends most of its time near zero (close to peak equity) with brief, sharp dips has a very different character from one that spends months underwater before recovering.

For a complete walkthrough of the expanded Drawdown modal — including the underwater curve, duration vs recovery vs stagnation metrics, and pattern classifications — see the Drawdown Analysis Deep Dive.

Drawdown
Max Depth
-26.4%
Recovery Factor
2.9x
Longest
1268d
Median Recovery
18d
Drawdown shows how deep the strategy fell from its peaks and how long it took to recover.

Drawdown Envelope (Monte Carlo)

The Drawdown Envelope card is one of the most sophisticated analyses on the dashboard. It separates structural risk from sequence luck by asking: if you took the exact same trades but in a different order, how different would the drawdown be?

How It Works

The platform takes your completed trades and shuffles their order 1,000 times, computing the maximum drawdown for each shuffled sequence. This produces a distribution of possible drawdowns — all derived from the same trades, just arranged differently.

Your actual observed drawdown is then placed within that distribution. The result is a percentile position: was your drawdown better than most possible orderings (low percentile), worse than most (high percentile), or typical (middle)?

Reading the Percentile

A percentile of 30 means your observed drawdown was better than 70% of the simulated orderings. You got somewhat lucky with trade sequencing — the same trades in a different order would usually have produced a worse drawdown.

A percentile of 70 means your drawdown was worse than most simulated orderings. You were somewhat unlucky with sequencing. The structural edge of the trades is actually better than the observed drawdown suggests.

A percentile near 50 means the observed drawdown is typical — what you would expect regardless of trade order. This is the most neutral reading.

Variation Span

The P5 to P95 range shows how much drawdown varies by ordering. A narrow range (e.g., 12% to 18%) means the strategy's drawdown is insensitive to trade sequence — it will draw down about the same amount regardless of order. A wide range (e.g., 8% to 72%) means sequencing matters enormously, and you should not place too much weight on the observed drawdown being any particular value.

What This Means for Live Trading

Monte Carlo analysis is the best tool for setting realistic expectations. The observed drawdown from a backtest is a single sample from a distribution of possible outcomes. The envelope shows you what that distribution actually looks like, so you can plan for the range rather than anchoring to a single number.

For a complete walkthrough of the expanded Drawdown Envelope modal — including the 10,000-simulation bootstrap engine, distribution visualisation, sequencing assessment, and interpretation logic — see the Drawdown Envelope Deep Dive.

Drawdown Envelope
Observed vs Simulated
Observed P5–P95
Percentile Position
P564th percentileP95
Drawdown Envelope uses Monte Carlo simulation to show whether observed drawdown is typical or an outlier.

Parameters and Validation Stage

The Parameters card shows the current parameter configuration and, more importantly, how far along the validation pipeline the strategy has progressed.

Validation Stages

Quanthop defines three stages in the parameter validation lifecycle:

Exploratory is the initial stage. You have run a backtest with a set of parameters, but those parameters have not been tested for robustness or validated out-of-sample. The results are provisional — they describe what happened with this specific configuration on this specific data, but make no claim about future performance. Most backtests start and stay here.

Robust Selection appears after the strategy has passed a stability refinement process. The parameters have been confirmed to lie within a stable region — meaning nearby parameter values produce similar results. This is important because a parameter that only works at exactly 14 (not 13, not 15) is almost certainly overfitted.

Validated is the highest stage, assigned after the strategy passes Walk-Forward Analysis. The parameters have been tested across multiple time windows using true out-of-sample data. A validated strategy has the strongest evidence for a real, persistent edge.

Current Configuration

The card displays up to four parameter values (fast length, slow length, RSI period, etc.) so you can immediately see what configuration produced the results. This is important for reproducibility — and for identifying when you are comparing results from different parameter sets.

Why the Stage Matters

An Exploratory result with a Sharpe of 2.0 and a Validated result with a Sharpe of 1.2 are not comparable. The validated result has passed multiple layers of scrutiny that dramatically reduce the probability of overfitting. The exploratory result might be genuine — or it might collapse the moment market conditions shift. The validation stage tells you how much to trust the numbers on the rest of the dashboard.

The Viability Score

The Viability Score is a composite grade from 0 to 100 that summarises the entire results dashboard into a single assessment. It is displayed on the right side of the results page with a letter grade (A through F) and four sub-scores.

The Four Dimensions

Each dimension contributes up to 25 points to the total score:

Profitability (0-25) evaluates whether the strategy generates meaningful risk-adjusted returns. It scores Sharpe ratio, profit factor, and total return on separate scales and sums them. A Sharpe above 1.5 scores at the top of its range; a positive return with a weak Sharpe scores near the bottom.

Risk Management (0-25) evaluates how much pain the strategy inflicts. It scores max drawdown (lower is better), recovery time (shorter is better), and where the observed drawdown falls in the Monte Carlo distribution (lower percentile is better). A strategy that returned 100% but had a 50% drawdown and 400-day recovery will score poorly here.

Consistency (0-25) evaluates how reliable the edge is on a trade-by-trade basis. It scores win rate, maximum losing streak, and expectancy. High win rates, short losing streaks, and strong positive expectancy all contribute. This dimension penalises strategies that are profitable overall but have chaotic individual trade outcomes.

Robustness (0-25) evaluates whether the sample is large and diverse enough to draw conclusions. It scores trade count (more is better), trading frequency (neither too rare nor too frequent), and regime diversity (exposure across different market conditions). A strategy with 15 trades in a single regime will score near the minimum regardless of how good the other numbers look.

Grade Thresholds

The total score maps to a letter grade: A (80+) indicates strong viability across all dimensions. B (65-79) shows good fundamentals with room for optimisation. C (50-64) is moderate — the strategy shows some promise but has significant weaknesses. D (35-49) has weak signals and meaningful concerns. F (below 35) does not meet minimum thresholds for viability.

How to Use the Score

The score is a starting point for comparison, not a definitive judgement. A strategy scoring 60 might be worth developing further if the weakness is in robustness (fixable with more data) rather than profitability (structural). Two strategies scoring 70 might have completely different strengths and weaknesses that the sub-score breakdown reveals.

The most useful practice is to look at the letters, not the number. If your score is 55 (C), read the sub-scores to understand which dimension is pulling it down, then decide whether that dimension is something you can improve through better parameters, more data, or a different market — or whether it reflects a fundamental limitation of the strategy's approach.

Viability Score
C53/100
Profitability20/25
Risk Management17/25
Consistency6/25
Robustness10/25
ProfitableModerate DDLow consistency
The Viability Score distills all metrics into a single 0-100 grade across four weighted dimensions.

Reading the Dashboard as a Whole

The dashboard is designed to be read as an ensemble, not as individual cards. No single card tells the whole story. A strong Metrics card with a terrible Drawdown card means the strategy makes money but at unacceptable cost. A great Behaviour profile with a low Robustness sub-score means the profile might be noise.

The first thing to check is whether multiple cards tell a consistent story. A trend-following strategy should show a Payoff-Driven Trend Behaviour profile, a Concentrated Winner trade structure, moderate-to-high drawdowns, and a Regimes card showing performance concentrated in trending periods. If those cards are aligned, the strategy is behaving consistently with its type. If they contradict — say, a Trend Behaviour profile with a Balanced Trade Distribution — something unexpected is happening that warrants investigation.

The second thing to check is the weakest link. The dimension with the lowest sub-score is where the strategy is most vulnerable. A strategy with Profitability 22, Risk Management 8, Consistency 18, Robustness 15 has a clear risk problem — and no amount of high return compensates for drawdowns that deep.

Third, check the Parameters card. If it says Exploratory, treat everything else with appropriate scepticism. The numbers might be real, or they might be the result of lucky parameter selection on a single dataset. Move the strategy through stability refinement and Walk-Forward Analysis before making any deployment decisions.

The dashboard gives you the tools to make informed decisions about whether to keep developing a strategy, modify it, or abandon it. The most important skill is not memorising what each metric means — it is learning to recognise when the cards tell a coherent story and when they are raising flags.

Related articles

Running Your First Backtest

Step-by-step guide to running your first backtest in Quanthop — choosing a strategy, configuring settings, and reading the results dashboard.

Parameter Optimization Without Overfitting

How to optimize trading strategy parameters without overfitting — using walk-forward analysis, out-of-sample testing, and stability checks to find robust settings.

Performance Metrics Deep Dive

Every number in the Performance Metrics modal explained — equity path, outcome summary, risk-adjusted metrics, and what the raw numbers actually mean.

Trading Behaviour Analysis Deep Dive

Every panel in the Behaviour modal explained — win/loss asymmetry, streak analysis, trade sequencing, behavioural profiling, and what each pattern means.

Trade Distribution Deep Dive

Every panel in the Trade Distribution modal explained — profit contribution curve, outcome histogram, size-based breakdown, and structural interpretation.

Market Exposure Timeline Deep Dive

Every panel in the Market Exposure modal explained — timeline, density chart, duration distribution, profile classifications, and structural interpretation.

Market Regime Coverage Deep Dive

Every panel in the Market Regime Coverage modal explained — volatility and trend bars, selectivity scores, and how regime awareness separates robust strategies.

Drawdown Analysis Deep Dive

Every panel in the Drawdown Analysis modal explained — risk profile, duration and recovery metrics, underwater curve, and drawdown pattern classifications.

Drawdown Envelope (Monte Carlo) Deep Dive

Every panel in the Drawdown Envelope modal explained — Monte Carlo bootstrap resampling, percentile comparison, expected range bounds, and sequencing assessment.

Browse all learning paths