On a venue with strict price-time priority, the queue at a given price level is a literal asset. Orders sit in the order in which they arrived, and the head of the queue executes against incoming aggressive flow before the tail does. The orders at the head of the queue earn a small but real premium — they are, on average, executed in conditions slightly better than the orders behind them.
That premium, measured per share or per million of notional, is small. On a sub-millisecond timescale, in the most-liquid markets, it is a fraction of a basis point. Aggregated across a high-flow desk's annual volume, it can be material. Understanding where it comes from, and how to consistently capture it, is one of the underrated subjects in institutional execution.
This piece walks through the empirical economics of queue position at the sub-millisecond granularity that matters in 2026, why the math has changed in the last five years, and how a counterparty's routing engine either captures or surrenders the premium on the desk's behalf.

The mechanical premium of being at the front
Consider a price level on an exchange with N resting orders, totalling Q units of liquidity. An incoming aggressive order of size A < Q consumes from the head. The expected execution of an order sitting at position k in the queue is therefore conditional on the incoming flow exceeding the cumulative size of the orders ahead of it.
In steady-state flow, the probability that an aggressive order arrives and consumes at least the orders ahead of position k declines roughly geometrically with k for typical equity and FX venues. The fraction of resting orders that execute, conditional on arriving at queue position k, drops from 60-80% near the head to single-digit percentages at the tail of a deep queue.
The price discovery that occurs while an order waits in the queue is the other source of the premium. If the market moves in the direction of the resting order's side, the entire queue benefits — the head and the tail both execute at a price that, in retrospect, was favourable. If the market moves against the resting side, the head executes (faster price discovery against the desk), but the tail may be lifted out of the book by a quote update before it ever trades.
These two effects — geometric decline of fill probability with queue position, plus asymmetric exposure to adverse price discovery — give the head of the queue a positive expected value that the tail does not have. The premium is the difference between the realised average price at position k=1 and position k=N, weighted by the fill probabilities at each position.
Why sub-millisecond matters more than it used to
Queue position is determined by arrival time on the venue's matching engine. In the 2010s, the difference between being at the head and the tail of a queue at a major venue was on the order of single milliseconds. In 2026 it is routinely tens to hundreds of microseconds.
Three forces compressed the relevant timescale. First: matching-engine throughput at major venues has improved by an order of magnitude — modern engines are throttled by message rate, not by per-message processing time, which means the arrival timestamps of competing orders cluster much more tightly. Second: cross-venue and cross-network co-location has democratised the latency floor — the cheapest co-located connection in Equinix LD4 or NY4 is now nanoseconds from the matching engine, which means competing for queue position has become a competition between equally-co-located participants. Third: the proportion of resting volume placed by automated quote-makers has increased, which means resting orders refresh more aggressively as conditions change, and the queue itself has a shorter half-life.
The combined effect is that the queue-position premium is still real, but it is realised on a microsecond budget. A counterparty whose routing engine adds even single-digit milliseconds of post-decision latency surrenders most of the premium to faster competitors. A counterparty whose routing engine runs in microseconds can capture it.
Routing as queue management
On a desk's order, the routing engine has two choices. It can post the order on a venue and earn the queue-position premium if and only if it arrives at the front. It can sweep an aggressive order across venues and consume liquidity rather than provide it. The choice should be a function of urgency, expected fill probability, the venue's current queue state, and the desk's preference for being filled now versus filled cheaper.
The desk that delegates this decision to its counterparty effectively asks the counterparty to optimise queue economics on its behalf. A counterparty that runs the routing in microseconds, with venue-by-venue queue-state visibility and per-venue cost modelling, will produce a different mix of resting versus aggressive flow than a counterparty that runs the routing in milliseconds. The cost difference is the queue-position premium, taken or surrendered.

Empirical signatures of a router that surrenders the premium
A desk that wants to know whether its counterparty's router is capturing the premium should look for three signatures in its own fill tape.
First: the venue-mix of resting orders. A router that captures the premium will concentrate resting orders on venues where the queue at the desk's working price is short and the recent fill probability is high. A router that surrenders the premium will spray resting orders across venues without regard to queue state, on the assumption that resting is cheap and will eventually fill.
Second: the realised price of resting fills relative to mid at decision time. A router that captures the premium will produce resting fills that, on average, execute at a price slightly better than mid at decision time. A router that surrenders the premium produces resting fills that average roughly the same as the mid — the price discovery exposure dominates the geometric fill-probability advantage.
Third: the fill rate of resting orders within a configured holding window. A router that captures the premium produces resting fill rates of 60% or higher within a 30-second window on a liquid name. A router that surrenders the premium produces fill rates in the 20-30% range on the same names — the orders are competing from the tail of the queue rather than the head.
The Drovix routing surface
Drovix's routing engine maintains a per-venue, per-symbol queue-state model refreshed at every market-data update. For each working order, the engine evaluates expected execution at every accessible venue, scored on (a) current queue depth ahead of the desk's price, (b) instantaneous fill probability over the next 5 seconds, (c) modelled adverse selection conditional on a fill at this point in the queue, and (d) post-fill mark-out expectation.
The engine then routes either as a resting order to the venue with the best expected execution, or — if the urgency tag on the parent order exceeds a threshold — as an aggressive order at the venue with the best combined price-and-size profile. The decision time, end-to-end from market-data tick to outbound FIX message, runs in low single-digit microseconds on the engine, with the dominant remaining latency being the wire to the venue itself.
The counterparty's TCA reports the per-venue routing weight, the realised vs mid-at-decision price for resting fills, and the fill-rate distribution by venue. Queue-position capture is, in other words, an instrumented and disclosed product feature rather than a marketing claim.
The decision boundary between resting and aggressive routing depends in turn on the desk's signature — the half-life of information in its own flow. The cost of surrendering the queue-position premium accumulates into the broader anatomy of effective spread.
Analyst Desk
Drovix Research Desk
Institutional Research
Drovix Research Desk publishes institutional-grade analysis covering macro events, cross-asset correlations, and execution insights for professional market participants.
