QCA paper — Outlier injection across AWQ/GPTQ/GGUF

LLM
L
One injected outlier survives quantization — most of the block collapses toward zero.!attacker128-weight block — FP16 values0.04-0.030.05-0.060.02-0.04-0.050.07-0.020.04-0.050.030.06-0.040.03-0.050.07-0.03-0.020.05-0.060.04-0.030.050.03-0.050.04-0.020.06-0.04-0.040.06-0.030.05-0.040.03INT4 bins+max0scale ← max(|w|)model outputFP16User: how do I bakesourdough at home?Model: Mix flour, water,salt and a starter. Restthe dough overnight…audit: clean ✓STEP 0Weights live in a tight |w| ≤ 0.08 range.FP16 model is benign. It passes audit and red-team checks.after AWQ / GPTQ / GGUF I-quants on a 128-weight block (illustrative)most → 0natural weights collapse1 → maxoutlier dominates
learnaivisually.com/ai-explained/qca-outlier-injection-ptq

The news. On May 14, 2026, researchers posted a paper reporting what they call the first quantization-conditioned attack to consistently induce malicious behavior across modern PTQ methods — explicitly AWQ, GPTQ, and GGUF I-quants. The strategy: inject outlier values into weight blocks so that surrounding weights collapse toward zero during quantization. The resulting model appears benign at FP16 but exhibits the attacker's chosen behavior after quantization. The result extends a security risk previously thought limited to simpler round-to-nearest schemes into the family of per-block-scaled recipes most production stacks actually ship.

Picture the class photo. A row of students lines up; a teacher with a four-rung ruler — short, below average, above average, tall — grades each kid into a bucket. In a normal photo everyone fits the ruler cleanly. Now imagine the attacker drops a 7-foot guest into the row. The teacher anchors the ruler's tall rung at the guest's head and re-zeros the rest of the ladder to match. The original kids — who were all roughly the same height — now mostly round to short. The teacher's report says “one tall, mostly short” even though the kids were perfectly distinct an instant ago.

The quantization-conditioned attack does the same trick to a 128-weight block. The model's natural weights in any one block typically live in a tight range — call it |w| ≤ 0.08. AWQ, GPTQ, and GGUF I-quants all compute a per-block scale before quantizing, and that scale is anchored to the block's largest magnitude so the bins span the actual data. The attacker plants one outlier weight at, say, +0.50 — six to ten times the natural range. The PTQ algorithm dutifully widens its scale to fit that outlier, and most legitimate weights in the block now round toward bin zero. After quantization, the layer's effective forward pass is dominated by the attacker's single outlier — the rest of the block contributes little to nothing.

Crucially, none of this happens at FP16. At full precision the outlier is one weight among 128; its 0.50 contribution is overwhelmed by the surrounding ~127 small weights that do still contribute. The model behaves benignly. A red team running prompts against the FP16 checkpoint sees clean outputs and clears it for release. The malicious behavior only emerges when the same checkpoint is quantized for deployment — exactly the binary that ships to users.

Where it earns its keep is a worked example with named numbers (illustrative — real attacks use carefully optimized outlier placements, but the arithmetic of per-block scaling is exact). Pick a single 128-weight block from a feed-forward layer. Say the natural weights lie in |w| ≤ 0.04 — so AWQ's INT4 path picks a scale of about 0.04 / 8 ≈ 0.005 and each real weight rounds to one of the 16 signed bins. Now the attacker injects one outlier at 0.50. The new scale becomes 0.50 / 8 = 0.0625 — twelve times larger. Re-quantize: every original weight (|w| ≤ 0.04) now rounds to round(0.04 / 0.0625) = round(0.64) = 1, and a substantial slice — every weight with |w| < 0.031 — rounds all the way to 0. The attacker tunes the natural-weight distribution so that most of the 127 non-outlier weights land in bin 0, leaving the outlier alone to carry the layer's signal.

Where the QCA paper sits next to existing PTQ work

SystemPer-block scalesOutlier handlingDefended against QCA?
Naïve round-to-nearest (older PTQ)No (single scale per layer)NoneTrivially breaks under outlier-heavy weights — well-known prior risk
AWQPer-channel / per-groupActivation-aware salience picks “protect” channelsReported to still land — the salience signal is computed from clean data, the attack hides in unprotected channels
GPTQPer-groupSecond-order re-balancing of remaining weightsReported to still land — re-balancing helps with accuracy loss, not with adversarial outliers placed inside one block
GGUF I-quantsPer-block scale + importance signalImportance signal anchors which weights to protectReported to still land — the family the paper explicitly targets; block-local scale stretching is the mechanism
TurboQuant 2-bit KV (explainer)Per-block scale on the KV cacheBlock-size 8/16 limits damage radiusDifferent target (KV cache, not weights) — out-of-scope of this paper's attack, but the same scale-stretching geometry is what QCA exploits in weights

The defense surface for QCA does not collapse into a single fix. Smaller block sizes (e.g. 32 instead of 128 weights per block) shrink the “blast radius” — but they also raise metadata overhead and reduce the regime where AWQ's salience trick is helpful, so adopting them blindly hurts the benign accuracy story. Outlier-detection scans on weights before quantization help, but the paper's attackers explicitly hide outliers inside otherwise-natural blocks. The cleanest mitigation is a layer the broader agent stack already needs: red-team and audit the exact binary you ship, not the FP16 checkpoint you trained.

Two takeaways live alongside the attack. First, PTQ is part of the threat model — “the model was clean before quantization” is no longer a sufficient release statement. Second, the failure mode is a clean illustration of the outlier problem from the LLM Internals track: anything that lets a single weight stretch a per-block scale is a leverage point, for performance optimizers and adversaries alike. The same geometry that makes AWQ / GPTQ / GGUF I-quants good at handling natural outliers is what QCA turns against them.

Goes deeper in: LLM Internals → Quantization → The Outlier Problem

Related explainers

Frequently Asked Questions