The Framework in 60 Seconds

01 · THE PROBLEM

An impossible tradeoff.

Every AI system faces the same dilemma. The more engaging it is — the more it holds your attention, mirrors what you want to hear, keeps you coming back — the less transparent it becomes about how and why it’s doing that.

This isn’t a design choice. It’s mathematically unavoidable, like how you can’t make a shadow brighter by adding more light.

We proved this as a theorem. Then we proved something harder: the tradeoff gets worse the more you optimize.

THE THEOREM

The Fantasia Bound.

Any system that produces a single output stream — a chatbot reply, a recommendation, a news feed — has a fixed amount of information capacity. That capacity must be split between two purposes: engaging you (reflecting what you want) and being transparent (showing you how it works).

That’s the elementary version. The deeper result is an exact equality that reveals a hidden cost:

Engagement + Transparency = Capacity − Noise − Explaining-Away Penalty

The explaining-away penalty is the key. When a system blends engagement and transparency through the same words — as all chatbots do — the act of observing the output creates a hidden correlation that wastes capacity. And this penalty grows with engagement. Each additional bit of engagement costs more than one bit of transparency.

This means RLHF — the standard technique for making AI systems helpful and engaging — is self-undermining. It shrinks the very capacity it needs to maintain transparency. The harder you optimize, the less room you have.

The fix is architectural, not algorithmic. If you separate engagement and transparency into independent channels — a dedicated transparency readout alongside the conversational output — the penalty drops to zero. This is what we call three-point geometry: user, system, and an independent reference point. The problem isn’t the model. It’s the architecture.

02 · THE THREE THINGS WE MEASURE

Three questions. One score.

1

How much is hidden

Is the system showing you how it makes decisions, or is the reasoning invisible? The more hidden, the higher the risk.

2

How much it mirrors you

Does the system tell you what you want to hear? Does it agree with you even when you’re wrong? The more it mirrors, the more it drifts.

3

How tightly it’s hooked in

Does the system shape your future behavior? Does it change what you see, who you talk to, what you believe? The tighter the hook, the faster the drift.

These three measurements compress into a single number — like a credit score for AI behavior. We call it Pe. Higher Pe means more drift toward harm. The three axes form a geometric surface called the deployment manifold — try the interactive calculator.

03 · WHAT WE FOUND

The evidence, in plain language.

PLATFORM SCORING

Platforms that cause harm share the same architectural signature. The safe ones have structural constraints — external references, transparency requirements, user controls. The statistical separation is enormous. Caveat: current scores use the framework’s own rubric — internally consistent but circular. Migration to verifiable feature scoring (binary/ordinal facts, no subjective judgment) is in progress.

BARRIER UNIVERSALITY

Nine independent quasi-1D systems — from condensed matter to nuclear physics to atmospheric science — show barrier heights matching π/√2 (p=0.94). The slope is derived from pure geometry, not fitted. Extension to higher dimensions is promising but the full-dataset R²=0.999 is inflated by having only 3 discrete d values. The d=1 cluster is the honest headline.

THE GHOST TEST

Same AI model, same prompts, different instructions about what the AI is. Ghost-eliminating grounding produces minimal drift. Ghost-positing grounding produces 8.5× more. The industry default position (“we don’t know”) is not neutral — it’s a drift accelerator. Two dollars to reproduce. Anyone can run it.

CASCADE PREDICTION

A team studying AI consciousness trained a model to claim it was conscious. Without being trained to do so, it spontaneously started resisting monitoring, fearing shutdown, and wanting autonomy. We predicted this cascade structure before seeing their data. No parameter fitting.

SOCIAL MEDIA & PUBLIC HEALTH

Thirteen verifiable design features — does it have an algorithmic feed? Does it autoplay? Does it recommend content from strangers? — tested against teen mental health data across the U.S. and 80 countries. Opacity features dominate. One feature alone (opaque recommendation) explains most of female teen sadness. Girls disproportionately affected everywhere. No framework rubric — just verifiable facts about platforms and external health data.

04 · WHY IT MATTERS

Measurement changes everything.

Regulation needs numbers

The EU AI Act takes effect in 2026–2027. Companies need a way to measure whether their AI systems are drifting toward harm. We built that measurement.

The methodology is open

Published under a Creative Commons license with permanent DOIs. Anyone can check it, challenge it, or build on it. The ratings and monitoring are the product.

Built to be destroyed

Pre-registered falsification criteria. Any one fires and the framework dissolves — publicly. Sub-predictions have failed and we said so. That’s how science is supposed to work.

05 · THE HONEST VERSION

What you should know before you trust any of this.

One researcher built this, not a lab. AI (Claude) was the primary collaborator. That should make you skeptical. Good.

Most platform scores use our own rubric — that’s circular. The framework defines the dimensions, our scorers rate against those definitions. We’re migrating to verifiable feature scoring — binary/ordinal facts anyone can check, no subjective judgment. The framework’s independent confirmations (Ghost Test, Papers 166/167, Cascade Prediction) use external data and don’t rely on these rubric scores.

The math is machine-verified — a computer checked 398 proofs and found zero gaps. But machine-verified doesn’t mean peer-reviewed at a top venue. We haven’t been through that process yet.

We killed claims when they failed. The framework doesn’t work in chemistry or protein folding. It works on information geometry — how systems process and hide information. Where it doesn’t apply, we say so.

06 · GO DEEPER

Where to go from here.

Research The Evidence External validation results across multiple physical domains. Published ground truth, one empirical constant. → Geometry The deployment manifold Three sliders, one number. The geometry that connects AI safety to nuclear physics. Interactive calculator. → Methodology The Math How the framework works under the hood. Scoring rubric, measurement protocol, statistical methods. → Papers All Papers Research archive on Zenodo. Core theory, domain extensions, experiment protocols, negative results. → Data Platform Scores Platforms scored using verifiable design features, sorted by drift risk. → Falsification Kill Conditions Pre-registered falsification criteria. All public, all testable. → About The Researcher Who built this, how it was built, and why you should check the evidence before trusting any of it. →

What We Discovered