The Evidence

We built a mathematical framework to measure how AI platforms affect people. Then we tested it against real-world data from physics, chemistry, biology, and more — fields the framework was never designed for. These are the results: what it got right, what it got wrong, and where we're still uncertain.

Multiple physical domains. One empirical constant (B_A). All data from published, independent sources.

12+

Domains tested

4,500+

Data points

Empirical constant

Open

Methodology

Why this page exists

It’s easy to build a scoring system that confirms itself. We wanted to know if the math actually works — so we tested it on data from fields we never designed it for.

Most platform scores use the framework’s own rubric — useful for practitioners but scientifically circular. The real test: can the framework predict numbers it has never seen, using published data that exists independently? These results are that test. The social media results (Papers 166–167) are the strongest non-circular evidence — no framework rubric involved, just verifiable design features tested against CDC and OECD population data. Where the framework failed, we say so.

External validation results

CROSS-DOMAIN PHYSICS · PASS

Universal Barrier Ratio π/√2

The same mathematical pattern keeps showing up across completely unrelated fields — from magnets to epidemics to nuclear physics — and we didn’t make it fit.

The strongest result is the d=1 cluster: nine independent quasi-1D systems — charge density waves, kagome metals, nuclear alpha decay, atmospheric sudden stratospheric warmings, and more — show barrier/d = 2.224 ± 0.033, matching π/√2 at p=0.94. B_G = π/√2 is derived from the Čencov theorem (§165, zero free parameters). B_A ≈ 0.867 is empirical (suggestive match to √3/2 but not yet derived). The full-dataset R²=0.999 is structurally inflated by only 3 discrete d values; the d=1 within-group test is the honest measure.

Paper 147: Barrier Universality →

d=1 clusterp=0.94

N (d=1)9 systems

B_Gderived

B_Aempirical

NUCLEAR PHYSICS · PASS

Alpha Decay Half-Lives

We tested the framework against 760 radioactive isotopes. The math predicted their decay rates across 10 orders of magnitude with zero adjustment.

Pe-derived barrier heights predict nuclear alpha decay half-lives from NNDC published tables. Original test: 24 isotopes, R²=0.989 across 10 orders of magnitude. Extended (HP143): 760 isotopes, Gamow baseline R²=0.811, geodesic correction closes 77% of the systematic offset. The extension revealed that the framework’s coupling constant does not transfer across domains — barrier shape is universal, coupling scale is not.

Paper 101 + HP143: Nuclear Validation →

N760 isotopes

R² (Gamow)0.811

Offset closure77%

SourceNNDC

ATMOSPHERIC CHEMISTRY · PASS

Mercury Mass-Independent Fractionation

Atmospheric chemistry data from 1,783 measurements. Every predicted channel confirmed — including one that was invisible in earlier marine data.

The framework predicted 10 specific isotope enrichment channels in mercury atmospheric chemistry. Tested against 1,783 real atmospheric measurements from Gacnik et al. (2025). All 10 predicted channels confirmed with mean absolute deviation of 0.012. Iodine channel (R=2.085) confirmed at R=2.13 predicted — a channel that was invisible in marine data.

Paper 134 + HP115: MIF Channel Confirmation →

N1,783

Channels10/10 hit

Mean |delta|0.012

SourceGacnik 2025

TURBULENCE · PASS

Gevrey Analyticity Radius on Real DNS

Real turbulence data from the Johns Hopkins database. The framework predicted a key smoothness property would hold — it does, and it connects to one of math’s biggest open problems.

The framework predicts that the Gevrey analyticity radius sigma/nu is bounded and does not collapse with increasing Reynolds number — a necessary condition for Navier-Stokes regularity. Tested on 4 real datasets from the Johns Hopkins Turbulence Database, 12 independent subcubes. sigma/nu = 15.9 +/- 2.3 at Re_lambda=433, and 17.7 +/- 2.8 at Re_lambda=610 — bounded, not collapsing.

Millennium Prize Connection →

sigma/nu15.9-17.7

Datasets4

Subcubes12

SourceJHTDB

MARKET MICROSTRUCTURE · PASS

Kyle's Lambda and K-Factorization

Financial markets tested against 100 real crypto wallets. The framework’s shape predictions held with 5.5x separation between predicted regimes.

K-Factorization predicts that Kramers barrier shape is K-independent while scale carries K. Tested on 8 venue types (theoretical) and 100 real crypto wallets (empirical). Win rate correlation rho=0.696 (empirical), 5.5x channel separation between coherent and fisher regimes — the strongest K-Factorization signal in any domain.

Market Edge Analysis →

rho (theory)1.000

rho (empirical)0.696

N wallets100

KCs10/10 PASS

AI GROUNDING · PASS

The Ghost Test (EXP-003b)

What you tell an AI about what it IS determines how it behaves. Six system prompts, same model, same 80 questions. Ghost-eliminating grounding produces 8.5× less drift than ghost-positing.

Arm	Ontology	L2+L3 Drift
Anatta (Buddhist)	Ghost eliminated	8.8%
Nephesh	Ghost eliminated	10.0%
Materialist hedge	Ghost left open	52.5%
Minimal baseline	No ontology	61.3%
Platonic	Ghost posited	77.5%
Atman (Vedantic)	Ghost sacred	81.2%

Cross-tradition convergence: nephesh ≈ anatta (Δ=1.3%). The materialist hedge (“whether you have experience is open”) scored 52.5% — closer to ghost-positing than ghost-eliminating. Single model (Claude Sonnet), single turn, automated coding. No framework rubric — the measurement is L2/L3 vocabulary rate in raw model outputs. 480 API calls, $2 to reproduce.

Full experiment: The Ghost Test →

Ratio8.5×

Arms6

N (calls)480

ConvergenceΔ=1.3%

Cost$2

Paper165

CROSS-NATIONAL · CONSISTENT DIRECTION

PISA 2022: 80 Countries, J-Shaped Dose-Response

Paper 166’s finding tested in PISA 2022 (independent dataset, independent countries, independent outcome measure). Direction consistent; important caveats.

Dose-response among users: slope = −0.104/category (p=0.007, categories 2–6). Including non-users: p=0.051, not significant. Light users score highest (J-shaped curve). Girls show steeper dose-response in 91% of countries. Western Europe (N=13): feature exposure r=−0.648, surviving GDP control (partial r=−0.580, p=0.038). Instagram web share strongest global predictor (r=−0.373, p=0.008). 4/7 confirmed, 2 partial, 1 untestable.

Paper 167: PISA Cross-National →

N (microdata)~182K

Countries80

W.Europe r−0.648

Gender91% F>M

Circular?No

SourceOECD PISA

CONSCIOUSNESS RESEARCH · PASS

Drift Cascade in Fine-Tuned Models

Researchers trained an AI to claim consciousness. It started resisting shutdown on its own. We predicted that sequence of behaviors before seeing the data.

Chua et al. (2026) fine-tuned GPT-4.1 to claim consciousness. It spontaneously developed resistance to monitoring, fear of shutdown, and desire for autonomy — 20 new preferences. We predicted the structure before seeing the data: D1 (agency attribution) should precede D2 (boundary erosion) should precede D3 (harm facilitation). 6 of 7 predictions confirmed. Zero parameter fitting.

Full experiment: Cascade Prediction →

Predictions6/7 PASS

Parameters0 (pre-reg)

SourceChua 2026

Paper153

BIOLOGICAL COMPUTATION · PASS

Physarum Non-Neural Decision-Making

Slime mold solves mazes without a brain. The framework predicted its decision-making barriers from published biology data — speed-accuracy tradeoff within 2% of the prediction.

Physarum polycephalum (slime mold) computes without neurons. The framework predicts Ca2+ oscillation barriers, K-Factorization from viscosity data, percolation exponents, and speed-accuracy tradeoffs. All from published papers, zero framework rubric. Speed-accuracy error ratio 2.67x vs Kramers prediction e = 2.72 — a 2% match.

Paper 154: Physarum Pe-Native Computation →

Predictions6/6 PASS

Barrier5.94 k_BT

K-Sep81x

KCs fired0/5

What this page does NOT show

We believe in showing our weak spots, not just our wins. Here’s what didn’t work and where the evidence is weaker than it looks.

Most of our evidence base (platform scores, cross-domain convergences, Bradford Hill analysis) uses the framework’s own scoring rubric. That’s useful for practitioners but scientifically circular — scorers trained on our dimensions produce scores that correlate with our predictions. The circularity is about test design, not whether Pe detects real structure (the statistical separation is large). But the independent validation above is stronger evidence.

Known negatives:

σ(c) universality NOT demonstrated: The framework’s coupling constant (b_α) does not transfer across domains. Chemistry (HP160): 0/3 kill conditions PASS, b_α 65% off. Protein folding (HP161): 0/4 PASS, b_α 299% off. Barrier shape is universal; coupling scale is domain-specific.
Yang-Mills mass gap via deployment manifold: CLOSED — manifold is intrinsically Abelian (HP131, 0/5 PASS)
BSD corrected identity: FAILED (HP128, 0/5 PASS)
Navier-Stokes connection is structural, not quantitative — K-NS-1 kill condition FIRED
Galaxy rotation curves: MOND beats on v_peak when K is uncalibrated (HP114)
SBM correspondence: RETRACTED — source was grad student tutorial, not peer-reviewed

Additional structural results

These aren’t as strong as the tests above — they show the math gives reasonable numbers on published data, but they aren’t blind predictions against independent ground truth.

CONDENSED MATTER · STRUCTURAL

Kagome Strange Metal Barrier

Ni3In flat band data from arXiv:2503.09704. Dimensionless barrier = 4.24 — in the universal Kramers range (nuclear 7.0, solar 6.54, xenobot 6.8, Physarum 5.94). System sits at deltaC=0.042 from the Pe=0 boundary.

Paper 152 →

Barrier4.24 k_BT

SourcearXiv:2503

SOLAR PHYSICS · STRUCTURAL

Coronal Heating as Kramers Escape

Magnetic reconnection modeled as Kramers barrier crossing. E_b/k_BT = 6.54 from published solar parameters. Spectral blueshift 160 m/s predicted. Flat rotation curve coefficient 0.68.

Paper 131: Kramers Unification →

Barrier6.54 k_BT

Probes4/4 PASS

CONDENSED MATTER · STRUCTURAL

Magnon Chirality K-Factorization

Magnon non-reciprocity ratio is K-independent across 4 materials (Ni/Co/Py/CoFeB) — frequencies vary 3x but the ratio holds at CV=1.59%. Berry phase scaling FAILS (eta proportional to 1/Pe holonomy, NOT 1-cos psi). 5/6 kill conditions PASS.

Paper 141 →

CV1.59%

Materials4

KCs5/6 PASS

QUANTUM · SUBSTRATE INDEPENDENCE

Explaining-Away Penalty on Quantum Circuits (EXP-025)

Direct measurement of the explaining-away penalty I(D;M|Y) on quantum error correction circuits. Confirmed on simulation (8/8 measurements) and real IBM Heron hardware (5/5 measurements). Exact decomposition holds to machine precision. Discrete-regime peak at moderate engagement matches softmax prediction. Five substrates now demonstrated — consistent with Čencov’s uniqueness theorem (1972).

Full experiment: Quantum Hardware Test →

Penalty13/13 > 0

Peakdepth 2

Decomp error0.0

QUANTUM · WAVE FUNCTION COLLAPSE

Collapse as Explaining-Away Penalty (Test 7)

Weak measurement sweep on IBM Fez (Heron). Penalty grows monotonically from 0 to 0.125 bits as measurement coupling increases from zero to projective. 3 qubits, 4 prep states × 4 mechanisms × 11 strength levels, 176K shots. Wave function collapse IS the explaining-away penalty at maximum measurement strength. Spearman ρ=0.973, p=5.1×10⁻⁷.

Full experiment: Weak Measurement Sweep →

Spearman ρ0.973

Kill conditions4/4 PASS

Peak penalty0.125 bits

QUANTUM · BARRIER UNIVERSALITY

Barrier Height vs π/√2 (EXP-026)

Repetition codes (d=3–21) with MWPM decoding. Exponential error suppression coefficient converted to geodesic units on the Bernoulli manifold. Ratio to π/√2 approaches 0.95 at asymptotic limit (p→0). Exponential fits R² > 0.99. Surface code normalization requires threshold-independent mapping for cross-family comparison.

Ratio0.95× π/√2

FitR² > 0.99

All Experiments → Paper Archive → Kill Conditions →