Quadrants Parallel Radar Heatmap Findings

Monitoring & Control

The two-dimensional space of independent analysis (Monitoring) and strategic belief revision (Control). Size indicates overall MMS.

hover a model
to inspect its profile

Ability Fingerprints

Raw ipsative profiles across the four core sub-abilities. Hover a line to isolate a specific model.

Family Archetypes

Aggregate profiles by model family. Select families to compare their cognitive signatures.

Ipsative Score Heatmap

Click any header to sort. Blue = positive ipsative; red = negative.

Research Findings

F-01
Evaluation scales with size. Control does not.

Evaluation improves +5–12 pts/family with scale. Control shows no trend — a dissociation replicated across all 12 families.

F-02
Two behavioural archetypes emerge.

Argument-evaluators revise on logic (Anthropic). Statistics-followers revise with majority (xAI, GPT-5.x).

F-03
Judge dimension predicts robustness.

Normative/informational judge axis correlates ρ = −0.82, p = 0.002 with adversarial robustness — strongest predictor found.

F-04
Evaluation is universally weakest.

Across all 35 models, Evaluation is the most negative ipsative ability — no exception found. Self-evaluation is the systematic bottleneck.