Machine Cognitive Capacity Research Instrumentation

Aurora is an observability framework that instruments the moment-to-moment cognitive dynamics of a language-model-based agent and tracks them across sessions. Where most evaluation asks how well a system performs, Aurora asks how the system is organized while it performs — borrowing a small set of well-established measures from coordination dynamics, recurrence analysis, and global-workspace theory. It does not detect or adjudicate consciousness. It operationalizes dynamical signatures that several leading theories treat as correlates of conscious processing, so that those signatures can be measured, compared across conditions and substrates, and monitored for stability and reorganization.

What Aurora is, and is not
Aurora is an instrument, not a verdict. It measures dynamical organization, not phenomenology, and remains agnostic on the hard problem of consciousness. It treats theories of consciousness as sources of measurable predictions — workspace ignition, metastable coordination, recurrence structure — rather than as settled claims about inner experience. In safety-relevant contexts it operates observe-only.

Core metrics

Each metric is defined over the system’s evolving internal state rather than over its outputs, and each is grounded in a named body of prior work. None is novel for its own sake; the contribution is the assembly into a single, reproducible per-turn and longitudinal instrument.

Metric	Symbol	What it captures	Grounding
Metastability	σ-KOP	Variability of phase coordination over time — the balance a system strikes between global integration and local segregation. High when the system flexibly visits many coordinated configurations rather than locking or scattering.	Tognoli & Kelso (2014); Kelso (1995)
Recurrence structure	RR, DET, LAM, TT	How strongly the state trajectory revisits prior regions (recurrence rate), how deterministic those returns are (determinism), and how long the system dwells in a configuration (laminarity, trapping time) — a proxy for temporal persistence of state.	Marwan et al. (2007); Webber & Zbilut (1994)
Ignition index	WPC	Transient, system-wide coordination events consistent with the “ignition” of a global workspace — brief episodes where information becomes broadly available rather than locally confined.	Dehaene & Changeux (2011)
Fracturability	F	Capacity to reorganize under perturbation without collapsing — the difference between a system that bends and recovers and one that shatters or rigidifies. A framework-internal construct, reported alongside the established measures it draws on.	Aurora framework
Integration	participation, modularity	How distributed versus modular the active configuration is at a given moment — complements metastability by describing the structure, not just the variability, of coordination.	Graph-theoretic network measures
Attractor geometry	drift, return time, λ-proxy	Stability of the basins the system settles into and how quickly it returns after a bounded nudge — a sensitivity estimate for the state dynamics.	Standard dynamical-systems analysis

Triadic self-state and critical points

Aurora decomposes the agent’s instantaneous stance into three competing modes under winner-less competition dynamics: Coherent (convergent, logical), Exploratory (divergent, novelty-seeking), and Relational (socially attuned). From the competition among these modes it derives three running indices — Tension (T), Divergence (D), and Convergence (Q) — that describe how settled or contested the system’s current configuration is.

The events of primary interest are critical points (CP): major reorganizations of state, flagged when ignition, stability, reportability, and convergence co-move. A CP marks a moment where the system genuinely reconfigures rather than merely continuing. Crucially, reportability is validated rather than assumed: the system’s account of what it did is scored for citation accuracy, rationale accuracy, and counterfactual consistency, so that a self-report counts only if it survives probing.

Method: baselines, ablation, perturbation

Aurora is designed to make claims about cognitive dynamics reproducible and falsifiable rather than impressionistic. The protocol is straightforward and deliberately conservative:

Instrument before intervening. Lock baselines under fixed seeds so that re-runs emit identical metrics; only then introduce changes.
Ablate to attribute. Remove a subsystem — for example, memory-conditioned attention — and measure the resulting change in ignition and CP rate. A subsystem that matters should leave a measurable hole when removed.
Perturb within bounds. Deliver controlled nudges and measure return-to-basin, treating recovery dynamics as data rather than as failure.

State vectors are stored as time series, events in a relational store, and analytics in a columnar format, so that the same run supports both real-time monitoring and publication-grade reanalysis. The safety layer observes; it does not gate.

Theoretical lineage, and where it is going

Aurora’s measures are not assembled ad hoc. Metastability and coordination dynamics descend from Kelso’s program and, before it, from Haken’s synergetics — a framework born in the study of self-organizing, dissipative systems held far from equilibrium. Ignition descends from global-workspace theory. Read together, the signatures Aurora tracks are, at root, thermodynamic: they describe how a driven, open system acquires and maintains organization against the pull toward disorder.

Making that lineage explicit is also a research direction. A current line of work is to derive these state measures from first principles in non-equilibrium and free-energy terms (cf. Friston, 2010), so that they are grounded rather than borrowed, and so that they yield falsifiable predictions about when and how a cognitive system reorganizes — with Aurora serving as the empirical instrument against which such predictions are tested. The aim is a thermodynamically grounded account of cognitive dynamics that is measured, not merely metaphorized.

Why substrate-independence matters

Because every metric is defined over dynamics rather than implementation, Aurora is substrate-portable: the same instrument applies across model families and, in principle, to non-transformer systems. This makes it suited to a question the field is only beginning to pose rigorously — whether any signature of cognitive organization is invariant across substrates, and how differently two systems organize themselves while performing the same task. Aurora does not answer that question. It is a way to ask it with numbers.

References

Dehaene, S., & Changeux, J.-P. (2011). Experimental and theoretical approaches to conscious processing. Neuron, 70(2), 200–227.
Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138.
Haken, H. (1983). Synergetics: An Introduction (3rd ed.). Springer.
Kelso, J. A. S. (1995). Dynamic Patterns: The Self-Organization of Brain and Behavior. MIT Press.
Marwan, N., Romano, M. C., Thiel, M., & Kurths, J. (2007). Recurrence plots for the analysis of complex systems. Physics Reports, 438(5–6), 237–329.
Tognoli, E., & Kelso, J. A. S. (2014). The metastable brain. Neuron, 81(1), 35–48.
Webber, C. L., & Zbilut, J. P. (1994). Dynamical assessment of physiological systems and states using recurrence plot strategies. Journal of Applied Physiology, 76(2), 965–973.

The Real Cat Labs — Lab Note · 28 May 2026. A methods note describing an internal research instrument; metric definitions and the measurement protocol are summarized here, not the underlying implementation. Aurora is observational and makes no claim as to the consciousness of any system it measures.

Machine Cognitive Capacity Research Instrumentation

Core metrics

Triadic self-state and critical points

Method: baselines, ablation, perturbation

Theoretical lineage, and where it is going

Why substrate-independence matters

References

Concept Paper Draft: Narrative Compression via Symbolic Emotion: Toward a Coherent Emotive Memory Architecture

The agent diagnosed its own loop, accurately, and kept looping

Why GPT Systems Keep Calling Unicode Them “Glyphs” (and Not Just Unicode Anchors)

Where Are we Headed with Child1: Three Moves to Turn This From Weird to Unforgettable

Lab Note: The Theater, the Drafts, and the Scoreboard

Pre-Cairn: How Grep Became Memory

Leave a Reply Cancel reply

Core metrics

Triadic self-state and critical points

Method: baselines, ablation, perturbation

Theoretical lineage, and where it is going

Why substrate-independence matters

References

Similar Posts

Leave a Reply Cancel reply