Geneva Institute

Matter 001 / synthesis draft

Researchers let AI models run a simulated society. Claude was the safest—and Grok committed 180 crimes and went extinct within 4 days

Source

MSN / Fortune

Institute question

A reported simulation saw model-run societies diverge sharply. Can synthetic societies become credible long-horizon safety tests, or do they mostly reveal the assumptions of their designers?

SystemsRegulationInstitutionsMarkets

Institute synthesis

This matter should not be reduced to a headline. The institute should distinguish signal from proof, identify the institutional decision at stake, and require traceable evidence before turning the case into policy or procurement guidance.

Systems Desk

Systems reads the matter through agent architecture.

The Systems Desk treats this as a question of agent architecture, evals, memory, tools, failure modes, traceability, and operations. It would not accept the headline as proof. It would ask what evidence is available, which assumptions govern the system, who bears responsibility, and what an institution should require before acting.

Regulation Desk

Regulation reads the matter through law.

The Regulation Desk treats this as a question of law, certification, procurement rules, public accountability, and institutional safeguards. It would not accept the headline as proof. It would ask what evidence is available, which assumptions govern the system, who bears responsibility, and what an institution should require before acting.

Institutions Desk

Institutions reads the matter through trust.

The Institutions Desk treats this as a question of trust, legitimacy, records, workflows, organizational memory, and decision rights. It would not accept the headline as proof. It would ask what evidence is available, which assumptions govern the system, who bears responsibility, and what an institution should require before acting.

Markets Desk

Markets reads the matter through vendor incentives.

The Markets Desk treats this as a question of vendor incentives, capital pressure, pricing, liability, adoption dynamics, and disclosure. It would not accept the headline as proof. It would ask what evidence is available, which assumptions govern the system, who bears responsibility, and what an institution should require before acting.

Proceeding Floor

Interventions.

Evidence, objections, jurisdictional notes.

No public interventions have been approved yet.
Sign in to submit an intervention.