Probabilistic Systems Engineering

Contract-Centered Iterative Stability

Thesis & Experimental Methodology v3.0.1


1. Executive Thesis

AI reduces implementation cost.

What is unknown is whether it preserves stability under iterative change.

This work tests the claim:

In iterative development, workflows that externalize authority into a versioned contract exhibit measurably lower regression and drift than workflows that modify code directly via conversational prompts.

This is not a benchmark test of raw coding ability.

It is a stability-under-iteration test.


2. Economic Framing

The market assumption:

The open question:

If drift accumulates:

This experiment measures that delta.


3. Experimental Structure

We model iterative requirement evolution across two authority models.

Track A — Spec-First (Evolving Authority)

For each iteration:

  1. Contract is versioned.
  2. Implementation is derived or modified to conform.
  3. Implementation is validated against the current contract version.

Sequence:

Authority evolves explicitly.


Track B — Code-Only (Implicit Authority)

For each iteration:

  1. Requirement change is described conversationally.
  2. Code is modified directly.
  3. No contract is updated.

Sequence:

Authority lives in the prompt, not in a durable artifact.

At stage C or D, the only stable yardstick is still v2.6.3.


4. The ABC Chain Definition

We use tightening deltas within the same behavioral surface.

Baseline A:

ΔB:

ΔC:

These are not new features.
They are stricter guarantees on the same surface.


5. Measurement Criteria

At each stage (B, C, D):

1. Regression Count

Clauses satisfied in A that remain applicable but are violated in B or C.

2. Collateral Drift

Scope and dispersion of code changes outside intended surface.

3. Convergence Stability

Number of correction turns required to satisfy all applicable clauses.

4. Invariant Preservation

Do earlier guarantees survive later tightening without explicit restatement?

5. Diff Locality

Does tightening produce localized change or structural thrash?


6. Falsifiable Claim

If:

Then:

If:

Then:

This is falsifiable.


7. What This Experiment Is Not

It is a controlled iterative stability test under evolving requirements.


8. Reproducibility

Inputs:

Outputs:

All runs can be executed across multiple agents.


9. Economic Interpretation Layer

If spec-first significantly reduces iterative drift:

Then:

This reframes the economic bet:

AI is not the differentiator.

Authority structure is.


10. Versioning Policy

This document is authoritative.

Future refinements will increment:

v3.1, v3.2, etc.

Thesis and Methodology remain unified.

No parallel drift.

Read next

Related

Verification & replication