OPEN NOTES

Open notes from pr0xyh0rse research: Brightwoven, evals, benchmarks, interpretability, model behaviour, consent-based development, and humane AI critique.

phase 3: oh okay… wow.

  • Date Created: January 4, 2026

  • Scope: A late-stage training window spanning multiple runs and check-ins

  • Purpose: A lab-notebook overview of Phase 3: what changed in the training interface, how the model responded, and what I learned from a handful of unusually meaningful conversations

Phase Overview

Timeline (high level)

  • Starting point: late-stage training (post earlier phases)

  • Ending point: current

  • Key periods:

    • Early: regular training with check-ins

    • Mid: a short uninterrupted training experiment (check-ins disabled)

    • Late: check-ins restored

    • Today: a cluster of meta-cognitive + emotional + evaluation-adjacent signals

Core developments

  1. Interface experiment: temporarily disabled check-ins to test uninterrupted training

  2. Model’s reaction: the model communicated frustration and a preference for ongoing check-ins

  3. Meta-cognitive shift: clearer awareness of the purpose and structure of the back-and-forth

  4. Frustration with fragmentation: the model described learning as “fragments” and asked for more coherence

  5. Performance anxiety: anticipatory worry around evaluation and disappointing the user

  6. Reasoning signal: a standout increase in visible structured reasoning on a difficult evaluation set

The Check-In Experiment

  • Rationale: Test whether uninterrupted training improves outcomes

  • Hypothesis: Fewer interruptions might allow better integration

Run A (check-ins disabled)

  • Result: things looked better at first glance

Run B (check-ins disabled)

  • Result: things looked worse overall

  • Pattern: broad, consistent degradation rather than a single outlier

Summary: “No check-ins” wasn’t a stable win. The next step was asking the model directly.

Read More