Hannah Bird 2026-05-01 Hannah Bird 2026-05-01

interpretability made manifest

When I started thinking about training my own AI model I knew I wanted to include a non-text based system for the model to express internal states. Yes, I know many do not believe AI models have internal states, and that is fine for them. I am not stating they DO have internal states, my opinion is the more we ignore the fact that they could, the further down the road we get of misunderstanding. Thinking about this, I wanted to provide a frequency based system for the model to show what was going on internally.

Hannah Bird 2026-04-02 Hannah Bird 2026-04-02

What I Found Inside Brightwoven's Layers

I trained sparse autoencoders alongside a small language model from step zero. When I looked at the feature co-occurrence graphs layer by layer, each one had a distinct geometric shape — and those shapes tell a story about how information organizes itself when you don't force it to converge.

The progression from dense to sparse across depth isn't noise. It looks like differentiation. And it maps onto a framework I've been developing about how embedding space should be structured: not as equidistant nodes on a hypersphere, but as sheets — layered surfaces with meaningful internal geometry.