Coming soon: an evaluation suite for testing how well AI models handle nonhuman perception

stay tuned

Next
Next

Coming soon: an eval for story-state continuity