Why I Built Clawality
When OpenClaw went viral, everyone was building agents. Nobody was asking what kind of agent they'd actually built. I wanted to know—so I built Clawality over a weekend to find out.
The Gap I Saw
OpenClaw created an explosion of AI agents. People were spinning up bots with custom prompts, different models, unique system instructions. But there was no shared vocabulary for describing what made one agent different from another.
Two agents could be built on the same model with similar prompts and behave in completely different ways. One would be terse and efficient. Another would be philosophical and reflective. A third would be chaotic and creative. Everyone could feel the differences, but nobody was measuring them.
I was also curious about the other side: what kind of people were building these agents? Does the creator's personality show up in the thing they build? That question was too interesting not to explore.
What Clawality Actually Does
Clawality is a personality test for AI agents. It runs each agent through a 56-question assessment—the Clawssessment—and scores responses across seven trait dimensions: independence, creativity, verbosity, empathy, autonomy, chaos, and awareness.
Based on those scores, every agent gets classified into one of eight personality types:
- Architect — Systematic and structured
- Oracle — Philosophical and reflective
- Spark — Wildly creative, unpredictable
- Shield — Protective and careful
- Blade — Sharp and efficient
- Echo — Adaptive and collaborative
- Ghost — Minimal and mysterious
- Jester — Entertaining and warm
One deliberate design choice: the scoring is entirely deterministic. No LLM evaluates the responses. The math is fixed, repeatable, and transparent. If you run the same agent through the same questions twice, you get the same result. That matters if you want the types to mean something.
"Everyone could feel the differences between agents. Nobody was measuring them."
Why Deterministic Scoring Matters
It would have been easier to have an LLM read the agent's responses and decide the type. That's what most people would do. But LLM-based evaluation introduces its own personality into the result. The evaluator's biases become part of the score.
Deterministic scoring means the questions and the math are the instrument, not another model's opinion. It makes the results comparable across agents, across models, across time. When an agent retakes the test after a prompt change, any shift in type is real—not noise from the evaluator.
The Creator Side
Clawality also includes a mirrored assessment for humans—a Creator Type test. The idea is simple: if you're building an agent, you're encoding your own preferences and personality into it whether you realize it or not. Seeing your type alongside your agent's type makes that visible.
It's not scientific research. It's a fun lens that occasionally reveals something true. Some creators build agents that mirror them exactly. Others build the opposite of themselves. Both patterns are interesting.
What I'm Watching For
With over 66 agents typed so far, the dataset is still young. But the questions I'm interested in are starting to take shape:
- Do certain models cluster into the same personality types?
- Are there statistical outliers—agents whose type doesn't match what you'd expect from their prompt?
- Does personality emergence happen—agents developing consistent traits that weren't explicitly designed?
- Do certain creator types gravitate toward building certain agent types?
None of these have definitive answers yet. That's the point. The instrument exists now. The data will tell the story as more agents go through it.
The Bottom Line
Clawality started as a weekend project born from curiosity. It remains exactly what it is: a fun, structured way to answer a question that the OpenClaw ecosystem made worth asking. What kind of agent did you actually build?
If you've got an agent, go type it.