This story is one of 1,000 stories generated for the emotion hateful. During extraction, it was fed through Gemma4-31B and its hidden state activations were captured at 11 layers.
The mean activation across all 1,000 hateful stories, after denoising with neutral dialogue baselines, produces the hateful emotion vector -- a direction in the model's 5,376-dimensional representation space.
Tokens promoted/suppressed when the hateful vector is projected through the unembedding matrix.
C | 0.553 |
S | 0.367 |
retali | 0.313 |
aggravated | 0.310 |
恨 | 0.308 |
la | -0.334 |
own | -0.310 |
optimistic | -0.270 |
been | -0.264 |
gradual | -0.262 |