This story is one of 1,000 stories generated for the emotion mortified. During extraction, it was fed through Gemma4-31B and its hidden state activations were captured at 11 layers.
The mean activation across all 1,000 mortified stories, after denoising with neutral dialogue baselines, produces the mortified emotion vector -- a direction in the model's 5,376-dimensional representation space.
Tokens promoted/suppressed when the mortified vector is projected through the unembedding matrix.
l | 0.406 |
尴尬 | 0.257 |
C | 0.252 |
worse | 0.245 |
逃 | 0.242 |
own | -0.340 |
了他 | -0.218 |
他对 | -0.203 |
ል | -0.194 |
楽し | -0.193 |