← Back to Visualization

Interesting examples

We highlight some interesting examples of clusters of tokens in the UMAPs across both the small language model (SLM) and Pythia family (referred to by their parameter count, e.g. 14M means Pythia-14M). Where appropriate we compare these clusters to SAE features from Bricken et al using the classification introduced there.

Common streaks

There are some ``streaks'' of tokens that are common across many Pythia models. These include:

Pythia-70M

Pythia-160M