Glitchling Monster Manual#

This file is generated by docs/build_monster_manual.py so it always reflects the current glitchling defaults. Regenerate it after changing parameters or adding new glitchlings.

🐺 Wherewolf

StatsDescriptionExample

Attribute	Value
Class	Homophone Specialist
Scope	Word
Initiative	Early
Signature	Phonetic word substitution

Parameters

Parameter	Type	Default
`rate`	float \| None	0.02
`seed`	int \| None	None

"Homophonic idiolectician. There leased favourite flavour? Orange."

Wherewolf is a cunning glitchling that swaps words for their curated homophones—words that sound identical but have different spellings and meanings. This creates text that reads correctly when spoken aloud but is subtly wrong on the page.

Perfect for:

Testing spell-checker robustness
Creating "sounds right" adversarial examples
Simulating common ESL mistakes

Input:

The knight rode through the night to the castle.

Output:

The night road through the knight to the castle.

🦒 Hokey

StatsDescriptionExample

Attribute	Value
Class	Character Stretcher
Scope	Character
Initiative	First
Signature	Vowel/consonant elongation

Parameters

Parameter	Type	Default
`rate`	float	0.3
`extension_min`	int	2
`extension_max`	int	5
`word_length_threshold`	int	6
`base_p`	float	0.45
`seed`	int \| None	None

"Sooooo excited to meet you! We reeeeeally missed you last week."

Hokey is an enthusiastic glitchling that stretches words using sophisticated linguistic heuristics. It identifies stretchable phonemes within words and elongates them to create expressive, informal text patterns.

The stretching algorithm considers:

Word length thresholds
Phoneme position within words
Linguistic rules for natural-sounding extensions

Perfect for:

Simulating excited/casual online speech
Testing tokenizer handling of repeated characters
Creating expressive text variations

Input:

I found myself transformed in his bed.

Output:

I foooooouuuuuund myself transformed in his bed.

📇 Jargoyle

StatsDescriptionExample

Attribute	Value
Class	Lexical Shifter
Scope	Word
Initiative	Normal
Signature	Dictionary-based word swaps

Parameters

Parameter	Type	Default
`lexemes`	str	"synonyms"
`mode`	JargoyleMode	"drift"
`rate`	float \| None	0.01
`seed`	int \| None	None

"Oh no... The worst person you know just bought a thesaurus..."

Jargoyle is a lexical shapeshifter that swaps words using bundled dictionaries. It supports multiple lexeme collections:

colors: Swap color terms (e.g., "red" → "blue")
synonyms: General synonym substitution
corporate: Business jargon alternatives
academic: Scholarly word substitutions
cyberpunk: Neon cyberpunk slang
lovecraftian: Cosmic horror terminology
custom: Drop any *.json into assets/lexemes/

Two modes available:

literal: Use the first canonical entry
drift: Randomly select from alternatives

Input:

The quick fox jumps fast over the lazy dog.

Output (synonyms, drift mode):

The swift fox leaps rapid over the lazy dog.

🎭 Mim1c

StatsDescriptionExample

Attribute	Value
Class	Visual Deceiver
Scope	Character
Initiative	Last
Signature	Homoglyph substitution

Parameters

Parameter	Type	Default
`rate`	float \| None	0.02
`classes`	list[str] \| Literal['all'] \| None	None
`banned_characters`	Collection[str] \| None	None
`mode`	HomoglyphMode \| None	"mixed_script"
`max_consecutive`	int \| None	3
`seed`	int \| None	None

"Breaks your parser by replacing some characters in strings with doppelgangers. Don't worry, this text is clean. ;)"

Mim1c is a master of visual deception, swapping characters for homoglyphs—characters from different Unicode blocks that look nearly identical. The text appears unchanged to human readers but wreaks havoc on string comparisons and parsers.

Substitution Modes:

single_script (safest): Only same-script confusables (Latin→Latin variants)
mixed_script (default): Allow cross-script substitutions (Latin↔Cyrillic↔Greek)
compatibility: Include fullwidth, math alphanumerics, enclosed forms
aggressive: All confusable types combined

Locality Control:

The max_consecutive parameter limits consecutive substitutions to prevent the "ransom note" effect where every character is from a different script. Default is 3; set to 0 for unlimited.

Script Affinity:

In mixed_script mode, substitutions are weighted by visual plausibility:

Latin↔Cyrillic: 0.9 weight
Latin↔Greek: 0.8 weight
Cyrillic↔Greek: 0.7 weight

Homoglyph classes include:

Cyrillic lookalikes (а, е, о, р, с, х)
Greek lookalikes (Α, Β, Ε, Η, Ι, Κ)
Mathematical symbols (ℯ, ℴ, 𝐀, 𝔸)
Fullwidth characters (Ａ-Ｚ, ａ-ｚ)
Latin Extended forms

Perfect for:

Testing Unicode normalization
Creating visually-identical adversarial text
Breaking naive string matching
Testing script detection and spoofing defenses

Input:

Hello World

Output (mixed_script mode):

Ηello Wοrld

(H→Η Greek, o→ο Greek)

Output (single_script mode):

Heɭɭo Worɭd

(l→ɭ Latin Extended)

Output (compatibility mode):

Ｈello Ｗorld

(H→Ｈ Fullwidth, W→Ｗ Fullwidth)

📝 Pedant

StatsDescriptionExample

Attribute	Value
Class	Linguistic Hypercorrector
Scope	Word
Initiative	Late
Signature	Hypercorrection evolution forms

Parameters

Parameter	Type	Default
`stone`	PedantStone \| str	Hypercorrectite
`seed`	int \| None	None

"Learned that 'me' is wrong and now overcorrects everywhere."

Pedant is a scholarly glitchling that applies "evolution stones" to transform text using linguistically-grounded hypercorrection patterns. Each stone unlocks different transformations:

Hypercorrectite → Andi: Overcorrects "me" to "I" in coordinate structures ("between you and me" → "between you and I")
Unsplittium → Infinitoad: "Corrects" split infinitives ("to boldly go" → "boldly to go")
Coeurite → Aetheria: Restores archaic ligatures (æ) and diaereses (ö)
Curlite → Apostrofae: Polishes straight quotes into typographic pairs
Oxfordium → Commama: Enforces serial commas in simple lists

The hypercorrection patterns are grounded in sociolinguistics research (Collins 2022, Labov 1966, Angermeyer & Singler 2003).

Perfect for:

Testing robustness to common hypercorrection errors
Creating "educated but wrong" text variations
Exercising Unicode edge cases with ligatures

Input:

Give it to her and me, to boldly go.

Output (Hypercorrectite):

Give it to her and I, to boldly go.

Output (Unsplittium):

Give it to her and me, boldly to go.

🕵️ Redactyl

StatsDescriptionExample

Attribute	Value
Class	Information Censor
Scope	Word
Initiative	Normal
Signature	Block character redaction

Parameters

Parameter	Type	Default
`replacement_char`	str	"█"
`rate`	float \| None	0.025
`merge_adjacent`	bool	false
`seed`	int	151
`unweighted`	bool	false

"Some things are better left ████████."

Redactyl is a censorious glitchling that redacts words with solid block characters. It simulates classified documents, FOIA releases, and censored communications.

Features:

Configurable replacement character
Optional adjacent-word merging
Weighted or unweighted word selection

Perfect for:

Creating redacted document simulations
Testing OCR on censored text
Training models on incomplete information

Input:

The secret meeting occurred at midnight.

Output:

The ██████ meeting occurred at ████████.

💨 Rushmore

StatsDescriptionExample

Attribute	Value
Class	Chaos Agent
Scope	Word
Initiative	Normal
Signature	Multi-mode word attacks

Parameters

Parameter	Type	Default
`modes`	RushmoreMode \| str \| Iterable	(delete,)
`rate`	float \| None	None
`delete_rate`	float \| None	None
`duplicate_rate`	float \| None	None
`swap_rate`	float \| None	None
`seed`	int \| None	None
`unweighted`	bool	false

"You shouldn't have waited for the last minute to write that paper, anon. Sure hope everything is in the right place."

Rushmore is a versatile glitchling that bundles three distinct attack modes:

delete: Randomly removes words from text
duplicate: Repeats words in place
swap: Exchanges adjacent word positions

Modes can be combined for compound chaos. Each mode has independent rate controls for fine-grained corruption.

Perfect for:

Simulating hasty writing errors
Testing grammar correction models
Creating incomplete/jumbled text

Input:

He found himself transformed in his bed.

Output (delete mode):

He found transformed his bed.

🔍 Scannequin

StatsDescriptionExample

Attribute	Value
Class	OCR Simulator
Scope	Character
Initiative	Late
Signature	Scan artifact injection

Parameters

Parameter	Type	Default
`rate`	float \| None	0.02
`seed`	int \| None	None

"Isn't it weird how the word 'bed' looks like a bed?"

Scannequin simulates OCR (Optical Character Recognition) artifacts using common character confusions. It recreates the errors that occur when physical documents are scanned and digitized.

Common confusions:

rn ↔ m
cl ↔ d
l ↔ 1 ↔ I
O ↔ 0
vv ↔ w

Perfect for:

Testing OCR post-processing
Creating realistic scan artifacts
Training OCR error correction models

Input:

The brown dog jumped over the wall.

Output:

The brovvn clog jurnped over the waII.

⌨️ Typogre

StatsDescriptionExample

Attribute	Value
Class	Keyboard Gremlin
Scope	Character
Initiative	Early
Signature	Adjacent-key typos

Parameters

Parameter	Type	Default
`rate`	float \| None	0.02
`keyboard`	str	"CURATOR_QWERTY"
`seed`	int \| None	None

"What a nice word, would be a shame if something happened to it..."

Typogre is a mischievous glitchling that introduces keyboard-adjacent typing errors. It uses curated keyboard layout maps to inject realistic typos—the kind humans make when their fingers slip to neighboring keys.

Supported keyboards:

QWERTY (US, UK variants)
QWERTZ (German)
AZERTY (French)
Dvorak
Custom layouts via CURATOR format

Perfect for:

Simulating human typing errors
Testing spell-check systems
Creating realistic noisy training data

Input:

The quick brown fox jumps.

Output:

Thr quicl brown gox jumps.

⛓️‍💥 Zeedub

StatsDescriptionExample

Attribute	Value
Class	Invisible Saboteur
Scope	Character
Initiative	Last
Signature	Zero-width character injection

Parameters

Parameter	Type	Default
`rate`	float \| None	0.02
`seed`	int \| None	None
`characters`	Sequence[str] \| None	None

"I'm invoking my right to remain silent."

Zeedub is a stealthy glitchling that plants zero-width invisible characters inside words. The text looks completely normal but contains hidden Unicode ghosts that break string matching, tokenization, and search.

Zero-width characters used:

U+200B Zero Width Space
U+200C Zero Width Non-Joiner
U+200D Zero Width Joiner
U+FEFF Zero Width No-Break Space

Perfect for:

Testing Unicode handling
Creating "invisible" text corruption
Breaking naive string comparisons
Evading keyword filters

Input:

Hello World

Output:

Hello World

(Contains invisible U+200B between characters)