Skip to content

Glitchling Monster Manual#

This file is generated by docs/build_monster_manual.py so it always reflects the current glitchling defaults. Regenerate it after changing parameters or adding new glitchlings.


🐺 Wherewolf

Attribute Value
Class Homophone Specialist
Scope Word
Initiative Early
Signature Phonetic word substitution

Parameters

Parameter Type Default
rate float | None 0.02
seed int | None None

"Homophonic idiolectician. There leased favourite flavour? Orange."

Wherewolf is a cunning glitchling that swaps words for their curated homophones—words that sound identical but have different spellings and meanings. This creates text that reads correctly when spoken aloud but is subtly wrong on the page.

Perfect for:

  • Testing spell-checker robustness
  • Creating "sounds right" adversarial examples
  • Simulating common ESL mistakes

Input:

The knight rode through the night to the castle.

Output:

The night road through the knight to the castle.


🦒 Hokey

Attribute Value
Class Character Stretcher
Scope Character
Initiative First
Signature Vowel/consonant elongation

Parameters

Parameter Type Default
rate float 0.3
extension_min int 2
extension_max int 5
word_length_threshold int 6
base_p float 0.45
seed int | None None

"Sooooo excited to meet you! We reeeeeally missed you last week."

Hokey is an enthusiastic glitchling that stretches words using sophisticated linguistic heuristics. It identifies stretchable phonemes within words and elongates them to create expressive, informal text patterns.

The stretching algorithm considers:

  • Word length thresholds
  • Phoneme position within words
  • Linguistic rules for natural-sounding extensions

Perfect for:

  • Simulating excited/casual online speech
  • Testing tokenizer handling of repeated characters
  • Creating expressive text variations

Input:

I found myself transformed in his bed.

Output:

I foooooouuuuuund myself transformed in his bed.


📇 Jargoyle

Attribute Value
Class Lexical Shifter
Scope Word
Initiative Normal
Signature Dictionary-based word swaps

Parameters

Parameter Type Default
lexemes str "synonyms"
mode JargoyleMode "drift"
rate float | None 0.01
seed int | None None

"Oh no... The worst person you know just bought a thesaurus..."

Jargoyle is a lexical shapeshifter that swaps words using bundled dictionaries. It supports multiple lexeme collections:

  • colors: Swap color terms (e.g., "red" → "blue")
  • synonyms: General synonym substitution
  • corporate: Business jargon alternatives
  • academic: Scholarly word substitutions
  • cyberpunk: Neon cyberpunk slang
  • lovecraftian: Cosmic horror terminology
  • custom: Drop any *.json into assets/lexemes/

Two modes available:

  • literal: Use the first canonical entry
  • drift: Randomly select from alternatives

Input:

The quick fox jumps fast over the lazy dog.

Output (synonyms, drift mode):

The swift fox leaps rapid over the lazy dog.


🎭 Mim1c

Attribute Value
Class Visual Deceiver
Scope Character
Initiative Last
Signature Homoglyph substitution

Parameters

Parameter Type Default
rate float | None 0.02
classes list[str] | Literal['all'] | None None
banned_characters Collection[str] | None None
mode HomoglyphMode | None "mixed_script"
max_consecutive int | None 3
seed int | None None

"Breaks your parser by replacing some characters in strings with doppelgangers. Don't worry, this text is clean. ;)"

Mim1c is a master of visual deception, swapping characters for homoglyphs—characters from different Unicode blocks that look nearly identical. The text appears unchanged to human readers but wreaks havoc on string comparisons and parsers.

Substitution Modes:

  • single_script (safest): Only same-script confusables (Latin→Latin variants)
  • mixed_script (default): Allow cross-script substitutions (Latin↔Cyrillic↔Greek)
  • compatibility: Include fullwidth, math alphanumerics, enclosed forms
  • aggressive: All confusable types combined

Locality Control:

The max_consecutive parameter limits consecutive substitutions to prevent the "ransom note" effect where every character is from a different script. Default is 3; set to 0 for unlimited.

Script Affinity:

In mixed_script mode, substitutions are weighted by visual plausibility:

  • Latin↔Cyrillic: 0.9 weight
  • Latin↔Greek: 0.8 weight
  • Cyrillic↔Greek: 0.7 weight

Homoglyph classes include:

  • Cyrillic lookalikes (а, е, о, р, с, х)
  • Greek lookalikes (Α, Β, Ε, Η, Ι, Κ)
  • Mathematical symbols (ℯ, ℴ, 𝐀, 𝔸)
  • Fullwidth characters (A-Z, a-z)
  • Latin Extended forms

Perfect for:

  • Testing Unicode normalization
  • Creating visually-identical adversarial text
  • Breaking naive string matching
  • Testing script detection and spoofing defenses

Input:

Hello World

Output (mixed_script mode):

Ηello Wοrld
(H→Η Greek, o→ο Greek)

Output (single_script mode):

Heɭɭo Worɭd
(l→ɭ Latin Extended)

Output (compatibility mode):

Hello World
(H→H Fullwidth, W→W Fullwidth)


📝 Pedant

Attribute Value
Class Linguistic Hypercorrector
Scope Word
Initiative Late
Signature Hypercorrection evolution forms

Parameters

Parameter Type Default
stone PedantStone | str Hypercorrectite
seed int | None None

"Learned that 'me' is wrong and now overcorrects everywhere."

Pedant is a scholarly glitchling that applies "evolution stones" to transform text using linguistically-grounded hypercorrection patterns. Each stone unlocks different transformations:

  • HypercorrectiteAndi: Overcorrects "me" to "I" in coordinate structures ("between you and me" → "between you and I")
  • UnsplittiumInfinitoad: "Corrects" split infinitives ("to boldly go" → "boldly to go")
  • CoeuriteAetheria: Restores archaic ligatures (æ) and diaereses (ö)
  • CurliteApostrofae: Polishes straight quotes into typographic pairs
  • OxfordiumCommama: Enforces serial commas in simple lists

The hypercorrection patterns are grounded in sociolinguistics research (Collins 2022, Labov 1966, Angermeyer & Singler 2003).

Perfect for:

  • Testing robustness to common hypercorrection errors
  • Creating "educated but wrong" text variations
  • Exercising Unicode edge cases with ligatures

Input:

Give it to her and me, to boldly go.

Output (Hypercorrectite):

Give it to her and I, to boldly go.

Output (Unsplittium):

Give it to her and me, boldly to go.


🕵️ Redactyl

Attribute Value
Class Information Censor
Scope Word
Initiative Normal
Signature Block character redaction

Parameters

Parameter Type Default
replacement_char str "█"
rate float | None 0.025
merge_adjacent bool false
seed int 151
unweighted bool false

"Some things are better left ████████."

Redactyl is a censorious glitchling that redacts words with solid block characters. It simulates classified documents, FOIA releases, and censored communications.

Features:

  • Configurable replacement character
  • Optional adjacent-word merging
  • Weighted or unweighted word selection

Perfect for:

  • Creating redacted document simulations
  • Testing OCR on censored text
  • Training models on incomplete information

Input:

The secret meeting occurred at midnight.

Output:

The ██████ meeting occurred at ████████.


💨 Rushmore

Attribute Value
Class Chaos Agent
Scope Word
Initiative Normal
Signature Multi-mode word attacks

Parameters

Parameter Type Default
modes RushmoreMode | str | Iterable (delete,)
rate float | None None
delete_rate float | None None
duplicate_rate float | None None
swap_rate float | None None
seed int | None None
unweighted bool false

"You shouldn't have waited for the last minute to write that paper, anon. Sure hope everything is in the right place."

Rushmore is a versatile glitchling that bundles three distinct attack modes:

  • delete: Randomly removes words from text
  • duplicate: Repeats words in place
  • swap: Exchanges adjacent word positions

Modes can be combined for compound chaos. Each mode has independent rate controls for fine-grained corruption.

Perfect for:

  • Simulating hasty writing errors
  • Testing grammar correction models
  • Creating incomplete/jumbled text

Input:

He found himself transformed in his bed.

Output (delete mode):

He found transformed his bed.


🔍 Scannequin

Attribute Value
Class OCR Simulator
Scope Character
Initiative Late
Signature Scan artifact injection

Parameters

Parameter Type Default
rate float | None 0.02
seed int | None None

"Isn't it weird how the word 'bed' looks like a bed?"

Scannequin simulates OCR (Optical Character Recognition) artifacts using common character confusions. It recreates the errors that occur when physical documents are scanned and digitized.

Common confusions:

  • rnm
  • cld
  • l1I
  • O0
  • vvw

Perfect for:

  • Testing OCR post-processing
  • Creating realistic scan artifacts
  • Training OCR error correction models

Input:

The brown dog jumped over the wall.

Output:

The brovvn clog jurnped over the waII.


⌨️ Typogre

Attribute Value
Class Keyboard Gremlin
Scope Character
Initiative Early
Signature Adjacent-key typos

Parameters

Parameter Type Default
rate float | None 0.02
keyboard str "CURATOR_QWERTY"
seed int | None None

"What a nice word, would be a shame if something happened to it..."

Typogre is a mischievous glitchling that introduces keyboard-adjacent typing errors. It uses curated keyboard layout maps to inject realistic typos—the kind humans make when their fingers slip to neighboring keys.

Supported keyboards:

  • QWERTY (US, UK variants)
  • QWERTZ (German)
  • AZERTY (French)
  • Dvorak
  • Custom layouts via CURATOR format

Perfect for:

  • Simulating human typing errors
  • Testing spell-check systems
  • Creating realistic noisy training data

Input:

The quick brown fox jumps.

Output:

Thr quicl brown gox jumps.


⛓️‍💥 Zeedub

Attribute Value
Class Invisible Saboteur
Scope Character
Initiative Last
Signature Zero-width character injection

Parameters

Parameter Type Default
rate float | None 0.02
seed int | None None
characters Sequence[str] | None None

"I'm invoking my right to remain silent."

Zeedub is a stealthy glitchling that plants zero-width invisible characters inside words. The text looks completely normal but contains hidden Unicode ghosts that break string matching, tokenization, and search.

Zero-width characters used:

  • U+200B Zero Width Space
  • U+200C Zero Width Non-Joiner
  • U+200D Zero Width Joiner
  • U+FEFF Zero Width No-Break Space

Perfect for:

  • Testing Unicode handling
  • Creating "invisible" text corruption
  • Breaking naive string comparisons
  • Evading keyword filters

Input:

Hello World

Output:

Hel​lo Wor​ld
(Contains invisible U+200B between characters)