Determinism Guide#
Glitchlings are deterministic by design. Same seed + same input = same output, every time, on every machine. This matters because reproducible experiments are the foundation of trustworthy researchβand because debugging non-deterministic corruption is a special kind of hell.
How Seeds Work#
Every glitchling owns a private random.Random instance. It never touches Python's global RNG state, so glitchlings don't interfere with each other or with your other code.
Individual Glitchlings#
Pass seed= on construction:
from glitchlings import Typogre
typo = Typogre(rate=0.1, seed=42)
typo("Hello world") # Always the same output
typo("Hello world") # Same again
Reset the RNG mid-run with reset_rng() if you need to replay:
Gaggles#
Pass seed= to the Gaggle, and it derives per-glitchling seeds automatically:
from glitchlings import Gaggle, Typogre, Mim1c
gaggle = Gaggle([Typogre(rate=0.02), Mim1c(rate=0.01)], seed=404)
The derived seeds depend on:
- The master seed you provide
- Each glitchling's class name
- The order of glitchlings in the roster
Roster Order Matters
Changing the order of glitchlings changes their derived seeds, even if the master seed stays the same. If you need identical output, keep the roster order stable.
Attack Helper#
The Attack class follows the same pattern:
from glitchlings.attack import Attack
from glitchlings import Typogre
# Seed controls the internal Gaggle
attack = Attack([Typogre(rate=0.02)], seed=404)
result = attack.run("Hello world") # Reproducible
If you pass an existing Gaggle or Glitchling, Attack clones it before applying the seedβso your original instance stays untouched.
Dataset Corruption#
Dataset wrappers inherit their Gaggle's seed:
from glitchlings import Gaggle, Typogre
from glitchlings.dlc.huggingface import GlitchedDataset
gaggle = Gaggle([Typogre(rate=0.02)], seed=404)
corrupted = gaggle.corrupt_dataset(dataset, columns=["text"])
# Reproducible across runs
Rust Pipeline#
Seeds are forwarded to the compiled Rust pipeline. Keep your master seed stable across runs, and both Python and Rust will produce identical output.
Common Pitfalls#
Unstable Ordering#
When sampling from a set or dictionary, Python's iteration order can vary. Always sort before sampling:
# BAD: order may vary
candidates = {word for word in text.split()}
selected = rng.sample(list(candidates), 3)
# GOOD: stable order
candidates = sorted(set(text.split()))
selected = rng.sample(candidates, 3)
Global RNG State#
Never use random.choice() or random.shuffle() directly. Always go through the glitchling's self.rng:
# BAD: touches global state
import random
selected = random.choice(options)
# GOOD: uses instance RNG
selected = self.rng.choice(options)
Floating-Point Comparisons#
If you're making decisions based on rate < threshold, remember that floating-point arithmetic isn't always exact. Stick to <= or < consistently.
External Data#
If your glitchling loads data from a file (like a homophone dictionary), sort the data on load. File system order can vary between machines.
Debugging Non-Determinism#
If your glitchling produces different output for the same seed and input, check these in order:
- Set vs. dict iteration β Sort before sampling
- Direct
randomcalls β Route throughself.rng - Candidate pool ordering β Sort before shuffle/sample
- External file loading β Sort on read
- Insufficient sort keys β Add secondary keys if primary keys can tie
The test suite includes tests/core/test_glitchlings_determinism.py which runs each glitchling multiple times with the same seed and asserts identical output. Run it to verify your changes:
Quick Checklist#
Before shipping a new glitchling or modification:
- [ ] Pass
seed=on construction - [ ] Use
self.rngfor all random operations - [ ] Sort any sets/dicts before sampling
- [ ] Sort external data on load
- [ ] Run the determinism test suite
See Also#
- Development Guide β Testing and contribution workflow
- Attack Helper β How Attack handles seed propagation