rulest - GPU Rules Extractor

Waffle

This is good work! This approaches the problem from what I think is a new direction, and I'm really liking what I see here. Very much appreciate you releasing it.

A1131

Thank you. It's a bit of reinventing the wheel, but I once encountered a problem extracting rules from available programs (ULM still rules in my opinion), and that's how rulest was created.

A1131

Test done on 1050Ti with a target depth of 6 and a target runtime of 8 hours (the job was completed in about 2 hours with 1,311,335 generated chains and 724,887 valid unique chains found - you can think of the --target-hours flag as a multiplier of generated rules for the appropriate depths - on different GPUs these values will be different from each other in automatic mode).

https://gist.github.com/A113L/9c0ecf6d9888ec1680ac98d533583e22/raw/c9f4f0d10206ed3ac2cc03f6c71d5d0b004b4d29/gistfile1.txt

Minor improvements:

The extracted rules are now sorted by frequency (most hits first) for better prioritization, with the empty rule : placed at the top.
A short header with metadata (total unique rules and total hits) is added to the output file for clarity.
Removed global cap limit of generated chains (can be limited with flags).

# Generated by rulest_v2.0.py
# Total unique rules: 878446
# Total hits (sum of frequencies): 377080316
:
z6 x05
z8 '5
x33 z8
x23 y4
'0 y4
z8 x44 Z3
i4% z7 i8E x74
i60 x22 y4
z5 x30
R0 z7
z9 x26
x55 Z9
i0+ x77 Z5
x44 p4
z9 x73
z7 x16
x44 y5
o0[ x55 p9
x23 y6
x66 Z3
D2 i5m o4. i4v z8 x51
...

A1131

Removed hardcoded depth limit; --max-depth now accepts any value, kernel constant set dynamically, enabling arbitrary chain depths via recursive rules expansion.
The new optional --seed-rules parameter allows loading a file with previously successful rules. These seeds are grouped by depth and used at every stage of chain generation. For each target depth, the algorithm first adds existing seeds of that depth, then extends seeds from the previous depth by appending a random rule, and finally fills any remaining budget with random (hot‑biased) chains. Newly generated chains become seeds for deeper levels, enabling a recursive expansion that builds upon proven patterns. This focused approach makes it feasible to efficiently generate and test rule chains well beyond depth 6, leveraging prior knowledge to explore the most promising combinations.

Waffle

Thanks - I'm working with a very large set, and looking at multiple ways to deal with the scope... none of this is easy work, and you are absolutely going down a beneficial path.

A1131

Not right, not left - straight ahead. If you are working on sets of rules, I also have a concentrator tool, there is a link here on the forum and on discord. It can process very large sets of rules and debugged data, created mainly for optimizing, but I don't know what problem you are dealing with and whether it will be useful for anything.

A1131

Fixes

Rule Validator – Added $ and ^ to the branch that expects exactly one following character. Without this, those rules were filtered out and never made it to the GPU kernel.
Seed Loading – The same validator now accepts $ and ^ rules, so seed files containing them are fully loaded and used.

# Generated by rulest_v2.0.py (seeded)
# Total unique rules: 145287
# Total hits (sum of frequencies): 52455891
:
]
l
c
^s ^o ^l ^l ^o ^p '2
x65
'0
x54
x00
x01
o1@ x00
^e ^k ^i ^r ^t ^s R2
x11 i0d x06
x11
x32
R3
*64 o0q '0
R0
x33
D1 $8 $5 $0 $0 x87
o48 '0
o28 R0
o1E x33
x34 p2
^F x22
E
...

https://gist.github.com/A113L/5ef467ced57f40bc7dadedce775cda6a

found_chains.txt

A1131

Added missing GPU-compatible Hashcat rules: bitwise shifts (L, R), +1/-1 substitution (. and ,), range skip (O), and switch after Nth delimiter (3). Updated rule generation to add O for all digit pairs and 3NX for all digits and delimiters. Improved the kernel to support these operations, ensuring full GPU compatibility. No memory or discard rules were added.
Note: BTW, after extraction, since all rules are Hashcat-compatible, it is still recommended running the resulting ruleset by concentrator functional minimizing function.

A1131

The signature-based minimizer eliminates functional redundancy: it builds a deterministic probe set of base words (≥ min‑word‑len) and computes each rule’s output signature. Rules that produce identical outputs on all probe words are considered equivalent; only the one with the highest GPU hit‑count survives. This reduces the final rule set dramatically while preserving all unique transformations.
The numeric‑seed extraction pass tests three built‑in families (pure prepend, pure append, and mixed prepend/append) up to depth 4 as direct GPU chain candidates—independent of random chain generation. It captures common numeric mutations (e.g., ^1 ^2, $9 $8, ^1 $2) efficiently, ensuring these high‑value patterns are never missed, even if the random chain budget is small.

A1131

# rulest — GPU-Compatible Hashcat Rules Engine
# Generated      : 2026-03-28 04:14:53
# Base           : hashmob.mini.txt
# Target         : hashmob.small.found
# Depth          : 1–10
# Bloom          : 512 MB
#
# GPU raw candidates      : 268,674  (bloom hits, includes false positives)
# Post-processing         : signature-based minimization
#   Probe words           : 21  (min length 10)
#   Equiv. rules removed  : 106,848
#
# Rules kept     : 161,825  (d1:1,856  d2:42,820  d3:34,157  d4:42,742  d5:10,555  d6:9,387  d7:7,435  d8:5,792  d9:3,892  d10:3,189)
# Sorted by      : GPU frequency (descending, UTF-8)
:
$1 $2 $3
$1 $2 $3 $4
x45 i5b 31n
$2 $3 $4
$4 $5 $6
$1 $0 $1
$2 $0 $0 $0
$2 $0 $1 $0
$2 $0 $0 $2
x03 36e
$2 $0 $1 $2
$3 $2 $1
$2 $0 $0 $1
$2 $0 $0 $3
$0 $0 $7
$2 $0 $0 $4
$2 $0 $0 $5
$1 $0 $0

A1131

The built-in seeds in rulest are now five categories of numeric rule chains (prepend/append, mixed, transform+digit, date patterns) automatically generated and tested against the bloom filter in Phase S. These seeds help extract common numeric transformations (e.g., adding years, digits) without requiring manual input.

The --no-builtin-seeds parametr disables Phase S entirely. Use it when:

Your target wordlist contains few or no numeric patterns.
You want to reduce GPU runtime by skipping thousands of numeric seeds.
You rely solely on atomic rules (Phase 1) and random chains (Phase 2) or your own --seed-rules file.

Without this flag, Phase S always runs, testing chains up to depth 4 (e.g., ^1 ^2, $1 $9 $9 $0, u $1 and date patterns like $0 $1 $0 $1 $2 $0 $2 $4).

A1131

New category rules added in the build-in generator

Phase S - Built-in Seed Families (A–M) 

Numeric families 
A Pure Prepend digits (depths 1–4) 
B Pure Append digits (depths 1–4) 
C Mixed Prepend/Append digits (depths 1–4) 
D Transform + digit/bracket (depths 2–4) 
E Date patterns DDMM/YYYY/… (depths 4–9) 

Special-character families 
F Pure Append special chars (depths 1–3, top-15 chars) 
G Pure Prepend special chars (depths 1–3, top-15 chars) 
H Transform + special char (depths 2–3, top-15 chars) 
I Digit(s) + special char (depths 2–4, core-7 chars) 
— covers the ubiquitous "word123!" / "!word123" patterns 

New families 
J Leet substitutions (depths 1–2, 10 core pairs) 
— sa@ se3 so0 si1 sl1 ss5 ss$ st7 sa4 si! 
— depth 2: leet + digit/special suffix/prefix 
— depth 2: double-leet chains (e.g. "sa@ so0" → "p@ssw0rd") 
K Double-transform chains (depth 2, all 15×15 pairs) 
— covers "c r", "u d", "t f", "E l", "c {", "l ]", etc. 
L Special-before-digit patterns (depths 2–3, core-7 chars) 
— reverse orientation of Family I: "!1word" / "word!12" 
— append: $sp $d… prepend: ^d… ^sp 
M Leet + transform chains (depth 2) 
— leet substitution followed by a transform op (all 15) 
— and transform op followed by leet substitution 
— covers "P@ssword", "@DMIN", "p@SSW0RD" patterns 

Special chars - top-15 (F/G/H): ! @ # $ % ^ & * ? . -_ + ( ) 
Special chars - core-7 (I/L): ! @ # $ % * ?

A1131

Added Phase 3 GA (genetic algotithm) that tries to find good rule chains by mimicking evolution. It starts with a mix of promising chains (from Phase 1 hits) and random ones. Each chain gets a fitness score = how many base words it successfully transforms into target-like words (checked via GPU). Top chains survive as "elite". Pairs of chains swap parts (crossover) and get small random changes (mutations) to create new chains. Weak chains die out. Over many generations, the population shifts toward high-scoring chains. This finds useful rules that random sampling would likely miss. Finally, all discovered chains are merged and deduplicated before output.

  An optional evolutionary search that runs after Phase 2 and complements
  random chain sampling with guided, coverage-driven optimisation.

  Why it fits this project
  ────────────────────────
  • The fitness function (bloom-filter hits) is already computed by the
    existing GPU chain kernel — no new GPU code is required.
  • Phase 2 samples chains *uniformly at random* from the atomic-rule pool.
    For depth ≥ 3 the search space is |pool|^depth (millions of candidates);
    the GA focuses probability mass on high-hit-rate regions of that space.
  • Hot atomic rules from Phase 1 seed the initial population, giving the GA
    a strong head start rather than searching from scratch.
  • All Phase-3 discoveries are merged into the global hit counter before
    signature-based minimisation, so they benefit from the same deduplication
    and sorting as Phase 1 and Phase 2 results.

  Algorithm summary
  ─────────────────
  1. Initial population — 30 % depth-2 hot-rule combos, 30 % seeded deeper
     chains, 40 % random — ensures both exploitation and exploration.
  2. GPU-batch fitness evaluation — reuses _run_chain_kernel unchanged.
  3. Tournament selection (k = 4) — low-pressure, maintains diversity.
  4. One-point crossover (p = 0.80) — exchanges rule-token sub-sequences.
  5. Mutation — replace / insert / delete one rule token (weights 60/20/20).
  6. Elitism — top <elite_frac> individuals survive unchanged each generation.
  7. Diversity guard — duplicate individuals are replaced by random chains.
  8. Terminates when <--genetic-generations> is reached or the wall-clock
     time budget (--target-hours remainder) is exhausted.

  CLI flags
  ─────────
    --genetic                   Enable Phase 3 (default: disabled)
    --genetic-generations N     Max generations (default: 50)
    --genetic-pop N             Population size (default: 200)
    --genetic-elite F           Elite fraction, e.g. 0.15 (default: 0.15)```

A1131

Phase 3 GA improvements

Novelty‑weighted fitness – Chains not found by Phase 1/S/2 get a 2× bonus during selection → drives GA toward new rules, not rediscovered ones.
Unexplored‑seed initial population – The 40 % Phase‑S fill slot prefers chains absent from known_rules; known seeds only as fallback. Depth‑3+ chains biased (70 %) when --max-depth ≥ 3.
Dedicated time reservation – 20 % of --target-hours (min 120 s) reserved for GA before Phase 2 starts → GA always gets meaningful runtime.
Stagnation guard – No fitness improvement for 5 generations → bottom 30 % of population replaced with fresh random chains (depth‑3+ biased).
Depth‑2 warning – Warns that Phase 2 already covers depth‑2 exhaustively → GA adds nothing at depth 2; recommends --max-depth ≥ 3.
Functional‑signature registry – Tracks equivalence classes via built‑in probe set. Novelty bonus is functionally aware – equivalent variants get no bonus.
Adaptive mutation – If mutated chain falls into a known signature class, up to 2 extra escape mutations applied to break out.
Signature‑based offspring filter – Offspring still covered after adaptive mutation → replaced by a fresh random chain (depth‑3+ biased).
Honest raw‑hit merging – Stores raw (un‑bonused) hit counts for output; novelty bonus only affects selection.
Genuinely novel reporting – Log shows both total GA hits and how many were truly new (absent from known_rules at GA start).

A1131

Benchmarks of GA and the new stage 0 (rule extraction using token strip) are available here

A1131

The Bloom filter — previously one of the slowest parts of tool startup — is now built on the GPU in a seconds, regardless of how large the target wordlist is. The CPU used to grind through it word by word; the GPU does it all at once. If your hardware doesn't support it, it silently falls back to the old way.

hashpwn

rulest - GPU Rules Extractor