Skip to content
  • Categories
  • Recent
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Slate)
  • No Skin
Collapse
Brand Logo

hashpwn

Home | Donate | GitHub | Matrix Chat | PrivateBin | Rules

  1. Home
  2. Tools
  3. Scripts
  4. rulest - GPU Rules Extractor

rulest - GPU Rules Extractor

Scheduled Pinned Locked Moved Scripts
23 Posts 2 Posters 1.5k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A1131A Offline
    A1131A Offline
    A1131
    Trusted
    wrote on last edited by A1131
    #1

    GPU-Accelerated Rule Mining for Hash Cracking Wordlists (Full Chain Extraction)

    This project implements a highly optimized tool for extracting effective word-transformation rule chains by leveraging PyOpenCL to run the transformation logic on a Graphics Processing Unit (GPU). It uses a Breadth-First Search (BFS) approach to efficiently explore multi-step rule combinations (chains) that convert base dictionary words into target dictionary words.

    Overview
    Rule-based attacks are highly effective against passwords. This utility automates the process of discovering the most successful rule chains (e.g., l $1 T0 - lowercase, append '1', toggle case at position 0) that have high hit rates against a specific target wordlist, accelerating rule-set generation for tools like Hashcat.

    Key Features

    OpenCL Acceleration: Word transformations are processed in parallel on the GPU for massive performance gains.

    Rule Chaining (BFS): Supports exploring rule chains up to a configurable depth (e.g., -d 3).

    Full Chain Extraction: Outputs the complete sequence of rules (R1 R2 ... Rn) responsible for a hit.

    Optimized Data Handling: Uses np.uint8 and np.uint16 buffers for minimal memory overhead and fast GPU transfers.

    Rule Filtering: Allows using an external file to limit the internal rule set, focusing the search space.

    Requirements

    Python 3.x

    numpy

    pyopencl (Requires OpenCL runtime drivers for your GPU vendor: NVIDIA, AMD, or Intel).

    tqdm (For progress bar display)

    pip install numpy pyopencl tqdm
    
    Argument,        Short Description
    --wordlist,      -w,  (Required) Path to the target dictionary (words to match).
    --base-wordlist, -b,  Optional path to the base wordlist (words to apply rules to).
    --chain-depth,   -d,  Number of rules to chain together.
    --batch-size,         Number of words to process per GPU batch.
    --output,        -o,  File to save the extracted rule chains to.
    --rules_file,    -r,  Path to an external file to filter the internal rule set.
    

    Examples

    Extracting Single Rules (Depth 1) Find all single rules that transform a base word into a target word, saving the results to my_rules_d1.txt.

    python rulest.py -w target_passwords.txt -b base_words.txt -d 1 -o my_rules_d1.txt
    

    Extracting Two-Step Rule Chains (Depth 2) Find rule chains of length two (R1 R2) that lead to a hit.

    python rulest.py -w target_passwords.txt -b base_words.txt -d 2 -o my_rules_d2.txt
    

    Using a Filtered Rule Set Only test rules specified in common_leetspeak.rule for a three-step chain.

    python rulest.py -w target.txt -b base.txt -d 3 -r common_leetspeak.rule -o leet_chains_d3.txt
    

    https://github.com/A113L/rulest

    Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
    PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

    1 Reply Last reply
    👍
    0
    • A1131A Offline
      A1131A Offline
      A1131
      Trusted
      wrote on last edited by
      #2

      Fyi. The OpenCl kernel used by the script has been updated with all Hashcat rules visible at https://hashcat.net/wiki/doku.php?id=rule_based_attack. This slightly increases the rule chain extraction time, but it can also lead to complex rules.

      Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
      PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

      1 Reply Last reply
      👍
      0
      • A1131A Offline
        A1131A Offline
        A1131
        Trusted
        wrote on last edited by
        #3

        Updated to version 2.0. The script now extracts more likely working Hashcat rules as opposed to using BFS up to depth 3. It dynamically calculates the number of needed generated rules for depth 2 and 3 to fit in the given script runtime as well as the size of buffers adjusted to the amount of VRAM to avoid OOM errors.

        Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
        PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

        1 Reply Last reply
        0
        • A1131A Offline
          A1131A Offline
          A1131
          Trusted
          wrote on last edited by
          #4

          https://gist.githubusercontent.com/A113L/64184c75477c32a448779836087ad4d8/raw/5abfc5c874b60aa9f8cf4bf1fc1be7a19388be25/gistfile1.txt

          Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
          PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

          1 Reply Last reply
          0
          • WaffleW Offline
            WaffleW Offline
            Waffle
            wrote on last edited by
            #5

            I'm interested in running a quick comparison. Can I get a copy of your "zeusgame.txt" list, please?

            1 Reply Last reply
            0
            • A1131A Offline
              A1131A Offline
              A1131
              Trusted
              wrote on last edited by
              #6

              zeusgame.txt

              Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
              PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

              1 Reply Last reply
              0
              • A1131A Offline
                A1131A Offline
                A1131
                Trusted
                wrote on last edited by
                #7

                Key improvements

                • Extended maximum rule chain depth from 3 to 6, allowing exploration of more complex transformations.
                • Adaptive batch sizing based on actual GPU VRAM – now dynamically scales word batches, chain batches, and output buffer sizes according to available memory (tested on RTX 3060 Ti and 1050 Ti without errors).
                • Dynamic time budgeting – distributes remaining processing time among depths proportionally, adjusting the number of chains generated for each depth based on actual phase‑1 runtime.
                • VRAM‑aware bloom filter sizing – reduces bloom filter size on GPUs with less than 4 GB VRAM to prevent out‑of‑memory errors.
                • Per‑depth chain limit overrides – new command‑line options (--depth2-chains, …, --depth6-chains) to manually control the number of chains generated for each depth.

                Examples of found_chains.txt

                Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                1 Reply Last reply
                0
                • WaffleW Offline
                  WaffleW Offline
                  Waffle
                  wrote on last edited by
                  #8

                  This is good work! This approaches the problem from what I think is a new direction, and I'm really liking what I see here. Very much appreciate you releasing it.

                  1 Reply Last reply
                  0
                  • A1131A Offline
                    A1131A Offline
                    A1131
                    Trusted
                    wrote on last edited by
                    #9

                    Thank you. It's a bit of reinventing the wheel, but I once encountered a problem extracting rules from available programs (ULM still rules in my opinion), and that's how rulest was created.

                    Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                    PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                    1 Reply Last reply
                    0
                    • A1131A Offline
                      A1131A Offline
                      A1131
                      Trusted
                      wrote on last edited by
                      #10

                      Test done on 1050Ti with a target depth of 6 and a target runtime of 8 hours (the job was completed in about 2 hours with 1,311,335 generated chains and 724,887 valid unique chains found - you can think of the --target-hours flag as a multiplier of generated rules for the appropriate depths - on different GPUs these values ​​will be different from each other in automatic mode).

                      https://gist.github.com/A113L/9c0ecf6d9888ec1680ac98d533583e22/raw/c9f4f0d10206ed3ac2cc03f6c71d5d0b004b4d29/gistfile1.txt

                      Minor improvements:

                      • The extracted rules are now sorted by frequency (most hits first) for better prioritization, with the empty rule : placed at the top.
                      • A short header with metadata (total unique rules and total hits) is added to the output file for clarity.
                      • Removed global cap limit of generated chains (can be limited with flags).
                      # Generated by rulest_v2.0.py
                      # Total unique rules: 878446
                      # Total hits (sum of frequencies): 377080316
                      :
                      z6 x05
                      z8 '5
                      x33 z8
                      x23 y4
                      '0 y4
                      z8 x44 Z3
                      i4% z7 i8E x74
                      i60 x22 y4
                      z5 x30
                      R0 z7
                      z9 x26
                      x55 Z9
                      i0+ x77 Z5
                      x44 p4
                      z9 x73
                      z7 x16
                      x44 y5
                      o0[ x55 p9
                      x23 y6
                      x66 Z3
                      D2 i5m o4. i4v z8 x51
                      ...
                      

                      Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                      PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                      1 Reply Last reply
                      0
                      • A1131A Offline
                        A1131A Offline
                        A1131
                        Trusted
                        wrote on last edited by
                        #11
                        • Removed hardcoded depth limit; --max-depth now accepts any value, kernel constant set dynamically, enabling arbitrary chain depths via recursive rules expansion.

                        • The new optional --seed-rules parameter allows loading a file with previously successful rules. These seeds are grouped by depth and used at every stage of chain generation. For each target depth, the algorithm first adds existing seeds of that depth, then extends seeds from the previous depth by appending a random rule, and finally fills any remaining budget with random (hot‑biased) chains. Newly generated chains become seeds for deeper levels, enabling a recursive expansion that builds upon proven patterns. This focused approach makes it feasible to efficiently generate and test rule chains well beyond depth 6, leveraging prior knowledge to explore the most promising combinations.

                        Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                        PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                        1 Reply Last reply
                        0
                        • WaffleW Offline
                          WaffleW Offline
                          Waffle
                          wrote on last edited by
                          #12

                          Thanks - I'm working with a very large set, and looking at multiple ways to deal with the scope... none of this is easy work, and you are absolutely going down a beneficial path.

                          1 Reply Last reply
                          0
                          • A1131A Offline
                            A1131A Offline
                            A1131
                            Trusted
                            wrote on last edited by A1131
                            #13

                            Not right, not left - straight ahead.🙂 If you are working on sets of rules, I also have a concentrator tool, there is a link here on the forum and on discord. It can process very large sets of rules and debugged data, created mainly for optimizing, but I don't know what problem you are dealing with and whether it will be useful for anything.

                            Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                            PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                            1 Reply Last reply
                            0
                            • A1131A Offline
                              A1131A Offline
                              A1131
                              Trusted
                              wrote on last edited by
                              #14

                              Fixes

                              • Rule Validator – Added $ and ^ to the branch that expects exactly one following character. Without this, those rules were filtered out and never made it to the GPU kernel.

                              • Seed Loading – The same validator now accepts $ and ^ rules, so seed files containing them are fully loaded and used.

                              # Generated by rulest_v2.0.py (seeded)
                              # Total unique rules: 145287
                              # Total hits (sum of frequencies): 52455891
                              :
                              ]
                              l
                              c
                              ^s ^o ^l ^l ^o ^p '2
                              x65
                              '0
                              x54
                              x00
                              x01
                              o1@ x00
                              ^e ^k ^i ^r ^t ^s R2
                              x11 i0d x06
                              x11
                              x32
                              R3
                              *64 o0q '0
                              R0
                              x33
                              D1 $8 $5 $0 $0 x87
                              o48 '0
                              o28 R0
                              o1E x33
                              x34 p2
                              ^F x22
                              E
                              ...
                              

                              https://gist.github.com/A113L/5ef467ced57f40bc7dadedce775cda6a

                              found_chains.txt

                              Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                              PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                              1 Reply Last reply
                              0
                              • A1131A Offline
                                A1131A Offline
                                A1131
                                Trusted
                                wrote on last edited by A1131
                                #15

                                Added missing GPU-compatible Hashcat rules: bitwise shifts (L, R), +1/-1 substitution (. and ,), range skip (O), and switch after Nth delimiter (3). Updated rule generation to add O for all digit pairs and 3NX for all digits and delimiters. Improved the kernel to support these operations, ensuring full GPU compatibility. No memory or discard rules were added.
                                Note: BTW, after extraction, since all rules are Hashcat-compatible, it is still recommended running the resulting ruleset by concentrator functional minimizing function.

                                Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                1 Reply Last reply
                                0
                                • A1131A Offline
                                  A1131A Offline
                                  A1131
                                  Trusted
                                  wrote on last edited by
                                  #16
                                  • The signature-based minimizer eliminates functional redundancy: it builds a deterministic probe set of base words (≥ min‑word‑len) and computes each rule’s output signature. Rules that produce identical outputs on all probe words are considered equivalent; only the one with the highest GPU hit‑count survives. This reduces the final rule set dramatically while preserving all unique transformations.

                                  • The numeric‑seed extraction pass tests three built‑in families (pure prepend, pure append, and mixed prepend/append) up to depth 4 as direct GPU chain candidates—independent of random chain generation. It captures common numeric mutations (e.g., ^1 ^2, $9 $8, ^1 $2) efficiently, ensuring these high‑value patterns are never missed, even if the random chain budget is small.

                                  Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                  PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                  1 Reply Last reply
                                  0
                                  • A1131A Offline
                                    A1131A Offline
                                    A1131
                                    Trusted
                                    wrote on last edited by
                                    #17
                                    # rulest — GPU-Compatible Hashcat Rules Engine
                                    # Generated      : 2026-03-28 04:14:53
                                    # Base           : hashmob.mini.txt
                                    # Target         : hashmob.small.found
                                    # Depth          : 1–10
                                    # Bloom          : 512 MB
                                    #
                                    # GPU raw candidates      : 268,674  (bloom hits, includes false positives)
                                    # Post-processing         : signature-based minimization
                                    #   Probe words           : 21  (min length 10)
                                    #   Equiv. rules removed  : 106,848
                                    #
                                    # Rules kept     : 161,825  (d1:1,856  d2:42,820  d3:34,157  d4:42,742  d5:10,555  d6:9,387  d7:7,435  d8:5,792  d9:3,892  d10:3,189)
                                    # Sorted by      : GPU frequency (descending, UTF-8)
                                    :
                                    $1 $2 $3
                                    $1 $2 $3 $4
                                    x45 i5b 31n
                                    $2 $3 $4
                                    $4 $5 $6
                                    $1 $0 $1
                                    $2 $0 $0 $0
                                    $2 $0 $1 $0
                                    $2 $0 $0 $2
                                    x03 36e
                                    $2 $0 $1 $2
                                    $3 $2 $1
                                    $2 $0 $0 $1
                                    $2 $0 $0 $3
                                    $0 $0 $7
                                    $2 $0 $0 $4
                                    $2 $0 $0 $5
                                    $1 $0 $0
                                    

                                    Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                    PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                    1 Reply Last reply
                                    0
                                    • A1131A Offline
                                      A1131A Offline
                                      A1131
                                      Trusted
                                      wrote on last edited by
                                      #18

                                      The built-in seeds in rulest are now five categories of numeric rule chains (prepend/append, mixed, transform+digit, date patterns) automatically generated and tested against the bloom filter in Phase S. These seeds help extract common numeric transformations (e.g., adding years, digits) without requiring manual input.

                                      The --no-builtin-seeds parametr disables Phase S entirely. Use it when:

                                      • Your target wordlist contains few or no numeric patterns.
                                      • You want to reduce GPU runtime by skipping thousands of numeric seeds.
                                      • You rely solely on atomic rules (Phase 1) and random chains (Phase 2) or your own --seed-rules file.

                                      Without this flag, Phase S always runs, testing chains up to depth 4 (e.g., ^1 ^2, $1 $9 $9 $0, u $1 and date patterns like $0 $1 $0 $1 $2 $0 $2 $4).

                                      Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                      PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                      1 Reply Last reply
                                      0
                                      • A1131A Offline
                                        A1131A Offline
                                        A1131
                                        Trusted
                                        wrote on last edited by A1131
                                        #19

                                        New category rules added in the build-in generator

                                        Phase S - Built-in Seed Families (A–M) 
                                        
                                        Numeric families 
                                        A Pure Prepend digits (depths 1–4) 
                                        B Pure Append digits (depths 1–4) 
                                        C Mixed Prepend/Append digits (depths 1–4) 
                                        D Transform + digit/bracket (depths 2–4) 
                                        E Date patterns DDMM/YYYY/… (depths 4–9) 
                                        
                                        Special-character families 
                                        F Pure Append special chars (depths 1–3, top-15 chars) 
                                        G Pure Prepend special chars (depths 1–3, top-15 chars) 
                                        H Transform + special char (depths 2–3, top-15 chars) 
                                        I Digit(s) + special char (depths 2–4, core-7 chars) 
                                        — covers the ubiquitous "word123!" / "!word123" patterns 
                                        
                                        New families 
                                        J Leet substitutions (depths 1–2, 10 core pairs) 
                                        — sa@ se3 so0 si1 sl1 ss5 ss$ st7 sa4 si! 
                                        — depth 2: leet + digit/special suffix/prefix 
                                        — depth 2: double-leet chains (e.g. "sa@ so0" → "p@ssw0rd") 
                                        K Double-transform chains (depth 2, all 15×15 pairs) 
                                        — covers "c r", "u d", "t f", "E l", "c {", "l ]", etc. 
                                        L Special-before-digit patterns (depths 2–3, core-7 chars) 
                                        — reverse orientation of Family I: "!1word" / "word!12" 
                                        — append: $sp $d… prepend: ^d… ^sp 
                                        M Leet + transform chains (depth 2) 
                                        — leet substitution followed by a transform op (all 15) 
                                        — and transform op followed by leet substitution 
                                        — covers "P@ssword", "@DMIN", "p@SSW0RD" patterns 
                                        
                                        Special chars - top-15 (F/G/H): ! @ # $ % ^ & * ? . -_ + ( ) 
                                        Special chars - core-7 (I/L): ! @ # $ % * ?
                                        

                                        Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                        PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                        1 Reply Last reply
                                        👍
                                        0
                                        • A1131A Offline
                                          A1131A Offline
                                          A1131
                                          Trusted
                                          wrote on last edited by A1131
                                          #20

                                          Added Phase 3 GA (genetic algotithm) that tries to find good rule chains by mimicking evolution. It starts with a mix of promising chains (from Phase 1 hits) and random ones. Each chain gets a fitness score = how many base words it successfully transforms into target-like words (checked via GPU). Top chains survive as "elite". Pairs of chains swap parts (crossover) and get small random changes (mutations) to create new chains. Weak chains die out. Over many generations, the population shifts toward high-scoring chains. This finds useful rules that random sampling would likely miss. Finally, all discovered chains are merged and deduplicated before output.

                                            An optional evolutionary search that runs after Phase 2 and complements
                                            random chain sampling with guided, coverage-driven optimisation.
                                          
                                            Why it fits this project
                                            ────────────────────────
                                            • The fitness function (bloom-filter hits) is already computed by the
                                              existing GPU chain kernel — no new GPU code is required.
                                            • Phase 2 samples chains *uniformly at random* from the atomic-rule pool.
                                              For depth ≥ 3 the search space is |pool|^depth (millions of candidates);
                                              the GA focuses probability mass on high-hit-rate regions of that space.
                                            • Hot atomic rules from Phase 1 seed the initial population, giving the GA
                                              a strong head start rather than searching from scratch.
                                            • All Phase-3 discoveries are merged into the global hit counter before
                                              signature-based minimisation, so they benefit from the same deduplication
                                              and sorting as Phase 1 and Phase 2 results.
                                          
                                            Algorithm summary
                                            ─────────────────
                                            1. Initial population — 30 % depth-2 hot-rule combos, 30 % seeded deeper
                                               chains, 40 % random — ensures both exploitation and exploration.
                                            2. GPU-batch fitness evaluation — reuses _run_chain_kernel unchanged.
                                            3. Tournament selection (k = 4) — low-pressure, maintains diversity.
                                            4. One-point crossover (p = 0.80) — exchanges rule-token sub-sequences.
                                            5. Mutation — replace / insert / delete one rule token (weights 60/20/20).
                                            6. Elitism — top <elite_frac> individuals survive unchanged each generation.
                                            7. Diversity guard — duplicate individuals are replaced by random chains.
                                            8. Terminates when <--genetic-generations> is reached or the wall-clock
                                               time budget (--target-hours remainder) is exhausted.
                                          
                                            CLI flags
                                            ─────────
                                              --genetic                   Enable Phase 3 (default: disabled)
                                              --genetic-generations N     Max generations (default: 50)
                                              --genetic-pop N             Population size (default: 200)
                                              --genetic-elite F           Elite fraction, e.g. 0.15 (default: 0.15)```

                                          Amateur of mycology and hashcracking | 1x3060Ti | 1x1050Ti
                                          PGP:4B0A386530D789157435DC7489138FB52FDD7FC1

                                          1 Reply Last reply
                                          0

                                          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                          With your input, this post could be even better 💗

                                          Register Login
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          homogenous-expeditionary
                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent