Why DyMerge Sucks.

DyMerge. My first ever open-source project. Here's what I learned from it and why it sucks.

DyMerge Image

Wtf is DyMerge

DyMerge was my very first GitHub project, designed to merge multiple dictionaries into one for efficient dictionary based password attacks. You can read more about the initial idea and the project itself here.

What it Does

DyMerge's main features are aligned in the program's usage manual:

❯❯❯ dymerge --help

Usage: dymerge {dictionaries} [options]

  --version             show program's version number and exit
  -h, --help            show this help message and exit
                        output filename
                        include specified values in dictionary
  -z ZIP_TYPE, --zip=ZIP_TYPE
                        zip file with specified archive format
  -s, --sort            sort output alphabetically
  -u, --unique          remove dictionary duplicates
  -r, --reverse         reverse dictionary items
  -f, --fast            finish task asap

  dymerge /usr/share/wordlists/rockyou.txt /lists/cewl.txt -s -u
  dymerge /lists/cewl.txt /lists/awlg.txt -s -u -i and,this
  dymerge ~/fsocity.dic -u -r -o ~/clean.txt
  dymerge /dicts/crunch.txt /dicts/john.txt -u -f -z bz2

In brief, DyMerge is capable of merging user-specified wordlists into one sorted file, removing duplicates and compressing the final dictionary. Other features such as reversing the order of the dictionary and appending values are also provided.

How it Works

The tool accomplishes the previously mentioned capabilities by storing each word from each file into an array (or list) and subsequently sorting and removing duplicate items using in-built Python functions. It then uses either the zipfile, tarfile, bz2, or gzip framework to compress the output file, depending on user input.

Why it Sucks

Arguably, all of DyMerge's features can be implemented with a single line of bash code, faster and - overall - more efficiently. Some example cases have been outlined below.

DyMerge vs. Bash
Here are some cases where DyMerge commands can be replaced by simple `bash` commands:
$ python dymerge.py /usr/share/wordlists/rockyou.txt /lists/cewl.txt -s -u
❯❯❯ sort -u /usr/share/wordlists/rockyou.txt /lists/cewl.txt > output.txt
$ python dymerge.py /lists/cewl.txt /lists/awlg.txt -s -u -i and,this
❯❯❯ sort -u /lists/cewl.txt /lists/awlg.txt <(echo 'and\nthis') > output.txt
$ python dymerge.py ~/fsocity.dic -s -u -r -o ~/clean.txt
❯❯❯ sort -r <~/fsocity.dic | uniq> clean.txt
$ python dymerge.py /dicts/crunch.txt /dicts/john.txt -s -u -f -z bz2
❯❯❯ sort -u /dicts/crunch.txt /dicts/john.txt | bzip2 > output.bz2

Moreover, DyMerge doesn't work well with large files. This is because it loads every word from each dictionary into an array, causing the computer system's memory to overload and possibly even crash.

Impact & Publicity

Even though all of DyMerge's main features can be fulfilled with classic bash, the tool has actually gained publicity.

DyMerge has received around 100 stars on GitHub and has been featured on YouTube videos by respected YouTubers such as JackkTutorials. It has also been featured on several infosec websites, including KitPloit and DarkNet.

What I Gained

My programming knowledge while developing this tool was nowhere near (and still isn't) perfect. I chose to write DyMerge in Python and approached its development as an opportunity for learning and improving my coding abilities.

By creating DyMerge I was able to expand my Python skills and learn how to use handy modules such as OptParser, which I still utilize to this day. I also received community exposure, discovered how the open-source community operates and how one can promote his/her idea... even if that idea isn't so perfect.