vxorg/README.md

1.4 KiB

vxorg

vxheaven organizer (converts it from a flat hierarchy of ~270k+ files to a neat tree). Originally written in python, I rewrote it in C++ for performance reasons.

History

  • 2018: I wrote a really shoddy attempt at doing organization in Bash. It sucked because I wasn't taking care of many idosyncracies about sample naming.
    • It also was very primitive and slow, since it would continually spawn mv processes just to move files. (same for mkdir too, but that is less of a concern since it's done less)
  • 2023: I wrote a new script in Python. It was "better" but still didn't work
    • I actually made the same mistake and tried to write in Bash again, but even Python was worlds faster, so I rewrote it in python
  • October 21, 2024: I decided to start rewriting the Python script I wrote to parse into a N-ary tree for memory savings while still allowing memoization. (and be modular instead of one blob)
    • Later in the day, as an experiment, I rewrote the parsing algorithm (fixing a bug in the process) in C++. It was 100x faster, so I committed to a rewrite in C++

Building

make

Usage

  • Generate a list of samples.
    • tar tf xxx/viruses-2010-05-18.tar.bz2 | sed 's/\.\///g' | awk NF | sort > list is one option. Not the best but it's (basically) what I did
  • Run with ./vxorg list src/ dest/
    • dest/ will be created if it does not exist.
    • It will show a progress bar as it completes.