ssdeep fork =========== This is a fork of [ssdeep][http://ssdeep.sf.net] and a different implementation of the [Python wrappers][https://github.com/DinoTools/python-ssdeep]. Goals of this fork: 1. Compute the hash by reading the input exactly once. No seeks. No buffering. 2. Thread safety. Do not use global variables. 3. Fixed memory consumption. 4. An API in the spirit of Python's hashlib. fuzzy.c and fuzzy.h contain a different implementation of the hash computation. Note that comparison has been left out entirely, because I have no complaint about the upstream implementation. ssdeep.c contains a simply program that does not recognize any of the original ssdeep options, but tries to behave a bit similar. The goal here is to make it comparable to upstream. pyfuzzy.pyx and setup.py contain a Cython wrapper to glue it into Python. performance ----------- The new implementation runs about 8 to 20 "normal" hashes in parallel. This is much more expensive. In my profiling the reimplementation is about 1.5 times slower than upstream in average. On the other hand upstream is occasionally 10 times slower. The likely explanation here is that the blocksize was guessed wrong and the file was rehashed. When reading from stdin the upstream version first reads the entire input into main memory. The fork handles this case in fixed memory. The `fuzzy_state` structure takes up about 2.5kb. about ----- Like the upstream projects this code is licensed under the GPL-2+. If you have any questions, please contact me at `Helmut Grohne `.