Age | Commit message (Collapse) | Author |
|
When the roll_sum of the last piece happens to be 0, the original
algorithm picks one byte of the digest behind the end. In the original
algorithm, it is initially set to zero and updated unconditionally even
when the length is not increased. Copy that behaviour.
|
|
FUZZY_FLAG_ELIMSEQ: The comparison operation runs eliminate_sequence
before actually comparing two hashes on both of them. This step can
be moved to hash generation time using this flag. Suggested by Niels
Thykier.
FUZZY_FLAG_NOTRUNC: The second part of the hash is truncated to
SPAMSUM_LENGTH/2 by default. When comparing two hashes with
different blocksize this can result in a larger edit distance and
therefore false negatives.
|
|
|
|
|
|
|
|
Use fuzzy_ as a prefix like all of the previous ones. Export fuzzy_new,
fuzzy_update, fuzzy_digest and fuzzy_free. These functions are
sufficient to put the caller in control and build an API similar to
Python's hashlib.
|
|
|
|
This improves the success rate of ssdeep_try_reduce_blockhash and
thereby gives a significant speedup.
|
|
* less memory management, everything we need fit in 2.5kb
* less scattering of data
* less code
|
|
|
|
This is more correct with respect to the sprintf usage and allows for
future extension.
|
|
When publishing an API based on ssdeep_context this enables us to change
its size without breaking API or ABI. Also splint appears to cope better
with that.
|
|
|
|
|
|
|
|
|
|
|
|
Thanks: Niels Thykier
|
|
This is a rewrite of ssdeep's fuzzy.c to do streaming hashes. It is a
bit slower, but the memory consumption is bounded in all cases.
|