~helmut/ssdeep.git
3 years agofix output for ugly corner case master
Helmut Grohne [Tue, 9 Sep 2014 17:42:22 +0000 (19:42 +0200)]
fix output for ugly corner case

When the roll_sum of the last piece happens to be 0, the original
algorithm picks one byte of the digest behind the end. In the original
algorithm, it is initially set to zero and updated unconditionally even
when the length is not increased. Copy that behaviour.

4 years agoimplement variants of the hashes
Helmut Grohne [Mon, 25 Mar 2013 12:00:48 +0000 (13:00 +0100)]
implement variants of the hashes

FUZZY_FLAG_ELIMSEQ: The comparison operation runs eliminate_sequence
    before actually comparing two hashes on both of them. This step can
    be moved to hash generation time using this flag. Suggested by Niels
    Thykier.
FUZZY_FLAG_NOTRUNC: The second part of the hash is truncated to
    SPAMSUM_LENGTH/2 by default. When comparing two hashes with
    different blocksize this can result in a larger edit distance and
    therefore false negatives.

4 years agosolve a splint warning
Helmut Grohne [Mon, 25 Mar 2013 10:54:47 +0000 (11:54 +0100)]
solve a splint warning

4 years agoadd a README
Helmut Grohne [Sun, 24 Mar 2013 20:29:44 +0000 (21:29 +0100)]
add a README

4 years agocython module
Helmut Grohne [Sun, 24 Mar 2013 20:11:07 +0000 (21:11 +0100)]
cython module

Finally a Python interface in the spirit of hashlib.

4 years agoproperly set errno
Helmut Grohne [Sun, 24 Mar 2013 20:06:06 +0000 (21:06 +0100)]
properly set errno

4 years agodo not fail digest computation that often
Helmut Grohne [Sun, 24 Mar 2013 19:39:00 +0000 (20:39 +0100)]
do not fail digest computation that often

4 years agorename functions and export them
Helmut Grohne [Sun, 24 Mar 2013 17:05:24 +0000 (18:05 +0100)]
rename functions and export them

Use fuzzy_ as a prefix like all of the previous ones. Export fuzzy_new,
fuzzy_update, fuzzy_digest and fuzzy_free. These functions are
sufficient to put the caller in control and build an API similar to
Python's hashlib.

4 years agofix comment for array implementation
Helmut Grohne [Sun, 24 Mar 2013 16:48:48 +0000 (17:48 +0100)]
fix comment for array implementation

4 years agoupdate total_size as early as possible
Helmut Grohne [Sun, 24 Mar 2013 16:21:53 +0000 (17:21 +0100)]
update total_size as early as possible

This improves the success rate of ssdeep_try_reduce_blockhash and
thereby gives a significant speedup.

4 years agoturn linked list of blockhashes into an array
Helmut Grohne [Sun, 24 Mar 2013 16:15:40 +0000 (17:15 +0100)]
turn linked list of blockhashes into an array

 * less memory management, everything we need fit in 2.5kb
 * less scattering of data
 * less code

4 years agofail gracefully on large inputs
Helmut Grohne [Sun, 24 Mar 2013 15:16:43 +0000 (16:16 +0100)]
fail gracefully on large inputs

4 years agoallow ssdeep_digest to fail
Helmut Grohne [Sun, 24 Mar 2013 15:07:06 +0000 (16:07 +0100)]
allow ssdeep_digest to fail

This is more correct with respect to the sprintf usage and allows for
future extension.

4 years agoplace ssdeep_context on the heap
Helmut Grohne [Sun, 24 Mar 2013 13:47:04 +0000 (14:47 +0100)]
place ssdeep_context on the heap

When publishing an API based on ssdeep_context this enables us to change
its size without breaking API or ABI. Also splint appears to cope better
with that.

4 years agofuzzy.h was missing <stdio.h>
Helmut Grohne [Sun, 24 Mar 2013 13:46:28 +0000 (14:46 +0100)]
fuzzy.h was missing <stdio.h>

Thanks to Niels Thykier.

4 years agominimal Makefile
Helmut Grohne [Sun, 24 Mar 2013 13:38:46 +0000 (14:38 +0100)]
minimal Makefile

4 years agomoved main function to ssdeep.c
Helmut Grohne [Sun, 24 Mar 2013 13:29:22 +0000 (14:29 +0100)]
moved main function to ssdeep.c

4 years agoship a fuzzy.h compatible with ssdeep's fuzzy.h
Helmut Grohne [Sun, 24 Mar 2013 13:25:57 +0000 (14:25 +0100)]
ship a fuzzy.h compatible with ssdeep's fuzzy.h

4 years agosupport stdin mode like original ssdeep
Helmut Grohne [Sun, 24 Mar 2013 13:14:15 +0000 (14:14 +0100)]
support stdin mode like original ssdeep

4 years agoteach splint about f{seek,tell}o
Helmut Grohne [Sun, 24 Mar 2013 13:12:02 +0000 (14:12 +0100)]
teach splint about f{seek,tell}o

4 years agosplit ssdeep_engine_step to smaller functions
Helmut Grohne [Sun, 24 Mar 2013 13:11:38 +0000 (14:11 +0100)]
split ssdeep_engine_step to smaller functions

4 years agouse fuzzy_hash_stream where applicable
Helmut Grohne [Sun, 24 Mar 2013 13:10:59 +0000 (14:10 +0100)]
use fuzzy_hash_stream where applicable

Thanks: Niels Thykier

4 years agoinitial checking
Helmut Grohne [Sun, 24 Mar 2013 13:09:36 +0000 (14:09 +0100)]
initial checking

This is a rewrite of ssdeep's fuzzy.c to do streaming hashes. It is a
bit slower, but the memory consumption is bounded in all cases.