Age | Commit message (Collapse) | Author | |
---|---|---|---|
2020-10-25 | use python3-pil instead of removed python3-imaging | Helmut Grohne | |
2020-02-16 | drop support for Python 2.x | Helmut Grohne | |
2016-05-23 | remove curl dependency | Helmut Grohne | |
Teach importpkg how to download urls using urlopen and thus remove the need for invoking curl. | |||
2013-08-01 | support hashing gif images | Helmut Grohne | |
* Rename "image_sha512" to "png_sha512". * dedup.image.ImageHash is now a base class for image hashes such as PNGHash and GIFHash. * Enable both hashes in importpkg. * Fix README. * Add new hash combinations to webapp. * Add "gif file not named *.gif" to issues in update_sharing. * Add redirect for "image_sha512" to webapp for backwards compatibility. | |||
2013-07-27 | move templates to dedup package | Helmut Grohne | |
They cluttered webapp.py and now vim can give proper highlighting for the templates. | |||
2013-07-26 | Merge branch functionid | Helmut Grohne | |
Actual savings on the full data set are around 7%. Conflicts: README | |||
2013-07-25 | README: foo.PNG is also a valid png name | Helmut Grohne | |
2013-07-23 | README: fix typo in query | Helmut Grohne | |
2013-07-23 | adapt queries in README to new schema | Helmut Grohne | |
2013-07-10 | schema: reference package table by integer key | Helmut Grohne | |
One approach to improve performance is to reduce the database size. A package name takes up 15 bytes in average. A number of a package takes up two bytes. Multiply that difference with the number of references and it should be noticeably. A small test set show a reduction by 10%. | |||
2013-07-03 | README: explain update_sharing.py | Helmut Grohne | |
2013-06-10 | split the import phase to a yaml stream | Helmut Grohne | |
importpkg.py now emits a yaml stream instead of updating the database. The acutual updating now happens in readyaml.py. In this process autoimport.py was significantly reworked to import packages in parallel. | |||
2013-04-08 | README: improve query after schemachange | Helmut Grohne | |
2013-03-10 | README: update queries to match content table split | Helmut Grohne | |
2013-03-07 | README: explain queries | Helmut Grohne | |
2013-03-06 | README: added interesting query | Helmut Grohne | |
2013-03-02 | update README | Helmut Grohne | |
* Tell about schema.sql. * Explain WAL. | |||
2013-02-25 | README: another interesting query | Helmut Grohne | |
2013-02-24 | hash image contents | Helmut Grohne | |
2013-02-24 | README: fix mistake | Helmut Grohne | |
2013-02-21 | added README | Helmut Grohne | |