summaryrefslogtreecommitdiff
path: root/update_sharing.py
AgeCommit message (Collapse)Author
2013-07-10use sqlalchemy paramstyleHelmut Grohne
By using the :name syntax inside sql statements, sqlalchemy will replace the contents with whatever paramstyle the underlying dbapi2 module needs. In case of psycopg2 the paramstyle is not qmark for instance.
2013-06-23update_sharing: postgres does not support "INSERT OR IGNORE"Helmut Grohne
2013-06-23dedup.utils: add enbale_sqlite_foreign_keys helperHelmut Grohne
Makes usage of sqlalchemy easier, cause I can invoke it once and it works for all connections.
2013-06-23port update_sharing.py to sqlalchemyHelmut Grohne
2013-04-24implement the /compare/pkg1/pkg2 page differentlyHelmut Grohne
The original version had two major drawbacks: 1) The SQL query used would cause a btree sort, so the time waiting for the first output was rather long. 2) For packages with many equal files, the output would grow with O(n^2). Thanks to the suggestions by Christine Grohne and Klaus Aehlig. The approach now groups files in package1 by their main hash value (sha512). It also does some work SQL was designed to solve manually now. To speed up page generation a new caching table was added identifying which files have corresponding shared files.
2013-03-09split content table to a hash tableHelmut Grohne
In the old content table (package, filename, size) would be the same for multiple hash functions. Now the schema represents that each file has precisely one size, but multiple hashes.
2013-03-07enable enforcing foreign keysHelmut Grohne
2013-03-02update_sharing: wrong database nameHelmut Grohne
2013-03-02add sharing tableHelmut Grohne
The sharing table is a cache for the /binary web pages. It essentially contains the numbers presented. This caching table is not automatically populated. It needs to be reconstructed after every (group of) package imports.