~helmut/debian-dedup.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2013-07-23	schema: reference hash functions by integer key	Helmut Grohne
	This already worked quite well for package.id. On a test data set of 5% size this transformation reduces the database size by about 4%.
2013-07-22	schema: extend content_package_index	Helmut Grohne
	We can avoid a b-tree sort in the package comparison of the web app, if the package index, also provides a size.
2013-07-10	schema: reference package table by integer key	Helmut Grohne
	One approach to improve performance is to reduce the database size. A package name takes up 15 bytes in average. A number of a package takes up two bytes. Multiply that difference with the number of references and it should be noticeably. A small test set show a reduction by 10%.
2013-07-10	schema.sql: drop unused index	Helmut Grohne
	sharing_package_index is a sub-index of sharing_insert_index and therefore unnecessary.
2013-04-24	implement the /compare/pkg1/pkg2 page differently	Helmut Grohne
	The original version had two major drawbacks: 1) The SQL query used would cause a btree sort, so the time waiting for the first output was rather long. 2) For packages with many equal files, the output would grow with O(n^2). Thanks to the suggestions by Christine Grohne and Klaus Aehlig. The approach now groups files in package1 by their main hash value (sha512). It also does some work SQL was designed to solve manually now. To speed up page generation a new caching table was added identifying which files have corresponding shared files.
2013-03-09	split content table to a hash table	Helmut Grohne
	In the old content table (package, filename, size) would be the same for multiple hash functions. Now the schema represents that each file has precisely one size, but multiple hashes.
2013-03-07	use "ON DELETE CASCADE" clauses	Helmut Grohne

2013-03-07	schema.sql: remove unsatisfiable foreign key	Helmut Grohne
	In the dependency table we will insert dependencies on packages which are not tracked. This happens during initial import and for virtual packages. Therefore the "required" column cannot be a foreign key.
2013-03-07	schema.sql: annotat foreign keys of sharing	Helmut Grohne

2013-03-07	integrate the source table into the package table	Helmut Grohne

2013-03-04	importpkg: record the source package relationship	Helmut Grohne

2013-03-02	add sharing table	Helmut Grohne
	The sharing table is a cache for the /binary web pages. It essentially contains the numbers presented. This caching table is not automatically populated. It needs to be reconstructed after every (group of) package imports.
2013-03-02	move sql schema to a separate file	Helmut Grohne