Age | Commit message (Collapse) | Author |
|
They were previously hex encoded, so this should cut the space consumed
by hashes in half. A first benchmark indicates that the savings in
database size are in the order of 30%.
|
|
The original version had two major drawbacks:
1) The SQL query used would cause a btree sort, so the time waiting
for the first output was rather long.
2) For packages with many equal files, the output would grow with
O(n^2).
Thanks to the suggestions by Christine Grohne and Klaus Aehlig. The
approach now groups files in package1 by their main hash value (sha512).
It also does some work SQL was designed to solve manually now. To speed
up page generation a new caching table was added identifying which files
have corresponding shared files.
|
|
In the old content table (package, filename, size) would be the same for
multiple hash functions. Now the schema represents that each file has
precisely one size, but multiple hashes.
|
|
|
|
|
|
The sharing table is a cache for the /binary web pages. It essentially
contains the numbers presented. This caching table is not automatically
populated. It needs to be reconstructed after every (group of) package
imports.
|