Age | Commit message (Collapse) | Author |
|
In the mean time, the master branch evolved quite a bit and the schema
changed again (eqclass added to function table). The main reason for the
merge is to resolve the large amounts of conflicts once, so development
of the sqlalchemy branch can continue and still benefit from changes in
the master branch such as schema compatibility, adapting the indent
level in web app due to the use of contextlib.closing which resembles
sqlalchemy's "with db.begin() as conn:".
Conflicts:
autoimport.py
dedup/utils.py
readyaml.py
update_sharing.py
webapp.py
|
|
|
|
No explicit "import sqlite3" left. It's still a bit rough around the
corners, particularly since sqlalchemy's support for executemany is
totally broken.
|
|
This should reduce the query bandwidth to the rdbms.
|
|
This already worked quite well for package.id. On a test data set of 5%
size this transformation reduces the database size by about 4%.
|
|
One approach to improve performance is to reduce the database size. A
package name takes up 15 bytes in average. A number of a package takes
up two bytes. Multiply that difference with the number of references and
it should be noticeably. A small test set show a reduction by 10%.
|
|
This appears to be a huge performance boost.
|
|
importpkg.py now emits a yaml stream instead of updating the database.
The acutual updating now happens in readyaml.py. In this process
autoimport.py was significantly reworked to import packages in parallel.
|