summaryrefslogtreecommitdiff
path: root/readyaml.py
AgeCommit message (Collapse)Author
2013-08-03convert remaining code to sqlalchemyHelmut Grohne
No explicit "import sqlite3" left. It's still a bit rough around the corners, particularly since sqlalchemy's support for executemany is totally broken.
2013-07-24readyaml: cache the whole function tableHelmut Grohne
This should reduce the query bandwidth to the rdbms.
2013-07-23schema: reference hash functions by integer keyHelmut Grohne
This already worked quite well for package.id. On a test data set of 5% size this transformation reduces the database size by about 4%.
2013-07-10schema: reference package table by integer keyHelmut Grohne
One approach to improve performance is to reduce the database size. A package name takes up 15 bytes in average. A number of a package takes up two bytes. Multiply that difference with the number of references and it should be noticeably. A small test set show a reduction by 10%.
2013-06-11autoimport: don't fork for readyamlHelmut Grohne
This appears to be a huge performance boost.
2013-06-10split the import phase to a yaml streamHelmut Grohne
importpkg.py now emits a yaml stream instead of updating the database. The acutual updating now happens in readyaml.py. In this process autoimport.py was significantly reworked to import packages in parallel.