summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-31multiarchanalyze.py: speed up yaml dumpingHelmut Grohne
2021-12-31multiarchimport.py: httpredir.d.o is deprecatedHelmut Grohne
2021-12-31multiarchanalyze.py: make pylint happierHelmut Grohne
pylint does not recognize that the condition ensures left and right to be defined.
2021-12-31drop remaining Python 2.x supportHelmut Grohne
2021-12-31multiarchimport.py: log exceptions from worker processesHelmut Grohne
2021-12-31multiarchimport.py: use dedup.utils.iterate_packagesHelmut Grohne
2021-12-31multiarchimport.py: decodetarname was dropped in masterHelmut Grohne
Fixes: ba840e8913ef ("Merge branch master into branch multiarchhints")
2021-12-31Merge branch master into branch multiarchhintsHelmut Grohne
Among other things, this drops Python 2.x support.
2021-12-31dedup.utils: uninline helper function iterate_packagesHelmut Grohne
2021-12-31webapp.py: consistently close cursors using context managersHelmut Grohne
2021-12-30DecompressedStream: improve performanceHelmut Grohne
When the decompression ratio is huge, we may be faced with a large (multiple megabytes) bytes object. Slicing that object incurs a copy becomes O(n^2) while appending and trimming a bytearray is much faster.
2021-12-29multiarchimport.py: reduce default loggingHelmut Grohne
2021-12-29multiarchanalyze.py: fix python3 compatibilityHelmut Grohne
.keys() now returns a special object, but show_files really wants something that provides len() and supports repeated iteration.
2021-12-29DecompressedStream: fix endless loopHelmut Grohne
Fixes: 775bdde52ad5 ("DecompressedStream: avoid mixing types for variable data")
2021-12-29webapp: avoid changing variable typeHelmut Grohne
Again static type checking is the driver for the change here.
2021-12-29autoimport: avoid changing variable typeHelmut Grohne
knownpkgvers is a dict while knownpkgs is a set. Separating them helps static type checkers.
2021-12-29webapp: speed up encode_and_bufferHelmut Grohne
We now know that our parameter is a jinja2.environment.TemplateStream. Enable buffering and accumulate via an io.BytesIO to avoid O(n^2) append.
2021-12-29webapp: improve performanceHelmut Grohne
html_response expects a str-generator, but when we call the render method, we receive a plain str. It can be iterated - one character at a time. That's what encode_and_buffer will do in this case. So better stream all the time.
2021-12-29webapp: forward compatibility with newer werkzeugHelmut Grohne
2021-12-29autoimport.py: convert to use pathlibHelmut Grohne
2021-12-29importpkg: fix suprression of boring contentHelmut Grohne
The content must be bytes. Passing str silently skips the suppression.
2021-12-29DecompressedHash: also gain a name property for consistencyHelmut Grohne
2021-12-29ImageHash: gain a name propertyHelmut Grohne
Instead of retroactively attaching a name to an ImageHash, autogenerate it via a property. Doing so also simplifies static type checking.
2021-12-29don't return the first parameter from hash_fileHelmut Grohne
Returning the object gets us into trouble as to what precisely the return type is at no benefit.
2021-12-29drop unused function sql_add_version_compareHelmut Grohne
2021-12-29DecompressedStream: avoid mixing types for variable dataHelmut Grohne
The local variable data can be bool or bytes. That's inconvenient for static type checkers. Avoid doing so.
2021-12-29DecompressedStream: eliminate redundant closed fieldHelmut Grohne
2021-12-27stop hiding M-A:same conflicts in binNMUed packagesHelmut Grohne
The issue has been solved by Mattia Rizzolo in dh-strip-nondeterminism via #999665.
2020-10-25drop obsolete python modulesHelmut Grohne
Both lzma and concurrent.futures are now part of the standard library and solely exist as virtual packages.
2020-10-25externalize ar parsing to arpyHelmut Grohne
2020-10-25use python3-pil instead of removed python3-imagingHelmut Grohne
2020-09-06fix tuple mismatchHelmut Grohne
Fixes: e6115dd16b46 ("hide M-A:same conflicts in binNMUed packages")
2020-09-03hide M-A:same conflicts in binNMUed packagesHelmut Grohne
binNMUed packages are not currently reproducible, because buildds don't pass --binNMU-timestamp to sbuild. Thus they use varying SOURCE_DATE_EPOCH and produce faulty packages. As much as this is a real bug, it is not actionable by maintainers. Hide such issues for now. Link: https://salsa.debian.org/perl-team/modules/packages/libtie-hash-indexed-perl/-/merge_requests/1 Link: https://bugs.debian.org/843773
2020-02-17fix typo in maforeign_library regexHelmut Grohne
2020-02-16drop support for Python 2.xHelmut Grohne
2018-06-25adapt to python3-magic/2:0.4.15-1 APIHelmut Grohne
2018-01-07multiarchanalyze: give examples when representing arch setsHelmut Grohne
Uwe Kleine-König said that knowing example architectures for file conflicts would be incredibly useful. The old presentation of architecture sets would collapse sets that are too big to a single count. This makes it difficult to find any colliding pair. Now, we'll now give at least two example architectures in addition to the count. Reported-By: Uwe Kleine-König <ukleinek@debian.org>
2018-01-05fix logic inversion in package selectionHelmut Grohne
We want the package with the highest version, not the lowest. Reported-By: Uwe Kleine-König <ukleinek@debian.org>
2017-12-21multiarchanalyze: opportunistically emit a version when uniqueHelmut Grohne
2017-09-23add module dedup.filemagicHelmut Grohne
This module is not used anywhere and thus its dependency on python3-magic is not recorded in the README. It can be used to guess the file type by looking at the contents using file magic. It is not a typical hash function, but it can be used for repurposing dedup for other analysers.
2017-09-13fix HashBlacklistContent.copyHelmut Grohne
It wasn't copying the stored member and thus could be blacklist "wrong" content after a copy.
2017-03-05multiarchimport: python 3 forward compatibilityHelmut Grohne
2017-03-04multiarchanalyze: detect some form wrong M-A:foreignHelmut Grohne
When an arch:any package ships a .so file in a public library search path (e.g. a symlink as many lib*-dev packages do) it most likely shouldn't be M-A:foreign. A common exception is plugins loaded into programs, so exclude that case. Many thanks to Johannes Schauer and Guillem Jover for helping discover this pattern of Multi-Arch: foreign abuse.
2016-11-13autoimport: fix regresion in url computationHelmut Grohne
The list path got inadvertently prepended to all binary package urls. Fixes: 420804c25797 ("autoimport: improve fetching package lists")
2016-08-07multiarchanalyze: make it easily consumable by tracker.d.oHelmut Grohne
Many thanks to Paul Wise for his detailed feedback on the data format.
2016-07-29repository movedHelmut Grohne
2016-06-12multiarchanalyze: speed up on sqlite3 3.8.7.1Helmut Grohne
Since all users of archdepcandidate run the results through "exists()" or "group by", "union" vs "union all" does not make any difference to the results. On the performance side however, it avoids a b-tree merge getting the maforeign_candidate query down from hours to seconds.
2016-06-10add a separate tool for generating hints on Multi-Arch headersHelmut Grohne
It builds on the core functionality of dedup, but uses a different database schema. Unlike dedup, it aborts downloading Arch:all packages early and consumes any other architecture in its entirety instead.
2016-06-09DecompressedStream: fix decompression without flushHelmut Grohne
In Python 3.x, lzma.LZMADecompressor doesn't have a flush method.
2016-06-09autoimport: fix hash checkHelmut Grohne
Fixes: 2f12a6e2f426 ("autoimport: add option to skip hash checking")