Age | Commit message (Collapse) | Author |
|
|
|
.keys() now returns a special object, but show_files really wants
something that provides len() and supports repeated iteration.
|
|
The issue has been solved by Mattia Rizzolo in dh-strip-nondeterminism
via #999665.
|
|
Fixes: e6115dd16b46 ("hide M-A:same conflicts in binNMUed packages")
|
|
binNMUed packages are not currently reproducible, because buildds don't
pass --binNMU-timestamp to sbuild. Thus they use varying
SOURCE_DATE_EPOCH and produce faulty packages. As much as this is a real
bug, it is not actionable by maintainers. Hide such issues for now.
Link: https://salsa.debian.org/perl-team/modules/packages/libtie-hash-indexed-perl/-/merge_requests/1
Link: https://bugs.debian.org/843773
|
|
|
|
Uwe Kleine-König said that knowing example architectures for file
conflicts would be incredibly useful. The old presentation of
architecture sets would collapse sets that are too big to a single
count. This makes it difficult to find any colliding pair. Now, we'll
now give at least two example architectures in addition to the count.
Reported-By: Uwe Kleine-König <ukleinek@debian.org>
|
|
We want the package with the highest version, not the lowest.
Reported-By: Uwe Kleine-König <ukleinek@debian.org>
|
|
|
|
|
|
When an arch:any package ships a .so file in a public library search
path (e.g. a symlink as many lib*-dev packages do) it most likely
shouldn't be M-A:foreign. A common exception is plugins loaded into
programs, so exclude that case.
Many thanks to Johannes Schauer and Guillem Jover for helping discover
this pattern of Multi-Arch: foreign abuse.
|
|
Many thanks to Paul Wise for his detailed feedback on the data format.
|
|
Since all users of archdepcandidate run the results through "exists()"
or "group by", "union" vs "union all" does not make any difference to
the results.
On the performance side however, it avoids a b-tree merge getting the
maforeign_candidate query down from hours to seconds.
|
|
It builds on the core functionality of dedup, but uses a different
database schema. Unlike dedup, it aborts downloading Arch:all packages
early and consumes any other architecture in its entirety instead.
|
|
In Python 3.x, lzma.LZMADecompressor doesn't have a flush method.
|
|
Fixes: 2f12a6e2f426 ("autoimport: add option to skip hash checking")
|
|
Moving the fetching part into dedup.utils. Instead of hard coding the
gzip compressed copy, try xz, gz and plain in that order. Also take care
to actually close the connection.
|
|
This causes non-successful fetches to result in HTTPErrors like it does
in py3 already.
|
|
After all, it isn't that generic. It knows what information is necessary
for running dedup. Thus it really belongs to the extractor subclass.
By building on handle_control_info, not that much parsing logic is left
in the extractor subclass.
|
|
|
|
|
|
Teach importpkg how to download urls using urlopen and thus remove the
need for invoking curl.
|
|
For variations of dedup, that do not consume the data.tar member, this
option can save significant bandwidth.
|
|
* streaming means that we do not need to hold the entire package list
in memory (but the pkgs dict will become large anyway).
* The decompress utility allows easily switching to e.g. xz which is
the only compression format for the dbgsym suites.
|
|
Iteration over file-like is required by deb822.Packages.iter_paragraphs.
|
|
|
|
The former behaviour was ignoring them. The intended use for dedup is to
know whenever a package unconditionally requires another package.
|
|
The handle_ar_member and handle_ar_end methods now have a default
implementation adding further handlers handle_debversion,
handle_control_tar and handle_data_tar.
In that process two additional bugs were fixed:
* decompress_tar was wrongly passing errors="surrogateescape" for
Python 2.x even though that's only supported for Python 3.x.
* The use of decompress actually passes the extension as unicode.
|
|
The autoimport tool runs the Python interpreter explicitly. Instead of
invoking just "python" and thus calling whatever the current default is,
use sys.executable which is the interpreter used to run autoimport, thus
locking both to the same Python version.
|
|
In Python 2.x, TarInfo.name is a bytes object. In Python 3.x,
TarInfo.name always is a unicode object. To avoid importpkg crashing
with an exception, we direct the Python 3.x decoding to use
surrogateescapes. Thus decoding the name boils down to checking whether
it contains surrogates.
|
|
Building on the previous commit, add a decompress function that turns a
compressed filelike into a decompressed filelike. Use it to decouple the
decompression step.
|
|
It now supports:
* tell()
* seek(absolute_position), forward only
* close()
* closed
This is sufficient for putting it as a fileobj into tarfile.TarFile. By
doing so we can decouple decompression from tar processing, which eases
papering over the Python 2.x vs Python 3.x differences.
|
|
They really are an aspect of the particular extractor and can easily be
changed by subclassing.
|
|
It is supposed to separate the parsing of Debian packages (understanding
how the format works) from the actual feature extraction. Its goal is to
simplify writing custom extractors for different feature sets.
|
|
|
|
Instead of carefully crafting an iterator to pass to yaml.safe_dump_all,
we simply take control on our own and call represent on a yaml dumper
object where needed.
|
|
|
|
|
|
Otherwise the yaml will contain binary strings on py3k which end up as
binary data in the sqlite database. In py2, yaml can handle those
unicode objects just fine.
|
|
|
|
|
|
|
|
zlib.crc32 returns a int32_t on py2 and a uint32_t on py3.
|
|
|
|
|
|
|
|
While in current sid packages the control file in control.tar is always
named "./control", some older packages name it "control".
|
|
wording, more NOT NULLs, some more explanations
|
|
Thanks to Peter Palfrader for explaining what information is needed and
reviewing the documentation.
|
|
|