summaryrefslogtreecommitdiff
path: root/dedup/debpkg.py
AgeCommit message (Collapse)Author
2016-05-23move dedup.debpkg.process_control back into importpkgHelmut Grohne
After all, it isn't that generic. It knows what information is necessary for running dedup. Thus it really belongs to the extractor subclass. By building on handle_control_info, not that much parsing logic is left in the extractor subclass.
2016-05-23DebExtractor: implement parsing of control.tarHelmut Grohne
2016-05-05treat Pre-Depends like regular DependsHelmut Grohne
The former behaviour was ignoring them. The intended use for dedup is to know whenever a package unconditionally requires another package.
2016-05-01push more functionality into DebExtractorHelmut Grohne
The handle_ar_member and handle_ar_end methods now have a default implementation adding further handlers handle_debversion, handle_control_tar and handle_data_tar. In that process two additional bugs were fixed: * decompress_tar was wrongly passing errors="surrogateescape" for Python 2.x even though that's only supported for Python 3.x. * The use of decompress actually passes the extension as unicode.
2016-04-19add a class DebExtractor for guiding feature extractionHelmut Grohne
It is supposed to separate the parsing of Debian packages (understanding how the format works) from the actual feature extraction. Its goal is to simplify writing custom extractors for different feature sets.
2015-04-16process_control: do not encode to asciiHelmut Grohne
Otherwise the yaml will contain binary strings on py3k which end up as binary data in the sqlite database. In py2, yaml can handle those unicode objects just fine.
2014-05-11importpkg: add support for control.tar and control.tar.xzGuillem Jover
dpkg supports those since 1.17.6. Signed-off-by: Guillem Jover <guillem@debian.org>
2013-10-03work around python-debian's #670679Helmut Grohne
2013-09-02importpkg: move library-like parts to dedup.debpkgHelmut Grohne