%% Evolutino of the Debian package management system %% Copyright 2001 Wichert Akkerman %% %% Presented at the OSDN-JP event in Tokyo, Japan, October 2001 %% %% Abstract: %% A look into how dpkg evolved from a sh script via a perl script into %% its current implementation as well as a look into future developments %% that are being planned. %% %% Define the fonts we will use %% %deffont "standard" tfont "arial.ttf" %deffont "typewriter" tfont "courbd.tf" %deffont "italic" tfont "ariali.tf" %% %% Default settings for special lines %% %default 1 leftfill, fore "black", back "white", bimage "background.bmp" %default 2 size 7, vgap 10, prefix " ", font "standard" %default 3 size 2, bar "gray70", vgap 10 %default 4 size 5, vgap 30, font "standard" %% %% Default settings for indented lines %% %tab 1 size 5, vgap 40, prefix " ", icon box "green" 50 %tab 2 size 4, vgap 40, prefix " ", icon arc "red" 50 %tab 3 size 3, vgap 40, prefix " ", icon delta3 "blue" 40 %% %%%%%%%%%%%%%%%%%% %page %nodefault, font "standard", fore "black", back "white", bimage "background.bmp" %center evolution of the Debian package management system %size 6 Wichert Akkerman %size 5 wichert@deephackmode.org %%%%%%%%%%%%%%%%%% %page Overview Package management The history New features %%%%%%%%%%%%%%%%%% %page What is a package manager? A package manager: keeps tracks of installed packages can install, upgrade and remove packages maintain system integrity verify system integrity %%%%%%%%%%%%%%%%%% %page Why use a packaging system? Easier maintenance No duplication of work Automated upgrades %%%%%%%%%%%%%%%%%% %page The history of dpkg dpkg started in 1994 as a sh script with a few C helper tools which could install, upgrade and remove packages. All the basic features where already there: package database preinst, postinst, prerm and postrm maintainer scripts seperate dpkg-util to handle package format install-info tool to manage the dir file %%%%%%%%%%%%%%%%%% %page 1994: adding a `debian' frontend In 1994 work began on a frontend called `debian', later renamed to \ `dselect'. Initial versions were perl, but quickly rewritten in C++. dselect providers a (ncurses) interface to the packaging system. \ It allows you to install, upgrade and remove available packages. dselect consists of multiple parts: the dselect frontend itself, \ and dselect methods which abstract updating the list of available \ packages, installing packages and removing packages. %%%%%%%%%%%%%%%%%% %page 1994: new start-stop-daemon utility At the same time as dselect the start-stop-daemon utility was added. \ This is a tool to make it easy to start and stop daemons from scripts. %%%%%%%%%%%%%%%%%% %page October 1994: dpkg rewrite dpkg.sh was replaced by a perl version %%%%%%%%%%%%%%%%%% %page April 1995: New source format April 1995 brought a new source format and associated utilities. The \ new format made it easy to split the source in seperate upstream source \ and the changes made for Debian. A source consists of 2 or 3 files: a source description (.dsc) file the upstream source (.tar.gz / .orig.tar.gz) changes made (.diff.gz) The dsc file can be signed as an OpenPGP ASCII armored message. %%%%%%%%%%%%%%%%%% %page New source format An unpacked source has to have 3 special files that the packaging \ tools use to build packages: debian/changelog debian/control debian/rules %%%%%%%%%%%%%%%%%% %page The debian/rules file debian/rules is an executable that is used to compile sources, \ build packages and clean the source tree. It has to support \ the following arguments: binary binary-arch binary-indep build clean %%%%%%%%%%%%%%%%%% %page Metadata syntax RFC822 style field names and values seperatons by colon (:) multi-line fields by starting extra lines with whitespace block seperated by newline %%%%%%%%%%%%%%%%%% %page debian/control contains basic information about the packages we are building. %font "typewriter", size 3 Source: grep Section: base Priority: required Maintainer: Wichert Akkerman Standards-Version: 3.1.0 Package: grep Architecture: any Essential: yes Pre-Depends: ${shlibs:Pre-Depends} Conflicts: rgrep Provides: rgrep Description: GNU grep, egrep and fgrep. The GNU family of grep utilities may be the "fastest grep in the west". GNU grep is based on a fast lazy-state deterministic matcher (about twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper search for a fixed string that eliminates impossible text from being ronsidered by the full regexp matcher without necessarily having to look at every character. The result is typically many times faster than Unix grep or egrep. (Regular expressions containing backreferencing will run more slowly, however.) %%%%%%%%%%%%%%%%%% %page debian/changelog Lists all the changes made to a package. We use a special format which \ the dpkg-dev tools can parse to extract the latest changes and the \ version number. %font "typewriter", size 3 grep (2.4.2-1) frozen unstable; urgency=low * New upstream release. This is only a translation-update and bugfix release, there is no new code (besides the bugfixes that is) * Fix location of GPL (lintian) * add -isp option to dpkg-gencontrol (lintian) * Fix grep-call in rgrep, Closes: Bug# 52751 -- Wichert Akkerman Sun, 2 Apr 2000 17:57:56 +0200 %%%%%%%%%%%%%%%%%% %page May 1995: dpkg rewrite dpkg.pl was replaced with a new C rewrite. This also added support \ for virtual packages. Virtual packages are a method for a package to indicate \ it provides a certain kind of functionality. This allows one to \ reference the funcionality instead of the specific implementations. %%%%%%%%%%%%%%%%%% %page Virtual packages %center, newimage "complex-depends.eps" %%%%%%%%%%%%%%%%%% %page Virtual packages %center, newimage "virtual-depends.eps" %%%%%%%%%%%%%%%%%% %page September 1995: alternatives Alternatives are a system to keep track of multiple binaries \ providing similar functionality. Each alternative is assigned \ a priority for automatic selection, but admin can override. %font "typewriter", size 3 # update-alternatives --display pager pager - status is auto. link currently points to /usr/bin/less /usr/bin/w3m - priority 25 slave pager.1.gz: /usr/share/man/man1/w3m.1.gz /usr/bin/less - priority 77 slave pager.1.gz: /usr/share/man/man1/less.1.gz /bin/more - priority 50 Current `best' version is /usr/bin/less. %%%%%%%%%%%%%%%%%% %page September 1995: new package format The binary (deb) format was updated to version 2. The new \ version is more extensible but still easily manipulated \ using traditional UNIX tools. %%%%%%%%%%%%%%%%%% %page The deb package format Our basic building block, which consists of: debian-binary: version information control: contains the package metadata data: contains the to-be-installed data %center, newimage "package.eps" %%%%%%%%%%%%%%%%%% %page The package control data control gives information about the package: name version dependencies description %%%%%%%%%%%%%%%%%% %page Example control file %font "typewriter", size 3 Package: acpid Version: 0.2000042500-1 Section: admin Priority: extra Architecture: i386 Depends: libc6 (>= 2.1.2) Installed-Size: 204 Maintainer: Wichert Akkerman Description: ACPI daemon Modern computers support the Advanced Configuration and Power Interface (ACPI) to allow intelligent power management on your system. . This package contains acpid, which is the user-space daemon needed in order to make the Linux ACPI support completely functional. %%%%%%%%%%%%%%%%%% %page Maintainer scripts Currently there are 5 possible scripts: preinst postinst prerm postrm config As well as 3 data files conffiles shlibs template %%%%%%%%%%%%%%%%%% %page January 1996: epochs On occasion people do strange things with version numbers. For \ example vim version numbering: alpha releases: 6.0a, 6.0b, .. 6.0z, 6.0aa .. 6.0ao beta releeases: 6.0ap .. 6.0aw, 6.0aw.001 .. 6.0aw.006 final releases: 6.0, 6.0.001, ... epochs provide a way to deal with versioning mistakes or strange \ versioning by adding an extra part to a version number. 1.0 is equal to 0:1.0 1: is always greater than 0: %%%%%%%%%%%%%%%%%% %page August 1996: dpkg-shlibdeps When building a package the libraries used have to be insert in the \ dependency list in the package metadata. This can be done automatically \ using dpkg-shlibdeps: each library packages provides a .shlibs file listing the libraries it provides and associated dependency information dpkg-shlibdeps will scan the shlibs files and build a dependency list %font "typewriter", size 3 libm 6 libc6 (>= 2.2.4-2) libc 6 libc6 (>= 2.2.4-2) libdb 2 libpython2.0 0.0 python2-base (>= 2.0-1) %%%%%%%%%%%%%%%%%% %page March 1997: start-stop-daemon rewrite start-stop-daemon was rewritten in C %%%%%%%%%%%%%%%%%% %page March 1998: apt released dpkg is a low-level way of dealing with packages; for non-trivial \ tasks more intelligence is required. This is provided by apt, which \ gives us: high-level logic for dealing with package relations handling for complex installations and upgrades database with lists of available packages ability to download packages if needed %%%%%%%%%%%%%%%% %page apt methods Stand-alone tools which libapt can use to download files using a \ specific protocol. Currently supported are: cdrom copy file ftp http %%%%%%%%%%%%%%%%%% %page Example apt configuration %font "typewriter", size 3 # The morsnet local mirror deb file:/morsnet/wichert/debian/archive potato mirror local deb file:/morsnet/wichert/debian/archive woody mirror local # Dutch mirror deb http://ftp.nl.debian.org/debian potato main contrib non-free deb-src http://ftp.nl.debian.org/debian potato main contrib non-free deb http://ftp.nl.debian.org/debian woody main contrib non-free deb-src http://ftp.nl.debian.org/debian woody main contrib non-free ## Non-US deb http://non-us.debian.org/debian-non-US potato/non-US main contrib non-free deb-src http://non-us.debian.org/debian-non-US potato/non-US main contrib non-free deb http://non-us.debian.org/debian-non-US woody/non-US main contrib non-free deb-src http://non-us.debian.org/debian-non-US woody/non-US main contrib non-free ## Security updates deb http://security.debian.org/ potato/updates main contrib non-free deb-src http://security.debian.org/ potato/updates main contrib non-free %%%%%%%%%%%%%%%%%% %page July 1999: dpkg-architecture dpkg-architecture was added to help with handling more architectures \ and operating systems. Packages can use this to determine the details \ of both the system package are build on and the system they are build \ for. %%%%%%%%%%%%%%%%%% %page January 2000: debsign debsign is a mechanism to enforce security policies on packages. \ It allows you to create a set of policies which must be met \ by a package in order for dpkg to install it. A policy is a combination of: selection criteria to see if the policy should be used verification rules %%%%%%%%%%%%%%%%%% %page May 2000: statoverrides Statoverride is a mechanism to override the file ownership and \ permission from packages. An override is registered using the \ dpkg-statoverride tool, and dpkg uses them directly when \ unpacking a package. %%%%%%%%%%%%%%%%%% %page New features Development is still ongoing and new features are being planned \ and implemented: Enormous versionnumbers Package signing Shared filesystem space Database redesign New source format The Great dpkg-dev rewrite dpkg-lsb %%%%%%%%%%%%%%%%%% %page Version numbering changes Two changes are being made in the handling of version numbers: a new ~ character can be used in version numbers which sorts lower than all other characters. no limit on numbers anymore %%%%%%%%%%%%%%%%%% %page Shared filesystem space In cluster situations it is common that part of the filesystem \ is shared amongst machines and only one machine can write to \ it. A possible solution to this would be: tell dpkg which filesystem parts are shared when upgrading a package that uses those compare md5sums of current and new file on mismatch abort and tell admin to upgrade central machine first %%%%%%%%%%%%%%%%%% %page Database redesign The current database has a few problems that need to be fixed: reading a large textfile is slow, a cache would be nice dpkg should not keep available data internationalization support status has more info than generally needed stat and checksum data for files is not stored No new design has been made yet, this is waiting restructuring \ of related core dpkg code. %%%%%%%%%%%%%%%%%% %page The Great dpkg-dev rewrite The collection of scripts that form dpkg-dev will be rewritten \ in python. This will gives us a couple of advantages: cleaner more maintainable code better abstractions opportunity to reconsider some old design decisions dpkg-shlibdeps has already been rewritten, other tools will follow. %%%%%%%%%%%%%%%%%% %page New source format The current sourceformat has a few shortcomings: can not handle multiple patches can not handle multiple tar archives only supports gzip compression A new design and partial implementation has been made but needs to be reviewed and tested. %%%%%%%%%%%%%%%%%% %page dpkg-lsb LSB 1.0 documentions a LSB package format, which is a subset of the \ rpm version 3 format. In order to become LSB complient dpkg can be \ extended by adding a new dpkg-lsb tool. Albert de Haan already has a working beta version of this. %%%%%%%%%%%%%%%%%% %page Conclusions Even though the current dpkg implementation is over 6 years old \ now, it still works very well and new features are being added on \ a regular basis. However for some larger changes a code cleanup is needed which will \ be the next big project. %%%%%%%%%%%%%%%%%% %page The end