Arch Linux Dev Blog

Specifications

··David Runge

In October 2024 a team of dedicated developers has started work on the ALPM project. Since then it has been focusing on writing new documentation on many aspects of Arch Linux Package Management that were not thoroughly documented in the past. This article provides an overview of the specifications written by this project and attempts to contextualize them for the reader.

The existing stack 📚

With its bash based makepkg tool for package creation, the libalpm C library for interfacing with system state and the central pacman package management tool, the pacman project has defined the foundation of package management on Arch Linux for the past 20 years. Over the years, several adjacent projects emerged, that provide functionality beyond the scope of the pacman project:

  • namcap: PKGBUILD and package file linting.
  • dbscripts: Binary package repository management used by Arch Linux to manage the official repositories.
  • devtools: A set of scripts and configuration files that also encompass Arch Linux's canonical package build tool pkgctl which wraps makepkg and performs builds in clean chroot environments with the help of systemd-nspawn.

Each Linux distribution has a similar stack of tools, that allows for the creation of package files from some form of input, the management of binary package repositories and the installation and management of those packages on end-user systems. However, many of these tools are not used by end-users, unless they themselves maintain their own package build scripts and binary package repositories.

On distribution documentation 🔍

The documentation of a distribution is key to its success, as it provides its members with access to details on tools, file formats and the overarching concepts. While Arch Linux's ArchWiki is a great resource for using the distribution, it lacks detailed information on developing it, as well as the concepts governing the existing tech stack.

Arguably, a wiki is not the best place for documentation of this sort, which is also made note of in RFC0021: Documentation on the operational side of Arch Linux is better served in a separate, dedicated place.

Similarly, central documentation on common file types, data types and concepts used in the package management stack are an important cornerstone for a shared and broad understanding of the technology. This helps package maintainers, application developers and end-users alike in collaborating and improving the existing set of tools.

Falling through the cracks 🕳️

The projects in the existing stack document most of their own functionality and use-cases for end-users. However, when considering strict validation of artifacts between the various building blocks in the ecosystem, it became clear that large areas of these projects are underspecified and only loosely follow an overarching design. For example, APIs or file formats not considered public or important enough for a dedicated specification by one project may be integral to the safe use of another project consuming its output.

The ALPM project follows in a long tradition of tools in the Arch Linux package management ecosystem, while more strongly focusing on modularity and validation. Already early on it became clear that an extensive documentation effort would be needed as the foundation of its granular design.

ALPM specifications 📜

Based mainly on black-box tests with the existing tooling, as well as input from longtime package maintainers and developers, a growing set of specifications has been written by the ALPM project.

Currently, the documentation is split between information on file formats and concepts. Some specifications already exist in multiple versions, which document different revisions of a format that changed over the past years.

For local access to all specifications in the form of man pages, install the alpm package group.

pacman -Su alpm

Concepts 📝

File formats 📄

  • SRCINFO: The format of .SRCINFO files found in the package source repositories of all official packages as well as all AUR source repositories. It provides metadata about the sources and packages defined in an enclosed PKGBUILD file while not requiring Bash to access this data.
  • ALPM-MTREE: The format of .MTREE files found in all package files. This file format exists in multiple versions (ALPM-MTREEv1 and ALPM-MTREEv2) and describes all files contained in a package file.
  • BUILDINFO: The format of .BUILDINFO files found in all package files. This file format exists in multiple versions (BUILDINFOv1 and BUILDINFOv2) and describes the environment used to build a package file.
  • PKGINFO: The format of .PKGINFO files found in all package files. This file format exists in multiple versions (PKGINFOv1 and PKGINFOv2) and describes the metadata of a package file.
  • alpm-install-scriptlet: The format of an .INSTALL file found in some package files. This script file is used to run custom commands around the installation, upgrade or uninstallation of a package.
  • alpm-repo-desc: The format of desc files found in repository sync databases. This file format exists in multiple versions (alpm-repo-descv1 and alpm-repo-descv2) and describes the state of a single package in a binary package repository.
  • alpm-db-desc: The format of desc files found in local libalpm databases. This file format exists in multiple versions (alpm-db-descv1 and alpm-db-descv2) and describes the state of a single package on a given system.
  • alpm-files: The format of files files found in local libalpm and repository sync databases. Depending on context, the file format may be referred to as alpm-db-files or alpm-repo-files, respectively.

In the works 🚧

Further specification documents are planned to describe repository sync databases and a new format for the handling of binary repository state in the future.

The documents are usually accompanied by dedicated parser and writer implementations, which are validated against real data to ensure their correctness (or to find bugs in existing tooling and data).

If this article sparked your interest, consider contributing to the ALPM project!