Vue lecture

Il y a de nouveaux articles disponibles, cliquez pour rafraîchir la page.

[$] A new swap abstraction layer for the kernel

Swapping may be a memory-management technique at its core, but its implementation also involves the kernel's filesystem and storage layers. So it is not surprising that a session on the kernel's swap abstraction layer, led by Chris Li at the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, was held jointly by all three of those tracks. Li has some ambitious ideas for an improved subsystem, but getting to a workable implementation may not be easy.

[$] The twilight of the version-1 memory controller

Almost immediately after the merging of control groups, kernel developers set their sights on reimplementing them properly. The second version of the control-group API started trickling into the kernel around the 3.16 release in 2014 and users have long since been encouraged to migrate, but support for (and users of) the initial API remain. At the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, memory-management developers discussed whether (and when) it might be possible to remove the version-1 memory controller. The session was led by Shakeel Butt and (participating remotely) Roman Gushchin.

[$] The path to deprecating SPARSEMEM

The term "memory model" is used in a couple of ways within the kernel. Perhaps the more obscure meaning is the memory-management subsystem's view of how physical memory is organized on a given system. A proper representation of physical memory will be more efficient in terms of memory and CPU use. Since hardware comes in numerous variations, the kernel supports a number of memory models to match; see this article for details. At the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, Oscar Salvador, presenting remotely, made the case for removing one of those models.

[$] Documenting page flags by committee

For every page of memory in the system, the kernel maintains a set of page flags describing how the page is used and various aspects of its current state. Space for page flags has been in chronic short supply, leading to a desire to eliminate or consolidate them whenever possible. That objective, though, is hampered by the fact that the purpose of many page flags is not well understood. In a memory-management-track session at the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, Matthew Wilcox set out to cooperatively update the page-flag documentation to improve that situation.

[$] Merging msharefs

The problem of sharing page tables across processes has been discussed numerous times over the years, Khalid Aziz said at the beginning of his 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit session on the topic. He was there to, once again, talk about the proposed mshare() system call (which, in its current form, is no longer actually a system call but the feature still goes by that name) and to see what can be done to finally get it into the mainline.

[$] Toward the unification of hugetlbfs

The kernel's hugetlbfs subsystem was the first mechanism by which the kernel made huge pages available to user space; it was added to the 2.5.46 development kernel in 2002. While hugetlbfs remains useful, it is also viewed as a sort of second memory-management subsystem that would be best unified with the rest of the kernel. At the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, Peter Xu raised the question of what that unification would involve and what the first steps might be.

[$] The interaction between memory reclaim and RCU

The 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit was a development conference, where discussion was prioritized and presentations with a lot of slides were discouraged. Paul McKenney seemingly flouted this convention in a joint session of the storage, filesystem, and memory-management tracks where he presented about 50 slides — in five minutes, twice. The subject was the use of the read-copy-update (RCU) mechanism in the memory-reclaim process, and whether changes to RCU would be needed for that purpose.

[$] Faster page faults with RCU-protected VMA walks

Looking up a virtual memory area (VMA) in a process's address space, for the handling of page faults or any of a number of other tasks, in multi-threaded processes has long been bedeviled by lock contention in the kernel. As a result, developer gatherings have been subjected to many sessions on how to improve the situation. At the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit, developers in the memory-management track met, in a session led by Liam Howlett, to talk about a situation that has improved considerably in recent times, but which still offers opportunities for optimization.

[$] Another try for address-space isolation

Brendan Jackman started his memory-management-track session at the 2024 Linux Storage, Filesystem, Memory-Management and BPF Summit by saying that, for some years now, the kernel community has been stuck in a reactive posture with regard to hardware vulnerabilities. Each problem shows up with its own scary name, and kernel developers find a way to mitigate it, usually losing performance in the process. Jackman said that it is time to take back the initiative against these vulnerabilities by reconsidering the more general use of address-space isolation.

[$] Memory-allocation profiling for the kernel

Optimizing the kernel's memory use is made much easier if developers have an accurate idea of how memory is being used, but the kernel's instrumentation is not as good as it could be. When Suren Baghdasaryan and Kent Overstreet presented their memory-allocation profiling work, which is meant to address this shortcoming, at the 2023 Linux Storage, Filesystem, Memory Management, and BPF Summit, their objective was uncontroversial but the proposed solution ran into opposition that played out at length on the mailing lists (example) over the last year. So it may be a bit surprising that, when the two returned to the memory-management track in the 2024 gathering, the controversy was gone and the discussion focused on improving details of the implementation.

[$] Dynamically sizing the kernel stack

The kernel stack is a scarce and tightly constrained resource; kernel developers often have to go far out of their way to avoid using too much stack space. The size of the stack is also fixed, leading to situations where it is too small for some code paths, while wastefully large for others. At the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit, Pasha Tatashin proposed making the kernel stack size dynamic, making more space available when needed while saving memory overall. This change is not as easy to implement as it might seem, though.

[$] Facing down mapcount madness

The page structure is a complicated beast, but some parts of it are more intimidating than others. The mapcount field is one of the scarier parts. It allegedly records the number of references to the page in page tables, but, as David Hildenbrand described during the memory-management track at the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit, things are more complicated than that. Few people truly understand the semantics of this field, but the situation will hopefully get better over time.

Security updates for Tuesday

Security updates have been issued by AlmaLinux (firefox, nodejs, and thunderbird), Fedora (uriparser), Oracle (firefox and thunderbird), Slackware (mariadb), SUSE (cairo, gdk-pixbuf, krb5, libosinfo, postgresql14, and python310), and Ubuntu (firefox, linux-aws, linux-aws-5.15, and linux-azure).

[$] What's next for the SLUB allocator

There are two fundamental levels of memory allocator in the Linux kernel: the page allocator, which allocates memory in units of pages, and the slab allocator, which allocates arbitrarily-sized chunks that are usually (but not necessarily) smaller than a page. The slab allocator is the one that stands behind commonly used kernel functions like kmalloc(). At the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit, slab maintainer Vlastimil Babka provided an update on recent changes at the slab level and discussed the changes that are yet to come.

[$] Better support for locally-attached-memory tiering

The term "memory tiering" refers to the management of memory placement on systems with multiple types of memory, each of which has its own performance characteristics. On such systems, poor placement can lead to significantly worse performance. A memory-management-track discussion at the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit took yet another look at tiering challenges with a focus on upcoming technologies that may simplify (or complicate) the picture.

Axboe: What's new with io_uring in 6.10

Jens Axboe describes the new io_uring features that will be a part of the 6.10 kernel release.

Bundles are multiple buffers used in a single operation. On the receive side, this means a single receive may utilize multiple buffers, reducing the roundtrip through the networking stack from N per N buffers to just a single one. On the send side, this also enables better handling of how an application deals with sends from a socket, eliminating the need to serialize sends on a single socket. Bundles work with provided buffers, hence this feature also adds support for provided buffers for send operations.

Security updates for Monday

Security updates have been issued by Debian (bind9, chromium, and thunderbird), Fedora (buildah, chromium, firefox, mingw-python-werkzeug, and suricata), Mageia (golang), Oracle (firefox and nodejs:20), Red Hat (firefox, httpd:2.4, nodejs, and thunderbird), and SUSE (firefox, git-cliff, and ucode-intel).
❌