Vue lecture

NVIDIA Engineer Talks Up sched_ext Linux Scheduler Possibilities At FOSDEM

Merged last year for the Linux 6.12 kernel was sched_ext for allowing extensible scheduler possibilities by allowing schedulers to be implemented as eBPF code and dynamically loaded into the kernel. This allows for rapidly developing new schedulers as well as exploring other new possibilities around more intelligent kernel scheduling decisions. Meta, Google, Canonical (Ubuntu), and others have been big proponents of sched_ext and NVIDIA is also increasingly vocalizing their support for these extensible scheduler opportunities...

AMD Broadcast TLB Invalidation Patches For Linux Updated, Intel RAR Eyed Next

One of the set of patches for the Linux kernel that we have been looking forward to but that wasn't wrapped up in time for the recent Linux v6.14 merge window was the work enabling use of the AMD INVLPGB instruction on Zen 3 CPUs and newer for broadcast TLB invalidation. This can lead to a nice performance bump in some workloads while the eighth iteration of those patches were posted overnight...

FFmpeg Adds AMD AMF Decoder, FSR-Based Upscaling

Landing this week in the FFmpeg open-source library that is widely-used by multimedia applications was NVIDIA video acceleration improvements for Blackwell GPUs. Over on the AMD side, there are also some interesting changes to have been merged this week into upstream FFmpeg...

Optimizing The Linux Kernel With PGO Can Yield ~3% Benefit For HPC Workloads

While the Linux kernel itself may not be often viewed as a bottleneck to typical high performance computing (HPC) workloads, optimizing the Linux kernel with Profile Guided Optimizations (PGO) can prove worthwhile for those seeking maximum performance potential. A presentation this past weekend at FOSDEM 2025 is highlighting around a 3% performance gain for HPC software compiled with PGO enabled...

Linux 6.15 Looks Like It May Try Again With EXECMEM_ROX Support

Initially merged back for the Linux 6.13 kernel was EXECMEM_ROX support for module text on x86_64 systems. With this caching of large ROX pages it can help with lowering TLB instruction pressure and enhancing performance. But this EXECMEM_ROX support that was contributed by a Microsoft engineer ended up being reverted in the final days of Linux 6.13. The revert came due to bugs and not having any Linux x86 maintainers signing off on the code. This code has been getting into shape for trying again with the mainline kernel...
❌