The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has published research looking into 172 key open-source projects and whether they are susceptible to memory flaws.
The report, cosigned by CISA, the Federal Bureau of Investigation (FBI), as well as Australian (ASD, ACSC) and Canadian organizations (CCCS), is a follow-up to the 'Case for Memory Safe Roadmaps' released in December 2023, aimed at raising awareness about the importance of memory-safe code.
Memory safety
Memory-safe languages are programming languages designed to prevent common memory-related errors such as buffer overflows, use-after-free, and other types of memory corruption.
They achieve this by managing memory automatically instead of relying on the programmer to implement safe memory allocation and deallocation mechanisms.
A modern example of a safe language system is Rust's borrow checker, which eliminates data races. Other languages like Golang, Java, C#, and Python manage memory through garbage collection, automatically reclaiming freed memory to prevent exploitation.
Memory-unsafe languages are those that do not provide built-in memory management mechanisms, burdening the developer with this responsibility and increasing the likelihood of errors. Examples of such cases are C, C++, Objective-C, Assembly, Cython, and D.
Widely used open-source code unsafe
The report presents research examining 172 broadly deployed open-source projects, finding that over half contain memory-unsafe code.
Key findings presented in the report are summarized as follows:
- 52% of critical open-source projects analyzed contain code written in memory-unsafe languages.
- 55% of the total lines of code (LoC) across these projects are written in memory-unsafe languages.
- The largest projects are disproportionately written in memory-unsafe languages.
- Of the ten largest projects by total LoC, each has a proportion of memory-unsafe LoC above 26%.
- The median proportion of memory-unsafe LoC in these large projects is 62.5%, with four projects exceeding 94%.
- Even projects written in memory-safe languages often depend on components written in memory-unsafe languages.
Some notable examples from the examined set are Linux (unsafe code ratio 95%), Tor (unsafe code ratio 93%), Chromium (unsafe ratio 51%), MySQL Server (unsafe ratio 84%), glibc (ratio 85%), Redis (ratio 85%), SystemD (65%), and Electron (47%).
CISA explains that software developers face multiple challenges that often oblige them to use memory-unsafe languages, such as resource constraints and performance requirements.
That is especially true when implementing low-level functionalities like networking, cryptography, and operating system functions.
"We observed that many critical open source projects are partially written in memory-unsafe languages and limited dependency analysis indicates that projects inherit code written in memory-unsafe languages through dependencies," explains CISA in the report.
"Where performance and resource constraints are critical factors, we have seen, and expect the continued use of, memory-unsafe languages."
The agency also highlights the problem of developers disabling memory-safety features, either by error or on purpose, to meet specific requirements, resulting in risks even when using theoretically safer building blocks.
Ultimately, CISA recommends that software developers write new code in memory-safe languages such as Rust, Java, and GO and transition existing projects, especially critical components, to those languages.
In addition, it is recommended to follow safe coding practices, carefully manage and audit dependencies, and perform continuous testing, including static analysis, dynamic analysis, and fuzz testing, to detect and address memory safety issues.