[Reading Paper USENIX 2019]Less is More: Quantifying the Security Benefits of Debloating Web Applications

Introduction#

This article is a precursor to a paper on PHP debloating presented at this year's USENIX (referred to as LIM, Less is More).

profiles: configuration, scenarios; profiling: analysis (performance analysis, behavior analysis, etc.); profiler: analyzer

bloat: inflation; debloat: deflation

ambient authority: Ambient authority is a term in system access control research. When a subject specifies the name of the object it needs and the action it will perform on that object, we say that the subject is using ambient authority.

monkey testing: Monkey testing is a technique where users test applications or systems by providing random inputs and checking behaviors or whether the application or system crashes. Monkey testing is often implemented as random automated unit testing.

Overview of paper#

The article utilizes dynamic analysis to obtain code coverage for PHP web applications and deletes unused code to achieve the goal of debloating.

In summary, it consists of the following parts:

Selecting CVE vulnerabilities and mapping them to web apps.
Selecting four different user groups and simulating app usage.
Recording code coverage and analyzing unused files/functions.
Performing debloating based on coverage and obtaining a debloated app (mainly file-level debloating and function-level debloating).
Simulating normal usage of the debloated app to evaluate whether functionality remains intact.
Conducting known CVE exploits on both the debloated app and the original app to assess the effectiveness of debloating (i.e., whether debloating removed critical code that caused vulnerabilities) and other evaluations and comparisons.

Here is a Figure 1, which contains a typo "Expoits."

Background#

The principle of software debloating has been successfully applied to operating systems (removing unnecessary code from the Linux kernel), shared libraries, and compiled binary applications.

This paper proposes for the first time the evaluation of the applicability of debloating on web apps to see if it is possible to remove critical code that causes vulnerabilities.

Motivation for web debloating#

The author uses Symfony's CVE-2018-14773 as an example.

This framework supports a legacy IIS header that may lead to abuse. If the server does not need to use this header, the related supporting code can be removed, i.e., debloating.

Target PHP web apps#

phpMyAdmin: Database management
WordPress: Blog management
MediaWiki: Wiki management
Magento: E-commerce management

Mapping vulnerabilities to source code#

Each web app selected the 20 most critical CVEs based on CVSS scores, all of which are CVEs from 2013 and later.

Due to different affected versions for different CVEs, vulnerabilities had to be mapped across multiple versions (as shown in the table below).

The affected versions and line numbers for each CVE are recorded in the database.

Web Application	Version	Known CVEs(≥2013)
Magento	1.9.0, 2.0.5	10
MediaWiki	1.19.1, 1.21.1, 1.24.0, 1.28.0	111
phpMyAdmin	4.0.0, 4.4.0, 4.6.0, 4.7.0	130
WordPress	3.9.0, 4.0, 4.2.3, 4.6, 4.7, 4.7.1	131

Simulating web app usage#

There are four methods to simulate app usage to achieve as broad and deep functional coverage as possible, or code coverage.

General tutorials (executed using Selenium scripts)
Monkey testing
Crawling
Vulnerability scanning

Recording web app code coverage#

The PHP analyzer is provided as a PHP extension, and the principle is to modify the PHP engine to collect code coverage; the one used in this paper is XDebug.

The direct idea was to add xdebug_start_code_coverage() and xdebug_get_code_coverage() to the end of each PHP file, but the author encountered some difficulties.

Since any PHP file can call exit() or die() to exit early, the above two recording functions need to be added before the exit functions.

Additionally, a shutdown function needs to be registered and added to the end of the shutdown function queue.

Finally, for destructors, if a class is destroyed after the shutdown function, that part won't be covered, so the destructor was rewritten to register itself during execution.

Debloating strategies#

File-level debloat: Remove PHP files that are not executed.
Function-level debloat: A finer-grained debloat than file-level, which can remove unexecuted code blocks within functions.

The debloating here does not completely delete the code but replaces it with placeholders. If the code execution reaches these placeholders, the program will exit and log information about the missing functions.

Subsequent results prove this method to be very effective, recording many files/functions that should not be deleted.

Experimental results#

The standard for measuring code quantity is not simply the number of lines of code but Logical Lines Of Code (LLOC), which does not count comments, blank lines, necessary syntax structures, etc.

Clearly, function-level debloating reduces more code than file-level debloating, which is also related to the coding practices of these four different projects (for example, WordPress does not rely as much on external packages, while Magento and MediaWiki are developed in a more modular way).

Reduction in cyclomatic complexity

Cyclomatic complexity (CC), also known as conditional complexity, is quantitatively expressed as the number of independent paths, which can also be understood as the minimum number of test cases needed to cover all possible scenarios.

I learned this concept in my junior software engineering class.

During the debloat process, cyclomatic complexity also decreases, indicating that the debloat method can remove complex instructions and execution paths.

Reduction in CVEs after debloating

The results show that 38% of vulnerabilities can be removed through file-level debloating, while 10% to 60% can be removed through function-level debloating (with phpMyAdmin and Magento having a large number of external libraries, while WordPress is a more singular case).

Note: The rule for determining whether a vulnerability has been debloated in this paper is that all files/functions covered by a certain vulnerability must be deleted, rather than just removing one link in the chain (although in most cases this would already break the exploit chain, rendering the vulnerability unexploitable).

I feel that the author did not clearly explain the specific rules for how debloating was conducted based on the previous four scenarios.

Because there are two situations: one is normal usage, which does not trigger vulnerabilities (like tutorials), and the other is intentional exploitation (like vulnerability scanning) or causing the application to enter an abnormal state, while monkey testing can produce both situations.

I tentatively assume that debloating is based on the following rules:

Conducting it while ensuring the program runs normally, i.e., prioritizing functionality over the potential for vulnerabilities.

Code outside of files/functions covered by normal usage should be deleted.

If the paths covered by malicious exploitation overlap with those covered by normal usage, the non-overlapping parts should be deleted, while the overlapping parts should be retained if they meet the requirements of the first rule.

In short, code not covered by normal usage needs to be deleted.

Impact of different vulnerability types

The degree of debloat varies for different types of vulnerabilities; for example, command execution and SQL injection vulnerabilities are easier to debloat (often found in less commonly used modules), while crypto and cookie-related vulnerabilities are harder to debloat (often found in core components that cannot be deleted).

Checking for POI vulnerabilities

POI, or PHP Object Injection, is essentially a PHP deserialization vulnerability in CTF.

The author used PHPGGC, a tool for generating POP exploit chains, to exploit the debloated app.

The results showed that function-level debloating successfully removed all vulnerabilities corresponding to the exploit chains present in PHPGGC (WordPress is not included here, as it does not rely on external packages).

Improper introduction of dev packages

Composer by default places external software in the vendor directory, which, if accessible due to server misconfiguration, could be exploited for RCE (e.g., PHPUnit).

Experimental results indicate that phpMyAdmin and Magento have this issue.

Qualitative analysis of deleted code

Due to the excessive number of deleted files and code, this paper used the k-means clustering algorithm to produce file groups and employed TFIDF maximum frequency limits to ignore common parts appearing in more than 50% of file paths.

Exploit testing on the debloated app

Finally, the author collected CVEs targeting these four PHP web apps present in the Metasploit framework and wrote them into POCs based on publicly available vulnerability information.

After verifying that the original versions of the web apps could be successfully exploited, tests were conducted on the debloated versions, with half failing (4 out of 8).

This result indicates that while debloating is not a panacea for web app security, it is effective.

Performance analysis#

Since code coverage tools increase performance overhead, this section discusses the overhead analysis of the XDebug tool, comparing Selenium scripts with and without XDebug.

The results show that the overhead for the four web apps increased in execution time, CPU consumption, and memory consumption.

However, this overhead can be reduced by improving the coverage calculation method, such as calculating coverage offline, which will be discussed later.

Limitations and future work#

To summarize the previous work, debloating can reduce hundreds of thousands of lines of irrelevant code, decrease cyclomatic complexity by 30% to 50%, and delete about half of the code related to CVEs that cause vulnerabilities. Even for vulnerabilities that cannot be deleted, debloating can remove some gadgets, making them harder to exploit.

The author believes that this work is not yet complete and has the following limitations:

Lack of exploitable vulnerabilities

There is a lack of publicly exploitable vulnerabilities, including various exploit reproductions and detailed descriptions.

The author also mentioned the absence of automated exploitation scripts for web apps (like BugBox), as this could greatly assist researchers.

Dynamic code coverage

Web debloating heavily relies on dynamic code coverage analysis, and even with four replicable and unbiased application configuration scenarios, it cannot claim to cover all benign states of web apps.

In short, the depth of coverage is insufficient, and the author plans to follow up through crowdsourcing and user studies.

Additionally, since this pipeline removes unnecessary features for specified user groups, it cannot conduct general static analysis work. However, the author suggested that static analysis could be performed on the code after debloating to ensure that the required features for these user groups still exist.

Handling requests to deleted code

When real users request deleted code, how should it be handled? Simply exiting the application and returning an error is insufficient; the deleted code should be reintroduced to handle user requests, and it must be determined whether the request is malicious beforehand.

Metrics for measuring debloating effectiveness

This paper uses reductions in cyclomatic complexity, logical lines of code (LLOC), CVEs, and POP chains as four metrics to measure effectiveness.

However, each line of code contributes differently to the program's attack surface, and the CVE standard does not apply to proprietary software. Additionally, CVEs need to be manually mapped to verify exploitability, which is a labor-intensive task.

Efficiency of debloating

The efficiency of debloating modular applications is significantly different from monolithic applications (like WordPress).

Here, the author mentions several static analysis debloating works, as well as web client debloating work (reducing the attack surface of Chrome), and a dynamic analysis work for custom PHP web applications (the limitation of this work is that it cannot quantitatively determine the number of vulnerabilities reduced because it is custom).

Conclusion#

Since my own thoughts have already been summarized in the previous Overview of paper, I will just paste the original abstract and conclusion here.

Abstract

As software becomes increasingly complex, its attack surface expands enabling the exploitation of a wide range of vulnerabilities. Web applications are no exception since modern HTML5 standards and the ever-increasing capabilities of JavaScript are utilized to build rich web applications, often subsuming the need for traditional desktop applications. One possible way of handling this increased complexity is through the process of software debloating, i.e., the removal not only of dead code but also of code corresponding to features that a specific set of users do not require. Even though debloating has been successfully applied on operating systems, libraries, and compiled programs, its applicability on web applications has not yet been investigated. In this paper, we present the first analysis of the security benefits of debloating web applications. We focus on four popular PHP applications and we dynamically exercise them to obtain information about the server-side code that executes as a result of client-side requests. We evaluate two different debloating strategies (file-level debloating and function-level debloating) and we show that we can produce functional web applications that are 46% smaller than their original versions and exhibit half their original cyclomatic complexity. Moreover, our results show that the process of debloating removes code associated with tens of historical vulnerabilities and further shrinks a web application’s attack surface by removing unnecessary external packages and abusable PHP gadgets.

Conclusion

In this paper, we analyzed the impact of removing unnecessary code in modern web applications through a process called software debloating. We presented the pipeline details of the end-to-end, modular debloating framework that we designed and implemented, allowing us to record how a PHP application is used and what server-side code is triggered as a result of client-side requests. After retrieving code-coverage information, our debloating framework removes unused parts of an application using file-level and function-level debloating. By evaluating our framework on four popular PHP applications (phpMyAdmin, MediaWiki, Magento, and WordPress) we witnessed the clear security benefits of debloating web applications. We observed a significant LLOC decrease ranging between 9% to 64% for file-level debloating and up to an additional 24% with function-level debloating. Next, we showed that external packages are one of the primary sources of bloat as our debloating framework was able to remove more than 84% of unused code in versions that used Composer, PHP’s most popular package manager. By quantifying the removal of code associated with critical CVEs, we observed a reduction of up to 60% of high-impact, historical vulnerabilities. Finally, we showed that the process of debloating also removes instructions and classes that are the primary sources for attackers to build gadgets and perform POI attacks. Our results demonstrate that debloating web applications provides tangible security benefits and therefore should be seriously considered as a practical way of reducing the attack surface of web-applications deployments.