Proposing SQL Statement Coverage Metrics: Difference between revisions

Revision as of 17:58, 14 March 2013

Ben Smith, Younghee Shin, and Laurie Williams

Abstract

An increasing number of cyber attacks are occurring at the application layer when attackers use malicious input. These input validation vulnerabilities can be exploited by (among others) SQL injection, cross site scripting, and buffer overflow attacks. Statement coverage and similar test adequacy metrics have historically been used to assess the level of functional and unit testing which has been performed on an application. However, these currently-available metrics do not highlight how well the system protects itself through validation. In this paper, we propose two SQL injection input validation testing adequacy metrics: target statement coverage and input variable coverage. A test suite which satisfies both adequacy criteria can be leveraged as a solid foundation for input validation scanning with a blacklist. To determine whether it is feasible to calculate values for our two metrics, we perform a case study on a web healthcare application and discuss some issues in implementation we have encountered. We find that the web healthcare application scored 96.7% target statement coverage and 98.5% input variable coverage

1. Introduction

According to the National Vulnerability Database (NVD)¹, more than half of all of the ever-increasing number of cyber vulnerabilities reported in 2002-2006 were input validation vulnerabilities. As Figure 1 shows, the number of input validation vulnerabilities is still increasing.

PLACEHOLDER FOR FIGURE 1²

Figure 1 illustrates the number of reported instances of each type of cyber vulnerability listed in the series legend for each year displayed in the x-axis. The curve with the square shaped points is the sum of all reported vulnerabilities that fall into the categories “SQL injection”, “XSS”, or “buffer overflow” when querying the National Vulnerability Database. The curve with diamond shaped points represents all cyber vulnerabilities reported for the year in the x-axis. For several years now, the number of reported input validation vulnerabilities has been half the total number of reported vulnerabilities. Additionally, the graph demonstrates that these curves are monotonically increasing; indicating that we are unlikely to see a drop in the future in ratio of reported input validation vulnerabilities.

Input validation testing is the process of writing and running test cases to investigate how a system responds to malicious input with the intention of using tests to mitigate the risk of a security threat. Input validation testing can increase confidence that input validation has been properly implemented. The goal of input validation testing is to check whether input is validated against constraints given for the input. Input validation testing should test both whether legal input is accepted, and whether illegal input is rejected. A coverage metric can quantify the extent to which this goal has been met. Various coverage criteria have been defined based on the target of testing (specification or program as a target) and underlying testing methods (structural, fault-based and error-based)^[19]. Statement coverage and branch coverage are well-known program-based structural coverage criteria^[19].

However, current structural coverage metrics and the tools which implement them do not provide specific information about insufficient or missing input validation. New coverage criteria to measure the adequacy of input validation testing can be used to highlight a level of security testing. Our research objective is to propose and to validate two input validation testing adequacy metrics related to SQL injection vulnerabilities. Our current input validation coverage criteria consist of two experimental metrics: input variable coverage, which measures the percentage of input variables used in at least one test; and target statement coverage, which measures the percentage of SQL statements executed in at least one test.

An input variable is any dynamic, user-assigned variable which an attacker could manipulate to send malicious input to the system. In the context of the Web, any field on a web form is an input variable as well as any number of other client-side input spaces. Within the context of SQL injection attacks, input variables are any variable which is sent to the database management system, as will be illustrated in further detail in Section 2. A target statement is any statement in an application which is subject to attack via malicious input; for this paper, our target statements will be all SQL statements found in production code. Other input sources can be leveraged to form an attack, but we have chosen not to focus on them for this study because they comprise less than half of recently reported cyber vulnerabilities (see Figure 1 and explanation).

In practice, even software development teams who use metrics such as traditional statement coverage often do not achieve 100% values in these metrics before production^[1]. If the lines left uncovered contain target statements, traditional statement coverage could be very high while little to no input validation testing is performed on the system. A target statement or input variable which is involved in at least one test might achieve high input validation coverage metrics yet still remain insecure if the test case(s) did not utilize a malicious form of input. However, a system with a high score in the metrics we define has a foundation for thorough input validation testing. Testers can relatively easily reuse existing test cases with multiple forms of good and malicious input. Our vision is to automate such reuse.

We evaluated our metrics on the server-side code of a Java Server Pages web healthcare application that had an extensive set of JUnit³ test cases. We manually counted the number of input variables and SQL statements found in this system and dynamically recorded how many of these statements and variables are used in executing a given test set. The rest of this paper is organized as follows: First, Section 2 defines SQL injection attacks. Then, Section 3 introduces our experimental metrics. Section 4 provides a brief summary of related work. Next, Section 5 describes our case study and application of our technique. Section 6 reports the results of our study and discusses their implications. Then, Section 7 illustrates some limitations on our technique and our metrics. Finally, Section 8 concludes and discusses the future use and development of our metrics.

2. Background

Section 2.1 explains the fundamental difference between traditional testing and security testing. Then, Section 2.2 describes SQL injection.

2.1 Testing for Security

Web applications are inherently insecure^[15] and web applications’ attackers look the same as any other customer to the server^[12]. Developers should, but typically do not, focus on building security into web applications ^[6]. Security has been added to the list of web application quality criteria^[11] and the result is that companies have begun to incorporate security testing (including input validation testing) into their development methodologies^[3]. Security testing is contrasted from traditional testing, as illustrated by Figure 2: Functional vs. Security Testing, adapted from^[17].

**Figure 2. Intended vs. Actual Behavior, (adapted from ^[17])**

Represented by the left-hand circle in Figure 2, the current software development paradigm includes a list of testing strategies to ensure the correctness of an application in functionality and usability as indicated by a requirements specification. With respect to intended correctness, verification typically entails creating test cases designed to discover faults by causing failures. Oracles tell us what the system should do and failures tell us that the system does not do what it is supposed to do. The right-hand circle in Figure 2 indicates that we validate not only that the system does what it should, but also that the system does not do what it should not: the right-hand circle represents a failure occurring in the system which causes a security problem. The circles intersect because some intended functionality can cause indirect vulnerabilities because privacy and security were not considered in designing the required functionality^[17]. Testing for functionality only validates that the application achieves what was written in the requirements specification. Testing for security validates that the application prevents undesirable security risks from occurring, even when the nature of this functionality is spread across several modules and might be due to an oversight in the application’s design. To adapt to the new paradigm, companies have started to incorporate new techniques. Some companies use vulnerability scanners, which behave like a hacker to make automated attempts at gaining access or misusing the system to discover its flaws^[4]. A blacklist is a representative or comprehensive set of all input validation attacks of a given type (such as SQL injection, see Section 2.2). These vulnerability scanners typically use a blacklist to test potential vulnerabilities against all attacks (or a set of representative attacks). Coverage criteria for target statements can help companies assess how much of their system has the framework for a range of input validation testing. A vulnerability scanner is ineffective if its blacklist is not tested against every target statement in the system.

9. References

^[1] B. Beizer, Software testing techniques: Van Nostrand Reinhold Co. New York, NY, USA, 1990.

^[2] S. W. Boyd and A. D. Keromytis, "SQLrand: Preventing SQL injection attacks," in Proceedings of the 2nd Applied Cryptography and Network Security (ACNS) Conference, Yellow Mountain, China, pp. 292-304, 2004.

^[3] B. Brenner, "CSI 2007: Developers need Web application security assistance," in SearchSecurity.com, 2007.

^[4] M. Cobb, "Making the case for Web application vulnerability scanners," in SearchSecurity.com, 2007.

^[5] W. G. Halfond, J. Viegas, and A. Orso, "A Classification of SQL-Injection Attacks and Countermeasures," in Proceedings of the International Symposium on Secure Software Engineering, March, Arlington, VA, 2006.

^[6] W. G. J. Halfond and A. Orso, "AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks," in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering, Long Beach, CA, USA, pp. 174-183, 2005.

^[7] W. G. J. Halfond and A. Orso, "Command-Form Coverage for Testing Database Applications," Proceedings of the IEEE and ACM International Conference on Automated Software Engineering, pp. 69–78, 2006.

^[8] Y. W. Huang, S. K. Huang, T. P. Lin, and C. H. Tsai, "Web application security assessment by fault injection and behavior monitoring," in Proceedings of the 12th International Conference on World Wide Web, Budapest, Hungary, pp. 148-159, 2003.

^[9] S. Kals, E. Kirda, C. Kruegel, and N. Jovanovic, "SecuBat: a web vulnerability scanner," in Proceedings of the 15th international conference on World Wide Web, Edinburgh, Scotland pp. 247-256, 2006.

^[10] G. McGraw, Software Security: Building Security in. Upper Saddle River, NJ: Addison-Wesley Professional, 2006.

^[11] J. Offutt, "Quality attributes of Web software applications," IEEE Software, vol. 19, no. 2, pp. 25-32, 2002.

^[12] E. Ogren, "App Security's Evolution," in DarkReading.com, 2007.

^[13] T. Pietraszek and C. V. Berghe, "Defending against injection attacks through context-sensitive string evaluation," in Recent Advances in Intrusion Detection (RAID). Seattle, WA, 2005.

^[14] F. S. Rietta, "Application layer intrusion detection for SQL injection," in Proceedings of the 44th annual southeast regional conference, New York, NY, pp. 531-536, 2006.

^[15] D. Scott and R. Sharp, "Developing secure Web applications," Internet Computing, IEEE, vol. 6, no. 6, pp. 38-45, 2002.

^[16] Z. Su and G. Wassermann, "The essence of command injection attacks in web applications," in Proceedings of the Annual Symposium on Principles of Programming Languages, Charleston, SC, pp. 372-382, 2006.

^[17] H. H. Thompson and J. A. Whittaker, "Testing for software security," Dr. Dobb's Journal, vol. 27, no. 11, pp. 24-34, 2002.

^[18] D. Willmor and S. M. Embury, "Exploring test adequacy for database systems," in Proceedings of the 3rd UK Software Testing Research Workshop, Sheffield, UK, pp. p123-133, 2005.

^[19] H. Zhu, P. A. V. Hall, and J. H. R. May, "Software Unit Test Coverage and Adequacy," ACM Computing Surveys, vol. 29, no. 4, 1997.

^[20] http://nvd.nist.gov/

^[21] http://www.junit.org

10. Notes

1. In Figure 1, we counted the reported instances of vulnerabilities by using the keywords "SQL injection", "cross-site scripting", "XSS", and "buffer overflow" within the input validation error category from NVD.

@@ Line 4: / Line 4: @@
 == 1. Introduction ==
-According to the National Vulnerability Database (NVD)<sup>[20]</sup>, more than half of all of the ever-increasing number of cyber vulnerabilities reported in 2002-2006 were input validation vulnerabilities. As Figure 1 shows, the number of input validation vulnerabilities is still increasing.
+According to the National Vulnerability Database (NVD)<sup>1</sup>, more than half of all of the ever-increasing number of cyber vulnerabilities reported in 2002-2006 were input validation vulnerabilities. As Figure 1 shows, the number of input validation vulnerabilities is still increasing.
-'''PLACEHOLDER FOR FIGURE 1'''<sup>1</sup>
+'''PLACEHOLDER FOR FIGURE 1'''<sup>2</sup>
 Figure 1 illustrates the number of reported instances of each type of cyber vulnerability listed in the series legend for each year displayed in the x-axis. The curve with the square shaped points is the sum of all reported vulnerabilities that fall into the categories “SQL injection”, “XSS”, or “buffer overflow” when querying the National Vulnerability Database. The curve with diamond shaped points represents all cyber vulnerabilities reported for the year in the x-axis. For several years now, the number of reported input validation vulnerabilities has been half the total number of reported vulnerabilities. Additionally, the graph demonstrates that these curves are monotonically increasing; indicating that we are unlikely to see a drop in the future in ratio of reported input
@@ Line 19: / Line 19: @@
 In practice, even software development teams who use metrics such as traditional statement coverage often do not achieve 100% values in these metrics before production<sup>[1]</sup>. If the lines left uncovered contain target statements, traditional statement coverage could be very high while little to no input validation testing is performed on the system. A target statement or input variable which is involved in at least one test might achieve high input validation coverage metrics yet still remain insecure if the test case(s) did not utilize a malicious form of input. However, a system with a high score in the metrics we define has a foundation for thorough input validation testing. Testers can relatively easily reuse existing test cases with multiple forms of good and malicious input. Our vision is to automate such reuse.
-We evaluated our metrics on the server-side code of a Java Server Pages web healthcare application that had an extensive set of JUnit<sup>[21]</sup> test cases. We manually counted the number of input variables and SQL statements found in this system and dynamically recorded how many of these statements and variables are used in executing a given test set. The rest of this paper is organized as follows: First, Section 2 defines SQL injection attacks. Then, Section 3 introduces our experimental metrics. Section 4 provides a brief summary of related work. Next, Section 5 describes our case study and application of our technique. Section 6 reports the results of our study and discusses their implications. Then, Section 7 illustrates some limitations on our technique and our metrics. Finally, Section 8 concludes and discusses the future use and development of our metrics.
+We evaluated our metrics on the server-side code of a Java Server Pages web healthcare application that had an extensive set of JUnit<sup>3</sup> test cases. We manually counted the number of input variables and SQL statements found in this system and dynamically recorded how many of these statements and variables are used in executing a given test set. The rest of this paper is organized as follows: First, Section 2 defines SQL injection attacks. Then, Section 3 introduces our experimental metrics. Section 4 provides a brief summary of related work. Next, Section 5 describes our case study and application of our technique. Section 6 reports the results of our study and discusses their implications. Then, Section 7 illustrates some limitations on our technique and our metrics. Finally, Section 8 concludes and discusses the future use and development of our metrics.
 == 2. Background ==