Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities
B. Smith, L. Williams, "Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities", Proceedings of the International Conference on Software Testing, Verification and Validation (ICST 2011), Berlin, Germany, pp. 220-229, 2011.
Abstract
Development organizations often do not have time to perform security fortification on every file in a product before release. One way of prioritizing security efforts is to use metrics to identify core business logic that could contain vulnerabilities, such as database interaction code. Database code is a source of SQL injection vulnerabilities, but importantly may be home to unrelated vulnerabilities. The goal of this research is to improve the prioritization of security fortification efforts by investigating the ability of SQL hotspots to be used as the basis for a heuristic for prediction of all vulnerability types. We performed empirical case studies of 15 releases of two open source PHP web applications: WordPress, a blogging application, and WikkaWiki, a wiki management engine. Using statistical analysis, we show that the more SQL hotspots a file contains per line of code, the higher the probability that file will contain any type of vulnerability.
1. Introduction
We can get good designs by following good practices instead of poor ones.
~F. Brooks, Jr.
The war for a trustworthy Internet continues. The popular social networking site Twitter was recently compromised by two cross-site scripting attacks, which are common and easy-to-execute exploits of a codelevel programming error[5]. Input validation vulnerabilities1 like this are in the CWE/SANS Top 25 Most Dangerous Programming Errors for 20102 despite the plethora of proposed techniques for protecting against code-level attacks (e.g. the context sensitive string evaluation method proposed by[11]). Additionally, the SANS list of Top Cyber Security Risks3 indicates that input validation vulnerabilities, such as SQL injection, cross-site scripting, and file inclusion continue to be the three most popular techniques used for compromising web sites.
Although techniques such as code reviews and design discussions can help developers reduce the number of vulnerabilities they introduce into the source code, the software development community currently has no single solution that will eliminate all security issues[7]. Furthermore, development organizations often do not have the time or resources to perform vulnerability detection efforts on every source file in a product before its release. Validation and verification (V&V) must be prioritized in such a way that the security fortification starts with the files that are most likely to be vulnerable first. SQL hotspots may help development organizations prioritize security fortification efforts. SQL hotspots (or just "hotspots" in this paper) are any point in the application source code where the system interacts with a database management system[3, 6]. Hotspots are typically associated with input validation vulnerabilities like SQL injection4, but they might also be useful for predicting any web application vulnerability since they protect the typical web application's most valuable asset: the database[3, 6].
The goal of this research is to improve the prioritization of security fortification efforts by investigating the ability of SQL hotspots to be used as the basis for a heuristic for the prediction of all vulnerability types. We have already defined the identification of hotspots[14], and demonstrated that testers can target hotspots at the system level to expose error message information leakage vulnerabilities5[15]. In this paper, we evaluate the ability of hotspots used in a model with number of lines of code to perform in prediction models that can help point testers to files in the source code that are likely to contain all types of web application vulnerabilities. We include lines of code in our model as a way of normalizing the number of SQL hotspots per file to make the comparison between files more accurate even as file sizes vary.
We built and analyzed a prediction model based on the security vulnerability reports of two open source PHP web applications: nine releases of WordPress6, a blogging application, and six releases of WikkaWiki7, a wiki management engine. We compared the evaluation of our model's ability to predict vulnerable files with a random guess calculated based on the distribution of vulnerabilities within each system.
9. References
- [1] T. Fawcett, "An introduction to ROC analysis," Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.
- [2] M. Gegick, L. Williams, J. Osborne, and M. Vouk, "Prioritizing software security fortification through code-level metrics," in ACM Workshop on Quality of Protection (QoP2008), Alexandria, Virginia, 2008, pp. 31-38.
- [3] W. G. J. Halfond and A. Orso, "AMNESIA: analysis and monitoring for neutralizing SQLinjection attacks," in 20th IEEE/ACM Conference on Automated Software Engineering, Long Beach, CA, USA, 2005, pp. 174-183.
- [4] ISO/IEC, "DIS 14598-1 Information technology -Software product evaluation," 1996.
- [5] J. Kirk, "Twitter Contains Second worm in a Week," in PCWorld Business Center, 2010, http://www.pcworld.com/businesscenter/article/206232/twitter_contains_second_worm_in_a_week.html.
- [6] Y. Kosuga, K. Kono, M. Hanaoka, M. Hishiyama, and Y. Takahama, "Sania: syntactic and semantic analysis for automated testing against SQL injection," in 23rd Annual Computer Security Applications Conference, Miami Beach, FL, 2007, pp. 107-117.
- [7] G. McGraw, Software Security: Building Security In. Reading, Massachusetts: Addison-Wesley Professional, 2006.
- [8] A. Meneely and L. Williams, "Secure open source collaboration: an empirical study of linus' law," in ACM Conference on Computer and Communications Security (CCS2009), Chicago, Illinois, 2009, pp. 453-462.
- [9] S. Nehaus, T. Zimmerman, C. Holler, and A. Zeller, "Predicting vulnerable software components," in ACM Conference on computer and communications security, Alexandria, Virginia, USA, 2007, pp. 529-540.
- [10] D. L. Olson and D. Delen, Advanced Data Mining Techniques. Berlin Heidelberg: Springer, 2008.
- [11] T. Pietraszek and C. V. Berghe, "Defending Against Injection Attacks Through ContextSensitive String Evaluation," in Recent Advances in Intrusion Detection, Springer LNCS 3858, Seattle, Washington, 2006, pp. 124-145.
- [12] Y. Shin, A. Meneely, L. Williams, and J. A. Osbourne, "Evaluating Complexity, Code Churn, and Developer Activity metrics as Indicators of Software Vulnerabilities," Transactions on Software Engineering, 2010, to appear. DOI 10.1109/TSE.2010.81.
- [13] Y. Shin and L. Williams, "Is complexity really the enemy of software security?," in ACM workshop on Quality of protection (QoP2008), Alexandria, Virginia, 2008, pp. 47-50.
- [14] B. Smith, Y. Shin, and L. Williams, "Proposing SQL Statement Coverage Metrics," in Software Engineering for Secure Systems (SESS2008), colocated with ICSE 2008., Leipzig, Germany, 2008, pp. 49-56.
- [15] B. Smith, L. Williams, and A. Austin, "Idea: Using system level testing for revealing SQLinjection related error message information leaks," Lecture Notes in Computer Science, vol. 5965, pp. 192-200, Symposium on Engineering Secure Software and Systems 2010 (ESSoS 2010), 2010.
- [16] J. Walden, M. Doyle, R. Lenhof, and J. Murray, "Idea: Java vs. PHP: Security Implications of Language Choice for Web Applications," in Engineering Secure Software and Systems, Springer LNCS 5965, Pisa, Italy, 2010, pp. 61-69.
- [17] T. Zimmerman, N. Nagappan, and L. Williams, "Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista," in International Conference on Software Testing (ICST 2010), Paris, France, 2010, pp. 421-428.
10. End Notes
- Input validation vulnerabilities occur when a system does not assert that input falls within an acceptable range, allowing the system to be exploited perform unintended functionality.
- http://cwe.mitre.org/top25/
- http://www.sans.org/critical-security-controls/#summary
- SQL injection vulnerabilities occur when a lack of input validation could allow a user to force unintended system behavior by altering the logical structure of a SQL statement using SQL reserved words and special characters.