Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities: Difference between revisions

Line 134: Line 134:


=== 5.1. Statistical Results and Predictive Modeling ===
=== 5.1. Statistical Results and Predictive Modeling ===
For both projects, we performed the statistical tests as described in Section 4.6 to analyze the research hypothesis (H1-H8) described in the beginning of Section 4.  We summarize the results in Table 1. In both projects, we found that the more hotspots a file contains the more likely that file will be vulnerable (H1), and the more changes developers will make to that file due to any type of vulnerability (H2).  We found that issue reports related to input validation vulnerabilities result in a higher average number of repository revisions meaning that input validation vulnerabilities tend to require multiple fixes before the development team considers them fixed (H3).
We built logistic regression models to evaluate the number of hotspots as a predictor of whether or not a file is vulnerable (H4).  In WordPress, our model had precision between 0.02 and 0.50, and the random guess had precision between 0.0 and 0.23.  Our model had recall between 0.10 and 0.40 and the random guess had recall between 0 and 0.26. Our model had better precision than the random guess in five out of eight cases, and had better recall than the random guess in seven out of eight cases (see Table 2). In WikkaWiki, our model had precision between 0.04 and 1.0, and the random guess had precision between 0.0 and 0.13.  Our model had recall between 0.09 and 1.0 and the random guess had recall between 0.0 and 0.11. Our model had better precision than the random guess in three out of five cases, and had better recall than the random guess in four out of five cases (see Table 3). The values for precision and recall vary because the model's performance changed on each of the 15 versions of the projects we analyzed.  As the model sees more vulnerable files, the model misses less vulnerabilities (higher recall), but also reports more false positives (lower precision) as it relaxes its criteria for choosing a vulnerable file.
The logistic regression model that Weka generated to predict whether files were vulnerable or not, based on number of hotspots and lines of code, consistently contained a positive coefficient for the term representing the number of hotspots for both projects. The positive coefficient indicates that a greater number of hotspots in a file in the current release results in a higher probability of that file containing any type of web application vulnerability (H5).
Based upon these research results, our prioritization heuristic is as follows: ''More SQL and non-SQL vulnerabilities will be found in files that contain more hotspots per line of code. ''


=== 5.2. Comparing the Projects ===
=== 5.2. Comparing the Projects ===