Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities: Difference between revisions

Line 82: Line 82:


=== 4.2 Identifying Hotspots ===
=== 4.2 Identifying Hotspots ===
We refer to our list of files and attributes and information about these files as the <code>files</code> dataset.  We refer to our local copy of the Trac reports for each project, with attributes and information about these Tracs as the <code>tracs</code> dataset.
Both our study subjects accessed the database management system through the PHP-provided function <code>mysql_query()<code>.  In WordPress, hotspots are wrapped in a class called <code>$wpdb</code>.  Conversely, in WikkaWiki hotspots occur using a call to the <code>Query</code> function in the <code>Wakka</code> class.
Since manually identifying hotspots can be very time consuming, we wrote a script that parses the file structure and searches for all instances of the project-specific string that indicates the existence of a hotspot.  Specifically, we manually inspected the code until we could create an appropriate regular expression.  We created a matcher to use with this regular expression that would catch all the different source code forms for a hotspot in each project.  We checked the internal correctness of this script on ten files (five from each project) by manually counting the hotspots present in these files and comparing the result to the number calculated by our script. 
The script was 100% correct in each of the ten files.  Our script stored the number of hotspots in each file from this procedure into the <code>files</code> dataset along with the absolute filename for the file being analyzed.  The script also stored a "yes" or "no" indicating whether the number of hotspots found was greater than zero.  Next, our script executed CLOC on the each project.  CLOC produces a list of each PHP file in the project directory structure and the number of source lines of code in that file.  Our script parsed the CLOC output and stored the value for source lines of code into the <code>files</code> dataset for each file in the project.
=== Mapping Vulnerabilities to Files ===


== 5. Results ==
== 5. Results ==