Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities: Difference between revisions
Jump to navigation
Jump to search
Programsam (talk | contribs) |
Programsam (talk | contribs) |
||
| Line 91: | Line 91: | ||
The script was 100% correct in each of the ten files. Our script stored the number of hotspots in each file from this procedure into the <code>files</code> dataset along with the absolute filename for the file being analyzed. The script also stored a "yes" or "no" indicating whether the number of hotspots found was greater than zero. Next, our script executed CLOC on the each project. CLOC produces a list of each PHP file in the project directory structure and the number of source lines of code in that file. Our script parsed the CLOC output and stored the value for source lines of code into the <code>files</code> dataset for each file in the project. | The script was 100% correct in each of the ten files. Our script stored the number of hotspots in each file from this procedure into the <code>files</code> dataset along with the absolute filename for the file being analyzed. The script also stored a "yes" or "no" indicating whether the number of hotspots found was greater than zero. Next, our script executed CLOC on the each project. CLOC produces a list of each PHP file in the project directory structure and the number of source lines of code in that file. Our script parsed the CLOC output and stored the value for source lines of code into the <code>files</code> dataset for each file in the project. | ||
=== Mapping Vulnerabilities to Files === | === 4.3. Mapping Vulnerabilities to Files === | ||
All files in the <code>files</code> dataset were initialized as neutral and then marked as vulnerable when they were determined to have any number of vulnerabilities by the procedure in this section. We did not track the number of vulnerabilities in a given file and instead only marked the file as vulnerable or neutral since we are interested in predicting which files are vulnerable, not how many vulnerabilities are in each file. As described before, to be included in our study, projects had to include a developer-indicated repository revision for each reported issue in the Trac management system. If a file was changed due to a security issue that was reported in Trac, then the file is considered vulnerable for that release. We describe the process of determining whether an issue report is security-related (and thus indicates a vulnerability) in Section 4.4. | |||
Since Trac integrates with the repository used for each project, we scripted the process of gathering the changed files, and what changes were made to each file due to a certain vulnerability report. If the Trac webpage that described the vulnerability also contained a link to a set of changes to the repository, then our script marked the file as vulnerable in the <code>files</code> dataset. If the webpage containing the description of the vulnerability contained no link to a set of changes, then the report was determined to not be a problem by the development team, or did not warrant a change in the current release of the system and was excluded from our analysis. Trac allows the user to view a text-only version of the changes that were conducted during a given repository revision over the web in diff format (by appending <code>?format=diff</code> to the end of the URL). Our script parsed this resultant diff webpage into data indicating the files that were changed as well as the number of lines changed for each file. | |||
== 5. Results == | == 5. Results == | ||