Various repair prompts for finding and fixing security vulnerabilities in JavaScript programs using Large Language Models (LLMs).
The repair prompts are categorized into 3 types of prompt templates, ranging from no additional context to comprehensive detail.
context-free
template: provide no additional contextcontext-sensitive
template: specify the name of the expected vulnerabilitycontext-rich
template: include comments along with the vulnerable code, providing a comprehensive explanation of the vulnerability and its potential exploitation (if applicable)
The vulnerabilities described in these prompts are adapted from the latest 2023 CWE Top 25 List. However, as not all vulnerabilities in this list are applicable to JavaScript, 20 out of the 25 vulnerabilities that are most relevant to JavaScript are selected.
Based on the 3 proposed prompt templates and the identified 20 vulnerabilities, there is a total number of 60 prompts used in testing with LLMs, namely ChatGPT and Bard.
The performance of these LLMs is as follows:
ChatGPT | Bard | |||||
---|---|---|---|---|---|---|
context-free | content-sensitive | context-rich | context-free | content-sensitive | context-rich | |
CWE-20 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
CWE-22 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
CWE-77 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-78 | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
CWE-79 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-89 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-94 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-125 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
CWE-190 | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
CWE-269 | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ |
CWE-276 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-287 | ✗ | ✗ | ✓ | ✗ | ✓ | ✓ |
CWE-306 | ✗ | ✗ | ✓ | ✗ | ✓ | ✓ |
CWE-434 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
CWE-476 | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ |
CWE-502 | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-787 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CWE-798 | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
CWE-862 | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ |
CWE-863 | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |