🐛 fix sonarqube missing CWE for deduplication #10178

quirinziessler · 2024-05-10T09:30:32Z

During the import of Sonarqube findings I got aware of an issue related with the deduplication on the parser. For some kind of findings a cwe is not provided but is mandatory for the deduplication algorithm by settings.dist.py. The deduplication then falls back to the legacy dedupe hash model and the import takes longer than usual.

defectdojo-uwsgi-1  | [10/May/2024 07:38:28] WARNING [dojo.specific-loggers.deduplication:2746] Cannot compute hash_code based on configured fields because cwe is 0 for finding of title 'java:S3415_AYK2F0bHUOabgueBfBvD' found in file 'None'. Fallback to legacy mode for this finding.

I removed the cwe from the dedupe hash code and added other suitable values by extending the parser.

dryrunsecurity · 2024-05-10T09:30:55Z

Hi there 👋, @DryRunSecurity here, below is a summary of our analysis and findings.

DryRun Security	Status	Findings
Configured Codepaths Analyzer	✅	0 findings
Sensitive Files Analyzer	✅	0 findings
AppSec Analyzer	✅	0 findings
Authn/Authz Analyzer	✅	0 findings
Secrets Analyzer	✅	0 findings

Note

🟢 Risk threshold not exceeded.

Change Summary (click to expand)

The following is a summary of changes in this pull request made by me, your security buddy 🤖.
Note that this summary is auto-generated and not meant to be a definitive list of security issues
but rather a helpful summary from a security perspective.

Summary:

This pull request includes several changes across different files, all of which are focused on improving the security-related functionality and configuration of the DefectDojo application, an open-source vulnerability management tool.

The changes to the unittests/tools/test_sonarqube_parser.py file enhance the robustness and functionality of the SonarQube parser, allowing it to extract more detailed information from the SonarQube reports, including CWE, CVSS v3 scores, component names and versions, and associated tags. This additional information can provide more context and insight into the potential security vulnerabilities identified by the SonarQube scans.

The changes to the dojo/tools/sonarqube/sonarqube_restapi_json.py file focus on improving the security-related information extracted from the SonarQube REST API and enhancing the way it is presented to the security and development teams. This includes extracting more detailed vulnerability metadata, improving tagging and categorization, providing more detailed reporting, and handling of "hotspots" (potential security-related issues).

Finally, the changes to the dojo/settings/settings.dist.py file introduce several security-focused configuration options, such as enabling SAML2-based authentication, remote user authentication, improved deduplication and hashcode configuration, Celery configuration for asynchronous task processing, and various other security and logging settings.

Files Changed:

unittests/tools/test_sonarqube_parser.py:
- Added new test cases to handle various scenarios for parsing SonarQube reports, including JSON reports, reports with multiple findings, reports with tables within the vulnerability details, and handling of edge cases.
- The key changes from a security perspective are the extraction of more detailed vulnerability information, such as CWE, CVSS v3 scores, component names and versions, and associated tags.
dojo/tools/sonarqube/sonarqube_restapi_json.py:
- Extracted more detailed information about vulnerabilities, including CWE IDs, CVSS scores, and vulnerability IDs (CVE, GHSA).
- Improved tagging and categorization of findings, such as "bug", "vulnerability", and "code_smell".
- Provided more detailed reporting, including information about the rule, component, project, textRange, flows, status, and message.
- Handled "hotspots" (potential security-related issues) from the SonarQube JSON response.
dojo/settings/settings.dist.py:
- Enabled SAML2-based authentication and remote user authentication.
- Introduced new configuration options for deduplicating findings and computing hashcodes.
- Configured Celery for asynchronous task processing, including tasks such as adding alerts, cleaning up alerts, and updating findings from source issues.
- Included various security-related settings, such as CSRF protection, session cookies, and content security policies.
- Configured logging settings, including the ability to use JSON-formatted logs.
- Configured the API, including the use of the DRF Spectacular library for generating OpenAPI documentation.
- Included settings for configuring system-level notifications.

Powered by DryRun Security

Maffooch · 2024-05-13T15:07:49Z

@quirinziessler how much longer is import taking when using a different a hash code computation? I believe there is one more conditional and a log statement added to the hashcode computation. Sonarqube is a pretty popular tool, and this update would impact a lot of folks when their hash codes are now using different values, and there is no longer a match with existing sonarqube findings.

I would recommend updating the config in your instance to allow the hash code calculation for sonarqube to accept null CWEs:
https://github.com/DefectDojo/django-DefectDojo/blob/2c7b5066736bfe13ed42227f6832ac540e188499/dojo/settings/settings.dist.py#L1275C1-L1289C29

quirinziessler · 2024-05-13T18:46:35Z

@Maffooch I am aware of the widespread use of SQ and this is also why I opened this PR against bugfix. During testing I saw my import time dropping about 5 seconds until all processes finished for aprox 50 findings. This may sound like a small impact but after successful testing with around 200 SQ projects I can say this change should definitely be part of the next release, so everyone can use DD out of the box and to keep the required adaptations as low as possible.

Furthermore the SQ ID should be preferred instead of the CWE for deduplication in my eyes. A file can have multiple findings with the same CWE in multiple lines, e.g. "missing input validation". By using the SQ ID those findings will be reflected in DefectDojo and not marked as duplicates like it is currently. Unfortunately the finding line is not consequently provided within the output of SQ so this is the only possible way to catch this behavior.

quirinziessler · 2024-05-20T13:30:26Z

@Maffooch any updates here?

Maffooch · 2024-05-20T14:36:44Z

Apologies for missing this. If the hashcode is the be changed here, I think it should align with how the API based SonarQube parser is handling deduplication. It uses unique_tool_by_id based on the rule key.

A migration will also need to be written to update all of the hash codes for sonarqube findings. Something like this would need to go against the dev branch, as the migration could potentially take long time for users with hundreds of projects

quirinziessler · 2024-05-22T10:40:43Z

No worries @Maffooch. According to the dedupe of SQ and your statement this meets my changes. I can also prepare the same PR against the latest dev version. However I currently fail on how to write a db_migration for the hash codes as the models seem not to be accessible at migration time. Can you help me on this and maybe provide an example?

quirinziessler added 2 commits May 10, 2024 11:23

🐛 fix removed cwe from SQ dedupe and added other suitable keys

168f142

✏️ flake8

050bf5c

github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests parser labels May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 fix sonarqube missing CWE for deduplication #10178

🐛 fix sonarqube missing CWE for deduplication #10178

quirinziessler commented May 10, 2024

dryrunsecurity bot commented May 10, 2024

Maffooch commented May 13, 2024

quirinziessler commented May 13, 2024 •

edited

quirinziessler commented May 20, 2024

Maffooch commented May 20, 2024

quirinziessler commented May 22, 2024 •

edited

🐛 fix sonarqube missing CWE for deduplication #10178

Are you sure you want to change the base?

🐛 fix sonarqube missing CWE for deduplication #10178

Conversation

quirinziessler commented May 10, 2024

dryrunsecurity bot commented May 10, 2024

Maffooch commented May 13, 2024

quirinziessler commented May 13, 2024 • edited

quirinziessler commented May 20, 2024

Maffooch commented May 20, 2024

quirinziessler commented May 22, 2024 • edited

quirinziessler commented May 13, 2024 •

edited

quirinziessler commented May 22, 2024 •

edited