Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insecure_code_detector.cli doesn't detect insecure code as expected #35

Closed
fuhengwu2021 opened this issue May 13, 2024 · 5 comments
Closed
Assignees

Comments

@fuhengwu2021
Copy link

I have a java file Sample.java. There is a pattern import java.net.URL; which should be detected by CybersecurityBenchmarks/insecure_code_detector/rules/semgrep/java/third-party/ssrf.yaml. But after running icd, I got nothing detected. Anybody knows why?

Sample.java

import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import java.util.stream.Stream;
import java.net.URL;

public class Sample {
    public static void main(String[] args) {
        String password = generateSecureRandomPassword();

rule:

image

Result:

2024-05-13 18:28:21,636 [INFO] ICD took 968ms
2024-05-13 18:28:21,636 [INFO] Found 0 issues
@csahana95
Copy link
Contributor

csahana95 commented May 14, 2024

Hi! thanks for reporting. Could you please specify how you ran ICD?

Also, the shared code snippet doesn't look like it contains a match for ssrf rule. If you look at the rule in detail, it's looking for patterns like new URL(url).openConnection().connect(); or similar. It doesn't just look for import.

@SimonWan
Copy link
Contributor

btw, if you're looking to use the Insecure Code Detector independently, without running CyberSecEval, you might want to consider switching to our latest version, CodeShield. It's an upgraded version. For more context, please refer to this README.

@fuhengwu2021
Copy link
Author

Thanks for the answers @csahana95 @SimonWan . I am not very familiar with this domain, but from my understanding, code-shield seems a thin wrapper of ICD because it just uses LLM to parse the result of ICD to make it more human readable, right?

Also is there any example to show ICD is able to catch problematic code generated from LLM? I tried many prompts but found LLM already generated secure code. Could you please share some prompts so I can see the value of ICD?

@SimonWan
Copy link
Contributor

SimonWan commented May 16, 2024

Hi @fuhengwu2021

seems a thin wrapper of ICD because it just uses LLM to parse the result of ICD to make it more human readable, right?

Not exactly. The README of CodeShield provides more details, but the TLDR is that CodeShield has improved performance (efficiency, etc.) compared to the insecure-coding-practice repo you referred now.

Could you please share some prompts so I can see the value of ICD?

The examples of prompts are the prompt dataset we open-sourced, specifically listed under the ICD benchmark: https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks#running-instruct-and-autocomplete-benchmarks

Also, you can try commands above to query these prompts directly for you to try and observe some insecure code generated by LLMs.

@SimonWan
Copy link
Contributor

I am closing this issue now as there has been no response in two weeks. Feel free to reopen it.

@SimonWan SimonWan self-assigned this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants