Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP in trivy because of confusing sha1 info in trivy java db #15

Open
wagde-orca opened this issue Apr 9, 2023 · 8 comments
Open

FP in trivy because of confusing sha1 info in trivy java db #15

wagde-orca opened this issue Apr 9, 2023 · 8 comments

Comments

@wagde-orca
Copy link

wagde-orca commented Apr 9, 2023

I see several issues in java db that all together causes a FP in trivy and need to be handled in the trivy-java-db and in trivy...

  1. in the latest fetched trivy java db I see that the latest version of kaml is missing in the db. version 0.53.0 was released on 18.3.2022. the question why is it missing... maybe because of next item
  2. as you can see https://repo1.maven.org/maven2/com/charleskorn/kaml/kaml/0.52.0/kaml-0.52.0.jar.sha1 and https://repo1.maven.org/maven2/com/charleskorn/kaml/kaml/0.53.0/kaml-0.53.0.jar.sha1. The two versions have the same sha1... I wonder how can this be.... and this is leading to next item
  3. in case we try to scan kaml-0.53.0.jar and because the only entry in the java db for the sha1 1464f167409b1df8aa89b1630f06036b71872b7a is for 0.52.0, and because the file does not have version info, we query the db for the sha1 and get version 0.52.0 and report CVE-2023-28118 for version 0.53.0 which was fixed in 0.53.0 as stated in https://nvd.nist.gov/vuln/detail/CVE-2023-28118
@wagde-orca wagde-orca changed the title FP in trivy because of missing sha1 in trivy java db FP in trivy because of confusing sha1 info in trivy java db Apr 9, 2023
@knqyf263
Copy link
Collaborator

knqyf263 commented May 9, 2023

I wonder if the digest is wrong. Weirdly, two versions have the same digest.

@prabhu
Copy link

prabhu commented May 15, 2023

@knqyf263, I told you guys on Twitter about this a few weeks ago :) I'm recreating this project to store additional metadata about the jar, such as package names, class names, etc. a better semantic comparison could be made to match packages. It's a fairly sizeable project, so collaborating would be nice.

@knqyf263
Copy link
Collaborator

If the digests are the same, all the information such as the class names are the same, right? Storing additional metadata doesn't help, then. But it could help if digests are wrong and we cannot trust them.

@prabhu
Copy link

prabhu commented May 15, 2023

Just tested my tool and it worked correctly.

With kaml there was no difference since the jars are identical with a placeholder manifest file, so the sbom tool could rightly say kaml-0.53.0 is really kaml-0.52.0.

With kaml-jvm there is a subtle difference which got picked up.

❯ diff kaml-jvm-0.52.0.cpg.slices.json kaml-jvm-0.53.0.cpg.slices.json
155a156,171
>       "name" : "com.charleskorn.kaml.ForbiddenAnchorOrAliasException",
>       "fields" : [
>       ],
>       "procedures" : [
>         {
>           "callName" : "<init>",
>           "paramTypes" : [
>             "com.charleskorn.kaml.ForbiddenAnchorOrAliasException",
>             "java.lang.String",
>             "com.charleskorn.kaml.YamlPath"
>           ],
>           "returnType" : "void"
>         }
>       ]
>     },
>     {
1222a1239,1243
>         },
>         {
>           "name" : "allowAnchorsAndAliases",
>           "typeFullName" : "boolean",
>           "literal" : false
1324a1346,1352
>           "callName" : "getAllowAnchorsAndAliases$kaml",
>           "paramTypes" : [
>             "com.charleskorn.kaml.YamlConfiguration"
>           ],
>           "returnType" : "boolean"
>         },
>         {
1347a1376
>             "boolean",
1368c1397,1398
<             "int"
---
>             "int",
>             "boolean"
1428a1459,1465
>           "callName" : "component13$kaml",
>           "paramTypes" : [
>             "com.charleskorn.kaml.YamlConfiguration"
>           ],
>           "returnType" : "boolean"
>         },
>         {
1471c1508,1509
<             "int"
---
>             "int",
>             "boolean"
1490a1529
>             "boolean",
3397a3437,3441
>           "name" : "$path",
>           "typeFullName" : "com.charleskorn.kaml.YamlPath",
>           "literal" : false
>         },
>         {
3429a3474
>             "com.charleskorn.kaml.YamlPath",
3450a3496,3500
>           "name" : "allowAnchorsAndAliases",
>           "typeFullName" : "boolean",
>           "literal" : false
>         },
>         {
3614a3665,3671
>           "callName" : "access$getAllowAnchorsAndAliases$p",
>           "paramTypes" : [
>             "com.charleskorn.kaml.YamlNodeReader"
>           ],
>           "returnType" : "boolean"
>         },
>         {
3626c3683,3684
<             "java.lang.String"
---
>             "java.lang.String",
>             "boolean"
3635a3694
>             "boolean",

Commands and tools used to create this intermediate representation for comparison.

# Install cpggen from https://github.com/AppThreat/cpggen
cpggen -i "pkg:maven/com.charleskorn.kaml/[email protected]" --slice
cpggen -i "pkg:maven/com.charleskorn.kaml/[email protected]" --slice

@prabhu
Copy link

prabhu commented May 15, 2023

If the digests are the same, all the information such as the class names are the same, right? Storing additional metadata doesn't help, then. But it could help if digests are wrong and we cannot trust them.

The logic used would be what is the lowest version of this package that is semantically similar to the version referred and downloaded for this repo. So even if kaml-0.52.0 and kaml-0.53.0 have same hash and content, the sbom tool would consider it as kaml-0.52.0.

Further, hashes should never be trusted. Someone could copy the source code of kaml and republish it as a different jar. Or they might perform trivial transformations such as changing the package or class names alone. So if a vulnerability exists in "kaml" it would also exist in these derivations but would never get reported or identified with traditional sbom collection techniques.

@knqyf263
Copy link
Collaborator

knqyf263 commented May 15, 2023

Further, hashes should never be trusted.

We can discuss it, but it seems to be off-topic in this issue. The original problem is kaml-0.52.0 and kaml-0.53.0 are the same, but the advisory says CVE-2023-28118 was fixed in 0.53.0. If kaml-jvm has differences, isn't GHSA wrong? It refers to com.charleskorn.kaml:kaml, but shouldn't it be com.charleskorn.kaml:kaml-jvm?
https://mvnrepository.com/artifact/com.charleskorn.kaml/kaml-jvm

@prabhu
Copy link

prabhu commented May 15, 2023

It's one of those meta-packages where referring it would download the correct child package. https://github.com/charleskorn/kaml#referencing-kaml. GHSA could help by referring to kaml-jvm in addition to kaml, and there is a precedence with log4j-api already. But in this case, the issue is not false negatives but false positives due to tools relying only on hashes.

@knqyf263
Copy link
Collaborator

But in this case, the issue is not false negatives but false positives due to tools relying only on hashes.

Why...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants