Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The JSON for SSVC "options" splits out keys into individual records #576

Open
jayjacobs opened this issue May 30, 2024 · 12 comments · May be fixed by #588
Open

The JSON for SSVC "options" splits out keys into individual records #576

jayjacobs opened this issue May 30, 2024 · 12 comments · May be fixed by #588
Labels
enhancement New feature or request

Comments

@jayjacobs
Copy link

The schema (with example in data/schema_examples/Computed-CVE-2014-0751-Coordinator.json) is being leveraged by CISA for Vulnrichment and creates some unfriendly JSON.

I opened issue #40 on vulnrichment to talk about it and they suggested they are just following this schema.

Long story short, by specifying each key:value pair in it's own object under options, it is flattened (by tools) to be unique records, when all of those key:value pairs represent a single object. (see issue 40 in vulnrichment)

It also does not make sense to have the options specified as an array if it is a single object tied to the single computed field in the same record.

Fixed example:

{
    "role": "Coordinator",
    "id": "CVE-2014-0751",
    "version": "2.0.3",
    "computed": "SSVCv2/E:A/V:S/T:T/P:M/B:A/M:M/D:A/2021-09-29T15:29:44Z/",
    "timestamp": "2021-09-29T15:29:44Z",
    "options": {
	"Exploitation": "active",
	"Automatable": "no",
	"Technical Impact": "total",
	"Mission Prevalence": "Minimal",
	"Public Well-being Impact": "Material",
	"Mission & Well-being": "medium"
    },
    "$schema": "https://democert.org/ssvc/SSVC_Computed_v2.0.3.schema.json",
    "decision_tree_url": "https://democert.org/ssvc/CISA-Coordinator-v2.0.3.json"
}
@jayjacobs jayjacobs added the enhancement New feature or request label May 30, 2024
@sei-vsarvepalli
Copy link
Contributor

Hello Jay,

Thanks for your feedback. I believe the intent in options being an array was for it to be an ordered list, the Decisions Points in schema are considered ordered. This makes it possible to arrange the Decision Point bundles and embed them both in a Computed schema as well the full Decision Tree schema. There was some discussion related to this earlier that may be relevant

#403
#290

Some related information from JSON Schema discussions Google Group

https://groups.google.com/g/json-schema/c/rgkxYocPSVg

@jayjacobs
Copy link
Author

First, I do not understand the need for these to be an ordered list. The "decision tree" in SSVC is simply for visualizing. The ordering of variables will never change the outcome. But that discussion is not what this issue is about.

Let's just assume they are ordered, it still does not mean that the individual variables need to be represented in the JSON as ordered. In other words, the representation/presentation can be different than the storage.

In #290 it is stated:

This is also generally true of (I think) all CVSS vector elements to date. (Counterexamples hereby requested.)

Which I think is true, the CVSS vector representation is ordered, so take this example from CVE-2007-3484 where the CVSS is stored in JSON:

            "cvssV3_1": {
              "scope": "CHANGED",
              "version": "3.1",
              "baseScore": 6.1,
              "attackVector": "NETWORK",
              "baseSeverity": "MEDIUM",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
              "integrityImpact": "LOW",
              "userInteraction": "REQUIRED",
              "attackComplexity": "LOW",
              "availabilityImpact": "NONE",
              "privilegesRequired": "NONE",
              "confidentialityImpact": "LOW"
            }

Notice how the vectorString is showing the ordering of the variables, but the variables themselves are stored in the object unordered. This doesn't make the values any less ordered for display than any other storage mechanism. It also "flattens" down to structured data rather nicely and can then be past to a presentation layer or wherever.

I maintain that the way the options for SSVC are stored are sub-optimal, and can be greatly simplified that could still support ordering the variables for whatever presentation may occur and allow clean flattening and extraction of structured data.

@ahouseholder
Copy link
Contributor

ahouseholder commented May 30, 2024

I think there's some confusion here on what needs to be "ordered" in SSVC. The rationale is laid out in decision record 0008, but the gist is that within a single decision point (think CVSS vector element), the values must be ordered. So for exploitation, $None < PoC < Active$, or for confidentialityImpact, $None < Low < High$.

That is not saying anything about how to represent multiple specific values across decision points, which seems to be what this thread is touching on.

I realize this doesn't resolve the question, I just wanted to point out that the "ordering" line of argument might be a red herring to the issue at hand. I haven't had a chance to reexamine the json schema mentioned, so I'm not prepared to comment on the rest of the thread at the moment.

@sei-vsarvepalli
Copy link
Contributor

Ah okay.

I think your recommended schema will work. The current schema also works, but I think your request is to primarily optimize the schema? Will the records consuming tools fail to understand or process the schema?

If you feel strongly the flattening of the schema is user, you can make a PR suggestion with the schema, and update the JavaScript library for us to run our tests and take it in. We only need that your PR be digitally signed and be evaluated by us before merging it.

On a related topic:

The reason for the Computed schema options field to be array of objects whose items can be array or string is so one can specify a Decision Point to have multiple potential values either because a Decision Point has not yet evaluated or Decision Point has been evaluated to "exclude" one of the options.

For example a valid SSVC computed schema options below states that Exploitation is either "poc" or "active" but NOT "none" and Mission Prevalence has NOT been evaluated by the Role of this metrics developer.

                "options": [
                  {
                    "Exploitation": ["poc","active"]
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  },
                  {
                    "Mission Prevalence": ["minimal","support","essential"]
                  }
                ]

@ahouseholder
Copy link
Contributor

So ignoring the current schema for a moment, the CVSS example @jayjacobs mentioned might turn @sei-vsarvepalli's example into

                "options": {
                    "Exploitation": ["poc","active"],
                    "Automatable": "no",
                    "Technical Impact": "partial",
                    "Mission Prevalence": ["minimal","support","essential"]
                }

But that would have the unfortunate side effect of having a dictionary whose keys are always strings but whose values could be strings or lists, which could lead to ambiguous parsing.

So maybe

                "options": {
                    "Exploitation": ["poc","active"],
                    "Automatable": ["no"],
                    "Technical Impact": ["partial"],
                    "Mission Prevalence": ["minimal","support","essential"]
                }

would be preferable? Every value is consistently a list of strings. The list could be of length 1 up to the entire decision point (implying that the record is asserting no information about that decision point)

This seems logically consistent with what we say in https://certcc.github.io/SSVC/howto/bootstrap/use/#partial-or-incomplete-information

The basic guidance is that the analyst should communicate all of the vulnerability's possible states, to the best of the analyst's knowledge. If the analyst knows nothing, all states are possible.
...
The merit in this “list all values” approach emerges when the stakeholder knows that the value for a decision point may be A or B, but not C.

@jayjacobs
Copy link
Author

I've given this some thought and I cannot open a PR for the work required, I just do not have the time and you do not want me working on javascript.

But for future reference, an array of objects is generally treated as a collection of separate objects (think of unique rows in a table), this is the only JSON I've come across (so far) where an array is used to order a single object - meaning each object in the JSON is actually a part of an object defined at the array level. It's just not standard JSON practice.

Also, an array denotes a one-to-many relationship. If you what something like "Exploitation" to have a one-to-many relationship with its values, than the whole thing should be an array regardless if one or many values exist, do not store When someone needs to convert JSON data into columnar format that relationship needs to apply across all records and cannot exist in a per-record definition. This means that even simple records would be an array:

    "options": {
	"Exploitation": [ "active" ],
	"Automatable": [ "no" ],
	"Technical Impact": [ "total" ],
	"Mission Prevalence": [ "Minimal" ],
	"Public Well-being Impact": [ "Material" ],
	"Mission & Well-being": [ "medium" ]
    },

@ahouseholder
Copy link
Contributor

    "options": {
	"Exploitation": [ "active" ],
	"Automatable": [ "no" ],
	"Technical Impact": [ "total" ],
	"Mission Prevalence": [ "Minimal" ],
	"Public Well-being Impact": [ "Material" ],
	"Mission & Well-being": [ "medium" ]
    },

I think this is consistent with what I intended in my comment above -- that the values would always be an array, even if there is only a single element.

However, due to the nearly coincident timing of our respective comments, I'm not sure whether you were responding to mine or whether we were both replying nearly simultaneously.

@jayjacobs
Copy link
Author

Yes sorry, my last comment was for @sei-vsarvepalli and your comment came in right before I posted. I agree @ahouseholder, your comment and mine seem to be very much in line.

@j---
Copy link
Collaborator

j--- commented May 31, 2024

The decision points don't need to be an ordered list; the order of the decision points in a tree is not material to the output. There are some display choices we make about ordering points with a tree's display, but that should be something that is ensured by the display tool, not the JSON format.

I agree the values should always be an array for each decision point, just to keep us aligned with the ability to relay partial information (though we could determine that use case is overtaken by events and no one wants to do it).

Thanks Jay and Allen for converging on a solution.

@ahouseholder
Copy link
Contributor

ahouseholder commented Jun 25, 2024

Capturing some additional thoughts on a way forward based on internal discussions:

flowchart TD
SSVC_Computed --> Decision_Point_Group
SSVC_Provision --> Decision_Point_Group
Decision_Point_Group --> Decision_Point
Loading
  • We'd want any changes to also include the array changes already described in this thread as well.

@sei-vsarvepalli
Copy link
Contributor

@jayjacobs

I have a branch that tries to resolve this issue with a consistent schema that can be used for Decision Points and Decision Point Groups.

https://github.com/sei-vsarvepalli/SSVC/tree/feature/issue_576

specifically take a look at https://github.com/sei-vsarvepalli/SSVC/blob/feature/issue_576/data/schema_examples/Computed-CVE-2014-0751-Coordinator.json

if that comes close to what would be reliable way to parse and represent SSVC data via ADP. This requires a bit of coordination with CISA developers to ensure the new schema also gets adopted.

sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024
sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024
@sei-vsarvepalli sei-vsarvepalli linked a pull request Jun 25, 2024 that will close this issue
@sei-vsarvepalli
Copy link
Contributor

Capturing some additional thoughts on a way forward based on internal discussions:

flowchart TD
SSVC_Computed --> Decision_Point_Group
SSVC_Provision --> Decision_Point_Group
Decision_Point_Group --> Decision_Point
Loading
  • We'd want any changes to also include the array changes already described in this thread as well.

This part is also resolved in the recent PR #588

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants