The JSON for SSVC "options" splits out keys into individual records #576

jayjacobs · 2024-05-30T13:37:31Z

The schema (with example in data/schema_examples/Computed-CVE-2014-0751-Coordinator.json) is being leveraged by CISA for Vulnrichment and creates some unfriendly JSON.

I opened issue #40 on vulnrichment to talk about it and they suggested they are just following this schema.

Long story short, by specifying each key:value pair in it's own object under options, it is flattened (by tools) to be unique records, when all of those key:value pairs represent a single object. (see issue 40 in vulnrichment)

It also does not make sense to have the options specified as an array if it is a single object tied to the single computed field in the same record.

Fixed example:

{
    "role": "Coordinator",
    "id": "CVE-2014-0751",
    "version": "2.0.3",
    "computed": "SSVCv2/E:A/V:S/T:T/P:M/B:A/M:M/D:A/2021-09-29T15:29:44Z/",
    "timestamp": "2021-09-29T15:29:44Z",
    "options": {
	"Exploitation": "active",
	"Automatable": "no",
	"Technical Impact": "total",
	"Mission Prevalence": "Minimal",
	"Public Well-being Impact": "Material",
	"Mission & Well-being": "medium"
    },
    "$schema": "https://democert.org/ssvc/SSVC_Computed_v2.0.3.schema.json",
    "decision_tree_url": "https://democert.org/ssvc/CISA-Coordinator-v2.0.3.json"
}

The text was updated successfully, but these errors were encountered:

sei-vsarvepalli · 2024-05-30T14:58:36Z

Hello Jay,

Thanks for your feedback. I believe the intent in options being an array was for it to be an ordered list, the Decisions Points in schema are considered ordered. This makes it possible to arrange the Decision Point bundles and embed them both in a Computed schema as well the full Decision Tree schema. There was some discussion related to this earlier that may be relevant

#403
#290

Some related information from JSON Schema discussions Google Group

https://groups.google.com/g/json-schema/c/rgkxYocPSVg

jayjacobs · 2024-05-30T18:35:27Z

First, I do not understand the need for these to be an ordered list. The "decision tree" in SSVC is simply for visualizing. The ordering of variables will never change the outcome. But that discussion is not what this issue is about.

Let's just assume they are ordered, it still does not mean that the individual variables need to be represented in the JSON as ordered. In other words, the representation/presentation can be different than the storage.

In #290 it is stated:

This is also generally true of (I think) all CVSS vector elements to date. (Counterexamples hereby requested.)

Which I think is true, the CVSS vector representation is ordered, so take this example from CVE-2007-3484 where the CVSS is stored in JSON:

            "cvssV3_1": {
              "scope": "CHANGED",
              "version": "3.1",
              "baseScore": 6.1,
              "attackVector": "NETWORK",
              "baseSeverity": "MEDIUM",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
              "integrityImpact": "LOW",
              "userInteraction": "REQUIRED",
              "attackComplexity": "LOW",
              "availabilityImpact": "NONE",
              "privilegesRequired": "NONE",
              "confidentialityImpact": "LOW"
            }

Notice how the vectorString is showing the ordering of the variables, but the variables themselves are stored in the object unordered. This doesn't make the values any less ordered for display than any other storage mechanism. It also "flattens" down to structured data rather nicely and can then be past to a presentation layer or wherever.

I maintain that the way the options for SSVC are stored are sub-optimal, and can be greatly simplified that could still support ordering the variables for whatever presentation may occur and allow clean flattening and extraction of structured data.

ahouseholder · 2024-05-30T20:39:43Z

I think there's some confusion here on what needs to be "ordered" in SSVC. The rationale is laid out in decision record 0008, but the gist is that within a single decision point (think CVSS vector element), the values must be ordered. So for exploitation, $None < PoC < Active$, or for confidentialityImpact, $None < Low < High$.

That is not saying anything about how to represent multiple specific values across decision points, which seems to be what this thread is touching on.

I realize this doesn't resolve the question, I just wanted to point out that the "ordering" line of argument might be a red herring to the issue at hand. I haven't had a chance to reexamine the json schema mentioned, so I'm not prepared to comment on the rest of the thread at the moment.

sei-vsarvepalli · 2024-05-30T22:18:27Z

Ah okay.

I think your recommended schema will work. The current schema also works, but I think your request is to primarily optimize the schema? Will the records consuming tools fail to understand or process the schema?

If you feel strongly the flattening of the schema is user, you can make a PR suggestion with the schema, and update the JavaScript library for us to run our tests and take it in. We only need that your PR be digitally signed and be evaluated by us before merging it.

On a related topic:

The reason for the Computed schema options field to be array of objects whose items can be array or string is so one can specify a Decision Point to have multiple potential values either because a Decision Point has not yet evaluated or Decision Point has been evaluated to "exclude" one of the options.

For example a valid SSVC computed schema options below states that Exploitation is either "poc" or "active" but NOT "none" and Mission Prevalence has NOT been evaluated by the Role of this metrics developer.

                "options": [
                  {
                    "Exploitation": ["poc","active"]
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  },
                  {
                    "Mission Prevalence": ["minimal","support","essential"]
                  }
                ]

ahouseholder · 2024-05-31T14:29:16Z

So ignoring the current schema for a moment, the CVSS example @jayjacobs mentioned might turn @sei-vsarvepalli's example into

                "options": {
                    "Exploitation": ["poc","active"],
                    "Automatable": "no",
                    "Technical Impact": "partial",
                    "Mission Prevalence": ["minimal","support","essential"]
                }

But that would have the unfortunate side effect of having a dictionary whose keys are always strings but whose values could be strings or lists, which could lead to ambiguous parsing.

So maybe

                "options": {
                    "Exploitation": ["poc","active"],
                    "Automatable": ["no"],
                    "Technical Impact": ["partial"],
                    "Mission Prevalence": ["minimal","support","essential"]
                }

would be preferable? Every value is consistently a list of strings. The list could be of length 1 up to the entire decision point (implying that the record is asserting no information about that decision point)

This seems logically consistent with what we say in https://certcc.github.io/SSVC/howto/bootstrap/use/#partial-or-incomplete-information

The basic guidance is that the analyst should communicate all of the vulnerability's possible states, to the best of the analyst's knowledge. If the analyst knows nothing, all states are possible.
...
The merit in this “list all values” approach emerges when the stakeholder knows that the value for a decision point may be A or B, but not C.

jayjacobs · 2024-05-31T14:31:47Z

I've given this some thought and I cannot open a PR for the work required, I just do not have the time and you do not want me working on javascript.

But for future reference, an array of objects is generally treated as a collection of separate objects (think of unique rows in a table), this is the only JSON I've come across (so far) where an array is used to order a single object - meaning each object in the JSON is actually a part of an object defined at the array level. It's just not standard JSON practice.

Also, an array denotes a one-to-many relationship. If you what something like "Exploitation" to have a one-to-many relationship with its values, than the whole thing should be an array regardless if one or many values exist, do not store When someone needs to convert JSON data into columnar format that relationship needs to apply across all records and cannot exist in a per-record definition. This means that even simple records would be an array:

    "options": {
	"Exploitation": [ "active" ],
	"Automatable": [ "no" ],
	"Technical Impact": [ "total" ],
	"Mission Prevalence": [ "Minimal" ],
	"Public Well-being Impact": [ "Material" ],
	"Mission & Well-being": [ "medium" ]
    },

ahouseholder · 2024-05-31T14:37:32Z

    "options": {
	"Exploitation": [ "active" ],
	"Automatable": [ "no" ],
	"Technical Impact": [ "total" ],
	"Mission Prevalence": [ "Minimal" ],
	"Public Well-being Impact": [ "Material" ],
	"Mission & Well-being": [ "medium" ]
    },

I think this is consistent with what I intended in my comment above -- that the values would always be an array, even if there is only a single element.

However, due to the nearly coincident timing of our respective comments, I'm not sure whether you were responding to mine or whether we were both replying nearly simultaneously.

jayjacobs · 2024-05-31T15:15:13Z

Yes sorry, my last comment was for @sei-vsarvepalli and your comment came in right before I posted. I agree @ahouseholder, your comment and mine seem to be very much in line.

j--- · 2024-05-31T15:22:15Z

The decision points don't need to be an ordered list; the order of the decision points in a tree is not material to the output. There are some display choices we make about ordering points with a tree's display, but that should be something that is ensured by the display tool, not the JSON format.

I agree the values should always be an array for each decision point, just to keep us aligned with the ability to relay partial information (though we could determine that use case is overtaken by events and no one wants to do it).

Thanks Jay and Allen for converging on a solution.

ahouseholder · 2024-06-25T15:14:48Z

Capturing some additional thoughts on a way forward based on internal discussions:

In the creation of the python versions of the decision points, we created Decision_Point.schema.json and Decision_Point_Group.schema.json which we use in unit tests (test_schema.py) to validate the json output of the pythonized object generators.
The javascript calculator relies on SSVC_Computed.schema.json and SSCC_Provision.schema.json
We need to hybridize the SSVC_*.schema.json schemas with the Decision_Point*.schema.json schemas.
Specifically, that means that we need these things to be modularized, which probably looks something like the following dependency graph

flowchart TD
SSVC_Computed --> Decision_Point_Group
SSVC_Provision --> Decision_Point_Group
Decision_Point_Group --> Decision_Point

We'd want any changes to also include the array changes already described in this thread as well.

sei-vsarvepalli · 2024-06-25T19:12:49Z

@jayjacobs

I have a branch that tries to resolve this issue with a consistent schema that can be used for Decision Points and Decision Point Groups.

https://github.com/sei-vsarvepalli/SSVC/tree/feature/issue_576

specifically take a look at https://github.com/sei-vsarvepalli/SSVC/blob/feature/issue_576/data/schema_examples/Computed-CVE-2014-0751-Coordinator.json

if that comes close to what would be reliable way to parse and represent SSVC data via ADP. This requires a bit of coordination with CISA developers to ensure the new schema also gets adopted.

sei-vsarvepalli · 2024-06-25T20:49:48Z

Capturing some additional thoughts on a way forward based on internal discussions:

In the creation of the python versions of the decision points, we created Decision_Point.schema.json and Decision_Point_Group.schema.json which we use in unit tests (test_schema.py) to validate the json output of the pythonized object generators.

The javascript calculator relies on SSVC_Computed.schema.json and SSCC_Provision.schema.json

We need to hybridize the SSVC_*.schema.json schemas with the Decision_Point*.schema.json schemas.

Specifically, that means that we need these things to be modularized, which probably looks something like the following dependency graph
flowchart TD
SSVC_Computed --> Decision_Point_Group
SSVC_Provision --> Decision_Point_Group
Decision_Point_Group --> Decision_Point
Loading
We'd want any changes to also include the array changes already described in this thread as well.

This part is also resolved in the recent PR #588

jayjacobs added the enhancement New feature or request label May 30, 2024

todb-cisa mentioned this issue Jun 5, 2024

The JSON for "other" (SSVC) "options" is not succinct json cisagov/vulnrichment#40

Open

ahouseholder mentioned this issue Jun 25, 2024

Make schema available via data/ folder for certcc.github.io #586

Merged

sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024

Updated Computed schema as feedback from Jay Jacobs CERTCC#576

58bd445

sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024

Updated Computed schema as feedback from Jay Jacobs CERTCC#576

a28f329

sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024

Updated Computed schema as feedback from Jay Jacobs CERTCC#576

7c90fba

sei-vsarvepalli added a commit to sei-vsarvepalli/SSVC that referenced this issue Jun 25, 2024

Updated Computed schema as feedback from Jay Jacobs CERTCC#576

222b7d4

sei-vsarvepalli linked a pull request Jun 25, 2024 that will close this issue

Feature/issue 576 #588

Open

This was referenced Jun 28, 2024

Consider adding a json schema to represent a policy #591

Open

Need json schema for outcome groups #589

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The JSON for SSVC "options" splits out keys into individual records #576

The JSON for SSVC "options" splits out keys into individual records #576

jayjacobs commented May 30, 2024

sei-vsarvepalli commented May 30, 2024

jayjacobs commented May 30, 2024

ahouseholder commented May 30, 2024 •

edited

Loading

sei-vsarvepalli commented May 30, 2024

ahouseholder commented May 31, 2024

jayjacobs commented May 31, 2024

ahouseholder commented May 31, 2024

jayjacobs commented May 31, 2024

j--- commented May 31, 2024

ahouseholder commented Jun 25, 2024 •

edited

Loading

sei-vsarvepalli commented Jun 25, 2024

sei-vsarvepalli commented Jun 25, 2024

The JSON for SSVC "options" splits out keys into individual records #576

The JSON for SSVC "options" splits out keys into individual records #576

Comments

jayjacobs commented May 30, 2024

sei-vsarvepalli commented May 30, 2024

jayjacobs commented May 30, 2024

ahouseholder commented May 30, 2024 • edited Loading

sei-vsarvepalli commented May 30, 2024

ahouseholder commented May 31, 2024

jayjacobs commented May 31, 2024

ahouseholder commented May 31, 2024

jayjacobs commented May 31, 2024

j--- commented May 31, 2024

ahouseholder commented Jun 25, 2024 • edited Loading

sei-vsarvepalli commented Jun 25, 2024

sei-vsarvepalli commented Jun 25, 2024

ahouseholder commented May 30, 2024 •

edited

Loading

ahouseholder commented Jun 25, 2024 •

edited

Loading