Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show prevalence of rules in the output #1737

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7603f85
Entropy Methods
Aayush-Goel-04 Jul 29, 2023
f5b38d5
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 2, 2023
bf1f59b
Sort rules in render based on match probability
Aayush-Goel-04 Aug 5, 2023
31bd6b3
Rendering rules into two sections. * for interesting rules.
Aayush-Goel-04 Aug 6, 2023
9ca4f9d
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 7, 2023
78877f2
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 13, 2023
f5f3e87
update
Aayush-Goel-04 Aug 13, 2023
a6797de
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 19, 2023
0b5a326
Update default.py
Aayush-Goel-04 Aug 19, 2023
def2d98
Merge branch 'Aayush-Goel-04/Issue#520' of https://github.com/Aayush-…
Aayush-Goel-04 Aug 19, 2023
039fdbd
Update utils.py
Aayush-Goel-04 Aug 19, 2023
8a0e61b
Merge branch 'master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 19, 2023
f6058b1
Update default.py
Aayush-Goel-04 Aug 19, 2023
dc399c3
Merge branch 'master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Aug 27, 2023
c5302cd
prevalence db update
Aayush-Goel-04 Aug 27, 2023
430bde6
Update default.py
Aayush-Goel-04 Aug 27, 2023
7f1566d
Update capa/render/default.py
Aayush-Goel-04 Aug 28, 2023
24541b6
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Sep 6, 2023
6787555
updated default render
Aayush-Goel-04 Sep 6, 2023
7c84926
Update utils.py
Aayush-Goel-04 Sep 6, 2023
c1f9e72
Revert "Update utils.py"
Aayush-Goel-04 Sep 6, 2023
7d6ec15
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Oct 9, 2023
5c1464c
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Oct 10, 2023
8ede526
Resolving path issues
Aayush-Goel-04 Oct 10, 2023
4476b2c
Update utils.py
Aayush-Goel-04 Oct 10, 2023
6077e99
Update utils.py
Aayush-Goel-04 Oct 10, 2023
bc0d129
Update pyinstaller.spec
Aayush-Goel-04 Oct 16, 2023
12dea73
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Oct 16, 2023
3bce5a9
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Oct 17, 2023
5a0a3a5
Update CHANGELOG.md
Aayush-Goel-04 Oct 20, 2023
e4bb521
Merge branch 'master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Oct 20, 2023
fe4af5c
render output with prevalence for (v) verbose
Aayush-Goel-04 Oct 20, 2023
95bdf5d
Update utils.py
Aayush-Goel-04 Oct 20, 2023
af57da8
Update RuleMetaData with Prevalence
Aayush-Goel-04 Nov 12, 2023
8057a73
Apply suggestions from code review
Aayush-Goel-04 Nov 12, 2023
5102ca1
Imports, Paths, Comments & Exceptions handled
Aayush-Goel-04 Nov 16, 2023
07553a6
Update result_document.py
Aayush-Goel-04 Nov 16, 2023
2c4931d
Update result_document.py
Aayush-Goel-04 Nov 20, 2023
c531a15
Merge branch 'master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Feb 3, 2024
61e7459
Added prevalence to verbose
Aayush-Goel-04 Feb 3, 2024
66d0ab7
linter checks
Aayush-Goel-04 Feb 3, 2024
e3ca32b
Revert "linter checks"
Aayush-Goel-04 Feb 3, 2024
f084040
Update result_document.py
Aayush-Goel-04 Feb 3, 2024
b07d600
Merge branch 'master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Feb 5, 2024
10d2140
Convert database to python files
Aayush-Goel-04 Feb 5, 2024
9bebffc
Lint checks
Aayush-Goel-04 Feb 5, 2024
fa89f44
Delete rules_prevalence.json.gz
Aayush-Goel-04 Feb 25, 2024
d93f135
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Feb 25, 2024
08ea4a9
Merge branch 'mandiant:master' into Aayush-Goel-04/Issue#520
Aayush-Goel-04 Mar 6, 2024
7992b1b
Merge branch 'master' into Aayush-Goel-04/Issue#520
mr-tz Mar 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
55 changes: 45 additions & 10 deletions capa/render/default.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
# See the License for the specific language governing permissions and limitations under the License.

import collections
from typing import Dict

import tabulate

Expand Down Expand Up @@ -73,19 +74,30 @@ def rec(match: rd.Match):

def render_capabilities(doc: rd.ResultDocument, ostream: StringIO):
"""
render capabilities sorted by:
- prevalence (rare to unknown)
- namespace (alphabetical)

example::
Aayush-Goel-04 marked this conversation as resolved.
Show resolved Hide resolved

+-------------------------------------------------------+-------------------------------------------------+
| CAPABILITY | NAMESPACE |
|-------------------------------------------------------+-------------------------------------------------|
| check for OutputDebugString error (2 matches) | anti-analysis/anti-debugging/debugger-detection |
| read and send data from client to server | c2/file-transfer |
| ... | ... |
+-------------------------------------------------------+-------------------------------------------------+
+-------------------------------------------------------+-------------------------------------------------+------------+
| CAPABILITY | NAMESPACE | PREVALENCE |
|-------------------------------------------------------+-------------------------------------------------|------------|
| check for OutputDebugString error (2 matches) | anti-analysis/anti-debugging/debugger-detection | rare |
| ... | ... | ... |
|-------------------------------------------------------|-------------------------------------------------|------------|
| read and send data from client to server | c2/file-transfer | common |
| ... | ... | ... |
+-------------------------------------------------------+-------------------------------------------------+------------+
"""
subrule_matches = find_subrule_matches(doc)

rows = []
# seperate rules based on their prevalence
common: Dict[str, str] = {"capability": "", "namespace": "", "prevalence": ""}
had_common = False
rare: Dict[str, str] = {"capability": "", "namespace": "", "prevalence": ""}
had_rare = False

for rule in rutils.capability_rules(doc):
if rule.meta.name in subrule_matches:
# rules that are also matched by other rules should not get rendered by default.
Expand All @@ -98,11 +110,34 @@ def render_capabilities(doc: rd.ResultDocument, ostream: StringIO):
capability = rutils.bold(rule.meta.name)
else:
capability = f"{rutils.bold(rule.meta.name)} ({count} matches)"
rows.append((capability, rule.meta.namespace))

namespace = rule.meta.namespace if rule.meta.namespace is not None else ""
prevalence = rutils.bold(rule.meta.prevalence) if rule.meta.prevalence != "unknown" else "unknown"

if "rare" in prevalence:
rare["capability"] += capability + "\n"
rare["namespace"] += namespace + "\n"
rare["prevalence"] += prevalence + "\n"
had_rare = True
else:
common["capability"] += capability + "\n"
common["namespace"] += namespace + "\n"
common["prevalence"] += prevalence + "\n"
had_common = True

rows = []
if had_rare:
rows.append((rare["capability"], rare["namespace"], rare["prevalence"]))
if had_common:
rows.append((common["capability"], common["namespace"], common["prevalence"]))

if rows:
ostream.write(
tabulate.tabulate(rows, headers=[width("Capability", 50), width("Namespace", 50)], tablefmt="mixed_outline")
tabulate.tabulate(
rows,
headers=[width("Capability", 50), width("Namespace", 50), width("Prevalence", 10)],
tablefmt="mixed_grid",
)
)
ostream.write("\n")
else:
Expand Down
27 changes: 27 additions & 0 deletions capa/render/result_document.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from enum import Enum
from typing import Dict, List, Tuple, Union, Literal, Optional
from pathlib import Path
from functools import lru_cache

from pydantic import Field, BaseModel, ConfigDict
from typing_extensions import TypeAlias
Expand All @@ -19,6 +20,7 @@
import capa.features.common
import capa.features.freeze as frz
import capa.features.address
import capa.render.rules_prevalence
import capa.features.freeze.features as frzf
from capa.rules import RuleSet
from capa.engine import MatchResults
Expand Down Expand Up @@ -569,9 +571,33 @@ class MaecMetadata(FrozenModel):
model_config = ConfigDict(frozen=True, populate_by_name=True)


@lru_cache(maxsize=None)
def load_rules_prevalence() -> Dict[str, str]:
"""
Load and return a dictionary containing prevalence information for rules defined in capa.

Returns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Returns:
Return:

Dict[str, str]: A dictionary where keys are rule names, and values are prevalence levels.

Example:
{
"capture screenshot": "rare",
"send data": "common",
"receive and write data from server to client": "common",
"resolve DNS": "common",
"reference HTTP User-Agent string": "rare"
}

Note:
Prevalence levels can be one of the following: "common", "rare"
"""
return capa.render.rules_prevalence.RULES_PREVALENCE


class RuleMetadata(FrozenModel):
name: str
namespace: Optional[str] = None
prevalence: str = "unknown"
authors: Tuple[str, ...]
scopes: capa.rules.Scopes
attack: Tuple[AttackSpec, ...] = Field(alias="att&ck")
Expand All @@ -589,6 +615,7 @@ def from_capa(cls, rule: capa.rules.Rule) -> "RuleMetadata":
return cls(
name=rule.meta.get("name"),
namespace=rule.meta.get("namespace"),
prevalence=load_rules_prevalence().get(rule.meta.get("name"), "unknown"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the rule prevalence database distributed with capa the library? i think its important that people be able to use capa the library without maintaining this database. so perhaps we want to handle the case of the database not existing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case database is not present, all rule matches will have prevalence as unknown in the results.
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can provide a warning if no db is found (in case that's not already there) pointing to one and explaining shortly what it does

authors=rule.meta.get("authors"),
scopes=capa.rules.Scopes.from_dict(rule.meta.get("scopes")),
attack=tuple(map(AttackSpec.from_str, rule.meta.get("att&ck", []))),
Expand Down