Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make StreamAlert Normalization Searchable #1180

Open
Ryxias opened this issue Mar 5, 2020 · 0 comments
Open

Make StreamAlert Normalization Searchable #1180

Ryxias opened this issue Mar 5, 2020 · 0 comments

Comments

@Ryxias
Copy link
Contributor

Ryxias commented Mar 5, 2020

Background

StreamAlert Normalization is a feature that makes it easier to write rules that are agnostic of the vendor(s)/format(s) that generate log data. It allows @rules to be written on datatypes and normalized vocabulary, instead of having to write one rule for each of the vendor log formats. It also alleviates the need to be intimately familiar with log formats.

However, these data are not easily searchable in Athena. In fact, it is virtually impossible to do a search of a particular IoC across all data schemas in the entire system.

Being able to do such a search would be greatly beneficial for Investigations. Additionally, StreamQuery (#1116) could greatly benefit from this as well.

Description

Consider the opening of this issue to be a public teaser for Normalization v2.0!

As an additional second step to StreamAlert normalization, we will extract all normalized data and drop them into Athena in a searchable format.

To do so, we'll leverage Kinesis Firehose Data Transformation.

Data Schema

dt type value function source_type record_id
2020-03-05-03 ip_address 50.50.50.50 connection_destination osquery_differential 26ee9dec-ba5a-48b4-99f5-5118ab74507e
2020-03-05-03 ip_address 64.64.64.64 connection_source osquery_differential 8d385e22-6ceb-4ef0-ac2f-5bfaf15d1a13
2020-03-05-03 hostname www.badwebsite.com dns_lookup infoblox_dns 187ed024-f986-4484-86d7-0faaf703fcc1
2020-03-05-03 ip_address 50.50.50.50 dns_lookup_result_ip infoblox_dns 187ed024-f986-4484-86d7-0faaf703fcc1
2020-03-05-03 computer_name ryxias.macbook.pro dns_lookup_requestor infoblox_dns 187ed024-f986-4484-86d7-0faaf703fcc1
  • dt — The Firehose-generated DT partition.
  • type — The normalized type
  • value — The value of the normalized type
  • function — Short, optional string that summarizes how the field is being used in the original record
  • source_type — The name of the table the original record is on
  • record_id — A globally unique id that is assigned to the original record. All Artifacts generated from that record share the same record_id.

User Journey

The above table actually illustrates an interesting situation that is not currently easy for StreamAlert to detect.

Suppose a threat, www.badwebsite.com, which constantly changes its IP address in order to evade detection. In the above example, we used osquery to monitor net traffic, and is able to detect that a specific machine made a connection to an ip address, 50.50.50.50... which we may not yet have identified as a threat or not.

However, a neighboring record from infoblox found that a recent DNS lookup that discovered 50.50.50.50 was associated with the known bad domain: www.badwebsite.com. OooOOoOoOOOoOOOOooOOO!

We infer that the computer, ryxias.macbook.pro, made an outbound connection request to what may have been a known bad domain. Which means it might be pwned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants