Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Integration][AWS] Add new integration #555

Merged
merged 142 commits into from
May 30, 2024

Conversation

shalev007
Copy link
Contributor

@shalev007 shalev007 commented Apr 15, 2024

Description

Ocean AWS Integration 🥇
basically the same exporter we have today only

  1. is OSS
  2. can use multi account
  3. utilises the ocean framework

What - AWS Integration
Why - Better, faster, more accounts, less installations
How -

Type of change

Please leave one option from the following and delete the rest:

  • New Integration (non-breaking change which adds a new integration)

Screenshots

image

API Documentation

Provide links to the API documentation used for this integration. WIP

Copy link
Collaborator

@yairsimantov20 yairsimantov20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shallow comments

integrations/aws/CHANGELOG.md Show resolved Hide resolved
integrations/aws/Dockerfile Outdated Show resolved Hide resolved
integrations/aws/config.yaml Outdated Show resolved Hide resolved
integrations/aws/config.yaml Outdated Show resolved Hide resolved
integrations/aws/overrides.py Outdated Show resolved Hide resolved
integrations/aws/pyproject.toml Outdated Show resolved Hide resolved
integrations/aws/pyproject.toml Outdated Show resolved Hide resolved
integrations/aws/pyproject.toml Outdated Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from dc1024e to da49c5e Compare April 16, 2024 15:56
integrations/aws/.port/resources/port-app-config.yml Outdated Show resolved Hide resolved
integrations/aws/.port/resources/port-app-config.yml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/resources/blueprints.json Outdated Show resolved Hide resolved
integrations/aws/overrides.py Outdated Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch 2 times, most recently from 9ceb5c6 to 973f6ac Compare April 18, 2024 16:43
Copy link
Collaborator

@yairsimantov20 yairsimantov20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to make the code more simple

integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/config.yaml Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Show resolved Hide resolved
integrations/aws/utils.py Outdated Show resolved Hide resolved
self.enabled_regions = []

async def updateEnabledRegions(self):
session = aioboto3.Session(self.access_key_id, self.secret_access_key, self.session_token)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it should know to detect those

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The session uses by default it's default region, but I would like to scan all of the regions available for each account

Comment on lines 45 to 52
def isRole(self):
return self.session_token is not None

async def createSession(self, region: str) -> aioboto3.Session:
if self.isRole():
return aioboto3.Session(self.access_key_id, self.secret_access_key, self.session_token, region)
else:
return aioboto3.Session(aws_access_key_id=self.access_key_id, aws_secret_access_key=self.secret_access_key, region_name=region)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dont you work only with the role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left it as an option to run the integration outside of AWS since I initially developed it to use a user credentials, it supports both ways though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant you run login with the cli?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but running this integration on-prem might not use a cli

else:
return aioboto3.Session(aws_access_key_id=self.access_key_id, aws_secret_access_key=self.secret_access_key, region_name=region)

async def createSessionForEachRegion(self) -> AsyncIterator[aioboto3.Session]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that async?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aioboto3 Session returns a coroutine

@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch 3 times, most recently from 39eded1 to 543b9cf Compare April 24, 2024 09:38
Comment on lines 29 to 31
async def create_session_for_each_region(self) -> AsyncIterator[aioboto3.Session]:
for region in self.enabled_regions:
yield self.create_session(region)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please test a multiple regions account?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tested it, why?

Comment on lines 45 to 52
def isRole(self):
return self.session_token is not None

async def createSession(self, region: str) -> aioboto3.Session:
if self.isRole():
return aioboto3.Session(self.access_key_id, self.secret_access_key, self.session_token, region)
else:
return aioboto3.Session(aws_access_key_id=self.access_key_id, aws_secret_access_key=self.secret_access_key, region_name=region)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant you run login with the cli?

integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/.port/spec.yaml Outdated Show resolved Hide resolved
integrations/aws/aws_credentials.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yairsimantov20 yairsimantov20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments

Comment on lines 179 to 204
def validate_request(request: Request) -> None:
"""
Validates the request by checking for the presence of the API key in the request headers.
"""
api_key = request.headers.get('x-port-aws-ocean-api-key')
if not api_key:
raise ValueError("API key not found in request headers")
if not ocean.integration_config.get("aws_api_key"):
raise ValueError("API key not found in integration config")
if api_key != ocean.integration_config.get("aws_api_key"):
raise ValueError("Invalid API key")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is better way to validate the request

https://docs.aws.amazon.com/IAM/latest/UserGuide/create-signed-request.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using api-destinations service to send the requests to the ocean integration, So I have no control over the requst, The only auth types avaiable to me were
Oauth, user/password or an api-key

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets talk about this

integrations/aws/utils.py Outdated Show resolved Hide resolved
resource = await describe_single_resource(resource_type, identifier, account_id, body.get("awsRegion"))
for resource_config in matching_resource_configs:
blueprint = resource_config.port.entity.mappings.blueprint.strip('"')
if not resource: # Resource probably deleted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please show me an example for delete event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an event with an AWS S3 Bucket name arrived but the the bucket does not exist on AWS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show me how it looks like?

Comment on lines 123 to 146
resource_type = body.get("resource_type")
identifier = body.get("identifier")
account_id = body.get("accountId")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for any of them to not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in AWS as a user you can create custom events, all of the events of our exporter need to be in the same format in order for our bot to handle them, so if an event body was not formatted correctly we cannot process it.

please take a look at how we currently format the events in our docs (any cloudformation example) https://docs.getport.io/build-your-software-catalog/sync-data-to-catalog/cloud-providers/aws/examples/

integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/session_manager.py Outdated Show resolved Hide resolved
integrations/aws/session_manager.py Outdated Show resolved Hide resolved
integrations/aws/session_manager.py Outdated Show resolved Hide resolved
integrations/aws/session_manager.py Outdated Show resolved Hide resolved
Comment on lines 82 to 91
if account['Id'] == self._application_account_id:
# Skip the current account as it is already added
# Replace the Temp account details with the current account details
self._aws_accessible_accounts[0] = account
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we talk about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let's talk

@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from 543b9cf to 3c52b8d Compare April 24, 2024 14:04
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from 14887c8 to 9f428ac Compare April 25, 2024 17:06

@ocean.on_resync()
async def resync_all(kind: str) -> ASYNC_GENERATOR_RESYNC_TYPE:
await update_available_access_credentials()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are u doing it only here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial approach was to periodically refresh the credentials, the resync all method that runs before any other resync method seems like a nice compromise, it seems redundant to me to update on evry type of sync, WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now do it in every reync
nothing promise you it will stay that way
you can update the credentials once and then set it to the event parameters that you have already done this to prevent future calls

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if "CREDENTIALS_CACHE" in event.attributes:
    return
fetch
event.attributes["CREDENTIALS_CACHE"] = X

integrations/aws/pyproject.toml Show resolved Hide resolved
integrations/aws/overrides.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
Comment on lines 175 to 203
except Exception as e:
logger.error(f"Failed to process event from aws error: {e}", error=e)
return {"ok": False}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really
think about when we will work over ocean SAAS it can be very helpful to monitor

integrations/aws/main.py Outdated Show resolved Hide resolved
blueprint=blueprint,
)
resource.update({KIND_PROPERTY: resource_type, ACCOUNT_ID_PROPERTY: account_id, REGION_PROPERTY: body.get("awsRegion")})
await ocean.register_raw(resource_config.kind, [_fix_unserializable_date_properties(resource)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very problematic as ocean dont let you pass the resource config itself and therefor you might trigger the same resource parsing multiple times

Comment on lines 176 to 192
resource.update(
{
KIND_PROPERTY: resource_type,
ACCOUNT_ID_PROPERTY: account_id,
REGION_PROPERTY: region,
IDENTIFIER_PROPERTY: identifier,
}
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove anything that can already be found in the playload

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resources that are not using cloudcontrol do not always use Identifier as their identifier key, for example
in EC2 case it is listed as InstanceId

response = await getattr(client, describe_method)()
next_token = response.get(marker_param)
for resource in response.get(list_param, []):
resource.update({KIND_PROPERTY: kind, ACCOUNT_ID_PROPERTY: account_id, REGION_PROPERTY: region, IDENTIFIER_PROPERTY: resource.get('Identifier')})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets talk about this i want to see an exmple for the data

integrations/aws/utils.py Outdated Show resolved Hide resolved
if kind == ResourceKindsWithSpecialHandling.ACM:
async with session.client('acm') as acm:
response = await acm.describe_certificate(CertificateArn=identifier)
return response.get('Certificate', {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? you are still using .get

@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch 3 times, most recently from 420330f to 334b773 Compare April 30, 2024 14:07
Copy link
Collaborator

@yairsimantov20 yairsimantov20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments

integrations/aws/overrides.py Outdated Show resolved Hide resolved
Comment on lines 201 to 202
if response.status_code <= 299:
response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u check if there are status codes that might close the aws pipeline for the webhooks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what did you mean in

close pipeline for webhooks

but according to AWS documentation:
Events associated with error codes 409, 429, and 5xx are retried.
Events associated with error codes 1xx, 2xx, 3xx, and 4xx (excluding 429) aren't retried.

integrations/aws/main.py Outdated Show resolved Hide resolved
integrations/aws/main.py Outdated Show resolved Hide resolved
Comment on lines 193 to 195
await ocean.register_raw(
resource_config.kind, [fix_unserializable_date_properties(resource)]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is problematic that you might issue the same kind multiple times

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so since they have the same identifier, BTW the Azure integration does this the same way exactly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unique to get_matching_kinds_from_config so that no kinds will be duplicated

integrations/aws/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/aws/utils.py Outdated Show resolved Hide resolved
integrations/aws/aws/aws_credentials.py Outdated Show resolved Hide resolved
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from aa9bd75 to 77c3d9c Compare May 5, 2024 11:27
@Tankilevitch Tankilevitch changed the title PORT 7056 aws exporter code in ocean [Integration][AWS aws exporter code in ocean May 6, 2024
@Tankilevitch Tankilevitch changed the title [Integration][AWS aws exporter code in ocean [Integration][AWS] Add new integration May 6, 2024
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from 0872d18 to db83a0d Compare May 8, 2024 08:05
@shalev007 shalev007 force-pushed the PORT-7056-aws-exporter-code-in-ocean branch from 027e5e0 to f3b1e6f Compare May 29, 2024 16:04
@shalev007 shalev007 merged commit 0205024 into main May 30, 2024
5 checks passed
@shalev007 shalev007 deleted the PORT-7056-aws-exporter-code-in-ocean branch May 30, 2024 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants