Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-4745] OCPGCP Network data processing SQL #5058

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

cgoodfred
Copy link
Contributor

@cgoodfred cgoodfred commented Apr 22, 2024

Jira Ticket

COST-4745

Description

This change will add ocp on gcp network processing, I need to separate out the migrations and get nise updated, this PR is not ready for review. The migration here will be needed for all 3 cloud providers and will be joined into a single migration when its ready.

NOTE: when GCP renamed Ingress to Data Transfer in, Egress was renamed to Data Transfer that sometimes has a conditional of out but sometimes does not. Based on my understanding of this GCP article, Ingress was simply renamed to Data Transfer In and any other data transfer is Egress/Outbound

Testing

  1. Using nise > 4.5.3, create GCP compute data that has networking SKUs defined for the same resource id as an OpenShift node. Something like
---
generators:
  - ComputeEngineGenerator:
      start_date: {{start_date}}
      end_date: {{end_date}}
      price: 2
      sku_id: CF4E-A0C7-E3BF
      usage.amount_in_pricing_units: 1
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}]
  - ComputeEngineGenerator:
      start_date: {{start_date}}
      end_date: {{end_date}}
      price: 2
      sku_id: BBF8-C07D-1DF4
      usage.amount_in_pricing_units: 50
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}]
  - ComputeEngineGenerator:
      start_date: 2024-05-01
      end_date: 2024-05-31
      price: 30
      sku_id: 9DE9-9092-B3BC
      usage.amount_in_pricing_units: 10
      usage.pricing_unit: hour
      currency: USD
      instance_type: m2-megamem-416
      location.region: australia-southeast1-a
      resource.name: projects/nise-populator/instances/gcp_compute1
      resource.global_name: //compute.googleapis.com/projects/nise-populator/zones/australia-southeast1-a/instances/3447398860992947181
      labels: [{"environment": "clyde", "app":"winter", "version":"green", "kubernetes-io-cluster-c32se93c-73z3-3s3d-cs23-d3245sj45349": "owned"}] 
  1. Create a source and load the OCP data
  2. Create a source and load the GCP data you just created
  3. Let summary run and check the OCP and OCP on GCP database records and verify the network records are visible and distinct with infrastructure_data_in_gigabytes or infrastructure_data_out_gigabytes filled in for each day and each Network unattributed project.
  4. Run a few SQL queries to verify the costs before and after OCPGCP summary line up.
    docker exec -it trino trino --server localhost:8080 --catalog hive --schema org1234567 --user admin --debug
trino:org1234567> SELECT sum(cost) as cost FROM gcp_openshift_daily;
  cost   
----------
 289440.0 
(1 row)

trino:org1234567> SELECT sum(unblended_cost) as cost FROM reporting_ocpgcpcostlineitem_project_daily_summary;
  cost   
----------
 289440.0 
(1 row)

trino:org1234567> SELECT sum(cost) as cost FROM gcp_openshift_daily WHERE lower(sku_description) LIKE '%data transfer%';
  cost   
----------
 288000.0 
(1 row)

trino:org1234567> SELECT sum(unblended_cost) as cost FROM reporting_ocpgcpcostlineitem_project_daily_summary WHERE data_transfer_direction IS NOT NULL;
   cost   
----------
 288000.0 
(1 row)

trino:org1234567> select sum(unblended_cost) as cost, data_transfer_direction from reporting_ocpgcpcostlineitem_project_daily_summary WHERE data_transfer_direc
tion IS NOT NULL GROUP BY data_transfer_direction;
   cost   | data_transfer_direction 
----------+-------------------------
  72000.0 | IN                      
 216000.0 | OUT                     
(2 rows)

72000 = 50 (usage) * 2 (rate) * 24 hours * 30 days
216000 = 30 (usage) * 10 (rate) * 24 hours * 30 days

Release Notes

  • proposed release note
* [COST-####](https://issues.redhat.com/browse/COST-####) Fix some things

Copy link

codecov bot commented Apr 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.1%. Comparing base (3dcf7df) to head (0b2854f).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #5058     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.1%     
=======================================
  Files        376     375      -1     
  Lines      31308   31170    -138     
  Branches    3756    3730     -26     
=======================================
- Hits       29467   29321    -146     
- Misses      1173    1180      +7     
- Partials     668     669      +1     

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants