Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-4741] - Separate AWS node network costs into a Network unattributed namespace #5104

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

samdoran
Copy link
Contributor

@samdoran samdoran commented May 13, 2024

Jira Ticket

COST-4741

Description

Separates out AWS node network costs into a separate project for later distribution.

Testing

  1. Load OCP on AWS test data.
    This is currently broken. You will need to run the second command twice.

    make create-test-customer
    make load-test-customer-data test_source=aws
    
  2. Inspect data in Trino. Where product_family is Data Transfer and product_code is AmazonEC2, there should be a data transfer direction

    SELECT namespace,product_code,product_family,data_transfer_direction,usage_amount FROM reporting_ocpawscostlineitem_project_daily_summary;
    
    namespace                 |  product_code  |   product_family   | data_transfer_direction |     usage_amount
    --------------------------+----------------+--------------------+-------------------------+-----------------------
     cost-management          | AmazonEC2      | Compute Instance   | NULL                    |                   8.0
     Network unattributed     | AmazonEC2      | Data Transfer      | IN                      |   0.05788159130273985
     Network unattributed     | AmazonEC2      | Data Transfer      | IN                      |   0.08385133566183305
     Network unattributed     | AmazonEC2      | Data Transfer      | OUT                     | 4.0166646410467517E-4
     Network unattributed     | AmazonEC2      | Data Transfer      | IN                      |  0.013915647933780239
     Platform unallocated     | AmazonEC2      | Data Transfer      | NULL                    |   0.04199137044014825
     Network unattributed     | AmazonEC2      | Data Transfer      | OUT                     |  0.050977435375557656
    ...
    
  3. Verify there are records for Network unattributed in PostgreSQL AWS daily summary table

    SELECT * FROM reporting_ocpawscostlineitem_project_daily_summary_p WHERE namespace = 'Network unattributed';
    
  4. Verify OpenShift costs are correct

    curl http://localhost:8000/api/cost-management/v1/reports/openshift/costs/
    
  5. Verify AWS filtered by OpenShift costs are correct

    http://localhost:8000/api/cost-management/v1/reports/openshift/infrastructures/aws/costs/
    
  6. Compare Trino and PostgreSQL data to make sure costs add up:

    postgres=# SELECT sum(unblended_cost) from reporting_aws_cost_summary_p
    sum
    -----------------
     23785.362736418
    
    trino:org1234567> SELECT sum(lineitem_unblendedcost) from aws_line_items;
    _col0
    --------------------
     23785.362736451105
    (1 row)
    ```
    
    
    
    
    

Release Notes

  • proposed release note
* [COST-3761](https://issues.redhat.com/browse/COST-3761) Separate AWS EC2 instance network costs into a `Network unattributed` project
  Network costs were previously combined with the AWS EC2 instance costs. Now those costs presented in a new `Network unattributed` project.

@samdoran samdoran added the aws-smoke-tests pr_check will build the image and run aws + ocp on aws smoke tests label May 13, 2024
@samdoran samdoran force-pushed the COST-3761/COST-4741-aws-network-costs branch from f358d79 to 6f1880b Compare May 13, 2024 22:42
@samdoran samdoran changed the title [COST-3761] - Separate AWS node network costs into a Network unattributed project [COST-3761] - Separate AWS node network costs into a Network unattributed namespace May 13, 2024
@samdoran samdoran force-pushed the COST-3761/COST-4741-aws-network-costs branch 2 times, most recently from a5fbc13 to 9f63a4c Compare May 14, 2024 20:10
@samdoran samdoran marked this pull request as ready for review May 14, 2024 20:18
@samdoran samdoran requested review from a team as code owners May 14, 2024 20:18
Copy link

codecov bot commented May 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.1%. Comparing base (7f8cb20) to head (2c0ded2).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #5104   +/-   ##
=====================================
  Coverage   94.1%   94.1%           
=====================================
  Files        375     375           
  Lines      31170   31170           
  Branches    3730    3730           
=====================================
+ Hits       29320   29321    +1     
  Misses      1180    1180           
+ Partials     670     669    -1     

@samdoran samdoran changed the title [COST-3761] - Separate AWS node network costs into a Network unattributed namespace [COST-4741] - Separate AWS node network costs into a Network unattributed namespace May 20, 2024
@samdoran samdoran force-pushed the COST-3761/COST-4741-aws-network-costs branch from bd7c5a0 to f2312dc Compare May 20, 2024 20:42
@samdoran samdoran force-pushed the COST-3761/COST-4741-aws-network-costs branch from f2312dc to 2c0ded2 Compare May 31, 2024 17:04
Copy link
Contributor

@cgoodfred cgoodfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few quick questions/comments

Comment on lines +210 to +213
WHEN position('in-bytes' IN lower(max(aws.lineitem_usagetype))) > 0 THEN 'IN'
WHEN position('out-bytes' IN lower(max(aws.lineitem_usagetype))) > 0 THEN 'OUT'
WHEN (position('regional-bytes' IN lower(max(aws.lineitem_usagetype))) > 0 AND position('-in' IN lower(max(lineitem_operation))) > 0) THEN 'IN'
WHEN (position('regional-bytes' IN lower(max(aws.lineitem_usagetype))) > 0 AND position('-out' IN lower(max(lineitem_operation))) > 0) THEN 'OUT'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to use position over strpos when we've been using strpos everywhere else in TrinoSQL files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason other than it's easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

literally everywhere else we use strpos...

WHEN (position('regional-bytes' IN lower(max(aws.lineitem_usagetype))) > 0 AND position('-out' IN lower(max(lineitem_operation))) > 0) THEN 'OUT'
ELSE NULL
END
END AS data_transfer_direction,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SQL is grouped by these items below, are these enough to ensure a unique data_transfer_direction or should we be grouping by the data_transfer_direction result as well?

GROUP BY aws.lineitem_usagestartdate,
    aws.lineitem_resourceid,
    4, -- CASE satement
    aws.product_productfamily,
    aws.product_instancetype,
    aws.lineitem_availabilityzone,
    aws.product_region,
    aws.resourcetags,
    aws.costcategory

max(cluster_alias),
max(data_source),
'Network unattributed' AS namespace,
max(node),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the max here, this is grouped by node.

Suggested change
max(node),
node,

Comment on lines +918 to +926
CASE
WHEN upper(data_transfer_direction) = 'IN' THEN usage_amount
ELSE 0
END AS infrastructure_data_in_gigabytes,
CASE
WHEN upper(data_transfer_direction) = 'OUT' THEN usage_amount
ELSE 0
END AS infrastructure_data_out_gigabytes,
data_transfer_direction,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking cause I'm not sure, are the units for the AWS data always in GB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws-smoke-tests pr_check will build the image and run aws + ocp on aws smoke tests smokes-required
Projects
None yet
3 participants