From f8c94a08439078cb4e59870a41ef0b02d99c947e Mon Sep 17 00:00:00 2001 From: Daniel Panzella Date: Wed, 5 Jun 2024 10:51:26 -0700 Subject: [PATCH 1/7] docs: Update README.md for public-dns-with-cloud-dns example (#130) * docs: Update README.md for public-dns-with-cloud-dns example * docs: Remove the example license key --- examples/public-dns-with-cloud-dns/README.md | 30 +++++++++++--------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/examples/public-dns-with-cloud-dns/README.md b/examples/public-dns-with-cloud-dns/README.md index 31b89ae..6194051 100644 --- a/examples/public-dns-with-cloud-dns/README.md +++ b/examples/public-dns-with-cloud-dns/README.md @@ -79,24 +79,26 @@ Note: The domain and subdomain association might take a few minutes to reflect. - Now, create a file called `terraform.tfvars` under the current directory - `terraform.tfvars` should have the following variables included, ```markdown - project_id = "" - region = "" - zone = "" - namespace = "" - license = "" - subdomain = "" - domain = "" + project_id = "" + region = "" + zone = "" + namespace = "" + license = "" + subdomain = "" + domain_name = "" + allowed_inbound_cidrs = [""] ``` Refer to sections above to see how you can obtain these values An example `terraform.tfvars` file would look like this, ```markdown - project_id = "playground-111" - region = "us-west4" - zone = "us-west4-a" - namespace = "venky-unique-3" - license = "eyJhbGciOiJS6InUzhEUXM1M0xQY09yNnZhaTdoSlduYnF1bTRZTlZWd1VwSWM9In0.eyJjb25jdXJyZW50QWdlbnRzIjoxMCwiZGVwbG95bWVudElkIjoiNGU0YWNiZmYtY2E5NS00MmRiLThmYmItMjliNmY5NTI2OWE0IiwibWF4VXNlcnMiOjQsIm1heFN0b3JhZ2VHYiI6MTAwMDAwMCwibWF4VGVhbXMiOjEsImV4cGlyZXNBdCI6IjIwMjItMTAtMjBUMTY6MjY6NTUuNzA3WiIsImZsYWdzIjpbIlNDQUxBQkxFIiwibXlzcWwiLCJzMyIsInJlZGlzIiwiTk9USUZJQ0FUSU9OUyIsInNsYWNrIiwibm90aWZpY2F0aW9ucyIsIk1BTkFHRU1FTlQiLCJvcmdfZGFzaCIsImF1dGgwIl0sInRyaWFsIjpmYWxzZSwiYWNjZXNzS2V5IjoiNzk3M2FkOWItNThmOC00OTUxLWJhOTctOGQ2NGFkYzI1ZThlIiwic2VhdHMiOjQsInRlYW1zIjoxLCJzdG9yYWdlR2lncyI6MTAwMDAwMCwiZXhwIjoxNjY2MjgzMjE1fQ.O_6D3Av9QoWI16ybg54KFvs7eGWugSXPxmfhobtZe3TBFvd8PwmSCAmMojmKWsqg6KNjLJ9sjxOP_3Pj9OAdrkx5WzU0KTcIByXD2hS9VwyYUOYEohBn65oCLnQJLYphXJBrB9JVS0GSUGxR1AzwnUK1PuKZ6jQFrpt-feQOD3rvCdyM1eBQ73rdHk6zfEBmdiZ7C4LiRLV8OEMxUfwxASvVF_cFUEeVQx82AaxRwfPBLZxXTL4qlQOIFjAKwGVyDMEWq04BhQ_ASdyND45w5qXiUOlvFOergrFyGBSHg-9yDT4fhdkDw5puGthDaMFsn02rr0eYHuxKFWSY958aig" - subdomain = "venky" - domain = "wandb.ml" + project_id = "playground-111" + region = "us-west4" + zone = "us-west4-a" + namespace = "venky-unique-3" + license = "W&B-license-key goes here" + subdomain = "venky" + domain_name = "wandb.ml" + allowed_inbound_cidrs = ["0.0.0.0/0"] ``` ### Initializing the terraform From e8c46022cbb5d9ca76f62ad14bded42279e5dfb6 Mon Sep 17 00:00:00 2001 From: Daniel Panzella Date: Wed, 5 Jun 2024 10:51:45 -0700 Subject: [PATCH 2/7] fix: Need the global redis helm values to not be null even when disabled (#131) --- main.tf | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/main.tf b/main.tf index d26147a..408e9b5 100644 --- a/main.tf +++ b/main.tf @@ -226,7 +226,7 @@ module "wandb" { ttlInSeconds = 604800 caCertPath = "/etc/ssl/certs/redis_ca.pem" } - } : null + } : {} } app = { @@ -242,7 +242,7 @@ module "wandb" { } } - redis = { install = false } + redis = { install = !var.create_redis } mysql = { install = false } weave = { From 4efd6fbb4cb9b9c2898eb827090444dc207d8c68 Mon Sep 17 00:00:00 2001 From: semantic-release-bot Date: Wed, 5 Jun 2024 17:52:15 +0000 Subject: [PATCH 3/7] chore(release): version 3.0.5 [skip ci] ### [3.0.5](https://github.com/wandb/terraform-google-wandb/compare/v3.0.4...v3.0.5) (2024-06-05) ### Bug Fixes * Need the global redis helm values to not be null even when disabled ([#131](https://github.com/wandb/terraform-google-wandb/issues/131)) ([e8c4602](https://github.com/wandb/terraform-google-wandb/commit/e8c46022cbb5d9ca76f62ad14bded42279e5dfb6)) --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7bc3bb8..13038ce 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,13 @@ All notable changes to this project will be documented in this file. +### [3.0.5](https://github.com/wandb/terraform-google-wandb/compare/v3.0.4...v3.0.5) (2024-06-05) + + +### Bug Fixes + +* Need the global redis helm values to not be null even when disabled ([#131](https://github.com/wandb/terraform-google-wandb/issues/131)) ([e8c4602](https://github.com/wandb/terraform-google-wandb/commit/e8c46022cbb5d9ca76f62ad14bded42279e5dfb6)) + ### [3.0.4](https://github.com/wandb/terraform-google-wandb/compare/v3.0.3...v3.0.4) (2024-04-12) From 1e8777af0f6bf6e8260a0faa488302f631b716b3 Mon Sep 17 00:00:00 2001 From: Aman Pruthi Date: Wed, 5 Jun 2024 23:43:36 +0530 Subject: [PATCH 4/7] feat: added support for stackdriver and otel metrics (#126) * feat: added support for stackdriver and otel metrics * fixed checks * terraform-docs: automated action --------- Co-authored-by: amanpruthi Co-authored-by: github-actions[bot] --- README.md | 11 +++- main.tf | 78 ++++++++++++++++++++++----- modules/app_gke/main.tf | 15 +++++- modules/app_gke/variables.tf | 5 ++ modules/service_accounts/main.tf | 24 +++++++++ modules/service_accounts/outputs.tf | 4 ++ modules/service_accounts/variables.tf | 15 ++++++ variables.tf | 15 ++++++ 8 files changed, 152 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index 768f473..3e2c602 100644 --- a/README.md +++ b/README.md @@ -67,7 +67,9 @@ resources that lack official modules. ## Providers -No providers. +| Name | Version | +|------|---------| +| [google](#provider\_google) | ~> 4.82 | ## Modules @@ -87,7 +89,9 @@ No providers. ## Resources -No resources. +| Name | Type | +|------|------| +| [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | ## Inputs @@ -103,6 +107,7 @@ No resources. | [deletion\_protection](#input\_deletion\_protection) | If the instance should have deletion protection enabled. The database / Bucket can't be deleted when this value is set to `true`. | `bool` | `true` | no | | [disable\_code\_saving](#input\_disable\_code\_saving) | Boolean indicating if code saving is disabled | `bool` | `false` | no | | [domain\_name](#input\_domain\_name) | Domain for accessing the Weights & Biases UI. | `string` | `null` | no | +| [enable\_stackdriver](#input\_enable\_stackdriver) | n/a | `bool` | `false` | no | | [force\_ssl](#input\_force\_ssl) | Enforce SSL through the usage of the Cloud SQL Proxy (cloudsql://) in the DB connection string | `bool` | `false` | no | | [gke\_machine\_type](#input\_gke\_machine\_type) | Specifies the machine type to be allocated for the database | `string` | `"n1-standard-4"` | no | | [gke\_node\_count](#input\_gke\_node\_count) | n/a | `number` | `2` | no | @@ -121,6 +126,7 @@ No resources. | [redis\_tier](#input\_redis\_tier) | Specifies the tier for this Redis instance | `string` | `"STANDARD_HA"` | no | | [resource\_limits](#input\_resource\_limits) | Specifies the resource limits for the wandb deployment | `map(string)` |
{
"cpu": null,
"memory": null
}
| no | | [resource\_requests](#input\_resource\_requests) | Specifies the resource requests for the wandb deployment | `map(string)` |
{
"cpu": "2000m",
"memory": "2G"
}
| no | +| [service\_account\_name](#input\_service\_account\_name) | n/a | `string` | `"stackdriver"` | no | | [size](#input\_size) | Deployment size for the instance | `string` | `null` | no | | [ssl](#input\_ssl) | Enable SSL certificate | `bool` | `true` | no | | [subdomain](#input\_subdomain) | Subdomain for accessing the Weights & Biases UI. Default creates record at Route53 Route. | `string` | `null` | no | @@ -129,6 +135,7 @@ No resources. | [wandb\_image](#input\_wandb\_image) | Docker repository of to pull the wandb image from. | `string` | `"wandb/local"` | no | | [wandb\_version](#input\_wandb\_version) | The version of Weights & Biases local to deploy. | `string` | `"latest"` | no | | [weave\_wandb\_env](#input\_weave\_wandb\_env) | Extra environment variables for W&B | `map(string)` | `{}` | no | +| [workload\_account\_id](#input\_workload\_account\_id) | n/a | `string` | `"stackdriver"` | no | ## Outputs diff --git a/main.tf b/main.tf index 408e9b5..320fda0 100644 --- a/main.tf +++ b/main.tf @@ -29,10 +29,13 @@ locals { } module "service_accounts" { - source = "./modules/service_accounts" - namespace = var.namespace - bucket_name = var.bucket_name - depends_on = [module.project_factory_project_services] + source = "./modules/service_accounts" + namespace = var.namespace + bucket_name = var.bucket_name + account_id = var.workload_account_id + service_account_name = var.service_account_name + enable_stackdriver = var.enable_stackdriver + depends_on = [module.project_factory_project_services] } module "kms" { @@ -77,14 +80,15 @@ locals { } module "app_gke" { - source = "./modules/app_gke" - namespace = var.namespace - machine_type = coalesce(try(local.deployment_size[var.size].node_instance, null), var.gke_machine_type) - node_count = coalesce(try(local.deployment_size[var.size].node_count, null), var.gke_node_count) - network = local.network - subnetwork = local.subnetwork - service_account = module.service_accounts.service_account - depends_on = [module.project_factory_project_services] + source = "./modules/app_gke" + namespace = var.namespace + machine_type = coalesce(try(local.deployment_size[var.size].node_instance, null), var.gke_machine_type) + node_count = coalesce(try(local.deployment_size[var.size].node_count, null), var.gke_node_count) + network = local.network + subnetwork = local.subnetwork + service_account = module.service_accounts.service_account + create_workload_identity = var.enable_stackdriver + depends_on = [module.project_factory_project_services] } module "app_lb" { @@ -186,6 +190,8 @@ locals { } : {} } +data "google_client_config" "current" {} + module "wandb" { source = "wandb/wandb/helm" version = "1.2.0" @@ -241,6 +247,54 @@ module "wandb" { "ingress.gcp.kubernetes.io/pre-shared-cert" = module.app_lb.certificate } } + # To support otel rds and redis metrics need operator-wandb chart minimum version 0.13.8 ( stackdriver subchart) + stackdriver = var.enable_stackdriver ? { + install = true + stackdriver = { + projectId = data.google_client_config.current.project + } + serviceAccount = { annotations = { "iam.gke.io/gcp-service-account" = module.service_accounts.monitoring_role } } + } : { + install = false + stackdriver = {} + serviceAccount = {} + } + + otel = { + daemonset = var.enable_stackdriver ? { + config = { + receivers = { + prometheus = { + config = { + scrape_configs = [ + { job_name = "stackdriver" + scheme = "http" + metrics_path = "/metrics" + dns_sd_configs = [ + { names = ["stackdriver"] + type = "A" + port = 9255 + } + ] + } + ] + } + } + } + service = { + pipelines = { + metrics = { + receivers = ["hostmetrics", "k8s_cluster", "kubeletstats", "prometheus"] + } + } + } + } + } : { config = { + receivers = {} + service = {} + } + } + } redis = { install = !var.create_redis } mysql = { install = false } diff --git a/modules/app_gke/main.tf b/modules/app_gke/main.tf index e57cbea..5027a22 100644 --- a/modules/app_gke/main.tf +++ b/modules/app_gke/main.tf @@ -1,3 +1,9 @@ +data "google_client_config" "current" {} + +locals { + project_id = data.google_client_config.current.project +} + resource "google_container_cluster" "default" { name = "${var.namespace}-cluster" @@ -11,7 +17,14 @@ resource "google_container_cluster" "default" { evaluation_mode = "PROJECT_SINGLETON_POLICY_ENFORCE" } - + # Conditionally enable workload identity + dynamic "workload_identity_config" { + for_each = var.create_workload_identity == true ? [1] : [] + content { + workload_pool = "${local.project_id}.svc.id.goog" + } + } + ip_allocation_policy { cluster_ipv4_cidr_block = "/14" services_ipv4_cidr_block = "/19" diff --git a/modules/app_gke/variables.tf b/modules/app_gke/variables.tf index a9ec740..fa502bb 100644 --- a/modules/app_gke/variables.tf +++ b/modules/app_gke/variables.tf @@ -43,4 +43,9 @@ variable "parquet_wandb_env" { variable "node_count" { type = number +} + +variable "create_workload_identity" { + description = "Flag to indicate whether to enable workload identity for the service account." + type = bool } \ No newline at end of file diff --git a/modules/service_accounts/main.tf b/modules/service_accounts/main.tf index 724e7d7..ca85630 100644 --- a/modules/service_accounts/main.tf +++ b/modules/service_accounts/main.tf @@ -1,4 +1,5 @@ data "google_client_config" "current" {} +data "google_project" "project" {} resource "random_id" "main" { # 30 bytes ensures that enough characters are generated to satisfy the service account ID requirements, regardless of @@ -60,3 +61,26 @@ resource "google_project_iam_member" "secretmanager_admin" { member = local.sa_member role = "roles/secretmanager.admin" } + + +resource "google_service_account" "workload-identity-user-sa" { + count = var.enable_stackdriver == true ? 1 : 0 + account_id = "stackdriver" + display_name = "Service Account For Workload Identity" + +} + +resource "google_project_iam_member" "monitoring-role" { + count = var.enable_stackdriver == true ? 1 : 0 + project = local.project_id + role = "roles/monitoring.viewer" + member = "serviceAccount:${google_service_account.workload-identity-user-sa[count.index].email}" +} + + +resource "google_project_iam_member" "workload_identity-role" { + count = var.enable_stackdriver == true ? 1 : 0 + project = local.project_id + role = "roles/iam.workloadIdentityUser" + member = "serviceAccount:${local.project_id}.svc.id.goog[default/${var.service_account_name}]" +} \ No newline at end of file diff --git a/modules/service_accounts/outputs.tf b/modules/service_accounts/outputs.tf index 0ed66fa..ba84de5 100644 --- a/modules/service_accounts/outputs.tf +++ b/modules/service_accounts/outputs.tf @@ -2,4 +2,8 @@ output "service_account" { value = google_service_account.main description = "The service account." +} + +output "monitoring_role" { + value = var.enable_stackdriver == true ? google_service_account.workload-identity-user-sa[0].email : null } \ No newline at end of file diff --git a/modules/service_accounts/variables.tf b/modules/service_accounts/variables.tf index e4d4bb8..6cc7675 100644 --- a/modules/service_accounts/variables.tf +++ b/modules/service_accounts/variables.tf @@ -7,4 +7,19 @@ variable "bucket_name" { type = string description = "Existing bucket the service account will access" default = "" +} + +variable "account_id" { + description = "The ID of the Google Cloud Platform (GCP) account." + type = string +} + +variable "service_account_name" { + description = "The name of the service account." + type = string +} + +variable "enable_stackdriver" { + description = "Flag to indicate whether to enable workload identity for the service account." + type = bool } \ No newline at end of file diff --git a/variables.tf b/variables.tf index a2cbff8..57aa658 100644 --- a/variables.tf +++ b/variables.tf @@ -253,3 +253,18 @@ variable "parquet_wandb_env" { description = "Extra environment variables for W&B" default = {} } + +variable "enable_stackdriver" { + type = bool + default = false +} + +variable "workload_account_id" { + type = string + default = "stackdriver" +} + +variable "service_account_name" { + type = string + default = "stackdriver" +} \ No newline at end of file From 24c2227a64974bfd543c1f0b9c2f40fd6ae7f052 Mon Sep 17 00:00:00 2001 From: semantic-release-bot Date: Wed, 5 Jun 2024 18:14:08 +0000 Subject: [PATCH 5/7] chore(release): version 3.1.0 [skip ci] ## [3.1.0](https://github.com/wandb/terraform-google-wandb/compare/v3.0.5...v3.1.0) (2024-06-05) ### Features * added support for stackdriver and otel metrics ([#126](https://github.com/wandb/terraform-google-wandb/issues/126)) ([1e8777a](https://github.com/wandb/terraform-google-wandb/commit/1e8777af0f6bf6e8260a0faa488302f631b716b3)) --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 13038ce..2a5c4a0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,13 @@ All notable changes to this project will be documented in this file. +## [3.1.0](https://github.com/wandb/terraform-google-wandb/compare/v3.0.5...v3.1.0) (2024-06-05) + + +### Features + +* added support for stackdriver and otel metrics ([#126](https://github.com/wandb/terraform-google-wandb/issues/126)) ([1e8777a](https://github.com/wandb/terraform-google-wandb/commit/1e8777af0f6bf6e8260a0faa488302f631b716b3)) + ### [3.0.5](https://github.com/wandb/terraform-google-wandb/compare/v3.0.4...v3.0.5) (2024-06-05) From 34c5d94da5ba75d9d5c7ad6ebbd6aef66bf702c4 Mon Sep 17 00:00:00 2001 From: Aditya Choudhari <48932219+adityachoudhari26@users.noreply.github.com> Date: Wed, 5 Jun 2024 11:30:32 -0700 Subject: [PATCH 6/7] fix: Consistent object type for redis (#133) --- main.tf | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/main.tf b/main.tf index 320fda0..dbbd542 100644 --- a/main.tf +++ b/main.tf @@ -232,7 +232,17 @@ module "wandb" { ttlInSeconds = 604800 caCertPath = "/etc/ssl/certs/redis_ca.pem" } - } : {} + } : { + password = "" + host = "" + port = 0 + caCert = "" + params = { + tls = false + ttlInSeconds = 0 + caCertPath = "" + } + } } app = { @@ -255,9 +265,9 @@ module "wandb" { } serviceAccount = { annotations = { "iam.gke.io/gcp-service-account" = module.service_accounts.monitoring_role } } } : { - install = false - stackdriver = {} - serviceAccount = {} + install = false + stackdriver = {} + serviceAccount = {} } otel = { From 255a7f10a896310e85393120c625795664715139 Mon Sep 17 00:00:00 2001 From: semantic-release-bot Date: Wed, 5 Jun 2024 18:31:05 +0000 Subject: [PATCH 7/7] chore(release): version 3.1.1 [skip ci] ### [3.1.1](https://github.com/wandb/terraform-google-wandb/compare/v3.1.0...v3.1.1) (2024-06-05) ### Bug Fixes * Consistent object type for redis ([#133](https://github.com/wandb/terraform-google-wandb/issues/133)) ([34c5d94](https://github.com/wandb/terraform-google-wandb/commit/34c5d94da5ba75d9d5c7ad6ebbd6aef66bf702c4)) --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2a5c4a0..65bb069 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,13 @@ All notable changes to this project will be documented in this file. +### [3.1.1](https://github.com/wandb/terraform-google-wandb/compare/v3.1.0...v3.1.1) (2024-06-05) + + +### Bug Fixes + +* Consistent object type for redis ([#133](https://github.com/wandb/terraform-google-wandb/issues/133)) ([34c5d94](https://github.com/wandb/terraform-google-wandb/commit/34c5d94da5ba75d9d5c7ad6ebbd6aef66bf702c4)) + ## [3.1.0](https://github.com/wandb/terraform-google-wandb/compare/v3.0.5...v3.1.0) (2024-06-05)