Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission Denied on all pages after upgrading to 1.4.4 from 1.3.77 #4299

Closed
ikogan opened this issue Apr 29, 2024 · 7 comments
Closed

Permission Denied on all pages after upgrading to 1.4.4 from 1.3.77 #4299

ikogan opened this issue Apr 29, 2024 · 7 comments

Comments

@ikogan
Copy link

ikogan commented Apr 29, 2024

What went wrong?

What happened:

  • Upgraded Grafana to v10.4.1 and OnCall to 1.4.4. Old versions were 10.2.2 and 1.3.77.
  • Confirmed that plugin and backend versions are the same.
  • Confirmed that the user is an admin
  • Attempted to test OnCall
  • Got "Detail: You do not have permission to perform this action." popup.
  • Disconnected/delete OnCall configuration, deleted service accounts and tokens, reconnected and OnCall shows "Connected".
  • Still receiving the above error, logs from OnCall Engine show several 403s:
Container: oncall
Filter
Connected
[uWSGI] getting INI configuration from uwsgi.ini
2024-04-29T18:25:41.462503736Z [log-encoder] registered format ${strftime:%Y-%m-%d %H:%M:%S} ${msgnl}
2024-04-29T18:25:43.058424886Z 2024-04-29 18:25:43 *** Starting uWSGI 2.0.21 (64bit) on [Mon Apr 29 18:25:41 2024] ***
2024-04-29T18:25:43.058477105Z 2024-04-29 18:25:43 compiled with version: 12.2.1 20220924 on 25 April 2024 20:13:44
2024-04-29 18:25:43 os: Linux-5.4.0-174-generic #193-Ubuntu SMP Thu Mar 7 14:29:28 UTC 2024
2024-04-29 18:25:43 nodename: grafana-oncall-engine-777bff7447-czdjj
2024-04-29 18:25:43 machine: x86_64
2024-04-29 18:25:43 clock source: unix
2024-04-29T18:25:43.058520031Z 2024-04-29 18:25:43 pcre jit disabled
2024-04-29T18:25:43.058527963Z 2024-04-29 18:25:43 detected number of CPU cores: 8
2024-04-29 18:25:43 current working directory: /etc/app
2024-04-29T18:25:43.058544002Z 2024-04-29 18:25:43 writing pidfile to /tmp/project-master.pid
2024-04-29 18:25:43 detected binary path: /usr/local/bin/uwsgi
2024-04-29 18:25:43 chdir() to /etc/app
2024-04-29 18:25:43 your memory page size is 4096 bytes
2024-04-29T18:25:43.058575990Z 2024-04-29 18:25:43 detected max file descriptor number: 1048576
2024-04-29 18:25:43 lock engine: pthread robust mutexes
2024-04-29T18:25:43.058592383Z 2024-04-29 18:25:43 thunder lock: disabled (you can enable it with --thunder-lock)
2024-04-29T18:25:43.058600673Z 2024-04-29 18:25:43 uWSGI http bound on 0.0.0.0:8080 fd 7
2024-04-29T18:25:43.058608953Z 2024-04-29 18:25:43 uwsgi socket 0 bound to TCP address 127.0.0.1:44507 (port auto-assigned) fd 6
2024-04-29T18:25:43.058617290Z 2024-04-29 18:25:43 Python version: 3.11.4 (main, Aug  9 2023, 08:38:11) [GCC 12.2.1 20220924]
2024-04-29T18:25:43.058625447Z 2024-04-29 18:25:43 Python main interpreter initialized at 0x7f9735747578
2024-04-29T18:25:43.058633553Z 2024-04-29 18:25:43 python threads support enabled
2024-04-29T18:25:43.058641639Z 2024-04-29 18:25:43 your server socket listen backlog is limited to 1024 connections
2024-04-29T18:25:43.058649766Z 2024-04-29 18:25:43 your mercy for graceful operations on workers is 60 seconds
2024-04-29T18:25:43.058668148Z 2024-04-29 18:25:43 mapped 855306 bytes (835 KB) for 5 cores
2024-04-29 18:25:43 *** Operational MODE: preforking ***
2024-04-29T18:25:43.058689806Z 2024-04-29 18:25:43 WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x7f9735747578 pid: 1 (default app)
2024-04-29T18:25:43.058703176Z 2024-04-29 18:25:43 spawned uWSGI master process (pid: 1)
2024-04-29T18:25:43.058714166Z 2024-04-29 18:25:43 spawned uWSGI worker 1 (pid: 7, cores: 1)
2024-04-29T18:25:43.058725921Z 2024-04-29 18:25:43 spawned uWSGI worker 2 (pid: 8, cores: 1)
2024-04-29T18:25:43.058763205Z 2024-04-29 18:25:43 spawned uWSGI worker 3 (pid: 9, cores: 1)
2024-04-29T18:25:43.058774636Z 2024-04-29 18:25:43 spawned uWSGI worker 4 (pid: 10, cores: 1)
2024-04-29T18:25:43.058784964Z 2024-04-29 18:25:43 spawned uWSGI worker 5 (pid: 11, cores: 1)
2024-04-29T18:25:43.058796543Z 2024-04-29 18:25:43 spawned uWSGI http 1 (pid: 12)
2024-04-29 18:25:47 source=engine:app google_trace_id=none logger=root inbound latency=0.403098 status=200 method=GET path=/startupprobe/ user_agent=kube-probe/1.27 content-length=0 slow=0 
2024-04-29 18:25:47 source=engine:uwsgi status=200 method=GET path=/startupprobe/ latency=0.407084 google_trace_id=- protocol=HTTP/1.1 resp_size=221 req_body_size=0
2024-04-29 18:25:47 source=engine:app google_trace_id=none logger=root inbound latency=0.000464 status=200 method=GET path=/ready/ user_agent=kube-probe/1.27 content-length=0 slow=0 
2024-04-29T18:25:47.461725580Z 2024-04-29 18:25:47 source=engine:uwsgi status=200 method=GET path=/ready/ latency=0.001164 google_trace_id=- protocol=HTTP/1.1 resp_size=221 req_body_size=0
2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=root inbound latency=0.018573 status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:53.807858297Z 2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alert_receive_channels/integration_options
2024-04-29T18:25:53.808998990Z 2024-04-29 18:25:53 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options latency=0.019574 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=root inbound latency=0.031375 status=200 method=GET path=/api/internal/v1/user user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:53.841104127Z 2024-04-29 18:25:53 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/user latency=0.032027 google_trace_id=- protocol=HTTP/1.1 resp_size=1400 req_body_size=0
2024-04-29 18:25:53 Mon Apr 29 18:25:53 2024 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /api/internal/v1/plugin/status (ip 10.42.4.179) !!!
2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=root inbound latency=0.022968 status=403 method=GET path=/api/internal/v1/teams user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:53.883520765Z 2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/teams
2024-04-29T18:25:53.883543956Z 2024-04-29 18:25:53 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/teams?include_no_team=true&only_include_notifiable_teams=false&search=&short=true latency=0.023863 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:53 Mon Apr 29 18:25:53 2024 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /api/internal/v1/alertgroups/bulk_action_options (ip 10.42.4.179) !!!
2024-04-29 18:25:53 Mon Apr 29 18:25:53 2024 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /api/internal/v1/alertgroups/silence_options (ip 10.42.4.179) !!!
2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=root inbound latency=0.027425 status=200 method=GET path=/api/internal/v1/features user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:53.927210954Z 2024-04-29 18:25:53 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/features latency=0.028232 google_trace_id=- protocol=HTTP/1.1 resp_size=335 req_body_size=0
2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=root inbound latency=0.016716 status=403 method=GET path=/api/internal/v1/alertgroups/filters user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:53.959087306Z 2024-04-29 18:25:53 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alertgroups/filters
2024-04-29 18:25:53 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alertgroups/filters latency=0.017737 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.023076 status=403 method=GET path=/api/internal/v1/cloud_connection user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.014240619Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/cloud_connection
2024-04-29T18:25:54.014501261Z 2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/cloud_connection latency=0.024058 google_trace_id=- protocol=HTTP/1.1 resp_size=317 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.032225 status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.419363843Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alert_receive_channels/integration_options
2024-04-29T18:25:54.419783345Z 2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options latency=0.033056 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.050206 status=403 method=GET path=/api/internal/v1/teams user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.537010718Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/teams
2024-04-29T18:25:54.537015588Z 2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/teams?include_no_team=true&only_include_notifiable_teams=false&search=&short=true latency=0.051040 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.735796 status=403 method=GET path=/api/internal/v1/organization user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.552984563Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/organization
2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/organization latency=0.741051 google_trace_id=- protocol=HTTP/1.1 resp_size=314 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.038267 status=200 method=GET path=/api/internal/v1/features user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.624930650Z 2024-04-29 18:25:54 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/features latency=0.039083 google_trace_id=- protocol=HTTP/1.1 resp_size=335 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.758557 status=403 method=GET path=/api/internal/v1/alertgroups/bulk_action_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.674999432Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alertgroups/bulk_action_options
2024-04-29T18:25:54.675004121Z 2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alertgroups/bulk_action_options latency=0.762940 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.757136 status=403 method=GET path=/api/internal/v1/alertgroups/silence_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.681736616Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alertgroups/silence_options
2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alertgroups/silence_options latency=0.760844 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=apps.grafana_plugin.views.status authenticated via <class 'apps.auth_token.auth.BasePluginAuthentication'>, user=[5: kogan@REDACTED] org=[self_hosted_stack]
2024-04-29T18:25:54.689810977Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=apps.grafana_plugin.views.status Status - check token org=1 status=1 token_ok=True
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.013835 status=403 method=GET path=/api/internal/v1/cloud_connection user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.714770851Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/cloud_connection
2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/cloud_connection latency=0.015358 google_trace_id=- protocol=HTTP/1.1 resp_size=317 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.900874 status=200 method=POST path=/api/internal/v1/plugin/status user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29 18:25:54 source=engine:uwsgi status=200 method=POST path=/api/internal/v1/plugin/status latency=0.908849 google_trace_id=- protocol=HTTP/1.1 resp_size=519 req_body_size=0
2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.009039 status=403 method=GET path=/api/internal/v1/notification_policies/notify_by_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.916857031Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/notification_policies/notify_by_options
2024-04-29 18:25:54 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/notification_policies/notify_by_options latency=0.009997 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29T18:25:54.919080713Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=root inbound latency=0.015709 status=403 method=OPTIONS path=/api/internal/v1/notification_policies user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:54.919102247Z 2024-04-29 18:25:54 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/notification_policies
2024-04-29 18:25:54 source=engine:uwsgi status=403 method=OPTIONS path=/api/internal/v1/notification_policies latency=0.016826 google_trace_id=- protocol=HTTP/1.1 resp_size=315 req_body_size=0
2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=root inbound latency=0.014814 status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:55.918740104Z 2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/alert_receive_channels/integration_options
2024-04-29 18:25:55 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/alert_receive_channels/integration_options latency=0.015618 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=root inbound latency=0.018528 status=403 method=GET path=/api/internal/v1/teams user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:55.930708612Z 2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/teams
2024-04-29 18:25:55 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/teams?include_no_team=true&only_include_notifiable_teams=false&search=&short=true latency=0.019667 google_trace_id=- protocol=HTTP/1.1 resp_size=309 req_body_size=0
2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=root inbound latency=0.015779 status=200 method=GET path=/api/internal/v1/features user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29 18:25:55 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/features latency=0.016623 google_trace_id=- protocol=HTTP/1.1 resp_size=335 req_body_size=0
2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=root inbound latency=0.025037 status=403 method=GET path=/api/internal/v1/organization user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29 18:25:55 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/organization
2024-04-29T18:25:55.944890686Z 2024-04-29 18:25:55 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/organization latency=0.026414 google_trace_id=- protocol=HTTP/1.1 resp_size=314 req_body_size=0
2024-04-29 18:25:56 source=engine:app google_trace_id=none logger=root inbound latency=0.011136 status=403 method=GET path=/api/internal/v1/cloud_connection user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29T18:25:56.009399823Z 2024-04-29 18:25:56 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/cloud_connection
2024-04-29 18:25:56 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/cloud_connection latency=0.012029 google_trace_id=- protocol=HTTP/1.1 resp_size=317 req_body_size=0
2024-04-29 18:25:56 source=engine:app google_trace_id=none logger=root inbound latency=0.009305 status=403 method=GET path=/api/internal/v1/organization user_agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0 content-length=0 slow=0 user_id=5 org_id=1 org_slug=self_hosted_org 
2024-04-29 18:25:56 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/organization
2024-04-29 18:25:56 source=engine:uwsgi status=403 method=GET path=/api/internal/v1/organization latency=0.010641 google_trace_id=- protocol=HTTP/1.1 resp_size=314 req_body_size=0

What did you expect to happen:

  • Expected to see existing escalation chains, alerts, etc.

How do we reproduce it?

It's hard to determine how to reproduce this issue given it occurred after an upgrade.

Grafana OnCall Version

v1.4.4

Product Area

Auth, Helm, Other

Grafana OnCall Platform?

Kubernetes

User's Browser?

Microsoft Edge 124.0.2478.51

Anything else to add?

  • Our users login through Azure SSO to Grafana.
  • We deploy Grafana separately from OnCall but in the same k8s namespace
  • Our "backend URL" is using the internal k8s fqdn of OnCall (http://grafana-oncall-engine.redacted.svc.cluster.local:8080)
  • OnCall is using the frontend fqdn for our Grafana instance (https://metrics.redacted)
  • The frontend FQDN configured in OnCall matches the Grafana root_url.
  • We've added the following ingress definition to allow this to work:
# OnCall needs `/integrations` routed to it at the frontend URL
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "oncall.engine.fullname" (dict "Chart" .Chart "Release" .Release "Values" .Values.oncall) }}
  labels:
    {{- include "oncall.labels" $ | nindent 4 }}
  {{- if .Values.redacted.oncall.integrations.allowedips }}
  annotations:
    nginx.ingress.kubernetes.io/whitelist-source-range: {{ .Values.redacted.oncall.integrations.allowedips | join ","}}
  {{- end }}
spec:
  tls:
    - hosts:
        - {{ .Values.oncall.base_url | quote }}
  rules:
    - host: {{ .Values.oncall.base_url | quote }}
      http:
        paths:
          - path: /integrations
            pathType: Prefix
            backend:
              service:
                name: {{ include "oncall.engine.fullname" (dict "Chart" .Chart "Release" .Release "Values" .Values.oncall) }}
                port:
                  number: 8080

Note that I've tried removing the list of allowed IPs which is currently set to the node public IP range and get the same result.

Is there a way to enable deeper debugging?

@mderynck
Copy link
Collaborator

Is mirageSecretKey configured in your helm config for oncall? If it is not set it will be generated which could cause the API key used by the plugin to not agree with what is stored in the OnCall backend.

@ikogan
Copy link
Author

ikogan commented Apr 29, 2024

Just to verify, I checked the environment variables in the pod and MIRAGE_SECRET_KEY matches the value in our oncall-secret. I've also verified that the deployment is using the value from that secret and the secret is still owned by the SealedSecret that we use to create it.

I did notice what seems to be a MIRAGE_CIPHER_IV variable that we do not currently control. Could that be having a similar impact?

@mderynck
Copy link
Collaborator

mderynck commented Apr 29, 2024

MIRAGE_CIPHER_IV should be ok, it has a default set so it will at least be consistent if not explicitly set. I notice the call to /status is giving 200 OK, we could see if it has anything interesting. When you first open oncall you should be able to see this by having dev tools open in the browser and inspecting network tab, the response for status should have fields like this:

{
	"is_installed": true,
	"token_ok": true,
	"allow_signup": true,
	"is_user_anonymous": false,
	"license": "OpenSource",
	"version": "dev-oss",
	"recaptcha_site_key": ...,
	"currently_undergoing_maintenance_message": null,
	"api_url": ...
}

@ikogan
Copy link
Author

ikogan commented Apr 30, 2024

{
    "is_installed": true,
    "token_ok": true,
    "allow_signup": true,
    "is_user_anonymous": false,
    "license": "OpenSource",
    "version": "1.4.4",
    "recaptcha_site_key": "redacted",
    "currently_undergoing_maintenance_message": null,
    "api_url": "https://metrics.sand.redacted/"
}

The only thing that may be interesting is that the api_url has no path component? Should it? Also, I noticed the call to user seems to return fine, it's calling: https://metrics.sand.redacted/api/plugin-proxy/grafana-oncall-app/api/internal/v1/user/ and it's getting this:

{
    "pk": "redacted",
    "organization": {
        "pk": "redacted",
        "name": "Self-Hosted Organization"
    },
    "current_team": null,
    "email": "kogan@redacted",
    "username": "kogan@redacted",
    "name": "Kogan, Ilya",
    "role": 0,
    "avatar": "/avatar/redacted",
    "avatar_full": "https://metrics.sand.redacted/avatar/redacted",
    "timezone": "America/Detroit",
    "working_hours": {
        "monday": [
            {
                "start": "09:00:00",
                "end": "17:00:00"
            }
        ],
        "tuesday": [
            {
                "start": "09:00:00",
                "end": "17:00:00"
            }
        ],
        "wednesday": [
            {
                "start": "09:00:00",
                "end": "17:00:00"
            }
        ],
        "thursday": [
            {
                "start": "09:00:00",
                "end": "17:00:00"
            }
        ],
        "friday": [
            {
                "start": "09:00:00",
                "end": "17:00:00"
            }
        ],
        "saturday": [],
        "sunday": []
    },
    "unverified_phone_number": null,
    "verified_phone_number": null,
    "slack_user_identity": null,
    "telegram_configuration": null,
    "messaging_backends": {
        "MOBILE_APP": {
            "connected": false
        },
        "MOBILE_APP_CRITICAL": {
            "connected": false
        },
        "EMAIL": {
            "email": "kogan@redacted"
        }
    },
    "notification_chain_verbal": {
        "default": "Email",
        "important": "Email"
    },
    "cloud_connection_status": 1,
    "hide_phone_number": false,
    "has_google_oauth2_connected": false,
    "is_currently_oncall": false,
    "google_calendar_settings": null,
    "rbac_permissions": []
}

The calls to the following all return 403s:

  • organization/
  • teams/
  • cloud_connection/
  • integration_options/
  • notification_policies/
  • notify_by_options/
  • users/ (users is what I clicked on)

All of these are rooted at https://metrics.sand.redacted/api/plugin-proxy/grafana-oncall-app/api/internal/v1/

@mderynck
Copy link
Collaborator

This is strange the /status endpoint uses the info from the plugin to authenticate so Grafana -> OnCall should be ok. Testing OnCall -> Grafana can be tested by trying the following:

  1. Connect to the oncall backend engine pod
  2. From the shell: python manage.py shell
  3. Then use this script:
from apps.user_management.models import Organization
from apps.grafana_plugin.helpers.client import GrafanaAPIClient
org = Organization.objects.get(stack_id=5,org_id=100) # Default ids for self hosted orgs
client = GrafanaAPIClient(api_url=org.grafana_url, api_token=org.api_token)
client.check_token()

You should see a response something like this:

source=engine:app google_trace_id=none logger=root outbound latency=0.02454587600004743 status=200 method=HEAD url=http://localhost:3000/api/org slow=0 
(None, {'url': 'http://localhost:3000/api/org', 'connected': True, 'status_code': 200, 'message': 'OK'})

This should tell us if the URL OnCall is using to talk to Grafana and the token are ok

@ikogan
Copy link
Author

ikogan commented May 3, 2024

Yup, I'm seeing generally that, except obviously the url is different:

>>> client.check_token()
source=engine:app google_trace_id=none logger=root outbound latency=0.079090875107795 status=200 method=HEAD url=https://metrics.sand.redacted/api/org slow=0
(None, {'url': 'https://metrics.sand.redacted/api/org', 'connected': True, 'status_code': 200, 'message': 'OK'})

@ikogan
Copy link
Author

ikogan commented May 14, 2024

This has mysteriously been fixed in 1.4.7, thanks for all of your help!

@ikogan ikogan closed this as completed May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants