New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splunk OnCall migration tool #4267
Conversation
b61cadd
to
5cf5f64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has been moved to [tools/migrators/add_users_to_grafana.py](https://github.com/grafana/oncall/pull/4267/files#diff-2b8ae690c66ef69beb40bffa232e3d9a03be89122cb0864ecae24b36ffa5156a)
(+ slightly refactored to support migrating users from Splunk OnCall -> Grafana OnCall)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor questions added (and nice to see complex schedule test cases! 👍 )
- the following Splunk OnCall escalation step types are not supported and will not be migrated: | ||
- "Notify the next user(s) in the current on-duty shift" | ||
- "Notify the previous user(s) in the current on-duty shift" | ||
- "Notify every member of this team" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure, isn't this equivalent to our "Notify all users from a team" step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it is. However, we don't currently have a script to migrate Splunk (or PagerDuty) teams to Grafana/Grafana OnCall teams (I considered adding it, and even left most of the code for it there, but commented out)
I suppose we could support this escalation step type and attempt to find the team match, and if we don't, just skip it. Maybe I'll do a follow up PR with support for this 👍
Convert a Splunk datetime string to a datetime object. | ||
""" | ||
dt = datetime.datetime.strptime(text, "%Y-%m-%dT%H:%M:%SZ") | ||
return dt.replace(tzinfo=datetime.timezone.utc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure, do we get TZ information from the Splunk API? are we potentially overriding that by the replace above? or do we always expect to get UTC timestamps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay here. They return the timezone
, but they also return the start
attribute in UTC. Just double checked their API response for schedule rotation shifts:
{
"label": "multi-day stuff",
"timezone": "America/Toronto",
"start": "2024-04-24T21:00:00Z",
"duration": 7,
"shifttype": "cstm",
"periods": [],
"current": {},
"next": {},
"shiftMembers": [
{
"username": "joeyorlando1",
"slug": "rtm-h8lTf3VyNUAenaRXGVWD"
},
{
"username": "joeyorlando",
"slug": "rtm-QIRzjSe3jyatvpIcutLt"
}
]
}
We use this function on the start
attribute. This shift is set to start at 17h (America/Toronto tz).. which is 21h UTC. With that said, I suppose I don't need to do dt.replace(tzinfo=datetime.timezone.utc)
"user_id": user_id, | ||
"type": "wait", | ||
"duration": transform_wait_delay(delay), | ||
"important": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, I guess there isn't a default/important distinction in Splunk, right? Are we leaving important policy empty then? Should allow some config to duplicate/default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, they have a concept called "custom paging policies", but those really don't map to how we model user notification policies, so I just decided to map Primary Paging Policy -> Default and ignore custom ones:
(the PagerDuty migrator also hardcodes created user notification policies to "important": False
)
What this PR does
Refactors the PagerDuty migration script to be a bit more generic + adds a migration script to migrate from Splunk OnCall (VictorOps)
tldr;
https://www.loom.com/share/a855062d436a4ef79f030e22528d8c71
Checklist
pr:no public docs
PR label added if not required)release:
). These labels dictate how your PR willshow up in the autogenerated release notes.