-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add debug utilities for Vulkan #90993
base: master
Are you sure you want to change the base?
Conversation
3cb8926
to
f0c3cf7
Compare
enum BreadcrumbMarker { | ||
NONE = 0, | ||
// Environment | ||
REFLECTION_PROBES, | ||
SKY_PASS, | ||
// Light mapping | ||
LIGHTMAPPER_PASS, | ||
// Shadows | ||
SHADOW_PASS_DIRECTIONAL, | ||
SHADOW_PASS_CUBE, | ||
// Geometry passes | ||
OPAQUE_PASS, | ||
ALPHA_PASS, | ||
TRANSPARENT_PASS, | ||
// Screen effects | ||
POST_PROCESSING_PASS, | ||
BLIT_PASS, | ||
UI_PASS, | ||
// Other | ||
DEBUG_PASS | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the responsibility of RenderingDevice should stay as abstract as possible in this case and not have knowledge of how the rendering pipeline works. Perhaps we can keep the breadcrumbs as a typedef of just uint32_t
and the renderers (Forward+, Mobile, or future ones) can assign the semantic meaning to the numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breadcrumbs don't really add knowledge.
They're just tagging an ID. The idea is that if the app crashes we look at the report and see "what was being executed?" oh breadcrumb says ID = 2, which corresponds to... let me see... SKY_PASS.
It's up to the rendering pipelines to actually use those IDs.
In fact some parts of the code do the following:
breadcrumb = LIGHTMAPPER_PASS | (pass_id << 16)
In other words, it doesn't give knowledge about the upper levels; it's just a way of standardizing some common tags that are more likely to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with both of you.
Dario's comment is more about a separation of responsibilities. In the RenderingDevice API, we should just use uint32_t's and then in the Renderer itself, we should use an enum to standardize the common tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wouldn't that mean that:
- Mobile and Forward+ would have their own copies of the enum, which can get out of sync. The main issue about that is that it can cause silly mistakes when debugging that can waste hours (i.e. user got id = 7, dev looked it up what that means, but saw the wrong version).
- Users who want to use the
RenderingDevice
directly have no information about what that id is for, and even if they do, they would not adhere to any convention.
Btw I looked again and turns out only 16 bits of each breadcrumb argument are used:
uint32_t breadcrumb = (p_phase << 16) | (p_breadcrumb_data & ((1 << 16) - 1));
Perhaps we could change this argument signature to...?
// original:
void draw_list_begin(RID p_framebuffer, InitialAction p_initial_color_action, FinalAction p_final_color_action, InitialAction p_initial_depth_action, FinalAction p_final_depth_action, const Vector<Color> &p_clear_color_values, float p_clear_depth, uint32_t p_clear_stencil, const Rect2 &p_region, RDD::BreadcrumbMarker p_phase, uint32_t p_breadcrumb_data) {}
// My proposal:
enum BreadcrumbMarker {
// Stays the same.
}
uint32_t breadcrumb( BreadcrumbMarker p_phase, uint16_t p_data ) {
return (p_phase << 16) | (p_data & 0xFFFF)
}
void draw_list_begin(RID p_framebuffer, InitialAction p_initial_color_action, FinalAction p_final_color_action, InitialAction p_initial_depth_action, FinalAction p_final_depth_action, const Vector<Color> &p_clear_color_values, float p_clear_depth, uint32_t p_clear_stencil, const Rect2 &p_region, uint32_t p_breadcrumb_data) {}
This way the BreadcrumbMarker enum is still visible to everyone, but it is made clear the underlying datatype is an arbitrary uint32_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back on this. I like your suggested approach. I really don't love the RDD having high level concepts like POST_PROCESSING_PASS
or SHADOW_PASS_CUBE
, but at the same time, to make this feature useful we want the debug utility to print out the human readable name for the pass and not force users to look it up. You are right that looking up an ID would be a huge annoyance and significantly reduce the value of breadcrumbs.
PipelineCacheRD *pipeline = nullptr; | ||
|
||
pipeline = &shader->pipelines[cull_variant][primitive][shader_version]; | ||
PipelineCacheRD *pipeline = &shader->pipelines[cull_variant][primitive][shader_version]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a heads up, we will have a merge conflict on this case as this code will be replaced in #90400. Seeing as it's for aesthetic reasons I'd forego this change and the removal of the comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. This is exactly why I included it (so someone could notice it, it wasn't an aesthetic change but more like bait).
There are more upcoming changes that will affect RenderForwardMobile::_render_list_template
, as there is an optimization that uses Spec Constants for key in sorting the draw calls for better draw call batching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will probably have to ask for your help when the time comes.
36bb0c3
to
e3cc35b
Compare
d71ead1
to
a7aa977
Compare
I've made an update. Look at the "3 outstanding things remain:" in the top post where I ask for a bit of help 😄 |
7e15b93
to
23c4298
Compare
23c4298
to
41c9fb4
Compare
f87b280
to
67ee11a
Compare
@DarioSamo pointed me how to make a backwards compatible C# binding (by editing rendering_device.compat.inc). However CI is still failing. I'm not sure if I did something wrong, but I noticed that we may be in a bit of a pickle: draw_list_begin already had a compatibility method and I suspect this is causing problems: Previous version before my PR: DrawListID draw_list_begin(RID p_framebuffer, InitialAction p_initial_color_action, FinalAction p_final_color_action, InitialAction p_initial_depth_action, FinalAction p_final_depth_action, const Vector<Color> &p_clear_color_values = Vector<Color>(), float p_clear_depth = 1.0, uint32_t p_clear_stencil = 0, const Rect2 &p_region = Rect2()); New Version: DrawListID draw_list_begin(RID p_framebuffer, InitialAction p_initial_color_action, FinalAction p_final_color_action, InitialAction p_initial_depth_action, FinalAction p_final_depth_action, const Vector<Color> &p_clear_color_values = Vector<Color>(), float p_clear_depth = 1.0, uint32_t p_clear_stencil = 0, const Rect2 &p_region = Rect2(), uint32_t p_breadcrumb = 0); That is, my version just adds Old version before PR #84976: DrawListID draw_list_begin(RID p_framebuffer, InitialAction p_initial_color_action, FinalAction p_final_color_action, InitialAction p_initial_depth_action, FinalAction p_final_depth_action, const Vector<Color> &p_clear_color_values, float p_clear_depth, uint32_t p_clear_stencil, const Rect2 &p_region, const TypedArray<RID> &p_storage_textures); That is, this version used to have The problem is that the very old version used to have I'm not sure if this is simply a CI bug where the compat test gets confused, or it will fail in real C#. The only solution I can think of is to add |
67ee11a
to
8e88905
Compare
c90205a
to
5b6d8cf
Compare
All checks passed! |
5cc5c3a
to
8c7c196
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me to merge for Beta 1 pending the small changes suggested by Bruvzg.
I would like to get this in to Beta 1 as it will help massively for debugging Vulkan issues. We fixed a lot of the significant bugs in 4.3 already, the rest are going to be nearly impossible to track down without this additional info. So IMO, its best to get this in now so that users of 4.3 that still run into Vulkan bugs can give us a sensible report by running a debug build
My bad missed there was a matching one in the 4.3 cycle
-------- | ||
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int". | ||
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': default_value changed value in new API, from "Array[RID]([])" to "0". | ||
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "typedarray::RID" to "int". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one of these should be kept I think, the type part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last two are complaints from 4.2 -> 4.3
But the last one is a complaint from 4.1 (or 4.0?) -> 4.3
I was a bit confused by it because the same issue should've happened when going 4.1 -> 4.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm there seems to be a clash with #84976, which is a 4.2->4.3 change that removed the 9th argument to draw_list_begin
.
Now this PR adds a new 9th argument with a different type.
I think it's the first time we have this case, not sure if the script we use to check for compatibility issues is able to handle that. CC @dsnopek
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since both changes were in 4.3, I would expect it to treat this like the 9th argument changed type, completely ignoring the previous removal of the argument.
The weird thing is these two lines:
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "typedarray::RID" to "int".
I'm not sure how it can change from both Array
and typedarray::RID
.
Looking at the reference extension_api.json
, it seems like typedarray::RID
is the correct one.
What happens if you remove this line from the 4.2-stable.expected
:
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also the CI:
The following validation errors no longer occur (compared to 4.2-stable):
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In godot-4.0-stable/gdextension/extension_api.json:
{
"name": "storage_textures",
"type": "Array",
"default_value": "Array[RID]([])"
}
In godot-4.1-stable/gdextension/extension_api.json & godot-4.2-stable/gdextension/extension_api.json:
{
"name": "storage_textures",
"type": "typedarray::RID",
"default_value": "Array[RID]([])"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try removing this line as well:
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "typedarray::RID". |
Or change it to fit the new value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into it myself, I think Matias' time is better spent on rendering issues than on debugging our CI API compat testing framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO our validation scripts are buggy and need to be fixed to properly handle APIs changing multiple times across versions.
But for now this should be an ok workaround to silence both the error if not adding the line, and the false positive warning when adding it in the 4.2
file.
diff --git a/misc/extension_api_validation/4.0-stable_4.1-stable.expected b/misc/extension_api_validation/4.0-stable_4.1-stable.expected
index 5c3bf07fb2..7388c90e2e 100644
--- a/misc/extension_api_validation/4.0-stable_4.1-stable.expected
+++ b/misc/extension_api_validation/4.0-stable_4.1-stable.expected
@@ -349,3 +349,11 @@ Validate extension JSON: Error: Hash changed for 'classes/EditorUndoRedoManager/
Validate extension JSON: Error: Hash changed for 'classes/UndoRedo/methods/create_action', from 0AEC1BFC to E87757EB. This means that the function has changed and no compatibility function was provided.
Added a optional parameters with default values. No adjustments should be necessary.
+
+
+GH-90993
+--------
+Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
+
+The change to `int` actually happened from 4.2 to 4.3, but our current validation script doesn't handle well the case where an argument changed type twice in different versions.
+This argument already changed in 4.1 in GH-76418, so the script gets confused.
diff --git a/misc/extension_api_validation/4.2-stable.expected b/misc/extension_api_validation/4.2-stable.expected
index 54f95cde58..cf80419cf9 100644
--- a/misc/extension_api_validation/4.2-stable.expected
+++ b/misc/extension_api_validation/4.2-stable.expected
@@ -367,7 +367,6 @@ Optional arguments added. Compatibility methods registered.
GH-90993
--------
-Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': default_value changed value in new API, from "Array[RID]([])" to "0".
Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "typedarray::RID" to "int".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!!!
I've applied your patch, rebased to latest master, and pushed.
Now all CI passes!
8c7c196
to
633a613
Compare
698b419
to
2446b41
Compare
void *Memory::realloc_aligned_static(void *p_memory, size_t p_bytes, size_t p_prev_bytes, size_t p_alignment) { | ||
if (p_memory == nullptr) { | ||
return alloc_aligned_static(p_bytes, p_alignment); | ||
} | ||
|
||
void *ret = alloc_aligned_static(p_bytes, p_alignment); | ||
memcpy(ret, p_memory, p_prev_bytes); | ||
free_aligned_static(p_memory); | ||
return ret; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this has to be on realloc_static()
, to avoid pessimizing for the case where the address changes.
2446b41
to
d7fc4f1
Compare
Features: - Debug-only tracking of objects by type. See get_driver_allocs_by_object_type et al. - Debug-only Breadcrumb info for debugging GPU crashes and device lost - Performance report per frame from get_perf_report - Some VMA calls had to be modified in order to insert the necessary memory callbacks Functionality marked as "debug-only" is only available in debug or dev builds. Misc fixes: - Early break optimization in RenderingDevice::uniform_set_create ============================ The work was performed by collaboration of TheForge and Google. I am merely splitting it up into smaller PRs and cleaning it up.
d7fc4f1
to
d5a06aa
Compare
Features:
Functionality marked as "debug-only" is only available in debug or dev builds.
Misc fixes:
============================
The work was performed by collaboration of TheForge and Google. I am merely splitting it up into smaller PRs and cleaning it up.
PR comments
This is the first PR from the TheForge changed. This one is the most important one because:
Update
Thanks everyone for the help! Most of the issues have been addressed. Almost all CI passes now. Windows now compiles.
3 outstanding things remain:
Turning breadcrumbs into a singleDone.uint32_t
. I want the enum to still be defined in rendering_device_commons.h and document it better.CI is complaining about draw_list_begin changing its signature (this is correct; although I'm not sure if the error is triggering because the bindings are wrong?). Help & Tips on this are appreciated.I've been talking with Dario, and it appears TheForge addedDone for now. We'll revisit this later.RenderingDevice::shader_destroy_modules
which must be called by hand to free (unused) Shader modules after creating the PSO. We both agree it seems like adding a shotgun to the API just to save few kilobytes of RAM (maybe a few megabytes?). I will remove it from this PR and ask TheForge for further info (perhaps it's a lot more RAM than we think?). It may also be a better idea to destroy shader modules after each PSO creation (unless they're shared), instead of doing it at higher level.