Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code generation and update to raylib 4 #100

Open
konsumer opened this issue Nov 23, 2021 · 24 comments
Open

code generation and update to raylib 4 #100

konsumer opened this issue Nov 23, 2021 · 24 comments

Comments

@konsumer
Copy link
Collaborator

konsumer commented Nov 23, 2021

Lots of discussion on #98

Basically, we can update to 4.0 raylib, and the new code-generation stuff seems like a really good path (so it will be some up-front work now, but later API changes will be easier to manage.) I started work on this fork, whcih isn't currently building, but I think just needs a few little things.

@RobLoach
Copy link
Owner

This is awesome, konsumer!

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

I am interested to see if generating the API this way works too. The performance tests I did definitely were not by any means scientific, and 3.5->4.0 may be accounting for the improvements - for me the difference was about double (in the bunnymark example) which seemed a little far fetched.

I am thinking if the binding is going to be generated this way - why bother using the JSON to generate ToValue.h and ToObject.h - the code for those functions doesn't really need to change drastically, and raylib doesn't really introduce many new structs between versions. Even in the generator code many of the conversion functions would just be hardcoded, so its probably just easier to use the existing files for that, and only generate node-raylib.h based on the JSON

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

Is there a way to (when developing) configure cmake to only clone the raylib repo once? This would be fine when the user ultimately installs the package, but it takes me about 3 minutes to clone raylib each time I want to recompile to test

@konsumer
Copy link
Collaborator Author

konsumer commented Nov 23, 2021

I am interested to see if generating the API this way works too.

Me too! Once it's building, I will definitely be comparing the 2 methods. I think my stuff can be pretty easily moved more to the other style, if it's better, and I'm all for it.

why bother using the JSON to generate ToValue.h and ToObject.h - the code for those functions doesn't really need to change drastically

It changed enough to be a challenge for me to port the code in this repo from 3.5 to 4.0 by hand, so I imagine future migrations will be similar. After the initial generation in this style, if it's roughly the same performance, I don't disagree with you, though. I am a big fan of "the code is the configuration" and it very well may make sense to drop the generation stuff, and just tweak the source directly, and maybe come back to it later, if it has serious problems with new API versions.

Even in the generator code many of the conversion functions would just be hardcoded, so its probably just easier to use the existing files for that, and only generate node-raylib.h based on the JSON

Yeh, agreed. I did that because I couldn't figure out a good way to automate it. if you have a look at Matrix specifically (the only one that is currently ignored, and created by hand) it creates a weird var-name with commas: m,m,m,m, etc. It may just be a problem with how I was generating it, I dunno.

It looks like this:

    {
      "name": "Matrix",
      "description": "",
      "fields": [
        {
          "name": "m0, m4, m8, m12",
          "type": "float",
          "description": "Matrix first row (4 components)"
        },
        {
          "name": "m1, m5, m9, m13",
          "type": "float",
          "description": "Matrix second row (4 components)"
        },
        {
          "name": "m2, m6, m10, m14",
          "type": "float",
          "description": "Matrix third row (4 components)"
        },
        {
          "name": "m3, m7, m11, m15",
          "type": "float",
          "description": "Matrix fourth row (4 components)"
        }
      ]
    },

So, I think it's a combo of the commas and how I'm ignoring numbers (for size-set vars.) This one issue could easily be worked around (to make parser more robust) and resolved, if it's a big deal to hand-code the one def.

@konsumer
Copy link
Collaborator Author

I think for uniformity, it should probly look like this, which maybe we can take up with the upstream parser team:

[
{
  "name": "m0",
  "type": "float"
},
{
  "name": "m1",
  "type": "float"
},
{
  "name": "m2",
  "type": "float"
},
{
  "name": "m3",
  "type": "float"
},
{
  "name": "m4",
  "type": "float"
},
{
  "name": "m5",
  "type": "float"
},
{
  "name": "m6",
  "type": "float"
},
{
  "name": "m7",
  "type": "float"
},
{
  "name": "m8",
  "type": "float"
},
{
  "name": "m9",
  "type": "float"
},
{
  "name": "m10",
  "type": "float"
},
{
  "name": "m11",
  "type": "float"
},
{
  "name": "m12",
  "type": "float"
},
{
  "name": "m13",
  "type": "float"
},
{
  "name": "m14",
  "type": "float",

},
{
  "name": "m15",
  "type": "float"
}
]

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

RobLoach mentioned this in the raylib discord earlier today too - I went around that by manually overriding the matrix entry in the structs before I process any of the text: https://github.com/twuky/raylib-4.0/blob/279b6a4c4e08701e2c26e18c84d227c539cbed28/src/generate.ts#L41
I agree that upstream it probably should look that way too - I think it outputs like that because they are comma seperated in groups of four in the header file itself

@konsumer
Copy link
Collaborator Author

konsumer commented Nov 23, 2021

Yep, I did similar, I just have an ignore list, and I added it to the top of the other files. I suspect it will never change.

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

I'm going to try and see if I can get this closer to compiling just by modifying the existing ToValue/ToObject scripts. There are only a few differences between 3.5 and 4.0, such as CharInfo being renamed to GlyphInfo

@konsumer
Copy link
Collaborator Author

Is there a way to (when developing) configure cmake to only clone the raylib repo once?

I am not sure. I noticed this problem, too. very annoying. For end-users it has a lot of things that make it easier, but it makes building while deving go very slow. I will look into it.

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

pushed a couple changes to the generator that handle pointers and inner structs the way the current bindings do. It looks like it is still hanging up on a function that depends on a Matrix& arg and another with const float [2] (Vector2?).

@konsumer
Copy link
Collaborator Author

konsumer commented Nov 23, 2021

I looked into cmake git prob more. I think it might be related to npm/cmake-js. if I make a new dir and run cmake .. it caches it, properly. It's just npm i that triggers re-download.

cmake-js build seems to make the cache work, so I will change it to that. Also fixed a few other little problems with generated code. It seems like the only build errors left is a bunch like these 2:

ToObject.h:37:10: error: use of undeclared identifier 'value'
  return value.As<Napi::Boolean>();

ToObject.h:694:32: error: use of undeclared identifier 'obj'
    out.boneWeights = (float *)obj.Get("boneWeights").As<Napi::Number>().Int64Value();

Pushed another thing that fixed that I think, so yeh, now I get a ton of array and Matrix-related errors, probly in the same spot you are. Looks like it's all ToObject.h, which gets me liking your plan to only generate node-raylib.h even more.

@twuky
Copy link
Collaborator

twuky commented Nov 23, 2021

pushed another few changes - GetArgFromParam used some of the old names of structs that had been changed since 3.5. Also, a lot of the errors with Matrix were that there wasn't a conversion in ToValue.h (since its being filtered out) so I hardcoded one to that section as well
I've also for now commented out the includes to raymath/rlgl - I would think that those modules are innaccurate with 4.0 - their functions aren't covered by the JSON.

@konsumer
Copy link
Collaborator Author

konsumer commented Nov 23, 2021

Also, a lot of the errors with Matrix were that there wasn't a conversion in ToValue.h (since its being filtered out) so I hardcoded one to that section as well

Ah, right. good catch.

I've also for now commented out the includes to raymath/rlgl - I would think that those modules are innaccurate with 4.0 - their functions aren't covered by the JSON.

That makes sense. May be another good future-thing to explore with the upstream parser.

I added a few more of the built-in converters.

Here is my current log. Seems like remaining stuff is arrays, a bunch of Matrix things, and some no matching constructor for initialization of 'Napi::Value' errors.

@konsumer
Copy link
Collaborator Author

konsumer commented Nov 23, 2021

I talked to ray on discord. He said he likes the visual structure of Matrix in original header, but would consider changing the parser to output more like we want above for Matrix. The truth is I should just preprocess it, though. Like I can add info about arrays and names and seperate comma fields and stuff. it's not a huge deal.

I was looking at these:

const blocklist = [
  'Matrix',
  'ToTexture2D',
  'TraceLog',
  'TextFormat'
]

I wonder if I just prepocess all of these, it would have all the right stuff. Some, like Matrix, would not need to be skipped (because they would have the same shape.) I think I will do that next. pulll out commas, turn them into many fields. Mark arrays seperately, etc.

What happened with the other ignored types in there? I am wondering if there is some simple things we can do like above texture thing, to keep it simple and minimize edge-cases.

Update: I got rid of Matrix and ToTexture2D skips, and fixed parser to handle them. We should re-evaluate the other 2, as well. I am still getting lots of errors around arrays, so we need to figure out how to make that work right.

@konsumer
Copy link
Collaborator Author

Ok, so this weekend I have a little time to look at this. Seems like remaining issues are sized-arrays like this:

ToObject.h: In function ‘Material ToMaterial(Napi::Env&, const Napi::Value&)’:
ToObject.h:818:16: error: incompatible types in assignment of ‘float’ to ‘float [4]’
  818 |     out.params = Tofloat(env, argObject.Get("params"));

I will try to figure out how to make a type-converter for these.

Sidenote, I think I see why TraceLog and TextFomrat are skipped, They have incompatible templates. I am not sure how to make them compatible, but that might be a good step to do as well:

AddFunction.h:637:6: note:   template argument deduction/substitution failed:
In file included from src/addon.cc:6:
 node-raylib.h:655:54: note:   candidate expects 12 arguments, 1 provided
  655 |   AddFunction(env, exports, "TextFormat", &TextFormat);
      |                                                      ^
In file included from addon.cc:5:
AddFunction.h:673:6: note: candidate: ‘template<class ReturnType, class P1, class P2, class P3, class P4, class P5, class P6, class P7, class P8, class P9, class P10, class P11, class P12, class P13> void AddFunction(Napi::Env&, Napi::Object&, const string&, ReturnType (*)(P1, P2, P3, P4, P5, P6, P7, P8, P9, P10, P11, P12, P13))’
  673 | void AddFunction(Napi::Env& env, Napi::Object& exports, const std::string& name, ReturnType (*f)(P1, P2, P3, P4, P5, P6, P7, P8, P9, P10, P11, P12, P13)) {

@konsumer
Copy link
Collaborator Author

konsumer commented Jan 9, 2022

With help from @RobLoach I got my fork generating & building from current JSON, and able to run basic example on raylib4. It's still missing a few things (mostly around arrays.) I added some new info in the in-memory JSON (before it does the code-generation, but after it reads the JSON file from upstream raylib)

Our current blocklist is this:

// use this to keep from wrapping things
const blocklist = [
  // error: invalid conversion from ‘void (*)(int, const char*, ...)’ to ‘void (*)()’ [-fpermissive]
  'TraceLog',
  'TextFormat',

  // these appear to not be defined, even though they are in JSON
  'SetWindowOpacity',
  'GetRenderWidth',
  'GetRenderHeight',
  'ExportFontAsCode'
]

so I got all our blocks documented, which is good.

Additionally, I am currently adding all structs that use arrays to the blocklist, and any function that uses those:

[
  'Material',
  'BoneInfo',
  'VrDeviceInfo',
  'VrStereoConfig'
]

So I think all that is left is figuring out how to do this for float[4], for example (untested pseudo-code):

inline Napi::Value ToValue(Napi::Env& env, float[4] value) {
  // return a NAPI array of floats here
}

float[4] Tofloat4(Napi::Env& env, Napi::Value value) {
  // return float[4] here
}

and then make sure to check the new size field in the struct on generation to trigger To${type}${size}.

This PR shows an example of using a float-array inline, so it should be a good start. We could just do things in this sort of 1-off way, to wrap it in the generated wrapper, and I think it'd work great. Since there are only 4, they could also just made by hand and added to the scalar converters that are in there now for float, etc. It's not as nice of a solution, but it might make the fix go a bit faster, now.

Also, aside from the missing array stuff, the wrappers still need some work. For example core_basic_window.js works, but not core_world_screen.js:

/home/konsumer/Documents/otherdev/node-raylib/index.js:136
  const newCamera = raylib.UpdateCameraWrap(camera)
                           ^

TypeError: raylib.UpdateCameraWrap is not a function
    at Object.raylib.UpdateCamera (/home/konsumer/Documents/otherdev/node-raylib/index.js:136:28)
    at Object.<anonymous> (/home/konsumer/Documents/otherdev/node-raylib/examples/core/core_world_screen.js:43:7)

Might just be that the demo needs to be updated for ray4, though.

to summarize

For this to be ready to PR, and allow us to say "node-wrapper with full raylib 4 support", I think we need this:

  • array-handling, in a 1-off fashion similar to this PR, or at very least, add them by hand to the generator script for those 4 items.
  • make sure examples are up-to-date
  • port more examples (Material would be good to test current version, but also raylib4 stuff)
  • check sketchy areas (I am not confident in how I did it) like reference-parameters, array-parameters, etc.
  • maybe we should work with upstream to get their outputted JSON closer to what we use in-memory. It would probly help others.

@twuky
Copy link
Collaborator

twuky commented Jan 9, 2022

good work on this!
It looks like the reason for UpdateCamera not working is that using the wrapped functions was commented out on addon.cc. I added some recently, but the implementation of them changes in 4.0 a little bit. The shader ones for example use differently named constants. I can add those to this branch too

In general, after looking more closely at raylib and how its functions work, there are a lot more functions that while compiling fine won't have the expected behavior, which may be solvable with some further wrapped functions. I think even for the example of UpdateCamera - that function allows a 2D or 3D camera to be passed in, so i'm wondering if we actually need two wrapped functions, that determine the type first FROM the JS binding, then call the adequate wrapped function - UpdateCamera2DWrap vs UpdateCamera3DWrap

i started documenting some functions i've found that have additional issues here: https://github.com/twuky/raylib-4.0/blob/master/unsupported_functions.md

@RobLoach
Copy link
Owner

RobLoach commented Jan 9, 2022

Added both of you as collaborators on the project! The updated code is coming along very nicely 👍

@konsumer
Copy link
Collaborator Author

In general, after looking more closely at raylib and how its functions work, there are a lot more functions that while compiling fine won't have the expected behavior, which may be solvable with some further wrapped functions.

My thinking, after making a few wrappers in various languages, is that the target language is the right place for this. Like it'll be easier & more ergonomic to wrap in node-space. If we can generate a basic wrapper and it has enough of an interface to work with, we can add a few little things in js, if needed. I also think index.js should expose more constants from the C-lib.

@twuky
Copy link
Collaborator

twuky commented Jan 10, 2022

calling into the node addon can be expensive, so i agree that the enumerators and other raylib constants could just be implemented in plain javascript. adding that to the parser wouldn't be too difficult.

it is probably possible to get some of the missing functions implemented on index.js - like CheckCollisionLines() which currently does not update one of its "by reference" arguments that is supposed to change to the point where the collision is.

But there are still some functions that will take additional work on the C side to function - anything expecting the user to pass in an array (or a pointer to an array) is a good example. some of those functions could have improvements on the side of javascript though, for example most those array functions also require you to pass in the array length, but here the binding could handle that for users.

@konsumer
Copy link
Collaborator Author

I currently expose all the constants in the c-wrapper. It might be fine, as it would be an at-import cost, and my feeling is it would actually be pretty light, because of how NAPI exposes constants. I agree with you on the other stuff, too, but maybe we should measure the performance with js vs NAPI. I know they do a bunch of stuff to optimize it, and it's orders of magnitude better than ffi, for example, so it might be fine. Maybe we should identify what needs special-handling and focus on that in JS, next (possibly also generated.)

for example most those array functions also require you to pass in the array length, but here the binding could handle that for users.

Agreed, JS can do this easily. Can you find some examples? While I was working on it, I only saw fixed-length arrays, but I didn't take too close of a look.

I don't have much time left this weekend, and a pretty packed schedule this week, with work, so I won't have much time to work on it. I am totally open to any ideas either of you have to improve it, so just go for it. Should I PR the currently incomplete notnullgames raylib4 branch into this repo, to keep it all here?

@twuky
Copy link
Collaborator

twuky commented Jan 10, 2022

Take a look at that Unsupported_functions.md link I posted for a few of the array examples, and some other functions I noted that use pointers in ways that may take some extra attention. I started noting how they don't currently work. The way I understand most of them, it may be difficult to fix those without hand-writing solutions for each case somewhat separately.

DrawLineStrip() is one that expects a pointer to an array of vector2 points. So a function would need to be made in C that expects an Napi::Array, gets a pointer to that, and passes that into the raylib function. It may also need to clear that array from memory? But because it makes calls to draw, it can't really be implemented in pure JS.

The problem with CheckCollisionLines() for example is that it needs to return two values from C. A boolean and a Vector2. So there are a few ways to handle it i see:

  • two separate 'wrapped-style' functions, one returns the bool, one returns the array. A JS wrapper could be written that calls both.
  • One C function that returns an Napi::Object containing both properties. Then a JS wrapper could parse that to replicate the behavior
  • This function is just doing math, so it could be implemented in pure JS too.

I think this sort of thing will need to be on a by-case basis though. It's probably a bigger priority to see about the fixed-length arrays and other Material issues first, since I am mostly talking about a few outlier functions.

I'm fine with working out of either repo at the moment, but it seems like it would be fine to pull the branch, since it is at least building?

@RobLoach
Copy link
Owner

There's no worry in implementing some of the functions in JavaScript when possible. There need to be some manual wrappers sometimes.

@twuky
Copy link
Collaborator

twuky commented Feb 21, 2022

So over this weekend I've been writing some more function bindings to test different methods of using node-addon-api with performance in mind. The main challenge performance-wise with this library is that native addon function calls have a lot of overhead when converting parameters from JS to C++. The expectation seems to be that native addons focus on libraries where the amount of work the function needs to do outweighs the cost of translating data between the languages. Raylib is an example of the opposite, where we do smaller amounts of work many times over such as rendering different objects.

That being said, I have found some general principles with node-addon-api that I think could be used to squeeze the maximum amount of speed out of this 'converting' process. My focus was on improving the throughput of DrawTexture() with the bunnymark example in mind.

Main Guidelines

  1. Avoid Using JS Objects as function parameters WHEREVER reasonably possible. Calling .As<Napi::Object>() is VERY slow. And subsequent object.Get() calls are also similarly slow. Instead, flattening / unwrapping those objects' properties into individual function arguments seems to give 6-8x better performance.
    for example:
// instead of
r.DrawTexture(texture, x, y, color)
// this performs magnitudes faster
r.DrawTexture(
  texture.id, texture.width, texture.height, texture.mipmaps, texture.format,
  x, y,
  color.r, color.g, color.b, color.a
)
// would require a wrapper function that unpacks the objects for the user, for API parity
  1. With the first point in mind, functions with fewer arguments convert their data much faster. Every single argument really seems to count when it comes to this. For example, if you make the assumption that the user wants to provide r.WHITE as the color parameter to DrawTexture(), it becomes much faster (~20%) to provide a DrawTextureWhite(texture, x, y) that hardcodes the color parameter of the function on the C++ side.
  2. Inline c++ type coercion code does seem to improve performance over function calls. This has to do with the GetArgFromParams() function in the current binding. By hardcoding the type conversion from the info[] array right inside the c++ raylib function wrapper, I was able to get around double the performance compared to the current bindings (even on the 4.0 branch), which does not include the optimizations described above. This obviously means having to generate a much more verbose addon.cc file. The generated addon from my 4.0 testing branch reaches nearly 9000 lines due to this difference already.

Benchmarks

Here is some more detailed information on the benchmarks I put together while testing this. These are all based on the 'bunnymark' example code from raylib.

Maximum Draw Calls

The first test was to continually spawn new bunnies until the framerate reached below 60FPS.

native c++ bunnies: 108900
method: (3.5) DrawTexture, args: 4, bunnies: 5344
method: (4.0) DrawTexture, args: 4, bunnies: 10820
method: DrawTextureShortWhite, args: 5, bunnies: 71250
method: DrawTextureShort, args: 9, bunnies: 55000
method: DrawBunnyWhite, args: 2, bunnies: 79130
method: DrawBunny, args: 6, bunnies: 65920
method: DrawBunnyBuffer, args: 1, bunnies: 41500
method: DrawTextureFlat, args: 11, bunnies: 52900
method: DrawTextureFlatWhite, args: 7, bunnies: 62100
method: DrawTextureShortBuffer, args: 6, bunnies: 55230

Stress Test Framerate

The second test was to pre-generate an array of 20,000 bunnies, and record whatever the framerate was after a few seconds of updating their positions.

native c++ fps: 320
method: (3.5) DrawTexture, args: 4, fps: 17
method: (4.0) DrawTexture, args: 4, fps: 31
method: DrawTextureShortWhite, args: 5, fps: 212
method: DrawTextureShort, args: 9, fps: 158
method: DrawBunnyWhite, args: 2, fps: 234
method: DrawBunny, args: 6, fps: 184
method: DrawBunnyBuffer, args: 1, fps: 116
method: DrawTextureFlat, args: 11, fps: 155
method: DrawTextureFlatWhite, args: 7, fps: 185
method: DrawTextureShortBuffer, args: 6, fps: 151

Some of these look like really drastic performance improvements! Being able to render a texture 79000 times/frame is actually more than you can achieve with something like love2d. But it may be a little tricky to implement these functions in a way that matches with the raylib API, since they drastically change how the arguments need to be presented. It would be necessary to wrap on the JS side in order to use these in the same way you currently use DrawTexture.
Here's the specifics about the differences between these function bindings / wrappers.

DrawTexture

parameters: texture: Texture2D, x: number, y: number, color: Color
This is the baseline metric for the performance we could expect with raylib4.0. It uses the inline type coercion I described above, which is why it performs better than the 3.5 bindings.

DrawTextureShort

parameters: id: number, width: number, height: number, x: number, y: number, r: number, g: number, b: number, a: number
This is a little tricky. I found that you can initialize a Texture struct with only 3 parameters: texture.id, texture.width, texture.height. Raylib provides default values for mipmaps (1) and texture type ( 7 == r8g8b8a8).

DrawTextureShortWhite

parameters: id: number, width: number, height: number, x: number, y: number
Works similarly to DrawTextureShort above, but the color parameter is hardcoded to WHITE in c++.

DrawBunnyWhite

parameters: x: number, y: number
This function hardcodes both the color AND the texture. meaning you can only control where the texture is drawn. Having only 2 parameters seems to make this the absolute fastest way to draw. Obviously a DrawTexture function where you cannot specify the texture is not very useful. But it may be possible to implement on the C++ some additional functions that 'cache' a global texture/color to draw. For example:

setCachedTexture(texture)
setCachedColor(color)
// probably some loop, i.e loop through particles
drawCached(x, y)
drawCached(x1,y1)
drawCached(x2,y2)
setCachedColor(color1)
drawCached(x3,y3)

This would mean introducing a new API that does not exactly exist in raylib proper. But it represents the most performant way to issue draw calls over node-addon-api that I could find. It may be possible, in a JS wrapper, to 'detect' when multiple sequential draw calls use the same color or textre, and 'switch over' to this API behind-the-scenes. But if properties like the color or texture are changing every single draw call, splitting the addon call into 2-3 calls would likely be slower than just 1 call with more parameters.

DrawBunny

parameters: x: number, y: number, r: number, g: number, b: number, a: number
Works similarly to DrawBunnyWhite above, but allows passing in parameters for color. As such its a little bit slower.

DrawBunnyBuffer

parameters: buffer: Buffer
This function only takes one argument - I wanted to see if you could pass a single buffer containing all parameters (since for this function you only need numbers) and if node-addon-api handled reading buffers faster than individual parameters. It does not, so this ends up being the slowest of the 'fast' functions. Additionally, Buffers restrict numbers to a range of 0-255, so it does not work for drawing to the screen.

DrawTextureFlat

parameters: id: number, width: number, height: number, mipmaps: number, type: number, x: number, y: number, r: number, g: number, b: number, a: number
This is the DrawTexture function but with all function parameters unfolded or 'flattened' into individual arguments. It has more parameters than DrawTextureShort as it includes the mipmap and texture type parameters. This would be a good fallback function to use, as it maps fully to the raylib API without any feature sacrifices. But since it uses 11 arguments it has fairly middling performance.

DrawTextureFlatWhite

parameters: id: number, width: number, height: number, mipmaps: number, type: number, x: number, y: number
Same as DrawTextureFlat above, but with hardcoded color value. It would probably be possible to write a JS wrapper function that detects if white or null is being passed as an argument, and depending on the situation calls either DrawTextureFlat or DrawTextureFlatWhite.

DrawTextureShortBuffer

parameters: id: number, width: number, height: number, x: number, y: number, color: Buffer
Works similarly to DrawTextureShort, but instead of passing in 4 values for color rgba you pass in a Buffer with 4 elements. Parsing the buffer ends up being slower than parsing 4 extra arguments, so this function ends up not being very useful.

Other performance considerations

Based on all these experiments, there are a few other ideas that I haven't figured out the implementation details for that could further optimize these calls.

Caching all textures.

It might be possible to override some functionality during LoadTexture where the C++ addon caches a reference to the texture in some global list (maybe an arena allocator?). Then when a user wants to call DrawTexture, in stead of passing in 3-5 parameters, they only need to pass in the index/id/pointer to the texture, which the addon then looks up in its cache . This would probably work okay for Textures, because the struct does not store any actual texture data, just some information about the texture. This is sort of where my inexperience with C++ is limiting, because I would also worry about memory issues on the C++ side, and making sure that the user always has a correct index / pointer to their desired texture. It seems like raylib by default is pretty smart about loading textures and assigning IDs, but I'd want to be totally sure that a memory safe model could be built on top of that.
When applying this concept to other more complicated structures, like Image, Mesh, or Model, it could get much harder to write function bindings, since they have methods that update the internal data. But I dont think those issues are impossible to solve, they would just need more detailed wrappers on both JS and C++.

Colors as ints

Raylib already has some built-in functionality to convert a 32 bit unsigned int to a color. It should be possible on the JS side to do some bitwise operations (you wouldnt want to call the built in raylib function unless you did it in advance to cache) to produce a number that could be converted by C++ to the color again. This would cut down 3 function arguments anywhere a color is needed. I tried to attempt this while experimenting, and while the addon compiles I can't call the function - which means something in the addon code probably is actually wrong.


I pushed the experimental bindings over to https://github.com/twuky/raylib-texture-performance-experiments if you want to take a look at how each of those functions were implemented. It basically only includes as many functions as necessary to run the tests.

I think ideally what would happen is that DrawTexture() (and similar, like DrawTexturePRO etc) would have a JS wrapper that analyzes the supplied arguments, and finds the node-addon function that gives the best performance (IE requires the minimum amount of arguments to match functionality). Though that may take some testing of itself to see how much that improves performance, because I could also see it becoming a mess of branches. The alternative would be to expose these alternative drawing functions directly to the user (possibly with some simpler JS wrappers to unfold the objects for the user) as functions that deviate from the standard raylib API.

Knowing this about the performance specifics of node-addon-api.. it seems that in order to make the most out of it would require writing the addon bindings in a completely new way, which would essentially mean starting node-raylib from scratch. There are some things, like "flattening" function arguments, that would be pretty trivial to implement in the code generator. Some of the other concepts like caching would need a lot of manual setup.
The difference in performance here looks well worth it though, somewhere between 5-13x the throughput on drawing textures. If this were able to be implemented, raylib on nodejs would be able to rival or outperform pretty much most other interpreted language game libraries, and even gets fairly close to native raylib. I'm wondering what anyone else's thoughts are on this here, particularly about how to introduce major changes / deviations from the API to end users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants