Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing may be slow for some use cases #284

Open
deavid opened this issue Mar 9, 2024 · 7 comments
Open

Parsing may be slow for some use cases #284

deavid opened this issue Mar 9, 2024 · 7 comments

Comments

@deavid
Copy link

deavid commented Mar 9, 2024

I noticed that in maps that are 128x128 tiles with several tile layer (around 7) and several tilesets with 50+ sprites, a map load can take 100ms in a fast computer. I suspect that a 256x256 map can easily take around 500ms but I have not tested.

Consider the use case where one would like to pre-load all maps in a folder to validate them and retrieve some basic info such as map properties, in order to build a database in memory for a game to show what maps are available. If the game has community map-packs, it easily can surpass 100 maps. In this case the game could take several seconds to build this initial state and data.

Of course, the solution is to load the data in another way or to cache it somewhere. But it is an inconvenience.

From the looks of what a TMX file has, I don't think that the XML parser itself is the problem (talking about xml-rs and #137 ), but the work that this library does while parsing, creating all the references for the proper sprites for each tileset, etc.

Profiling the library and benchmarking different parts of the code could give you an idea on where the time is lost. But most likely is the lookup of sprites and associating each sprite to each tile.

Consider if some of this work could be deferred to access time. Or for example, initially build a more crude representation of the file in memory that we could use as-is, and another step that creates the final one more database-like that is more intensive to create but faster to use.

At least in my case, I tend to load what I need into my own structures and discard the parser straight away. The problem might be that here we have 2 things in one, both a loader, and a in-memory map database; and we cannot avoid the costs of the latter.

@aleokdev
Copy link
Contributor

Hmm. It is true that there's not a quick way to access surface-level aspects of a map right now. I completely understand your use case. However, creating a progressively-loaded version of the library would probably require some interface rewriting.

I'd really like to have benchmarks at some point, although my intuition tells me that most time is spent in filesystem operations and XML parsing. If this is true, one quick and dirty solution that can be done on user-side for now could be to implement a reader that ignores all files save for the ones you are interested in, but of course that approach only affects entire files.

@deavid
Copy link
Author

deavid commented Mar 10, 2024

I really doubt that FS operations and XML parsing is the problem. Unless you open and parse the same file several times, which then that would be the problem.

To prove my point, I wrote a very simple Python parse to showcase parsing speed:

https://github.com/deavid/unhaunter/blob/next/assets/maps/parse_test.py

Running that Python program from that folder is in, outputs the following:

Parsed file: character1.tsx in 0.000075 seconds
Parsed file: unhaunter_custom_tileset.tsx in 0.000044 seconds
Parsed file: unhaunter_spritesheet2.tsx in 0.000036 seconds
Parsed file: unhaunter_spritesheetA_6x6x10.tsx in 0.000437 seconds
Parsed file: unhaunter_spritesheetA_3x3x3.tsx in 0.000533 seconds
Parsed file: map_house1.tmx in 0.000155 seconds
Parsed file: map_school1.tmx in 0.001008 seconds
Parsed file: map_house2.tmx in 0.000884 seconds
['character1.tsx', 'image', 'unhaunter_custom_tileset.tsx', 'tileoffset', 'grid', 'tile', 'unhaunter_spritesheet2.tsx', 'tileoffset', 'grid', 'properties', 'image', 'tile', 'unhaunter_spritesheetA_6x6x10.tsx', 'tileoffset', 'grid', 'properties', 'image', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'unhaunter_spritesheetA_3x3x3.tsx', 'tileoffset', 'grid', 'transformations', 'properties', 'image', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'tile', 'wangsets', 'map_house1.tmx', 'properties', 'tileset', 'tileset', 'tileset', 'group', 'map_school1.tmx', 'properties', 'tileset', 'tileset', 'tileset', 'group', 'map_house2.tmx', 'properties', 'tileset', 'tileset', 'tileset', 'group']

Total time spent parsing files: 0.003172 seconds

Total: 3 milliseconds. Tiled takes around 225ms to complete reading the *.tmx files in there in the same machine.

Please check your assumptions. XML parsing is slow, but it's not that slow.

Here is my use case: https://github.com/deavid/unhaunter/blob/49832097a541311be5c946ff4ac71908fbe9b6c4/src/root.rs#L377

And I had to implement a vary naive loader to make this usable: https://github.com/deavid/unhaunter/blob/49832097a541311be5c946ff4ac71908fbe9b6c4/src/tiledmap.rs#L403

All Unhaunter code, maps and assets are under Apache 2 license (except fonts), so feel free to use them for benchmarks if you want.

But you don't need even benchmarks to see the problem. I bet that if we store a 1024x1024 tile map with a 1024 sprite tileset it will take very long to parse, and just doing a profiling session it would show the problem. (or just adding manually some timing information manually by printing to the console)

Deukhoofd added a commit to Deukhoofd/rs-tiled that referenced this issue Apr 1, 2024
This little change massively reduces the amount of syscalls done when reading files. Using the test files from mapeditor#284, I saw a 400% performance boost.
Deukhoofd added a commit to Deukhoofd/rs-tiled that referenced this issue Apr 1, 2024
This little change massively reduces the amount of syscalls done when reading files. Using the test files from mapeditor#284, I saw a 400% performance boost.
Deukhoofd added a commit to Deukhoofd/rs-tiled that referenced this issue Apr 1, 2024
This little change massively reduces the amount of syscalls done when reading files. Using the test files from mapeditor#284, I saw a 400% performance boost.
@Deukhoofd
Copy link
Contributor

I did a profile with the given test files, most of the overhead consisted of small read syscalls in the xml parser. I created a PR to use a BufReader as underlying file reader, which should remove most of these syscalls. This should greatly improve performance.

@deavid
Copy link
Author

deavid commented Apr 1, 2024

I'll take a look to your PR and check performance. If it reduces the timer required by 5 fold, I would expect to see a timing of 44ms; and while the improvement is huge (thanks a lot!), it is still a far cry from "just XML parsing in Python".

I'd suggest that after the PR is reviewed and merged we still leave this issue open to have a better look on what is happening.

aleokdev pushed a commit that referenced this issue Apr 1, 2024
This little change massively reduces the amount of syscalls done when reading files. Using the test files from #284, I saw a 400% performance boost.
@Deukhoofd
Copy link
Contributor

Deukhoofd commented Apr 1, 2024

It's Easter, and I was slightly bored, so here's a proof of concept you could use that changes the underlying XML parser, and does some other performance optimizations: https://github.com/Deukhoofd/rs-tiled/tree/xml_performance

Using your test files, I got down to 2564488 ns/iter (+/- 53017) (2.5 ms) when building the crate in release mode (loading the map_house2.tmx map). Most of the overhead is now allocations, storing data, etc.

I probably won't create a PR for it unless asked for, as it fundamentally changes how the library works, and touches almost every file. I wouldn't want to get PRs like that as a maintainer myself ;).

Edit: for comparison, without the changes the benchmarks are 13814870 ns/iter (+/- 634266) (~14ms)

Another Edit: I was curious what the actual proper benchmarks were before I made the earlier BufReader changes, as I guesstimated the 400% performance gain from my debug run times. Looks like that change has a far greater effect when built in release mode. Before the BufReader changes, it benched at 279556497 ns/iter (+/- 9662169) (280 ms). As it benches at 14 ms after those changes, that one was a 1900% performance gain, not a 400% gain :).

@deavid
Copy link
Author

deavid commented Apr 1, 2024

That sounds more like it. You got a 6x speedup on top of the 4-5x speedup from the previous one. So roughly 25x performance I guess.

For testing, I would recommend the map_school1.tmx since it's the biggest and it's where the problems can be seen the most.

I see now your edit that we measure now from 280ms to 14ms, then from 14ms to 2.5ms; so ~20x first, and ~100x in total. I would be more than happy with those results.

@aleokdev I would really appreciate if you could consider such a contribution. The first PR that has been sent seems like a no-brainer to me.

@aleokdev
Copy link
Contributor

aleokdev commented Apr 1, 2024

I was suspecting XML loading was the issue, xml-rs isn't precisely fast. I'm really grateful for your contributions, @Deukhoofd, but one of the things I've wanted to do for a while now is to separate all the XML loading stuff into their own modules, finalizing the work in the current open PR. This'll clean up the codebase a lot and separate concerns. Maybe now's the best time to do it. Either way I can accept the changes without that if you'd like, create the PR and I'll review it when possible. Again, thanks a lot :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants