Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Rheeseyb · 2024-05-09T16:17:21Z

We originally restricted the UIDs to be 3 characters to minimise the amount that we impact the users' code, but since we no longer print them we can make them as long as we want. If we ensure we're always using the code bounds in that hash, we can keep them consistent and don't need to worry about clashing, meaning we can completely parallelise that part.
We could probably simplify it further by just hashing the filename and code bounds rather than the actual JSX Element, which I'd bet would speed it up.

See https://github.com/concrete-utopia/utopia/pull/5547/files#diff-82cdd3e4e2dfcd2a312838a7c4c4ba895c26a9d03dd5b60957248a16915a557d for where I have introduced the code bounds into one case where we're generating the hash

Part of #5654 **Problem:** > The UIDs are too short …and therefore prone to collisions. **Fix:** > I made them longer …so they won't collide. This PR adds support for full-length UIDs. This is done with a split approach between production and test environments, with the theoretical ability to configure the desired UID length (but realistically they'll just stay as they are - 3 chars for tests, 32 chars for prod). When generating a consistent UID (remember! we'll want to get rid of the existing UID checks in the future, so we can parallelize the parsing): - for test environments reuse a good chunk of the previous logic, generating 3-chars long UIDs that are either a portion of the original or an incremental alphabetical string (which is now done as an actual increment rather than doing nested loops which would also be a nightmare to parametrize). - for production environments, use either the verbatim hash (128 bit, 32 chars) or a random UUID v4 (with stripped dashes). **Manual Tests:** I hereby swear that: - [x] I opened a hydrogen project and it loaded - [x] I could navigate to various routes in Preview mode

Rheeseyb assigned ruggi May 9, 2024

Rheeseyb mentioned this issue May 9, 2024

Parser performance improvement: Look into parallelising the parsing of multiple files #5655

Open

Rheeseyb added the Parser Performance label May 13, 2024

This was referenced May 15, 2024

Faster UID hashes #5683

Merged

32-chars UIDs #5712

Closed

ruggi mentioned this issue Jun 18, 2024

Full length UIDs #5986

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Rheeseyb commented May 9, 2024 •

edited

Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Comments

Rheeseyb commented May 9, 2024 • edited

Rheeseyb commented May 9, 2024 •

edited