-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654
Labels
Comments
ruggi
added a commit
that referenced
this issue
Jun 19, 2024
Part of #5654 **Problem:** > The UIDs are too short …and therefore prone to collisions. **Fix:** > I made them longer …so they won't collide. This PR adds support for full-length UIDs. This is done with a split approach between production and test environments, with the theoretical ability to configure the desired UID length (but realistically they'll just stay as they are - 3 chars for tests, 32 chars for prod). When generating a consistent UID (remember! we'll want to get rid of the existing UID checks in the future, so we can parallelize the parsing): - for test environments reuse a good chunk of the previous logic, generating 3-chars long UIDs that are either a portion of the original or an incremental alphabetical string (which is now done as an actual increment rather than doing nested loops which would also be a nightmare to parametrize). - for production environments, use either the verbatim hash (128 bit, 32 chars) or a random UUID v4 (with stripped dashes). **Manual Tests:** I hereby swear that: - [x] I opened a hydrogen project and it loaded - [x] I could navigate to various routes in Preview mode
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We originally restricted the UIDs to be 3 characters to minimise the amount that we impact the users' code, but since we no longer print them we can make them as long as we want. If we ensure we're always using the code bounds in that hash, we can keep them consistent and don't need to worry about clashing, meaning we can completely parallelise that part.
We could probably simplify it further by just hashing the filename and code bounds rather than the actual JSX Element, which I'd bet would speed it up.
See https://github.com/concrete-utopia/utopia/pull/5547/files#diff-82cdd3e4e2dfcd2a312838a7c4c4ba895c26a9d03dd5b60957248a16915a557d for where I have introduced the code bounds into one case where we're generating the hash
The text was updated successfully, but these errors were encountered: