Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser performance improvement: Simplify the hashing function by using full length UUIDs and remove the need to check for clashes #5654

Open
Rheeseyb opened this issue May 9, 2024 · 0 comments
Assignees

Comments

@Rheeseyb
Copy link
Contributor

Rheeseyb commented May 9, 2024

We originally restricted the UIDs to be 3 characters to minimise the amount that we impact the users' code, but since we no longer print them we can make them as long as we want. If we ensure we're always using the code bounds in that hash, we can keep them consistent and don't need to worry about clashing, meaning we can completely parallelise that part.
We could probably simplify it further by just hashing the filename and code bounds rather than the actual JSX Element, which I'd bet would speed it up.

See https://github.com/concrete-utopia/utopia/pull/5547/files#diff-82cdd3e4e2dfcd2a312838a7c4c4ba895c26a9d03dd5b60957248a16915a557d for where I have introduced the code bounds into one case where we're generating the hash

This was referenced May 15, 2024
@ruggi ruggi mentioned this issue Jun 18, 2024
2 tasks
ruggi added a commit that referenced this issue Jun 19, 2024
Part of #5654

**Problem:**

> The UIDs are too short

…and therefore prone to collisions.

**Fix:**

> I made them longer

…so they won't collide.

This PR adds support for full-length UIDs. This is done with a split
approach between production and test environments, with the theoretical
ability to configure the desired UID length (but realistically they'll
just stay as they are - 3 chars for tests, 32 chars for prod).

When generating a consistent UID (remember! we'll want to get rid of the
existing UID checks in the future, so we can parallelize the parsing):
- for test environments reuse a good chunk of the previous logic,
generating 3-chars long UIDs that are either a portion of the original
or an incremental alphabetical string (which is now done as an actual
increment rather than doing nested loops which would also be a nightmare
to parametrize).
- for production environments, use either the verbatim hash (128 bit, 32
chars) or a random UUID v4 (with stripped dashes).

**Manual Tests:**
I hereby swear that:

- [x] I opened a hydrogen project and it loaded
- [x] I could navigate to various routes in Preview mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants