Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Startup Time by having Assembly Hash Cache in the database too #7

Open
richard-churchman opened this issue Nov 19, 2023 · 0 comments

Comments

@richard-churchman
Copy link
Contributor

Jube is written to support scalable cloud operations. It means that many small instances of Jube, perhaps containers, can be created to achieve scalability. A principle requirement of the strategy where containers are created to handle bursts dynamically is new instances of the software must load very quickly, in under a minute. A requirement that is coming up extensively is the support of cloud operations and dynamic scalability via Kubernetes (although the use of containers is less so, more so very lightweight VMs amounting to the same thing).

The software does load quickly, however, new instances will bring back all configurations from the database, for all models, and proceed to lay that out in the instance memory, a process that is very fast (not to mention unavoidable). The issue is that for each configuration that contacts rule code, this will compile to an assembly, and then be stored locally in the hash cache (dictionary of code hash to its assembly), so as not to duplicate the compilation of identical code.

The task is to refactor the hash cache to be part of the compilation class first, moving the compilation class to an instance in place of the hash cache. This compile class, which will now also include the hash cache, will also use a table in Postgres storing byte arrays and the code hash.

On a call to compile, as now, the hash cache will first be inspected for the key value combination (hashed code vs assembly), then in the absence of that, it will fall back to a table in Postgres for the same (noting that the initial Rosyln compilation is a byte array that should be trivial to store) and only in the event of unavailability will the code go on to be compiled to an assembly. It is of course case that newly compiled code be made available to the hash cache in both Postgres and the instance.

The approach will remove the need for code to be recompiled as new instances are created in the cluster, which should improve the startup time, making models available soon after instantiation.

It would also be advantageous to include more compile-time data and errors for the purpose of monitoring and production support. At the moment the compile errors only appear in logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant