Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bloating temp directory with copies of native libraries #2622

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DKARAGODIN
Copy link

Before submitting a pull request, please do the following steps:

  1. Read instructions for contributors.
  2. Make sure the code builds.
  3. If you add new functionality add tests to check it.
  4. Run existing tests to make sure you haven't broken anything.
  5. If you haven't already, sign the Contributor License Agreement.

Problem: when using catboost in Java a native library has to be loaded into JVM. To do so we extract corresponding native library into temp folder. After JVM shutdown this file stays there. In case when temp folder is system temp folder it may not be a problem. However when catboost used in managed environment like Application Server or Servlet Container, the temp directory will be the temp directory of this environment. Without any cleaning mechanism space on hard drive will be quickly bloated.

In this PR I suggest mechanism that deletes copies of native library in temp directory. To accomplish it I did the following changes:

  1. Native libraries copies to temp directory with version.
  2. Creating in temp directory empty file with library filename + .lck suffix
  3. Before loading library scan temp directory. If we find file that has name and version that we about to load && corresponding .lck file does not exist then delete such file.

This algorithm will work every time JVM shutsdown gracefully. In case of sudden shutdown no remedy exists.

Why this algorithm will work? (Here is my explanation, but I am not 100% sure on it)
The problem is in the fact that native library that we copied to temp folder won't be deleted despite the fact that method deleteOnExit called. That happens because this file was loaded by System classloader and will be released only after System ClassLoader garbage collected. But GC of System ClassLoader means that JVM shut down.
But this won't happen to simple file which is .lck file.

Another approach to solve this problem is to load native library with other ClassLoader. I don't think this is a good idea.

The source of inspiration for this PR can be found here: https://github.com/xerial/sqlite-jdbc/blob/master/src/main/java/org/sqlite/SQLiteJDBCLoader.java

In the attachment you can find two projects. JNA-WAR and JNA-JAR. Their intention is to reproduce the problem and check the solution. There are dependencies on catboost-prediction and sqllite. JNA-WAR should be run in Servlet Container. I tested on tomcat 10.
jna_jar.zip
jna_war.zip

@DKARAGODIN
Copy link
Author

I hereby agree to the terms of the CLA available at: [https://yandex.ru/legal/cla/?lang=en].

@andrey-khropov
Copy link
Member

Related issue in sqlite-jdbc: https://github.com/xerial//issues/80

@DKARAGODIN
Copy link
Author

Rebased commit on updated master version.

Please, delete Windows tag for this PR. Issue reproduces on Linux as well. You are misled by issue in SQL library reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants