Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperscan java wrapper throws an exception when one scratch space is used with multiple databases #220

Open
apismensky opened this issue Jan 29, 2024 · 2 comments

Comments

@apismensky
Copy link

This article https://intel.github.io/hyperscan/dev-reference/performance.html#allocate-one-scratch-space-per-scanning-context says that one scratch space can be used with multiple databases. But this snipped of code produces an exception:

public class BugReproduce {
    public static void main(String[] args) throws Exception {
        try (
                Database dbA = Database.compile(new Expression("a"));
                Database dbB = Database.compile(new Expression("b"));
                Scanner scanner = new Scanner()
        ) {
            scanner.allocScratch(dbA);
            List<Match> matchesA = scanner.scan(dbA, "a"); // works
            showMatches(matchesA);
            scanner.allocScratch(dbB);
            matchesA = scanner.scan(dbA, "a"); // Exception - An invalid parameter has been passed. Is scratch allocated?
            showMatches(matchesA);
        }
    }
    private static void showMatches(List<Match> matches) {
        if (!matches.isEmpty()) matches.forEach(m -> System.out.println("Match start: " + m.getStartPosition() + ", end: " + m.getEndPosition()));
        else System.out.println("No matches in hyperscan");
    }
}

Most probably the bug is in the java wrapper, cause the following C++ code works fine (no error):

TEST(order, alexey2) {
    // patterns for dbA
    vector<pattern> patternsA;
    patternsA.push_back(pattern("a", 0 , 1));
    // compile dbA
    hs_database_t *dbA = buildDB(patternsA, HS_MODE_NOSTREAM);
    ASSERT_NE(nullptr, dbA);
    // Allocate scratch space
    hs_scratch_t *scratch = nullptr;
    hs_error_t err = hs_alloc_scratch(dbA, &scratch);
    ASSERT_EQ(HS_SUCCESS, err);
    // Scan string dataA against dbA
    const char *dataA = "a";
    CallBackContext cA;
    err = hs_scan(dbA, dataA, strlen(dataA), 0, scratch, record_cb,(void *)&cA);
    ASSERT_EQ(HS_SUCCESS, err);
    // Verify one match
    EXPECT_EQ(1, countMatchesById(cA.matches, 1));
    // Patterns for dbB
    vector<pattern> patternsB;
    patternsB.push_back(pattern("b", 0 , 1));
    // compile dbB
    hs_database_t *dbB = buildDB(patternsB, HS_MODE_NOSTREAM);
    ASSERT_NE(nullptr, dbB);
    // Use the same scratch for dbB
    err = hs_alloc_scratch(dbB, &scratch);
    ASSERT_EQ(HS_SUCCESS, err);
    // Scan string dataB against dbB
    const char *dataB = "b";
    CallBackContext cB;
    err = hs_scan(dbB, dataB, strlen(dataB), 0, scratch, record_cb,(void *)&cB);
    ASSERT_EQ(HS_SUCCESS, err);
    // Verify one match
    EXPECT_EQ(1, countMatchesById(cB.matches, 1));
    // Cleanup
    err = hs_free_scratch(scratch);
    ASSERT_EQ(HS_SUCCESS, err);
    hs_free_database(dbA);
    hs_free_database(dbB);
}
@apismensky
Copy link
Author

The reason for the bug is that every call allocScratch calls scratch.registerDeallocator(); and this guy calls hs_scratch_t p = new hs_scratch_t(this); which effectively calls hs_scratch_t method twice for the same scanner object. We may fix that by checking if the deallocator instance is created, for example:

private static class NativeScratch extends hs_scratch_t {
        void registerDeallocator() {
            if (deallocator() != null) {
                hs_scratch_t p = new hs_scratch_t(this);
                deallocator(() -> hs_free_scratch(p));
            }
        }

@gliwka - lmk I can create a PR for that, but wanted to know what you think?

@gliwka
Copy link
Owner

gliwka commented Feb 1, 2024

@apismensky Thanks for the analysis. PR is welcome :-)

apismensky pushed a commit to apismensky/hyperscan-java that referenced this issue Feb 1, 2024
gliwka pushed a commit that referenced this issue Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants