Skip to content
This repository has been archived by the owner on Jun 18, 2020. It is now read-only.

Lucene: exception - Query parser encountered <EOF> after “some word” #5

Open
stroncod opened this issue Jun 4, 2018 · 1 comment

Comments

@stroncod
Copy link

stroncod commented Jun 4, 2018

I got a problem when trying to read a dataset with special characters and trying to get the concept vector.
This is easily solve by adding the escape function in the Vectorizer class

public ConceptVector vectorize(String text) throws ParseException, IOException {
        Query query = queryParser.parse(**QueryParser.escape(text)**);
        TopDocs td = searcher.search(query, conceptCount);
        return new ConceptVector(td, indexReader);
    }

Great implementation by the way! Thanks

Source: https://stackoverflow.com/questions/10259907/lucene-exception-query-parser-encountered-eof-after-some-word/10259944

@pvoosten
Copy link
Owner

pvoosten commented Jun 5, 2018

Thanks for using ESA, and even more for your feedback!

text is expected to be plain text, without control characters (such as quotes to combine multiple words into a single token), so I think your solution is correct.

Do you want to issue a pull request with the change and a unit test or two? Then your contribution will be carved into stone.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants