Skip to content

Sample project for next word predictions using n-grams

Notifications You must be signed in to change notification settings

nejckorasa/text-predictor

Repository files navigation

text-predictor

Sample project for next word predictions using n-grams.

See NGramModel and NGram classes for implementation.

This project uses Quarkus and Picocli to build a simple CLI with GraalVM native image.

Note that this is just a sample project, something like Apache OpenNLP should be used as a machine learning based toolkit for the processing of natural language text.

Creating a native executable

You can create a native executable using:

./gradlew build -Dquarkus.package.type=native

Or, if you don't have GraalVM installed, you can run the native executable build in a container using:

./gradlew build -Dquarkus.package.type=native -Dquarkus.native.container-build=true

Running native executable

See Commands for supported command line arguments.

You can execute your native executable with: ./build/text-predictor-1.0-runner, for example:

./build/text-predictor-1.0-runner predict ./samples/frankenstein.txt "text to predict next tokens for"
./build/text-predictor-1.0-runner predict -all ./samples/frankenstein.txt "some other text"

Packaging and running the application

The application can be packaged using ./gradlew build.

It produces the quarkus-run.jar file in the build/quarkus-app/ directory. Be aware that it’s not an über-jar as the dependencies are copied into the build/quarkus-app/lib/ directory.

If you want to build an über-jar :

  • execute ./gradlew build -Dquarkus.package.type=uber-jar.
  • the application is now runnable using java -jar build/quarkus-app/quarkus-run.jar.