Sample project for next word predictions using n-grams.
See NGramModel and NGram classes for implementation.
This project uses Quarkus and Picocli to build a simple CLI with GraalVM native image.
Note that this is just a sample project, something like Apache OpenNLP should be used as a machine learning based toolkit for the processing of natural language text.
You can create a native executable using:
./gradlew build -Dquarkus.package.type=native
Or, if you don't have GraalVM installed, you can run the native executable build in a container using:
./gradlew build -Dquarkus.package.type=native -Dquarkus.native.container-build=true
See Commands for supported command line arguments.
You can execute your native executable with: ./build/text-predictor-1.0-runner
, for example:
./build/text-predictor-1.0-runner predict ./samples/frankenstein.txt "text to predict next tokens for"
./build/text-predictor-1.0-runner predict -all ./samples/frankenstein.txt "some other text"
The application can be packaged using ./gradlew build
.
It produces the quarkus-run.jar
file in the build/quarkus-app/
directory.
Be aware that it’s not an über-jar as the dependencies are copied into the build/quarkus-app/lib/
directory.
If you want to build an über-jar :
- execute
./gradlew build -Dquarkus.package.type=uber-jar
. - the application is now runnable using
java -jar build/quarkus-app/quarkus-run.jar
.