Text-Prediction

This program implements a statistical trigram language model with NLTK for text prediction based on the Alice in Wonderland corpus.

Getting started

clone or download this repository

git clone https://github.com/jadessechan/Text-Prediction.git

run main.py
once prompted by the program, enter a phrase related to the corpus

Demo

Lines 80-86 display n-gram statistics of the corpus and are commented-out by default.

Here is a frequency distribution plot of the most common 30 trigrams:

Here is an example of the program output:

final output of demo:

User input: alice said to the
Prediction: alice said to the table, half hoping she might find another (comma was added for readability)
What did alice want to find again?? The suspense...😖

Implementation

I used NLTK's probability library to store the probability of each predicted word,

ConditionalFreqDist()

then the program picks from a weighted random probability to decide which prediction to append to the given phrase.

random.choices()

The user decides when to stop the program by choosing whether or not to predict the next word.

"Do you want to generate another word? (type 'y' for yes or 'n' for no): "

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
README.md		README.md
alice.txt		alice.txt
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Prediction

Getting started

Demo

final output of demo:

Implementation

About

Releases

Packages

Languages

jadessechan/Text-Prediction

Folders and files

Latest commit

History

Repository files navigation

Text-Prediction

Getting started

Demo

final output of demo:

Implementation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages