This project is an implementation of Part-of-Speech tagging by using Hidden Markov Model (HMM).
Each sentence is in the form of "word/POS tag" in the Brown corpus. (fire/nn means that the word 'fire' has the tag 'nn' –which is noun)
The HMM model which is trained in task 2 is used for the input_tokens.txt file to assign the most probable tags.
Viterbi method finds the path with the highest probability by looking at all the possible tag sequences. The algorithm contains two steps:
- Compute the probability of the most likely tag sequence.
- Trace the back pointers to find the most likely tag sequence from the end to the beginning.
test_set.txt is used as input file.