HMM-and-POS-Tagging-NLP

This project is an implementation of Part-of-Speech tagging by using Hidden Markov Model (HMM).

Task 1 - Loading entire corpus (Brown corpus)

Each sentence is in the form of "word/POS tag" in the Brown corpus. (fire/nn means that the word 'fire' has the tag 'nn' –which is noun)

The HMM model which is trained in task 2 is used for the input_tokens.txt file to assign the most probable tags.

Viterbi method finds the path with the highest probability by looking at all the possible tag sequences. The algorithm contains two steps:

Compute the probability of the most likely tag sequence.
Trace the back pointers to find the most likely tag sequence from the end to the beginning.

test_set.txt is used as input file.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
brown		brown
input		input
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py