Japanese-Text-Similarity

1.1. Question (1):

1.2. Solution: Uncompress the compressed file

2.1. Understand the problem statetement:

2.2. Basic EDA and Visualization pre-cleaning process:

3.1. Removing Noise:

3.2. Removing Punctuation:

3.3. Tokenization:

3.4. Removing Stopwords:

4.1. Question (2.1)

4.2. Solution: Embedding Visualization

4.3. Question (2.2)

4.4. Solution: Query similarity with gensim

5.1. Question (2.3):

5.2. Solution: Text Classification with Naive Bayes (NB)

5.3. Question (2.4):

5.4. Solution: Improve the accuracy of the model

6.1. Question(3)

6.2. Solution: Topic Modelling with LDA

Provide feedback