NLP-Classification

This repo contains python implementations for extracting features from text, that I have used in my research mostly for user input classification tasks.
Two approaches are implemented:

One based on word-embeddings, which is described as part of the baseline methods in [1].
A typical statistical n-gram language modeling approach, that estimates the conditional probability of a sentence in a class.

API Referernce

To do....

Toy Example

A toy example is provided, to play around with. The dataset used is a randomly selected subset of the "SMS Spam Collection" dataset available at the UCI Machine learning repository.

References

Cedric De Boom, Steven Van Canneyt, Thomas Demeester, and Bart Dhoedt. 2016. Representation learning for very short texts using weighted word embedding aggregation. Pattern Recogn. Lett. 80, C (September 2016), 150-156. DOI: https://doi.org/10.1016/j.patrec.2016.06.012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NLP-Classification

API Referernce

Toy Example

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

NLP-Classification

API Referernce

Toy Example

References