Word-Affiliation-Comparative-Study

A Comparative Study to Compare Performance of Static Word Embeddings

Given an extensive auxiliary text file (could be web-scrapped from any website) which has been cleaned using regular expressions, we aim to
build an end-to-end framework, with the input from the user being a word or a set of words (word analogy relation task) and our goal to extract relevant words from the auxiliary text file directly correspond to the input, i.e., words which are either synonyms or antonyms of the input or fit best into the word analogy relation. Word analogy task can be explained by, ‘x is to y what w is to ________’. An example would be ‘red is to apple what yellow is to banana’.

Features

Synonyms on a Webpage according to input
Anotonyms on a Webpage according to input
Word Analogy Relation acc. to input words

Word Embeddings Being Tested

word2vec
GloVe
fastText

Results for Wikipedia Page

The brackets show the Cosine Similarity Score

Environment & General Details

Feature	Value
Programming Language	Python
Operating System	Windows 10
Library Packages	pickle, flask, numpy, pandas, gensim, sklearn.metrics.pairwise, re, nltk
Interface Design	Web Application hosted locally
Web App Framework	Flask
Datasets/Models Used	GoogleNews-vectors-negative300.bin, glove.6B.100d.txt.word2vec, wiki-news-300d-1M.vec
Word Embeddings	Word2Vec, GloVe, fastText

WorkFlow:

The whole project can be accessed in terms of a web application. The web application requires the user to upload a web scrapped file in .csv format.
The backend python code cleans the csv file provided by the user and stores the words thereby left in a list format. The user is asked to either
provide a word for calculation of synonyms, antonyms of word analogy from the list of words thus received above. A list is then output everytime containing
the desired results.

We have measured the score of similarity using cosine similarity.
We have experimented across word embeddings of word2vec, GloVe and fastText, we would show a correspondence cosine similarity score depicting how similar or opposite the extracted words are to the input word/phrase. We have achieved this task by using Transfer Learning in NLP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Word-Affiliation-Comparative-Study

A Comparative Study to Compare Performance of Static Word Embeddings

Features

Word Embeddings Being Tested

Results for Wikipedia Page

The brackets show the Cosine Similarity Score

Environment & General Details

WorkFlow:

Web App Interface:

Finding the Synonyms of Read from Wikipedia Page

Model View Template Diagram:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Word-Affiliation-Comparative-Study

A Comparative Study to Compare Performance of Static Word Embeddings

Features

Word Embeddings Being Tested

Results for Wikipedia Page

The brackets show the Cosine Similarity Score

Environment & General Details

WorkFlow:

Web App Interface:

Finding the Synonyms of Read from Wikipedia Page

Model View Template Diagram: