Skip to content

Source code for IEEE TENCON paper - "Multilingual cyber abuse detection using advanced transformer architecture"

License

Notifications You must be signed in to change notification settings

aditya-malte/Code-Mixed-Cyberabuse-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multilingual cyber abuse detection using advanced transformer architecture (presented at IEEE TENCON 2019)

This repo presents the source code for training and pre-processing code-mixed text used in our paper:

Aditya Malte, Pratik Ratadiya, "Multilingual cyber abuse detection using advanced transformer architecture", IEEE TENCON 2019

Dataset:

TRAC-1 code-mixed dataset for detection of cyber abuse

Model:

BERT(Base/Large/Multi), XLNet, various hyperparameters

Preprocessing:

demojization, transliteration, normalization and so on.

Results:

  1. State-of-the-art performance on Hindi dataset
  2. Excellent performance (top-5) on English dataset

Note:

Colaboratory Notebooks to be added soon

About

Source code for IEEE TENCON paper - "Multilingual cyber abuse detection using advanced transformer architecture"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published