GitHub - pruizf/elco: Entity Linking system combination using weighted voting (*SEM 2015)

The application consists of clients to call Entity Linking services (EL) in English, and modules to operate on the results. Implements the Entity Linking System Combination described in our *SEM 2015 paper.

The EL services currently supported are:

TagMe
DBpedia Spotlight
~~Wikipedia Miner~~ (public instance no longer accessible)
AIDA: both installed locally and in the public web service
Babelfy

Requirements

Python 2.7
lxml
MySQL-python (aka MySQLdb)
nltk
pyspotlight
requests

To call TagMe and Babelfy, you need to request a key: Tagme, Babelfy. The application's config module has variables to enter the keys.

Modules

analysis: Parses client responses. Computes entity-cooccurrence tables.
clients: Clients to call the services
config: Configuration
main: Example how to use. Creates runners and calls them for each service
model: Data types and some methods for them
readers: To preprocess input before calling a client
runners: Classes here use a reader, client and writer to create an annotation workflow
utils: General tools useful for several modules
writers: To postprocess the annotations and output them (to a file etc)

Usage

activate the services to call in config.py

call main.py

 usage: App to work with Entity Linking [-h] [-i MYINPUT] [-o MYOUT]
                                    [-s MYSKIPLIST] [-c CORPUS_NAME]

 optional arguments:
   -h, --help            show this help message and exit
   -i MYINPUT, --input MYINPUT
                         Input file, directory or text. A default can be set in
                         config.py (default: /path/to/some/default/input)
   -o MYOUT, --output MYOUT
                         Output file or files. Default names are created 
                         dynamically by code in writers.py module (default: None)
   -r MYOUTRESPS, --resp_output MYOUTRESPS
                         Output directory for client responses. A default is
                         created dynamically by code in writers.py module
                         (default: None)
   -s MYSKIPLIST, --skip_list MYSKIPLIST
                         File with filenames to skip (default:
                         /path/to/some/default/list)
   -c CORPUS_NAME, --corpus CORPUS_NAME
                         Name of the corpus (for output files etc.). A default
                         can be set in config.py (default: SOME_DEFAULT_NAME)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
eval		eval
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
__init__.py		__init__.py
analysis.py		analysis.py
clients.py		clients.py
combination.py		combination.py
config.py		config.py
epydoc.conf		epydoc.conf
main.py		main.py
main_combine.py		main_combine.py
model.py		model.py
readers.py		readers.py
runners.py		runners.py
utils.py		utils.py
writers.py		writers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Modules

Usage

About

Releases

Packages

Languages

License

pruizf/elco

Folders and files

Latest commit

History

Repository files navigation

Requirements

Modules

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages