GitHub - r0ller/hi-engmorph-foma: English morphological analyser for the Alice project:

The goal of this project is to create a useable English morphological analyser for the Alice project.

The whole project is based on the following resources:

University of Pennsylvania, XTAG Project: https://www.cis.upenn.edu/~xtag/

morph-1.5: https://www.cis.upenn.edu/~xtag/swrelease.html

original text file: morph-1.5\data\morph_english.flat

I just commited all intermediate artifacts (source code, db files, fst, build scripts, etc.) emerged during the conversion of the text file morph_english.txt.

Some short description:
-createdb.sql was used the create the empty db files
-upenn_morphtxt2db.cpp was used to create the converted_engmorph.db from comment_free_morph_english.txt
-adjust_upenn_morphdb.cpp was used to adjust it and get engmorph.db which was finally used onwards
-the various build*.sh scripts were used to build the programs to convert the relevant info from the engmorph.db into the corresponding *.lexc files
-english.foma was taken from the foma site and enhanced manually -finally some manual adjustments were applied to the lexc files
-createfst.sh was used to create english.fst in the end

Note: This is not a 1:1 transformation of the original and is still in development so expect many bugs/mistakes

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
XTAG.pdf		XTAG.pdf
adjust_upenn_morphdb.cpp		adjust_upenn_morphdb.cpp
build_adjust.sh		build_adjust.sh
build_convert.sh		build_convert.sh
build_extract_adjs.sh		build_extract_adjs.sh
build_extract_advs.sh		build_extract_advs.sh
build_extract_comps.sh		build_extract_comps.sh
build_extract_conjs.sh		build_extract_conjs.sh
build_extract_dets.sh		build_extract_dets.sh
build_extract_injects.sh		build_extract_injects.sh
build_extract_nouns.sh		build_extract_nouns.sh
build_extract_nvcs.sh		build_extract_nvcs.sh
build_extract_parts.sh		build_extract_parts.sh
build_extract_preps.sh		build_extract_preps.sh
build_extract_prons.sh		build_extract_prons.sh
build_extract_puncts.sh		build_extract_puncts.sh
build_extract_verbs.sh		build_extract_verbs.sh
build_extract_vvcs.sh		build_extract_vvcs.sh
comment_free_morph_english.txt		comment_free_morph_english.txt
converted_engmorph.db		converted_engmorph.db
createdb.sql		createdb.sql
createfst.sh		createfst.sh
engadj.lexc		engadj.lexc
engadj_cd.lexc		engadj_cd.lexc
engadv.lexc		engadv.lexc
engcomp.lexc		engcomp.lexc
engconj.lexc		engconj.lexc
engdecnum.lexc		engdecnum.lexc
engdet.lexc		engdet.lexc
enginject.lexc		enginject.lexc
english.foma		english.foma
english.fst		english.fst
engmorph.db		engmorph.db
engnoun.lexc		engnoun.lexc
engnum.lexc		engnum.lexc
engnvc.lexc		engnvc.lexc
engpart.lexc		engpart.lexc
engprep.lexc		engprep.lexc
engpron.lexc		engpron.lexc
engpunct.lexc		engpunct.lexc
engverb.lexc		engverb.lexc
engverb_cd.lexc		engverb_cd.lexc
engvvc.lexc		engvvc.lexc
extract_db_adjs.cpp		extract_db_adjs.cpp
extract_db_advs.cpp		extract_db_advs.cpp
extract_db_comps.cpp		extract_db_comps.cpp
extract_db_conjs.cpp		extract_db_conjs.cpp
extract_db_dets.cpp		extract_db_dets.cpp
extract_db_injects.cpp		extract_db_injects.cpp
extract_db_nouns.cpp		extract_db_nouns.cpp
extract_db_nvcs.cpp		extract_db_nvcs.cpp
extract_db_parts.cpp		extract_db_parts.cpp
extract_db_preps.cpp		extract_db_preps.cpp
extract_db_prons.cpp		extract_db_prons.cpp
extract_db_puncts.cpp		extract_db_puncts.cpp
extract_db_verbs.cpp		extract_db_verbs.cpp
extract_db_vvcs.cpp		extract_db_vvcs.cpp
feature_nr_1_as_gcat_distinct.txt		feature_nr_1_as_gcat_distinct.txt
morph_english.txt		morph_english.txt
noun_sg_ending_in_s.txt		noun_sg_ending_in_s.txt
nouns.txt		nouns.txt
savestack.sh		savestack.sh
upenn_morphtxt2db.cpp		upenn_morphtxt2db.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

r0ller/hi-engmorph-foma

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages