Report for incorrect sentence split (JNLPBA-IOBES) #2

wonjininfo · 2019-03-28T07:32:01Z

Hi,
Thanks for providing these useful resources!
While we were using the resources, we got to know that sentences in JNLPBA-IOBES dataset might be incorrectly split.

MTL-Bioinformatics-2016/data/JNLPBA-IOBES/test.tsv starts with

Number	O

of	O
glucocorticoid	B-protein
receptors	E-protein
in	O
lymphocytes	S-cell_type
and	O
their	O
sensitivity	O
to	O
hormone	O
action	O
.	O
The	O

study	O
demonstrated	O

while MTL-Bioinformatics-2016/data/JNLPBA/test.tsv starts with

-DOCSTART-	O

Number	O
of	O
glucocorticoid	B-protein
receptors	I-protein
in	O
lymphocytes	B-cell_type
and	O
their	O
sensitivity	O
to	O
hormone	O
action	O
.	O

The	O
study	O

We used our own post-preprocessing script to fix this and used the fixed dataset in our experiments.

Once again, thank you so much for sharing these useful resources!

The text was updated successfully, but these errors were encountered:

GamalC · 2019-03-28T21:08:51Z

Hi @wonjininfo. Many thanks for this bit of information. I think others would appreciate having your script as well, would you mind sharing it? If you are willing you can create a pull request or send me the script (gkoc2 at cam dot ac uk) and I would add it.

GamalC self-assigned this Mar 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report for incorrect sentence split (JNLPBA-IOBES) #2

Report for incorrect sentence split (JNLPBA-IOBES) #2

wonjininfo commented Mar 28, 2019

GamalC commented Mar 28, 2019

Report for incorrect sentence split (JNLPBA-IOBES) #2

Report for incorrect sentence split (JNLPBA-IOBES) #2

Comments

wonjininfo commented Mar 28, 2019

GamalC commented Mar 28, 2019