You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of text tokenisation is pretty naive and doesn't cover all aspects. A nice tokenisation library should be able to generate all possible text tokens like currency, dates, numbers, symbols etc..
For example :
In 1996, 1996 people sent emails at someone @ example . com at 1:30 PM.
In nineteen ninety six, one thousand nine hundred and ninety six people sent emails at someone at example dot com at one thirty p m
and all the alternative versions.
The library needs to be integrated in subtitle parser (srtparser.h).
The text was updated successfully, but these errors were encountered:
The current implementation of text tokenisation is pretty naive and doesn't cover all aspects. A nice tokenisation library should be able to generate all possible text tokens like currency, dates, numbers, symbols etc..
For example :
In 1996, 1996 people sent emails at someone @ example . com at 1:30 PM.
In nineteen ninety six, one thousand nine hundred and ninety six people sent emails at someone at example dot com at one thirty p m
and all the alternative versions.
The library needs to be integrated in subtitle parser (srtparser.h).
The text was updated successfully, but these errors were encountered: