How can I train this model with arabic or urdu characters? #34

ghulammustufa31 · 2019-02-15T09:19:21Z

My labels contain arabic/urdu text.
For example "اسلام آباد : چیئرمین رضابانی کی زیر صدارت سینیٹ کا اجلاس"

What changes are required to train the model given non-English labels?

Belval · 2019-02-15T13:45:11Z

So according to britannica, Arabic has 28 letters which means that it would be more compatible with the CRNN architecture than a word-based language like Chinese. I think that you can expect reasonable results by simply replacing the values in CRNN/config.py and expect somewhat workable results. Since Arabic is read right to left, you might encounter some issue but you'll have to try to be sure.

Now for Urdu, the same process can be applied, but some characters seem to be very wide. Since CRNN is not attention based this could make it very hard to converge.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I train this model with arabic or urdu characters? #34

How can I train this model with arabic or urdu characters? #34

ghulammustufa31 commented Feb 15, 2019

Belval commented Feb 15, 2019

How can I train this model with arabic or urdu characters? #34

How can I train this model with arabic or urdu characters? #34

Comments

ghulammustufa31 commented Feb 15, 2019

Belval commented Feb 15, 2019