You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From my understanding: (please correct me if I'm wrong)
It only generates lstmf files, and does not perform any training.
In the steps mentioned in Overview of Training Process, it stops at step 5. Steps 6 and 7 must be done separately. Is that correct ?
How to perform steps 6 and 7 ? with Makefile commands ? if you give me some inputs, I can help adding these steps to the python script.
The python script takes a TEXTFILE and generates (for each font) box/tif/lstmf files for the hole text, not line by line. So, in order to generate line by line, we must run the script for each one-line file ?
tesstrain basically creates artificial training data for doing finetuning with a specific font for example. You might find some existing examples using the old tesstrain.sh script which should be roughly equivalent for tesstrain. The Makefile approach is for "real" data only.
As mentioned by @stefan6419846 in madmaze/pytesseract#508 , there is a python wrapper for training in tesstrain/src/ , which unfortunately is not documented in tesseract, tessdoc and tesstrain repositories.
From my understanding: (please correct me if I'm wrong)
It only generates lstmf files, and does not perform any training.
In the steps mentioned in Overview of Training Process, it stops at step 5. Steps 6 and 7 must be done separately. Is that correct ?
How to perform steps 6 and 7 ? with Makefile commands ? if you give me some inputs, I can help adding these steps to the python script.
The python script takes a TEXTFILE and generates (for each font) box/tif/lstmf files for the hole text, not line by line. So, in order to generate line by line, we must run the script for each one-line file ?
Thanks in advance !
Cc: @stefan6419846
The text was updated successfully, but these errors were encountered: