You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thank you for your fantastic work. Is it possible to release a filtered version of the dataset, without any tables annotated?
Background: The reading order of text can be quite different than the reading order in tables. In my experiments with your model, it is mixing up the reading order on some documents with multi-column text layouts. It is reading some paragraphs left to right instead of following the two-column layout from top to bottom. I guess it is due to the table samples provided in the dataset.
Is it maybe possible to filter out the images containing a table, by a layout-segmentation / table detection model and release a filtered version of the dataset?
Isn't it better to release two separate datasets, one for tables and one for text?
Thank you in advance!
The text was updated successfully, but these errors were encountered:
Hello there,
thank you for your fantastic work. Is it possible to release a filtered version of the dataset, without any tables annotated?
Background: The reading order of text can be quite different than the reading order in tables. In my experiments with your model, it is mixing up the reading order on some documents with multi-column text layouts. It is reading some paragraphs left to right instead of following the two-column layout from top to bottom. I guess it is due to the table samples provided in the dataset.
Is it maybe possible to filter out the images containing a table, by a layout-segmentation / table detection model and release a filtered version of the dataset?
Isn't it better to release two separate datasets, one for tables and one for text?
Thank you in advance!
The text was updated successfully, but these errors were encountered: