You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I make the detector detect text in the following image
the preferred dots in the boxes are from the i's in the lines below the boxes (see the error traceback picture)
Code snippet to reproduce the bug
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from doctr.models import detection_predictor
from doctr.io import DocumentFile
def visualize_word_boxes(image_path, word_boxes):
# Load the image
image = plt.imread(image_path)
# Get image dimensions
image_height, image_width, _ = image.shape
# Create figure and axes
fig, ax = plt.subplots()
ax.imshow(image)
# Plot word boxes
for box in word_boxes:
# Convert normalized coordinates to absolute pixel values
x1 = int(box[0] * image_width)
y1 = int(box[1] * image_height)
x2 = int(box[2] * image_width)
y2 = int(box[3] * image_height)
# Create a rectangle patch
rect = patches.Rectangle((x1, y1), x2 - x1, y2 - y1, linewidth=1, edgecolor='r', facecolor='none')
# Add the patch to the Axes
ax.add_patch(rect)
# Show the plot
plt.show()
# Assuming 'doc' contains the loaded image and 'result' contains the word boxes
image_path = "/home/rmast/Downloads/Brief gemeente 300dpi voorkant.jpg"
# Assuming 'result' contains the detection results
model = detection_predictor(arch='db_resnet50', pretrained=True)
doc = DocumentFile.from_images("/home/rmast/Downloads/Brief gemeente 300dpi voorkant.jpg")
result = model(doc)
word_boxes = result[0]['words'] # Assuming 'words' contains the word boxes
visualize_word_boxes(image_path, word_boxes)
Error traceback
See the box "Op meerdere plaatsen [op]" [op] contains a dot from below.
"kruisingen [op] het Kerkplein" This [op] also contains a dot from below.
"We [gaan] de kruisingen" This [gaan] also has a dot from below.
It appears the descenders of p and g increase the risk of this happening.
Environment
DocTR version: v0.8.1
TensorFlow version: N/A
PyTorch version: 2.2.2 (torchvision 0.17.2)
OpenCV version: 4.9.0
OS: Linux Mint 20.3
Python version: 3.12.3
Is CUDA available (TensorFlow): N/A
Is CUDA available (PyTorch): Yes
CUDA runtime version: 12.1.66
GPU models and configuration: GPU 0: NVIDIA GeForce GT 1030
Nvidia driver version: 535.86.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.3
Deep Learning backend
Python 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:50:38) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
from doctr.file_utils import is_tf_available, is_torch_available
Bug description
When I make the detector detect text in the following image
the preferred dots in the boxes are from the i's in the lines below the boxes (see the error traceback picture)
Code snippet to reproduce the bug
Error traceback
See the box "Op meerdere plaatsen [op]" [op] contains a dot from below.
"kruisingen [op] het Kerkplein" This [op] also contains a dot from below.
"We [gaan] de kruisingen" This [gaan] also has a dot from below.
It appears the descenders of p and g increase the risk of this happening.
Environment
DocTR version: v0.8.1
TensorFlow version: N/A
PyTorch version: 2.2.2 (torchvision 0.17.2)
OpenCV version: 4.9.0
OS: Linux Mint 20.3
Python version: 3.12.3
Is CUDA available (TensorFlow): N/A
Is CUDA available (PyTorch): Yes
CUDA runtime version: 12.1.66
GPU models and configuration: GPU 0: NVIDIA GeForce GT 1030
Nvidia driver version: 535.86.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.3
Deep Learning backend
Python 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:50:38) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
The text was updated successfully, but these errors were encountered: