Detection prefers to include dot's of i's underneath #1562

rmast · 2024-04-22T18:18:32Z

Bug description

When I make the detector detect text in the following image

the preferred dots in the boxes are from the i's in the lines below the boxes (see the error traceback picture)

Code snippet to reproduce the bug

import matplotlib.pyplot as plt
import matplotlib.patches as patches
from doctr.models import detection_predictor
from doctr.io import DocumentFile



def visualize_word_boxes(image_path, word_boxes):
    # Load the image
    image = plt.imread(image_path)

    # Get image dimensions
    image_height, image_width, _ = image.shape

    # Create figure and axes
    fig, ax = plt.subplots()
    ax.imshow(image)

    # Plot word boxes
    for box in word_boxes:
        # Convert normalized coordinates to absolute pixel values
        x1 = int(box[0] * image_width)
        y1 = int(box[1] * image_height)
        x2 = int(box[2] * image_width)
        y2 = int(box[3] * image_height)

        # Create a rectangle patch
        rect = patches.Rectangle((x1, y1), x2 - x1, y2 - y1, linewidth=1, edgecolor='r', facecolor='none')

        # Add the patch to the Axes
        ax.add_patch(rect)

    # Show the plot
    plt.show()

# Assuming 'doc' contains the loaded image and 'result' contains the word boxes
image_path = "/home/rmast/Downloads/Brief gemeente 300dpi voorkant.jpg"

# Assuming 'result' contains the detection results
model = detection_predictor(arch='db_resnet50', pretrained=True)
doc = DocumentFile.from_images("/home/rmast/Downloads/Brief gemeente 300dpi voorkant.jpg")
result = model(doc)
word_boxes = result[0]['words']  # Assuming 'words' contains the word boxes
visualize_word_boxes(image_path, word_boxes)

Error traceback

See the box "Op meerdere plaatsen [op]" [op] contains a dot from below.
"kruisingen [op] het Kerkplein" This [op] also contains a dot from below.
"We [gaan] de kruisingen" This [gaan] also has a dot from below.
It appears the descenders of p and g increase the risk of this happening.

Environment

DocTR version: v0.8.1
TensorFlow version: N/A
PyTorch version: 2.2.2 (torchvision 0.17.2)
OpenCV version: 4.9.0
OS: Linux Mint 20.3
Python version: 3.12.3
Is CUDA available (TensorFlow): N/A
Is CUDA available (PyTorch): Yes
CUDA runtime version: 12.1.66
GPU models and configuration: GPU 0: NVIDIA GeForce GT 1030
Nvidia driver version: 535.86.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.3

Deep Learning backend

Python 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:50:38) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from doctr.file_utils import is_tf_available, is_torch_available

print(f"is_tf_available: {is_tf_available()}")
is_tf_available: False
print(f"is_torch_available: {is_torch_available()}")
is_torch_available: True

The text was updated successfully, but these errors were encountered:

rmast added the type: bug Something isn't working label Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detection prefers to include dot's of i's underneath #1562

Detection prefers to include dot's of i's underneath #1562

rmast commented Apr 22, 2024

Detection prefers to include dot's of i's underneath #1562

Detection prefers to include dot's of i's underneath #1562

Comments

rmast commented Apr 22, 2024

Bug description

Code snippet to reproduce the bug

Error traceback

Environment

Deep Learning backend