Dataset not well formated? #1737

johnlockejrr · 2024-09-29T15:34:55Z

johnlockejrr
Sep 29, 2024

I try to train a recognition model with doctr, I followed the instructions from https://github.com/mindee/doctr/tree/main/references/detection on how to format my data, here is a snip of my labels.json:

{
    "81_dc946_default.jpg": = {
        'img_dimensions': (640, 480),
        'img_hash': "96acfebe56c3056bdd42b18480036a9434fb426d0ff8f1ca4a610437563f79b5",
        'polygons': {
            "textzone": [[[2161, 949], [3196, 962], [3249, 1054], [3262, 4333], [3091, 4386], [461, 4386], [421, 4267], [448, 2346], [408, 2211], [448, 1015], [2161, 949]]],
            "textline": [[[491, 1084], [491, 1163], [731, 1143], [899, 1160], [926, 1160], [929, 1160], [942, 1156], [1051, 1133], [1173, 1153], [1212, 1160], [1216, 1160], [1763, 1156], [1766, 1156], [1769, 1156], [1786, 1146], [1802, 1136], [1881, 1143], [2010, 1156], [2013, 1156], [2181, 1140], [2310, 1127], [2689, 1127], [2702, 1130], [2735, 1150], [2738, 1150], [2741, 1150], [2837, 1130], [2857, 1127], [2873, 1130], [2959, 1143], [2962, 1143], [3058, 1127], [3061, 1127], [3094, 1153], [3097, 1153], [3101, 1153], [3104, 1153], [3107, 1153], [3183, 1123], [3183, 1074], [3176, 1001], [487, 1011], [491, 1084]], [[477, 1212], [477, 1285], [873, 1268], [1028, 1278], [1067, 1281], [1071, 1281], [1074, 1281], [1077, 1281], [1077, 1278], [1097, 1265], [1127, 1278], [1133, 1281], [1136, 1281], [1140, 1281], [1506, 1278], [1509, 1278], [1512, 1278], [1519, 1275], [1548, 1258], [1746, 1258], [1776, 1272], [1792, 1278], [1796, 1278], [1799, 1278], [1838, 1268], [1888, 1258], [1931, 1268], [1967, 1278], [1970, 1278], [1974, 1281], [2214, 1281], [2214, 1278], [2217, 1278], [2257, 1265], [2287, 1255], [2323, 1265], [2369, 1275], [2372, 1275], [2376, 1275], [2379, 1275], [2409, 1262], [2428, 1255], [2550, 1262], [2755, 1272], [2758, 1272], [2761, 1272], [2788, 1258], [2797, 1255], [2992, 1255], [3002, 1255], [3087, 1272], [3091, 1272], [3183, 1252], [3186, 1202], [3183, 1140], [2972, 1136], [2942, 1117], [2939, 1117], [2936, 1117], [2933, 1117], [2811, 1136], [2507, 1120], [2504, 1120], [2385, 1136], [2306, 1120], [2303, 1120], [2300, 1120], [2267, 1136], [2234, 1120], [2231, 1120], [2227, 1120], [2112, 1140], [2079, 1120], [2076, 1120], [2072, 1120], [2069, 1120], [2066, 1120], [1987, 1143], [1974, 1150], [1756, 1150], [1753, 1146], [1700, 1120], [1697, 1120], [1539, 1120], [1535, 1120], [1301, 1146], [1295, 1146], [1265, 1146], [1100, 1146], [1074, 1127], [1071, 1127], [1067, 1127], [1064, 1127], [1061, 1127], [1057, 1127], [1028, 1150], [1021, 1150], [1001, 1150], [843, 1133], [840, 1133], [474, 1150], [477, 1212]], [[494, 1344], [494, 1417], [846, 1413], [850, 1413], [853, 1413], [883, 1390], [955, 1403], [959, 1403], [962, 1403], [965, 1403], [985, 1387], [1107, 1410], [1110, 1410], [1113, 1410], [1123, 1410], [1390, 1384], [1426, 1400], [1430, 1400], [1433, 1400], [1588, 1384], [1654, 1403], [1664, 1407], [1667, 1407], [1670, 1407], [1776, 1403], [2026, 1390], [2095, 1400], [2138, 1403], [2142, 1403], [2145, 1403], [2161, 1397], [2211, 1380], [2277, 1397], [2300, 1400], [2303, 1400], [2343, 1397], [2474, 1377], [2524, 1394], [2544, 1400], [2547, 1400], [2550, 1400], [2554, 1400], [2577, 1394], [2610, 1384], [2672, 1390], [2699, 1394], [2702, 1394], [2705, 1394], [2708, 1394], [2712, 1390], [2738, 1374], [2774, 1390], [2791, 1397], [2794, 1397], [2797, 1397], [2857, 1390], [2919, 1380], [2959, 1387], [2992, 1394], [2995, 1394], [2998, 1394], [3018, 1387], [3058, 1374], [3180, 1384], [3186, 1324], [3180, 1245], [2942, 1262], [2811, 1249], [2761, 1245], [2758, 1245], [2725, 1252], [2623, 1268], [2053, 1262], [1660, 1255], [1657, 1255], [1453, 1268], [1314, 1278], [1229, 1272], [1057, 1258], [1054, 1258], [1051, 1258], [1047, 1258], [1028, 1275], [1024, 1275], [1011, 1275], [764, 1265], [761, 1265], [757, 1265], [728, 1278], [725, 1281], [718, 1281], [662, 1265], [659, 1265], [655, 1265], [491, 1281], [494, 1344]], [[477, 1548], [655, 1545], [659, 1545], [718, 1519], [774, 1545], [777, 1545], [781, 1545], [1044, 1515], [1113, 1542], [1117, 1542], [1120, 1542], [1262, 1515], [1430, 1542], [1440, 1542], [1443, 1542], [1446, 1542], [1450, 1542], [1558, 1512], [1713, 1535], [1716, 1535], [1720, 1535], [1723, 1535], [1756, 1512], [1875, 1512], [1960, 1539], [1967, 1539], [1970, 1539], [1974, 1539], [1990, 1539], [2115, 1512], [2178, 1535], [2181, 1535], [2184, 1539], [2389, 1539], [2389, 1535], [2415, 1532], [2636, 1506], [2814, 1529], [2840, 1532], [2844, 1532], [2972, 1529], [3012, 1525], [3015, 1525], [3018, 1525], [3045, 1506], [3173, 1525], [3180, 1456], [3173, 1367], [2633, 1397], [2554, 1377], [2544, 1374], [2540, 1374], [2537, 1374], [2501, 1377], [2366, 1390], [2306, 1380], [2270, 1374], [2267, 1374], [2198, 1380], [2056, 1397], [1881, 1387], [1733, 1377], [1730, 1377], [1608, 1390], [1489, 1403], [1189, 1403], [1183, 1397], [1153, 1384], [1150, 1384], [1146, 1384], [1143, 1384], [1107, 1397], [1090, 1403], [1015, 1400], [846, 1390], [843, 1390], [840, 1390], [807, 1403], [794, 1410], [477, 1410], [477, 1473], [477, 1548]], [[487, 1604], [491, 1687], [576, 1657], [701, 1680], [705, 1680], [708, 1680], [774, 1660], [853, 1684], [863, 1684], [866, 1684], [870, 1684], [906, 1680], [1077, 1664], [1176, 1677], [1186, 1677], [1189, 1677], [1192, 1677], [1196, 1677], [1262, 1651], [1308, 1667], [1311, 1667], [1314, 1667], [1463, 1651], [1769, 1667], [1786, 1667], [1789, 1667], [1812, 1667], [2165, 1644], [2201, 1660], [2221, 1670], [2224, 1670], [2227, 1674], [2415, 1674], [2415, 1670], [2524, 1657], [2606, 1644], [2623, 1654], [2639, 1667], [2643, 1667], [2646, 1667], [2649, 1670], [2804, 1670], [2804, 1667], [2807, 1667], [2811, 1667], [2830, 1651], [2834, 1651], [2873, 1664], [2877, 1664], [2880, 1664], [2883, 1664], [2916, 1651], [2946, 1637], [3071, 1637], [3087, 1647], [3104, 1660], [3107, 1660], [3111, 1660], [3114, 1660], [3190, 1644], [3193, 1581], [3186, 1496], [3051, 1522], [2649, 1506], [2577, 1502], [2573, 1502], [2570, 1502], [2560, 1506], [2514, 1522], [2448, 1509], [2418, 1502], [2415, 1502], [2264, 1512], [1937, 1532], [1802, 1519], [1720, 1512], [1716, 1512], [1664, 1522], [1598, 1535], [1568, 1525], [1539, 1512], [1535, 1512], [1532, 1512], [1529, 1512], [1502, 1525], [1483, 1539], [1377, 1539], [1344, 1529], [1285, 1512], [1281, 1512], [1278, 1512], [1186, 1532], [1143, 1539], [1104, 1532], [1061, 1525], [1057, 1525], [949, 1535], [870, 1542], [860, 1535], [837, 1525], [833, 1525], [830, 1525], [484, 1542], [487, 1604]], [[458, 1733], [458, 1792], [757, 1809], [761, 1809], [879, 1792], [929, 1786], [982, 1792], [1127, 1809], [1130, 1809], [1367, 1792], [1578, 1805], [1581, 1805], [1585, 1805], [1588, 1805], [1611, 1789], [1614, 1789], [1720, 1805], [1723, 1805], [1726, 1805], [1792, 1789], [1825, 1779], [1927, 1786], [2112, 1802], [2115, 1802], [2250, 1786], [2306, 1779], [2389, 1786], [2580, 1799], [2583, 1799], [2587, 1799], [2636, 1782], [2662, 1776], [2692, 1782], [2755, 1799], [2758, 1799], [2761, 1799], [3190, 1779], [3193, 1720], [3190, 1647], [2923, 1637], [2919, 1637], [2916, 1637], [2890, 1651], [2863, 1664], [2580, 1667], [2494, 1654], [2405, 1641], [2402, 1641], [2399, 1641], [2343, 1657], [2310, 1664], [2237, 1657], [2076, 1641], [2072, 1641], [2069, 1641], [2030, 1660], [2013, 1670], [1865, 1670], [1805, 1664], [1684, 1644], [1680, 1644], [1677, 1644], [1674, 1644], [1641, 1664], [1621, 1677], [1532, 1677], [1515, 1667], [1479, 1644], [1476, 1644], [1473, 1644], [1469, 1644], [1123, 1670], [1107, 1670], [1100, 1670], [968, 1647], [965, 1647], [962, 1647], [899, 1670], [853, 1654], [850, 1654], [846, 1654], [843, 1654], [794, 1674], [794, 1677], [454, 1677], [458, 1733]], [[471, 1865], [471, 1927], [698, 1914], [738, 1927], [757, 1934], [761, 1934], [764, 1934], [767, 1934], [787, 1927], [833, 1911], [955, 1924], [1034, 1934], [1038, 1934], [1041, 1934], [1044, 1934], [1057, 1924], [1080, 1911], [1153, 1911], [1183, 1924], [1212, 1937], [1216, 1937], [1219, 1937], [1328, 1921], [1433, 1908], [1479, 1921], [1522, 1934], [1525, 1934], [1529, 1937], [1651, 1937], [1651, 1934], [1822, 1918], [1950, 1908], [2026, 1918], [2151, 1934], [2155, 1934], [2250, 1914], [2306, 1904], [2343, 1914], [2392, 1931], [2395, 1931], [2399, 1934], [2830, 1934], [2830, 1931], [2926, 1911], [2959, 1904], [3120, 1904], [3127, 1908], [3147, 1924], [3150, 1924], [3153, 1924], [3157, 1924], [3160, 1924], [3196, 1908], [3196, 1855], [3190, 1776], [3068, 1792], [2699, 1779], [2659, 1779], [2656, 1779], [467, 1799], [471, 1865]], [[471, 2076], [649, 2056], [1047, 2066], [1146, 2069], [1150, 2069], [1176, 2066], [1252, 2049], [1298, 2063], [1308, 2066], [1311, 2066], [1314, 2066], [1354, 2063], [1496, 2046], [1670, 2056], [1776, 2063], [1779, 2063], [1782, 2063], [1786, 2063], [1796, 2056], [1809, 2046], [1868, 2053], [1918, 2059], [1921, 2059], [1980, 2053], [2046, 2043], [2099, 2049], [2171, 2059], [2175, 2059], [2530, 2043], [2672, 2036], [2715, 2039], [2853, 2053], [2857, 2053], [2860, 2053], [2863, 2053], [2886, 2036], [2896, 2033], [2939, 2036], [3127, 2049], [3130, 2049], [3134, 2049], [3199, 2030], [3199, 1977], [3196, 1918], [2942, 1901], [2939, 1901], [2745, 1918], [2682, 1924], [2643, 1921], [2527, 1904], [2524, 1904], [2521, 1904], [2471, 1921], [2448, 1927], [2260, 1924], [1861, 1911], [1858, 1911], [1660, 1927], [1604, 1931], [1575, 1927], [1489, 1914], [1486, 1914], [1357, 1931], [1311, 1934], [1281, 1931], [1209, 1918], [1206, 1918], [1202, 1918], [1166, 1931], [1146, 1937], [1120, 1931], [1077, 1921], [1074, 1921], [1071, 1921], [1067, 1921], [1047, 1931], [1028, 1941], [985, 1931], [926, 1921], [922, 1921], [919, 1921], [886, 1934], [860, 1941], [810, 1934], [748, 1924], [744, 1924], [741, 1924], [708, 1934], [685, 1941], [471, 1937], [471, 2003], [471, 2076]], [[458, 2125], [458, 2181], [494, 2201], [497, 2201], [500, 2204], [777, 2204], [777, 2201], [781, 2201], [840, 2178], [846, 2175], [853, 2178], [879, 2194], [883, 2194], [886, 2194], [889, 2194], [972, 2178], [988, 2175], [995, 2178], [1018, 2194], [1021, 2194], [1024, 2194], [1028, 2194], [1120, 2178], [1123, 2178], [1127, 2178], [1196, 2198], [1199, 2198], [1202, 2198], [1492, 2178], [1634, 2168], [1654, 2178], [1670, 2184], [1674, 2184], [1677, 2184], [1769, 2178], [1881, 2168], [1914, 2178], [1960, 2191], [1964, 2191], [1967, 2191], [1970, 2191], [2013, 2178], [2056, 2165], [2254, 2165], [2287, 2178], [2310, 2184], [2313, 2184], [2316, 2184], [2320, 2184], [2349, 2178], [2395, 2165], [2438, 2178], [2468, 2184], [2471, 2184], [2474, 2188], [2675, 2188], [2675, 2184], [2679, 2184], [2702, 2178], [2745, 2161], [2784, 2178], [2791, 2178], [2794, 2178], [2797, 2178], [2801, 2178], [2804, 2178], [2830, 2158], [2962, 2178], [3005, 2181], [3008, 2181], [3199, 2175], [3203, 2102], [3196, 2023], [3101, 2039], [2326, 2033], [2020, 2033], [2016, 2033], [454, 2059], [458, 2125]], [[467, 2250], [467, 2333], [655, 2333], [655, 2329], [659, 2329], [662, 2329], [685, 2313], [959, 2329], [1005, 2329], [1008, 2329], [1024, 2326], [1123, 2310], [1459, 2326], [1512, 2326], [1515, 2326], [1519, 2326], [1522, 2326], [1558, 2306], [1891, 2323], [1927, 2323], [1931, 2323], [1970, 2323], [2303, 2303], [3190, 2313], [3193, 2234], [3186, 2168], [464, 2184], [467, 2250]], [[474, 2511], [474, 2557], [570, 2580], [573, 2580], [576, 2580], [622, 2557], [629, 2557], [632, 2557], [678, 2580], [682, 2580], [685, 2580], [833, 2557], [850, 2557], [866, 2557], [991, 2580], [995, 2580], [998, 2580], [1001, 2580], [1034, 2557], [1044, 2554], [1054, 2557], [1080, 2570], [1084, 2570], [1087, 2570], [1090, 2570], [1117, 2557], [1127, 2554], [1156, 2557], [1225, 2570], [1229, 2570], [1232, 2570], [1235, 2570], [1249, 2557], [1258, 2554], [1394, 2554], [1397, 2557], [1417, 2573], [1420, 2573], [1423, 2573], [1426, 2573], [1581, 2557], [1670, 2550], [1690, 2557], [1726, 2573], [1730, 2573], [1733, 2573], [1736, 2573], [1769, 2557], [1786, 2550], [1993, 2550], [2010, 2557], [2039, 2573], [2043, 2573], [2046, 2573], [2412, 2570], [2474, 2557], [2534, 2547], [2606, 2557], [2662, 2567], [2666, 2567], [2811, 2557], [3048, 2544], [3071, 2557], [3081, 2567], [3084, 2567], [3087, 2567], [3091, 2567], [3203, 2557], [3206, 2494], [3203, 2425], [3137, 2409], [3134, 2409], [3008, 2425], [2982, 2428], [2966, 2425], [2913, 2415], [2909, 2415], [2530, 2428], [2442, 2432], [2399, 2432], [1888, 2399], [1885, 2399], [1881, 2399], [1766, 2432], [1707, 2415], [1703, 2415], [1700, 2415], [1499, 2435], [1443, 2402], [1440, 2402], [1436, 2402], [1275, 2402], [1272, 2402], [1268, 2402], [1143, 2442], [1136, 2442], [1133, 2442], [1087, 2428], [1084, 2428], [1080, 2428], [784, 2445], [748, 2445], [744, 2445], [632, 2405], [629, 2405], [626, 2405], [622, 2405], [471, 2445], [474, 2511]], [[484, 2633], [484, 2705], [797, 2685], [1535, 2699], [1539, 2699], [1542, 2699], [1581, 2682], [1657, 2699], [1660, 2699], [1753, 2682], [2178, 2699], [2191, 2699], [2194, 2699], [2198, 2699], [2359, 2679], [2448, 2695], [2451, 2695], [2619, 2679], [3058, 2695], [3081, 2695], [3084, 2695], [3087, 2695], [3167, 2679], [3199, 2695], [3209, 2623], [3203, 2550], [481, 2560], [484, 2633]], [[461, 2751], [461, 2797], [560, 2827], [563, 2827], [883, 2830], [886, 2830], [889, 2830], [932, 2807], [1107, 2830], [1110, 2830], [1113, 2830], [1196, 2797], [1272, 2827], [1275, 2827], [1278, 2827], [1565, 2807], [1644, 2827], [1647, 2827], [1651, 2827], [1654, 2827], [1759, 2797], [1875, 2817], [1878, 2817], [1881, 2817], [1885, 2817], [1914, 2794], [2013, 2820], [2016, 2820], [2020, 2820], [2023, 2820], [2079, 2801], [2132, 2820], [2135, 2820], [2362, 2824], [2366, 2824], [2369, 2824], [2372, 2824], [2409, 2797], [2501, 2814], [2504, 2814], [2507, 2814], [2577, 2791], [2774, 2791], [2827, 2817], [2830, 2817], [2834, 2817], [2919, 2804], [2959, 2820], [2962, 2820], [2966, 2820], [2969, 2820], [3041, 2788], [3203, 2784], [3203, 2738], [3196, 2682], [458, 2689], [461, 2751]], [[467, 2873], [467, 2956], [599, 2956], [599, 2952], [603, 2952], [606, 2952], [639, 2929], [853, 2929], [929, 2949], [942, 2952], [945, 2952], [949, 2952], [952, 2952], [965, 2949], [1028, 2929], [1087, 2949], [1104, 2952], [1107, 2952], [1110, 2952], [1146, 2949], [1252, 2929], [1311, 2946], [1324, 2949], [1328, 2949], [1331, 2949], [1387, 2946], [1558, 2929], [1581, 2942], [1591, 2952], [1595, 2952], [1598, 2952], [1601, 2952], [1677, 2942], [1773, 2929], [1825, 2942], [1875, 2952], [1878, 2952], [1881, 2952], [1911, 2939], [1921, 2936], [1937, 2939], [1987, 2952], [1990, 2952], [1993, 2956], [2362, 2956], [2362, 2952], [2366, 2952], [2409, 2936], [2442, 2952], [2445, 2952], [2448, 2956], [2801, 2956], [2801, 2952], [2804, 2952], [2853, 2933], [2860, 2929], [3203, 2929], [3203, 2873], [3199, 2794], [3005, 2811], [2702, 2797], [2603, 2794], [2600, 2794], [2596, 2794], [2593, 2794], [2590, 2797], [2560, 2817], [2438, 2801], [2435, 2801], [2432, 2801], [2425, 2801], [2244, 2820], [1904, 2804], [1855, 2804], [1852, 2804], [1848, 2804], [1845, 2804], [1812, 2824], [1624, 2824], [1575, 2807], [1545, 2801], [1542, 2801], [1539, 2801], [1499, 2807], [1446, 2820], [1140, 2814], [827, 2804], [823, 2804], [467, 2817], [467, 2873]], [[487, 3064], [820, 3084], [823, 3084], [929, 3064], [942, 3064], [952, 3064], [1074, 3084], [1077, 3084], [1225, 3064], [1255, 3061], [1275, 3064], [1351, 3081], [1354, 3081], [1443, 3064], [1476, 3081], [1479, 3081], [1483, 3084], [1825, 3084], [1825, 3081], [1829, 3081], [1901, 3061], [1924, 3058], [2053, 3058], [2063, 3061], [2082, 3078], [2086, 3078], [2089, 3078], [2092, 3078], [2095, 3078], [2132, 3061], [2142, 3058], [2188, 3061], [2329, 3078], [2333, 3078], [2336, 3078], [2379, 3061], [2382, 3061], [2412, 3078], [2415, 3078], [2418, 3078], [2544, 3061], [2550, 3061], [2557, 3061], [2629, 3074], [2633, 3074], [2636, 3074], [2675, 3061], [2695, 3054], [2732, 3061], [2811, 3074], [2814, 3074], [3203, 3058], [3206, 3002], [3203, 2936], [3130, 2919], [3127, 2919], [3124, 2919], [3094, 2936], [3074, 2946], [3041, 2936], [3008, 2926], [3005, 2926], [3002, 2926], [2853, 2939], [2715, 2949], [2669, 2939], [2619, 2929], [2616, 2929], [2458, 2942], [2320, 2952], [2254, 2942], [2112, 2923], [2109, 2923], [2105, 2923], [2102, 2923], [2069, 2946], [2056, 2956], [1964, 2956], [1950, 2946], [1911, 2926], [1908, 2926], [1904, 2926], [1901, 2926], [1825, 2946], [1733, 2933], [1730, 2933], [1726, 2933], [1693, 2949], [1680, 2956], [1674, 2949], [1660, 2939], [1657, 2939], [1654, 2939], [1651, 2939], [1143, 2946], [1094, 2929], [1090, 2929], [1087, 2929], [922, 2952], [889, 2929], [886, 2929], [883, 2929], [771, 2929], [767, 2929], [764, 2929], [649, 2959], [642, 2962], [487, 2962], [487, 3015], [487, 3064]], [[504, 3193], [744, 3209], [748, 3209], [751, 3209], [804, 3193], [830, 3209], [833, 3209], [837, 3209], [840, 3209], [843, 3209], [876, 3193], [926, 3209], [929, 3209], [932, 3209], [935, 3209], [939, 3209], [965, 3190], [972, 3190], [1018, 3190], [1331, 3206], [1334, 3206], [1337, 3206], [1397, 3190], [1407, 3186], [1463, 3186], [1805, 3203], [1809, 3203], [1911, 3183], [1924, 3183], [1967, 3183], [2273, 3199], [2277, 3199], [2280, 3199], [2283, 3199], [2306, 3180], [2310, 3180], [2329, 3196], [2333, 3196], [2336, 3196], [2339, 3199], [2534, 3199], [2534, 3196], [3206, 3173], [3209, 3124], [3206, 3064], [2909, 3048], [2906, 3048], [2784, 3068], [2689, 3084], [2639, 3071], [2610, 3061], [2606, 3061], [2603, 3061], [2385, 3068], [2303, 3051], [2300, 3051], [2211, 3071], [2132, 3054], [2128, 3054], [2036, 3074], [1921, 3054], [1918, 3054], [1799, 3081], [1756, 3094], [1555, 3094], [1542, 3084], [1489, 3061], [1486, 3061], [1483, 3061], [1311, 3084], [1219, 3061], [1216, 3061], [1212, 3061], [1209, 3061], [1206, 3061], [1173, 3081], [1034, 3061], [1031, 3061], [883, 3094], [860, 3097], [837, 3094], [692, 3071], [688, 3071], [685, 3071], [682, 3071], [645, 3097], [642, 3101], [504, 3101], [504, 3143], [504, 3193]], [[504, 3341], [632, 3338], [636, 3338], [688, 3312], [708, 3328], [711, 3328], [715, 3328], [718, 3328], [945, 3312], [1008, 3338], [1011, 3338], [1015, 3338], [1021, 3338], [1229, 3308], [1311, 3331], [1314, 3331], [1318, 3331], [1321, 3331], [1370, 3308], [1473, 3308], [1555, 3335], [1558, 3335], [1562, 3335], [1565, 3335], [1568, 3335], [1571, 3335], [1601, 3312], [1647, 3325], [1651, 3325], [1654, 3325], [1657, 3325], [1660, 3325], [1697, 3305], [1927, 3312], [1950, 3331], [1954, 3331], [1957, 3331], [1960, 3331], [1974, 3331], [2125, 3302], [2181, 3328], [2184, 3328], [2188, 3331], [2514, 3331], [2514, 3328], [2517, 3328], [2527, 3325], [2596, 3302], [2636, 3325], [2639, 3328], [2643, 3328], [2646, 3328], [2649, 3328], [2817, 3325], [2820, 3325], [2824, 3325], [2824, 3321], [2863, 3298], [2946, 3298], [2979, 3321], [2982, 3325], [2985, 3325], [2989, 3325], [2992, 3325], [2995, 3325], [3005, 3321], [3045, 3308], [3068, 3321], [3071, 3325], [3074, 3325], [3078, 3325], [3081, 3325], [3203, 3318], [3206, 3256], [3203, 3186], [2995, 3180], [2992, 3180], [2989, 3180], [2972, 3186], [2939, 3206], [2834, 3206], [2781, 3190], [2741, 3180], [2738, 3180], [2735, 3180], [2669, 3190], [2587, 3206], [2471, 3193], [2395, 3186], [2392, 3186], [2224, 3196], [2023, 3209], [1990, 3199], [1950, 3186], [1947, 3186], [1944, 3186], [1941, 3186], [1904, 3199], [1888, 3206], [1868, 3199], [1819, 3186], [1815, 3186], [1604, 3186], [1601, 3186], [1519, 3203], [1476, 3190], [1473, 3190], [1469, 3190], [1466, 3190], [1463, 3190], [1440, 3206], [1426, 3213], [1331, 3206], [1044, 3190], [1041, 3190], [1038, 3190], [952, 3213], [945, 3213], [926, 3213], [701, 3196], [698, 3196], [695, 3196], [655, 3216], [652, 3219], [504, 3219], [504, 3272], [504, 3341]], [[454, 3391], [454, 3437], [771, 3460], [774, 3460], [883, 3440], [965, 3440], [995, 3460], [998, 3460], [1001, 3460], [1005, 3460], [1245, 3443], [1410, 3460], [1413, 3460], [1532, 3440], [1657, 3460], [1660, 3460], [1664, 3460], [1730, 3440], [1782, 3460], [1786, 3460], [1789, 3460], [2056, 3440], [2148, 3460], [2151, 3463], [2395, 3463], [2395, 3460], [2399, 3460], [2402, 3460], [2432, 3440], [2458, 3460], [2461, 3460], [2465, 3460], [2468, 3460], [2471, 3460], [2530, 3440], [2563, 3460], [2567, 3460], [2570, 3460], [2573, 3463], [2702, 3463], [2702, 3460], [2705, 3460], [2708, 3460], [2741, 3440], [3094, 3440], [3134, 3460], [3137, 3460], [3140, 3460], [3143, 3460], [3209, 3437], [3209, 3391], [3209, 3328], [3117, 3312], [3114, 3312], [3111, 3312], [3054, 3328], [3051, 3328], [3031, 3312], [3028, 3312], [3025, 3312], [3022, 3312], [2913, 3328], [2860, 3335], [2768, 3328], [2633, 3318], [2629, 3318], [2626, 3318], [2590, 3328], [2554, 3338], [2333, 3328], [2115, 3321], [2112, 3321], [2030, 3331], [1967, 3338], [1944, 3331], [1885, 3312], [1881, 3312], [1878, 3312], [1684, 3328], [1657, 3312], [1654, 3312], [1651, 3312], [1647, 3312], [1644, 3312], [1641, 3312], [1614, 3331], [1604, 3338], [1430, 3331], [1087, 3321], [1084, 3321], [1080, 3321], [1077, 3321], [1057, 3335], [1054, 3338], [1047, 3335], [1015, 3312], [1011, 3312], [1008, 3312], [870, 3312], [866, 3312], [754, 3335], [751, 3338], [454, 3338], [454, 3391]], [[461, 3522], [461, 3575], [583, 3592], [586, 3592], [589, 3592], [593, 3592], [612, 3578], [626, 3572], [741, 3572], [774, 3578], [840, 3592], [843, 3592], [846, 3592], [873, 3578], [883, 3572], [896, 3578], [926, 3595], [929, 3595], [932, 3595], [1005, 3578], [1038, 3572], [1064, 3578], [1120, 3595], [1123, 3595], [1127, 3595], [1130, 3595], [1166, 3578], [1186, 3569], [1545, 3569], [1628, 3582], [1700, 3592], [1703, 3592], [1707, 3592], [1730, 3582], [1756, 3569], [1881, 3582], [1950, 3588], [1954, 3588], [1957, 3588], [1960, 3588], [1967, 3582], [1990, 3565], [2013, 3582], [2020, 3588], [2023, 3588], [2026, 3588], [2030, 3592], [2306, 3592], [2306, 3588], [2310, 3588], [2320, 3585], [2356, 3565], [2455, 3585], [2484, 3588], [2488, 3588], [2491, 3588], [2501, 3585], [2537, 3569], [2807, 3585], [2827, 3585], [2830, 3585], [2834, 3585], [2837, 3585], [2873, 3562], [3203, 3585], [3209, 3509], [3203, 3447], [2422, 3463], [2250, 3447], [2181, 3440], [2178, 3440], [2175, 3440], [2158, 3447], [2122, 3460], [2086, 3447], [2063, 3437], [2059, 3437], [2056, 3437], [2053, 3437], [2030, 3447], [2003, 3457], [1796, 3447], [1647, 3440], [1644, 3440], [1641, 3440], [1621, 3447], [1585, 3460], [1522, 3447], [1483, 3440], [1479, 3440], [1476, 3440], [1450, 3447], [1384, 3466], [1334, 3447], [1331, 3447], [1328, 3447], [1324, 3447], [1308, 3447], [1150, 3466], [1120, 3447], [1113, 3443], [1110, 3443], [1107, 3443], [893, 3443], [889, 3443], [886, 3443], [870, 3447], [790, 3470], [741, 3447], [738, 3447], [734, 3447], [731, 3447], [728, 3447], [725, 3447], [695, 3470], [596, 3447], [566, 3443], [563, 3443], [458, 3447], [461, 3522]], [[454, 3654], [454, 3691], [530, 3714], [533, 3714], [537, 3714], [649, 3700], [701, 3723], [705, 3723], [708, 3723], [912, 3691], [1351, 3691], [1608, 3717], [1611, 3717], [1614, 3717], [1693, 3697], [1815, 3714], [1819, 3714], [1822, 3714], [1825, 3714], [1852, 3691], [1904, 3714], [1908, 3714], [1911, 3714], [2366, 3691], [2432, 3710], [2435, 3710], [2438, 3714], [2745, 3714], [2745, 3710], [2880, 3687], [3028, 3707], [3031, 3707], [3209, 3671], [3209, 3634], [3206, 3575], [2850, 3565], [2847, 3565], [2778, 3578], [2722, 3588], [2639, 3578], [2530, 3569], [2527, 3569], [2524, 3569], [2521, 3569], [2504, 3582], [2471, 3605], [2366, 3605], [2316, 3585], [2300, 3578], [2297, 3578], [2293, 3578], [2250, 3585], [2201, 3592], [2155, 3585], [2016, 3562], [2013, 3562], [2010, 3562], [1957, 3585], [1878, 3562], [1875, 3562], [1871, 3562], [1664, 3585], [1479, 3565], [1476, 3565], [1314, 3595], [1219, 3615], [655, 3615], [616, 3602], [530, 3572], [527, 3572], [523, 3572], [520, 3572], [451, 3602], [454, 3654]], [[461, 3845], [767, 3832], [856, 3845], [929, 3855], [932, 3855], [935, 3855], [975, 3845], [1024, 3832], [1077, 3845], [1133, 3855], [1136, 3859], [1509, 3859], [1509, 3855], [1585, 3842], [1621, 3836], [1654, 3842], [1720, 3855], [1723, 3855], [1802, 3842], [1868, 3829], [2043, 3829], [2059, 3839], [2086, 3852], [2089, 3852], [2092, 3852], [2194, 3839], [2227, 3836], [2254, 3839], [2316, 3849], [2320, 3849], [2323, 3849], [2326, 3849], [2339, 3839], [2349, 3832], [2359, 3839], [2392, 3852], [2395, 3852], [2399, 3852], [2504, 3839], [2527, 3836], [2560, 3836], [2748, 3852], [2751, 3852], [2755, 3852], [2758, 3852], [2781, 3836], [2791, 3832], [3206, 3832], [3213, 3770], [3209, 3704], [2877, 3687], [2873, 3687], [2870, 3687], [2814, 3704], [2791, 3710], [2600, 3704], [2240, 3694], [2237, 3694], [2234, 3694], [2198, 3707], [2184, 3710], [2132, 3707], [1878, 3687], [1875, 3687], [1871, 3687], [1829, 3710], [1822, 3714], [1423, 3714], [1417, 3710], [1367, 3687], [1364, 3687], [1067, 3687], [1064, 3687], [1061, 3687], [1057, 3687], [1031, 3704], [800, 3687], [797, 3687], [794, 3687], [790, 3687], [757, 3710], [458, 3714], [461, 3776]], [[469, 3964], [509, 3994], [514, 3994], [519, 3994], [523, 3994], [528, 3994], [617, 3964], [647, 3989], [647, 3994], [652, 3994], [657, 3994], [662, 3999], [795, 3999], [795, 3994], [800, 3994], [874, 3959], [1013, 3994], [1018, 3994], [1023, 3994], [1230, 3959], [1240, 3959], [1245, 3959], [1418, 3984], [1423, 3984], [1571, 3959], [1626, 3994], [1631, 3994], [1636, 3994], [1641, 3994], [1833, 3959], [1873, 3984], [1878, 3984], [1883, 3984], [1888, 3984], [2150, 3959], [2234, 3994], [2239, 3994], [2244, 3994], [2249, 3994], [2372, 3959], [2407, 3989], [2407, 3994], [2412, 3994], [2417, 3994], [2422, 3999], [2792, 3999], [2792, 3994], [3193, 3954], [3198, 3900], [3188, 3821], [464, 3831], [469, 3910]], [[458, 4109], [688, 4119], [692, 4119], [695, 4119], [715, 4109], [751, 4093], [840, 4093], [932, 4109], [975, 4116], [978, 4116], [982, 4116], [1005, 4109], [1074, 4086], [1377, 4086], [1420, 4106], [1430, 4112], [1433, 4112], [1436, 4112], [1591, 4106], [1716, 4099], [1720, 4099], [1723, 4099], [1746, 4083], [1802, 4099], [1805, 4099], [1809, 4099], [1997, 4083], [2076, 4102], [2079, 4102], [2082, 4102], [2086, 4102], [2142, 4086], [2382, 4102], [2425, 4106], [2428, 4106], [2432, 4106], [2442, 4102], [2530, 4076], [2646, 4102], [2652, 4102], [2656, 4102], [2669, 4102], [2962, 4079], [3005, 4099], [3008, 4099], [3012, 4102], [3219, 4102], [3223, 4027], [3213, 3938], [3107, 3971], [3022, 3948], [3018, 3948], [3015, 3948], [2685, 3974], [2613, 3944], [2610, 3944], [2606, 3944], [2445, 3944], [2442, 3944], [2438, 3944], [2435, 3944], [2432, 3948], [2399, 3971], [2329, 3948], [2323, 3948], [2320, 3948], [2316, 3948], [2303, 3951], [2204, 3974], [2145, 3951], [2132, 3948], [2128, 3948], [2125, 3948], [1740, 3954], [1710, 3957], [1463, 4004], [1242, 4007], [662, 4007], [606, 3974], [603, 3971], [599, 3971], [596, 3971], [593, 3971], [454, 3974], [458, 4050]], [[491, 4218], [896, 4244], [899, 4244], [902, 4244], [959, 4224], [1090, 4241], [1094, 4241], [1097, 4241], [1186, 4218], [1249, 4241], [1252, 4241], [1255, 4244], [1397, 4244], [1397, 4241], [1539, 4221], [1680, 4238], [1684, 4238], [1687, 4238], [1733, 4215], [1736, 4215], [1746, 4215], [1871, 4231], [1875, 4231], [1878, 4231], [1911, 4215], [1918, 4211], [2030, 4215], [2353, 4224], [2356, 4224], [2448, 4211], [2488, 4208], [2511, 4211], [2590, 4231], [2593, 4231], [3216, 4208], [3219, 4155], [3213, 4089], [3002, 4099], [2966, 4093], [2880, 4070], [2877, 4070], [2873, 4070], [2718, 4096], [2652, 4106], [2629, 4096], [2596, 4083], [2593, 4083], [2590, 4083], [2587, 4083], [2544, 4099], [2494, 4119], [2119, 4119], [2092, 4106], [2043, 4076], [2039, 4076], [2036, 4076], [2033, 4076], [1614, 4109], [1235, 4083], [1232, 4083], [1229, 4083], [1140, 4116], [1054, 4083], [1051, 4083], [1047, 4083], [1044, 4083], [998, 4102], [965, 4089], [962, 4089], [959, 4089], [817, 4109], [721, 4086], [718, 4086], [715, 4086], [711, 4086], [603, 4129], [599, 4132], [491, 4132], [491, 4175], [491, 4218]], [[467, 4297], [467, 4373], [537, 4389], [540, 4389], [603, 4376], [672, 4360], [698, 4376], [718, 4389], [721, 4389], [725, 4389], [728, 4389], [787, 4376], [814, 4373], [830, 4379], [879, 4396], [883, 4396], [886, 4396], [889, 4396], [926, 4379], [935, 4376], [998, 4379], [1169, 4392], [1173, 4392], [1288, 4383], [1344, 4379], [1357, 4383], [1407, 4402], [1410, 4402], [1413, 4402], [1417, 4402], [1466, 4386], [1502, 4373], [1525, 4386], [1545, 4399], [1548, 4399], [1552, 4399], [1703, 4412], [1707, 4412], [1710, 4412], [1786, 4389], [1894, 4353], [1987, 4389], [2016, 4402], [2020, 4402], [2023, 4402], [2026, 4402], [2066, 4392], [2168, 4363], [2303, 4386], [2306, 4386], [2554, 4350], [2827, 4350], [2929, 4402], [2939, 4406], [2942, 4406], [2946, 4406], [3216, 4402], [3216, 4284], [3209, 4205], [2517, 4251], [2438, 4251], [2372, 4215], [2369, 4215], [2366, 4215], [2362, 4215], [2303, 4231], [2155, 4221], [2151, 4221], [2148, 4221], [2082, 4247], [1700, 4228], [1697, 4228], [1571, 4251], [1506, 4221], [1502, 4221], [1499, 4221], [1496, 4221], [1394, 4254], [1265, 4254], [1186, 4218], [1176, 4215], [1173, 4215], [1169, 4215], [1136, 4218], [995, 4238], [949, 4218], [942, 4218], [939, 4218], [935, 4218], [464, 4221], [467, 4297]]]
        }
    },
    "136_7aab7_default.jpg": = {
...

It gives me this error:

(doctr-py3.10) incognito@DESKTOP-H1BS9PO:~/doctr-py3.10$ python doctr/references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Traceback (most recent call last):
  File "/home/incognito/doctr-py3.10/doctr/references/detection/train_pytorch.py", line 481, in <module>
    main(args)
  File "/home/incognito/doctr-py3.10/doctr/references/detection/train_pytorch.py", line 182, in main
    val_set = DetectionDataset(
  File "/home/incognito/doctr-py3.10/doctr/doctr/datasets/detection.py", line 54, in __init__
    labels = json.load(f)
  File "/usr/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 30 (char 31)

Any idea?

Answered by felixdittrich92

Sep 30, 2024

Hi @johnlockejrr 👋,

The values are already absolute so that's fine 👍
But what you have is a multi-point polygon and doctr requires a 4-point polygon as label :)
So what you have simply to do is to extract the top-left, top-right, bottom-right and bottom-left points from each polygon label to get

[[x1, y1], [x2, y2], [x3, y3], [x4, y4]]

Then the polygons key in the labels.json contains all the annotations for 1 image:

{
    "sample_img_01.png" = {
        'img_dimensions': (900, 600),
        'img_hash': "theimagedumpmyhash",
        'polygons': [
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
            .....
        ]
     …

View full answer

johnlockejrr · 2024-09-29T19:40:37Z

johnlockejrr
Sep 29, 2024
Author

Seems like I have badly formatted the labels.json.
Now is ok but I get another error:

(doctr-py3.10) incognito@DESKTOP-H1BS9PO:~/doctr-py3.10$ python doctr/references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Traceback (most recent call last):
  File "/home/incognito/doctr-py3.10/doctr/references/detection/train_pytorch.py", line 481, in <module>
    main(args)
  File "/home/incognito/doctr-py3.10/doctr/references/detection/train_pytorch.py", line 182, in main
    val_set = DetectionDataset(
  File "/home/incognito/doctr-py3.10/doctr/doctr/datasets/detection.py", line 63, in __init__
    geoms, polygons_classes = self.format_polygons(label["polygons"], use_polygons, np_dtype)
  File "/home/incognito/doctr-py3.10/doctr/doctr/datasets/detection.py", line 90, in format_polygons
    _polygons = np.concatenate([np.asarray(poly, dtype=np_dtype) for poly in polygons.values() if poly], axis=0)
  File "/home/incognito/doctr-py3.10/doctr/doctr/datasets/detection.py", line 90, in <listcomp>
    _polygons = np.concatenate([np.asarray(poly, dtype=np_dtype) for poly in polygons.values() if poly], axis=0)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (25,) + inhomogeneous part.

My original dataset is in PAGE-XML format that has tuples of polygon coordinates.

    <TextRegion id="eSc_textblock_c1152229"  custom="structure {type:Main;}">
      <Coords points="1215,969 3268,1010 3309,1119 3322,3835 3322,4501 3281,4597 1338,4624 464,4583 491,1051 519,996 1215,969"/>


      <TextLine id="eSc_line_a83ee08e" custom="structure {type:default;}">
        <Coords points="549,1129 546,1180 566,1197 570,1197 573,1197 905,1201 908,1201 1048,1184 1120,1177 1133,1187 1157,1204 1161,1204 1164,1204 1167,1204 1249,1187 1280,1184 1318,1191 1410,1204 1413,1204 1478,1191 1540,1180 1584,1194 1625,1208 1629,1208 1632,1208 1745,1197 1878,1184 1912,1201 1936,1211 1939,1211 1943,1211 1987,1201 2045,1187 2090,1204 2103,1208 2107,1208 2110,1208 2113,1208 2131,1204 2192,1187 2650,1187 2718,1214 2721,1214 2725,1214 2728,1214 2738,1214 2892,1191 2950,1218 2960,1221 2964,1221 2967,1225 3251,1225 3258,1156 3254,1085 2875,1081 2715,1054 2711,1054 2708,1054 2633,1075 2592,1054 2588,1054 2585,1054 2411,1064 2291,1023 2288,1023 2182,1020 2178,1020 1922,1057 1806,1034 1803,1034 1488,1054 1420,1013 1417,1013 1413,1013 1191,1013 1188,1013 1185,1013 1048,1054 905,1034 901,1034 898,1034 823,1061 549,1061 549,1129"/>
        <Baseline points="549,1129 1973,1129 3258,1156"/>

I made wrote a code to extract the polygons and convert to DocTr labels. Should I do anything else to the original polygons?

0 replies

johnlockejrr · 2024-09-29T23:10:33Z

johnlockejrr
Sep 29, 2024
Author

This is how I "convert" my tuples to doctr polygons, is that wrong?

import numpy as np

def to_absolute(width, height, label):
    polygons = np.empty(label['boxes'].shape, dtype=np.int32)
    polygons[..., 0] = label['boxes'][..., 0] * width
    polygons[..., 1] = label['boxes'][..., 1] * height
    return polygons

def convert_to_doctr_polygons(polygon_str, imageWidth, imageHeight):
    # Step 1: Parse the input string into a list of tuples
    points = polygon_str.strip().split()
    # Step 2: Convert to tuples of integers
    points = [tuple(map(int, point.split(','))) for point in points]

    # Step 3: Normalize the coordinates
    normalized_points = [(x / imageWidth, y / imageHeight) for x, y in points]

    # Prepare the label structure expected by `to_absolute`
    label = {'boxes': np.array(normalized_points, dtype=np.float32)}  # Use float32 for normalized points

    # Step 4: Convert normalized points back to absolute pixel coordinates
    absolute_polygons = to_absolute(imageWidth, imageHeight, label)

    # Format the output for DocTR
    doctr_polygons = [absolute_polygons.tolist()]  # Wrap it for a single polygon

    return doctr_polygons

# Example usage
polygon_str = "1215,969 3268,1010 3309,1119 3322,3835 3322,4501 3281,4597 1338,4624 464,4583 491,1051 519,996 1215,969"
imageWidth = 4607
imageHeight = 6143

# Convert the string to DocTR polygon format with absolute coordinates
result = convert_to_doctr_polygons(polygon_str, imageWidth, imageHeight)
print(result)

4 replies

felixdittrich92 Sep 30, 2024
Maintainer

Hi @johnlockejrr 👋,

The values are already absolute so that's fine 👍
But what you have is a multi-point polygon and doctr requires a 4-point polygon as label :)
So what you have simply to do is to extract the top-left, top-right, bottom-right and bottom-left points from each polygon label to get

[[x1, y1], [x2, y2], [x3, y3], [x4, y4]]

Then the polygons key in the labels.json contains all the annotations for 1 image:

{
    "sample_img_01.png" = {
        'img_dimensions': (900, 600),
        'img_hash': "theimagedumpmyhash",
        'polygons': [
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
            .....
        ]
     },
     "sample_img_02.png" = {
        'img_dimensions': (900, 600),
        'img_hash': "thisisahash",
        'polygons': [
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
             [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
            .....
        ]
     }
     ...
}

You can find datasets for reference here: #1654

Hope this helps

Best regards,
Felix

Answer selected by johnlockejrr

johnlockejrr Sep 30, 2024
Author

@felixdittrich92 thank you for your reply! I wasn't aware of that! So my code below should be ok?

def convert_to_four_point_polygon(polygon_string):
    # Step 1: Parse the string into a list of tuples
    points = [tuple(map(int, point.split(','))) for point in polygon_string.split()]

    # Step 2: Initialize extreme points
    top_left = (float('inf'), float('inf'))
    top_right = (float('-inf'), float('inf'))
    bottom_left = (float('inf'), float('-inf'))
    bottom_right = (float('-inf'), float('-inf'))

    # Step 3: Find the extreme points
    for x, y in points:
        if y < top_left[1] or (y == top_left[1] and x < top_left[0]):
            top_left = (x, y)
        if y < top_right[1] or (y == top_right[1] and x > top_right[0]):
            top_right = (x, y)
        if y > bottom_left[1] or (y == bottom_left[1] and x < bottom_left[0]):
            bottom_left = (x, y)
        if y > bottom_right[1] or (y == bottom_right[1] and x > bottom_right[0]):
            bottom_right = (x, y)

    # Step 4: Return the list of four corner points
    return [list(top_left), list(top_right), list(bottom_right), list(bottom_left)]

# Example usage
polygon_str = "1215,969 3268,1010 3309,1119 3322,3835 3322,4501 3281,4597 1338,4624 464,4583 491,1051 519,996 1215,969"
four_point_polygon = convert_to_four_point_polygon(polygon_str)
print(four_point_polygon)

Output:

[[1215, 969], [1215, 969], [1338, 4624], [1338, 4624]]

felixdittrich92 Sep 30, 2024
Maintainer

Looks not correct.

def convert_to_four_point_polygon(polygon_string):
    points = [tuple(map(int, point.split(','))) for point in polygon_string.split()]
    print(points)

    min_x = min(points, key=lambda p: p[0])[0]
    max_x = max(points, key=lambda p: p[0])[0]
    min_y = min(points, key=lambda p: p[1])[1]
    max_y = max(points, key=lambda p: p[1])[1]

    top_left = min(points, key=lambda p: (p[0] - min_x) + (p[1] - min_y))
    top_right = min(points, key=lambda p: (max_x - p[0]) + (p[1] - min_y))
    bottom_left = min(points, key=lambda p: (p[0] - min_x) + (max_y - p[1]))
    bottom_right = min(points, key=lambda p: (max_x - p[0]) + (max_y - p[1]))

    return [list(top_left), list(top_right), list(bottom_right), list(bottom_left)]

# Example usage
polygon_str = "1215,969 3268,1010 3309,1119 3322,3835 3322,4501 3281,4597 1338,4624 464,4583 491,1051 519,996 1215,969"
four_point_polygon = convert_to_four_point_polygon(polygon_str)
print(four_point_polygon)

To evaluate you can display the boxes on the corresponding image :)

johnlockejrr Sep 30, 2024
Author

Ok, I'll try that

johnlockejrr · 2024-09-30T08:50:24Z

johnlockejrr
Sep 30, 2024
Author

I think you was right, the code below I think should do it:

def calculate_four_points(coordinate_string, image_width, image_height):
    # Split the string into individual coordinate pairs
    coordinates = [tuple(map(int, point.split(','))) for point in coordinate_string.split()]

    # Extract x and y coordinates
    xs = [x for x, y in coordinates]
    ys = [y for x, y in coordinates]

    # Calculate min and max for x and y
    min_x = min(xs)
    max_x = max(xs)
    min_y = min(ys)
    max_y = max(ys)

    # Define the four points using the bounding box
    top_left = (min_x, min_y)
    top_right = (max_x, min_y)
    bottom_left = (min_x, max_y)
    bottom_right = (max_x, max_y)

    # Ensure the points are within image boundaries
    top_left = (max(0, top_left[0]), max(0, top_left[1]))
    top_right = (min(image_width, top_right[0]), max(0, top_right[1]))
    bottom_left = (max(0, bottom_left[0]), min(image_height, bottom_left[1]))
    bottom_right = (min(image_width, bottom_right[0]), min(image_height, bottom_right[1]))

    #return top_left, top_right, bottom_left, bottom_right
    #return [list(top_left), list(top_right), list(bottom_left), list(bottom_right)]
    return [list(top_left), list(top_right), list(bottom_right), list(bottom_left)]

# Example usage
coordinate_string = "549,1129 546,1180 566,1197 570,1197 573,1197 905,1201 908,1201 1048,1184 1120,1177 1133,1187 1157,1204 1161,1204 1164,1204 1167,1204 1249,1187 1280,1184 1318,1191 1410,1204 1413,1204 1478,1191 1540,1180 1584,1194 1625,1208 1629,1208 1632,1208 1745,1197 1878,1184 1912,1201 1936,1211 1939,1211 1943,1211 1987,1201 2045,1187 2090,1204 2103,1208 2107,1208 2110,1208 2113,1208 2131,1204 2192,1187 2650,1187 2718,1214 2721,1214 2725,1214 2728,1214 2738,1214 2892,1191 2950,1218 2960,1221 2964,1221 2967,1225 3251,1225 3258,1156 3254,1085 2875,1081 2715,1054 2711,1054 2708,1054 2633,1075 2592,1054 2588,1054 2585,1054 2411,1064 2291,1023 2288,1023 2182,1020 2178,1020 1922,1057 1806,1034 1803,1034 1488,1054 1420,1013 1417,1013 1413,1013 1191,1013 1188,1013 1185,1013 1048,1054 905,1034 901,1034 898,1034 823,1061 549,1061 549,1129"
image_width = 4607
image_height = 6143

four_points = calculate_four_points(coordinate_string, image_width, image_height)
print(four_points)

Output:

Anyway, my original dataset had something like this:

Should this still doing in training a doctr model? I try to train the model on handwritten texts.

1 reply

felixdittrich92 Sep 30, 2024
Maintainer

Box looks ok but docTR is trained on word level (both detection + recognition) but your annotation looks like line level correct ? ^^
Do you have annotations on word level for your dataset also ?

johnlockejrr · 2024-09-30T12:05:45Z

johnlockejrr
Sep 30, 2024
Author

Oh... that explains many things. Unfortunately almost all my datasets are line level segmentation/detection and recognition

1 reply

felixT2K Sep 30, 2024

At the end you can give it a try ..but as mentioned it's not especially designed to work on line level :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset not well formated? #1737

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Dataset not well formated? #1737

johnlockejrr Sep 29, 2024

Replies: 4 comments · 6 replies

johnlockejrr Sep 29, 2024 Author

johnlockejrr Sep 29, 2024 Author

felixdittrich92 Sep 30, 2024 Maintainer

johnlockejrr Sep 30, 2024 Author

felixdittrich92 Sep 30, 2024 Maintainer

johnlockejrr Sep 30, 2024 Author

johnlockejrr Sep 30, 2024 Author

felixdittrich92 Sep 30, 2024 Maintainer

johnlockejrr Sep 30, 2024 Author

felixT2K Sep 30, 2024

johnlockejrr
Sep 29, 2024

Replies: 4 comments 6 replies

johnlockejrr
Sep 29, 2024
Author

johnlockejrr
Sep 29, 2024
Author

felixdittrich92 Sep 30, 2024
Maintainer

johnlockejrr Sep 30, 2024
Author

felixdittrich92 Sep 30, 2024
Maintainer

johnlockejrr Sep 30, 2024
Author

johnlockejrr
Sep 30, 2024
Author

felixdittrich92 Sep 30, 2024
Maintainer

johnlockejrr
Sep 30, 2024
Author