Given an image of a math-expression, this program aims to separate that into its individual terms. This is what that means:
How to run?
git clone git@github.com:utkarsh-21st/math-expression-seperator.git
cd math-expression-seperator
python main -i input_path -o output_path --show_boxes
- input_path -> image path
- output_path -> path where the results will get saved
- --show_boxes is an optional argument
First, the program reads the image in gray-scale, resized to a width of 600 while maintaining the aspect ratio, followed by OpenCV Adaptive Thresholding. Then, it finds contours using OpenCV findContours function, filtering out irrelevant contours (or noise) based on their areas. Contours here captures the outline of each term. Using contours we extract each term from the image and save the results. The program can separate any abstract into its individual terms.
It works fine so long as the terms are distinctly partitioned, which is to say that it cannot separate a term if it is connected to any other part. I originally began this project intending to design a model that could read hand-written math-expression but later I discarded the idea. If you are interested to continue to do so, here are some datasets that you could train your model on: