Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 2.63 KB

README.md

File metadata and controls

36 lines (27 loc) · 2.63 KB

Analysis and use of different types of classification algorithms for handwritten number recognition

Dataset examples

About this project

This project shows a little example of a handwritten number recognition algorithm which is a classical problem in the machine learning classification models, it basically consists in identifying the right number that corresponds to the image of a handwriting number which is used as input. It's one of the most interesting problems in the machine learning field, so I wanted to give it a shoot. In this occasion I decided to work with a reduced version of the problem which is to consider just two numbers or labels for making the classification, the numbers one and five in this case. The data was previously splited into two different files which were imported as two datasets in the project, the first for training and the other for testing, each one is composed of information about grayscale images of numbers from zero to nine and labels or numbers that represent each image, I'll put both datasets on the repository. I implemented three diferent algorithms or models in order to make an analysis and comparison on them and see which one shows better results, thoses models were the perceptron, pocket algorithm, and linear regression model. On the other hand, instead of using raw data of images as input in the training process it was estimated more representative characteristics to use in their place which are the intensity and simmetry of an image.

👨🏻‍💻  Algorithms and techniques used

  • Perceptron Algorithm
  • Pocket Algorithm
  • Linear regression model
  • Learning paradigm: Supervised
  • Machine learning

⚙️  Features

  • Dataset size: 9298 images (numbers from 0 to 9)
  • Dataset format file: .csv (2 files)
  • Initial number of characteristics: 256
  • Train data size: 1561 images (images with 1 and 5)
  • Test data size: 421 images (images with 1 and 5)
  • Final number of characteristics: 2
  • Characteristics: Intensity and simmetry of an image
  • Output: 2 labels
  • Iterations: 200
  • Best result: 0.321 (pocket algorithm Error)

🚀  Tech Stack

Programming languages & tools: Python, Jupyter notebook

Libraries & modules: Numpy, matplotlib, seaborn, Scikit-learn