Skip to content

Implementation of an ML RS1 algorithm based on the theory of rough sets by Zdzislaw Pawlak

Notifications You must be signed in to change notification settings

SayMyName1337/RST-RS1-algorithm

Repository files navigation

Rough Set Theory (RST)

Python Visual Studio Code Pandas

Project Description

This project implements a machine learning algorithm based on Zdzislaw Pawlak's Rough Set Theory to predict golf performance based on weather conditions.

Project structure

The project consists of the following files:

  • Train_data_golf_14ex.csv: Training dataset.
  • Test_data_golf_50ex.csv: Test dataset.
  • algorithm.py: The main script with the implementation of the algorithm.

Installation

  1. Clone the repository:
git clone https://github.com/your-username/rst-golf-prediction.git
  1. Go to your project folder:
cd rst-golf-prediction
  1. Install required dependencies:
pip install pandas

Using the algorithm

  1. Place your CSV data files in your project root folder.
  2. For correct operation specify the path to the test and training dataset depending on its location on your computer
df_path = 'Put your personal path here'
df_test_path = 'Put your personal path here too'
  1. Run the script RS-ML.py
python RS-ML.py

Example of work

Outlook Humidity % Wind Play
Overcast 87 Fasle Yes
Sunny 80 True Yes
Sunny 80 True Yes
Overcast 75 True Yes
Overcast 75 True Yes
Rainy 80 False No
Sunny 80 True No
Rainy 80 False No
Rainy 85 False No
Overcast 87 False Yes

After launch we get the following intermediate results, which represent the construction of production rules:

Getting an elementary subsets of dataset:
[[0, 9], [1, 2, 6], [3, 4], [5, 7], [8]]
[[0, 9], [3, 4]]

======== Production rules for positive region ========
1) IF (Outlook = Overcast)& (Humidity% = 87 & 75)& (Wind = False & True)& THEN DECISION "PLAY" = PLAY

======== Production rules for negative region ========
2) IF (Outlook = Rainy)&(Humidity% = 85 V 80)&(Wind = False) THEN DECISION "PLAY" = DON'T PLAY

======== Production rules for boundry region ========
3) IF (Outlook = Sunny)&(Humidity% = 80)&(Wind = True) THEN DECISION "PLAY" = MAYBE PLAY

Approximation accuracy: 0.571

The final result will be the classification of the test dataset based on the constructed rules, as well as a comparison of the classification of the algorithm with the true values.

Outlook Humidity % Wind Play Classification
Overcast 87 Fasle Yes Yes
Sunny 80 True Yes Maybe
Rainy 80 True Yes Unknown
Sunny 75 True Yes Maybe
NaN 75 True Yes Unknown
Overcast 80 False No Yes
Raqiny 80 True No No
Accuracy of the classification RS1: 42.9 %

Code Structure

The main implemented functions of the algorithm are:

  • get_elementary_subsets(X): A function that returns elementary subsets of a set of objects.
  • get_lower(elementary, X_true_indexes): Formation of lower approximation.
  • get_upper(elementary, X_true_indexes): Formation of upper approximation.
  • get_pos_rule(pos_dataframe): Creating production rules for upper approximation.
  • get_neg_rule(not_pos_dataframe): Creating production rules for lower approximation.
  • get_maybe_rule(maybe_dataframe): Creating production rules for boundry region.
  • classify_new_data(row, pos_df, maybe_df, neg_df): Classification of a test data set based on constructed rules.

About

Implementation of an ML RS1 algorithm based on the theory of rough sets by Zdzislaw Pawlak

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages