Supervised Machine Learning Model for Thyroid Disease Classification
These reports were available at https://rb.gy/jbazib. These data were constructed containing the following:
- Age
- Sex
- On_thyroxine
- Query_on_thyroxine
- On_antithyroid_medi cation
- Sick
- Pregnant
- Thyroid_surgery
- l131_treatment
- Query_hypothyroid
- Query_hyperthyroid
- Lithium
- Goiter
- Tumor
- Hypopituitary
- Psych
- TSH_measured
- TSH
- T3_measured
- T3
- TT4_measured
- TT4
- T4U_measured
- T4U
- FTI_measured
- FTI
- TBG_measured
- TBG
- Referral_source
- Target
- ID
The cleaned data of different thyroid diseases (hyperthyroidism, hypothyroidism and euthyroid) underwent synthetic data generation using MOSTLY AI which pioneered the generation of synthetic data for the development of AI models and software testing. Exploratory Data Analysis was carried out to see the correlation and significant relationship between the variables. These data were inspected and encoded in csv files. Then, it underwent preprocessing using python, NumPy and pandas to get an overview of the data types, fill null values, label categorical values, and remove outliers and
Following data preprocessing, the dataset was split for the modeling process using the Sci-Kit Learn Library. The dataset was divided into eighty percent (80%) training sets and twenty percent (20%) validation sets.