Stroke-Prediction-Dataset-Practice

Whilst looking for health data scientist positions, I've stumbled upon some that focus on stroke research. To become more familiar with this domain of medicine, I've decided to practice health data analysis on a stroke-centered dataset - https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset

This is a small dataset of 5110 rows and the following 12 columns:

id: unique identifier
gender: "Male", "Female" or "Other"
age: age of the patient
hypertension: 0 if the patient doesn't have hypertension, 1 if the patient has hypertension
heart_disease: 0 if the patient doesn't have any heart diseases, 1 if the patient has a heart disease
ever_married: "No" or "Yes"
work_type: "children", "Govt_jov", "Never_worked", "Private" or "Self-employed"
Residence_type: "Rural" or "Urban"
avg_glucose_level: average glucose level in blood
bmi: body mass index
smoking_status: "formerly smoked", "never smoked", "smokes" or "Unknown"*
stroke: 1 if the patient had a stroke or 0 if not *Note: "Unknown" in smoking_status means that the information is unavailable for this patient

I plan to do the formatting of variables and exploratory data analysis (summary statistics, Q-Q plots, box-plots, t-tests etc.), followed by survival analysis and prediction modelling in an RMD to be knitted into an interactive HTML document.

I've planned the analysis using the following resources:

Other repos on my github - https://github.com/georgemelrose/Dummy-HES-APC-Data-Work , https://github.com/georgemelrose/Mental-Health-Data-Practice
The excellent HealthyR textbook - https://argoshare.is.ed.ac.uk/healthyr_book/
The unsurpassable survival analysis tutorial of Dr Emily Zabor - https://www.emilyzabor.com/tutorials/survival_analysis_in_r_tutorial.html
The example survival analysis of research biostatiscian Mr Jacky Choi - https://jmc2392.github.io/survival.html

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
0_data_analysis.Rmd		0_data_analysis.Rmd
README.md		README.md
healthcare-dataset-stroke-data.csv		healthcare-dataset-stroke-data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stroke-Prediction-Dataset-Practice

About

Releases

Packages

georgemelrose/Stroke-Prediction-Dataset-Practice

Folders and files

Latest commit

History

Repository files navigation

Stroke-Prediction-Dataset-Practice

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages