Skip to content

TarikKaanKoc/Diagnosis_of_Diabetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bussines Problem

It is desired to develop a machine learning model that can predict whether people have diabetes or not when their characteristics are specified. Before developing the model, the necessary data analysis and feature engineering steps must be performed. (Scenario)

Dataset Story

The dataset is part of a large dataset held at the National Institutes of Diabetes-Digestive-Kidney Diseases in the USA. It is the data used for diabetes research on women, consisting of Pima Indian Women aged 21 and over living in Phoenix, the 5th largest city of the State of Arizona in the USA. The target variable is specified as "Outcome"; 1 indicates positive diabetes test result, 0 indicates negative.



  • Total Features : 9
  • Total Row : 768
  • CSV File Size : 24 KB
Sr. Feature Description
1 Pregnancies Number of Pregnancy
2 Glucose 2-hour plasma glucose concentration in the oral glucose tolerance test
3 Blood Pressure Blood Pressure (Small Blood Pressure) (mmHg)
4 SkinThickness Skin Thickness
5 Insulin 2-hour serum insulin (mu U/ml)
6 DiabetsPedigreeFunction A function that calculates the probability of having diabetes according to one's descendants
7 BMI Body mass index
8 Age Age (Type: Year)
9 Outcome 1 positive indicates does have diabetes, 0 indicates negative does not have diabetes.