Breast_Cancer

The project provides information about breast cancer to help doctors predict if a person has it.

Introduction

Breast cancer is a disease in which abnormal breast cells grow out of control and form tumours. If left unchecked, the tumours can spread throughout the body and become fatal.
Breast cancer cells begin inside the milk ducts and/or the milk-producing lobules of the breast. The earliest form (in situ) is not life-threatening. Cancer cells can spread into nearby breast tissue (invasion). This creates tumours that cause lumps or thickening.
Invasive cancers can spread to nearby lymph nodes or other organs (metastasize). Metastasis can be fatal.
Treatment is based on the person, the type of cancer and its spread. Treatment combines surgery, radiation therapy and medications.

Consider the data present in the Breast Cancer Dataset file

Following the attribute related information. This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.

Age
Menopause
inv-nodes
node-caps
deg-malig
breast
breast-quad
irradiat
Outcome (no-recurrence-events, recurrence-events)

Problem Statement

To diagnostically predict whether or not a patient has Breast Cancer, based on certain diagnostic measurements included in the dataset.

Steps Followed for the Project

Importing Necessary Libraries
Performing Exploratory Data Analysis
Data Preprocessing
Converting Categorical data to numerical. (Label Encoder)
Creating X and Y
Split the data into test and train
Performed various model such as Logistic Regression, Decision Tree, Random Forest, Extra_Tree_Classifier, SVC, KNeighbors.
Tuned the above model
Smote Implementation and again running all above models for better accuracy and low recall value

Conclusion

Performed multiple models such as Logistic Regression, Decision Tree, Random Forest, Extra_Tree_Classifier, SVC, KNeighbors amongst them the tuned logistic regression model exhibits promising performance with an accuracy of 75.86%, indicating robust overall prediction. The low Type II error rate of 8 underscores its effectiveness in minimizing instances of false negatives, making it a strong choice for applications where identifying positive cases is crucial.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Breast_Cancer.ipynb		Breast_Cancer.ipynb
README.md		README.md
breast-cancer.data		breast-cancer.data
breast-cancer_Documentation.names		breast-cancer_Documentation.names

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast_Cancer

Introduction

Consider the data present in the Breast Cancer Dataset file

Problem Statement

Steps Followed for the Project

Conclusion

About

Releases

Packages

Languages

damaniayesh/Breast_Cancer

Folders and files

Latest commit

History

Repository files navigation

Breast_Cancer

Introduction

Consider the data present in the Breast Cancer Dataset file

Problem Statement

Steps Followed for the Project

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages