Skip to content

Fraud detection via residual neural network. (+ DVC)

Notifications You must be signed in to change notification settings

Yoskutik/CreditCardsResNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Credit Card Fraud Detection

Credit Card Fraud Detection is a highly unbalanced dataset consisting of 492 fraud and 284,807 normal transactions. Every record in data frame has 30 features which are:

  • Time: the number of seconds elapsed between each transaction and the first transaction;
  • Amount: the transaction amount;
  • V1, V2, ..., V28: generated by PCA features of transaction.

Residual neural network was built as classifier. The NN consists of 6 fully connected layers and 2 shortcuts connections. Here is its structure:

___________________________________________________________________
Layer           Output Shape        Activation        Connected to
===================================================================
input           [(None, 29)]        
___________________________________________________________________
dense_0          (None, 64)           ReLu             input
___________________________________________________________________
dense_1          (None, 64)           ReLU             dense_0
___________________________________________________________________
add_0            (None, 64)                            dense_1
                                                       dense_0
___________________________________________________________________
dense_2          (None, 64)           ReLU             add_0
___________________________________________________________________
dense_3          (None, 29)           ReLU             dense_2
___________________________________________________________________
dense_4          (None, 29)           ReLU             dense_3
___________________________________________________________________
add_1            (None, 29)                            dense_4
                                                       dense_3
                                                       input
___________________________________________________________________
dense_5          (None, 29)           ReLU             add_1
___________________________________________________________________
dense_6          (None, 1)            Sigmoid          dense_5
===================================================================

The model is saved in model.h5 file.
Test accuracy of the model is 0.93, ROC AUC is 0.97 and F1-Score is 0.93.


The project is completed using DVC. There are 3 stages in total:

  • stages/split.dvc - Pre-process the dataset: removes column Time, creates RobustScaler and use it on column Amount, split dataset into train and test with test_size=0.2;
  • stages/train.dvc - Trains the model and saves it to data/model.h5 for DVC and to model.h5 for Git;
  • stages/evaluate.dvc - Evaluates the model via test dataset. Outputs accuracy, ROC AUC score and F1-Score. Saves all metrics to metrics.txt, so later you can view them using dvd metrics show.

There is a visualisation of whole pipeline of stages/evaluate.dvc below:

                 +-------------------------+
                 | data\creditcard.csv.dvc |
                 +-------------------------+
                               *
                               *
                               *
                     +------------------+
                     | stages\split.dvc |
                     +------------------+
                   ***                  ****
               ****                         ***
             **                                ****
+------------------+                               **
| stages\train.dvc |                           ****
+------------------+                        ***
                   ***                  ****
                      ****          ****
                          **      **
                   +---------------------+
                   | stages\evaluate.dvc |
                   +---------------------+

To reproduce the model and the evaluation process use: dvc repro stages/evaluate.dvc.

About

Fraud detection via residual neural network. (+ DVC)

Topics

Resources

Stars

Watchers

Forks

Languages