The Credit Card Fraud Detection Example

Open In PAI-DSW

The sample data already loaded in MySQL comes from Kaggle. To train the model using the full dataset, you need to download the dataset and load the dataset into MySQL manually.

You can verify the sample data content in MySQL using:

  1. %%sqlflow
  2. SELECT * from creditcard.creditcard limit 5;

Train a DNN Model Using SQLFlow

Once your dataset is prepared, you can run the below SQL statement to start training. Note that SQLFlow will automatically split the dataset into training and validation sets, the output of evaluation result is calculated using the validation set.

  1. %%sqlflow
  2. SELECT * from creditcard.creditcard
  3. TO TRAIN DNNClassifier
  4. WITH model.n_classes=2, model.hidden_units=[128,32], train.epoch=100
  5. COLUMN time,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v20,v21,v22,v23,v24,v25,v26,v27,v28,amount
  6. LABEL class
  7. INTO creditcard.creditcard_deep_model;

Run Predict

We can use the trained model to predict new data, e.g. we can choose some positive sample in the dataset to do predict:

  1. %%sqlflow
  2. SELECT * from creditcard.creditcard
  3. WHERE class=1
  4. TO PREDICT creditcard.predict.class
  5. USING creditcard.creditcard_deep_model;

Then we can get the predict result using:

  1. %%sqlflow
  2. SELECT * from creditcard.predict;