This project implements a comprehensive fraud detection system for credit card applications using Self-Organizing Maps (SOM) for unsupervised anomaly detection, followed by an Artificial Neural Network (ANN) for supervised classification. The system effectively identifies fraudulent credit card applications with high accuracy.
- Unsupervised Learning: Uses SOM to identify potential fraud patterns without labeled data
- Supervised Learning: Implements ANN for final fraud classification
- Data Visualization: Interactive maps showing fraud clusters
- Scalable Architecture: Modular design for easy extension and modification
- Comprehensive Analysis: Detailed exploratory data analysis and results interpretation
The project uses the Credit Card Applications Dataset containing:
- 690 applications with 15 features each
- Anonymized attributes (A1-A15) for privacy protection
- Binary classification: Approved (1) vs Rejected (0) applications
- Customer demographic information
- Financial attributes
- Credit history indicators
- Application-specific details
- Python 3.8+
- TensorFlow/Keras - Neural network implementation
- Scikit-learn - Data preprocessing and evaluation
- MiniSOM - Self-Organizing Maps implementation
- Pandas - Data manipulation and analysis
- NumPy - Numerical computations
- Matplotlib - Data visualization
- Seaborn - Statistical plotting
credit-card-fraud-detection-som/
β
βββ data/
β βββ Credit_Card_Applications.csv
β
βββ notebooks/
β βββ som.ipynb # SOM implementation
β βββ mega_case_study.ipynb # Complete analysis
β
βββ src/
β βββ __init__.py
β βββ data_preprocessing.py
β βββ som_model.py
β βββ ann_model.py
β βββ visualization.py
β
βββ requirements.txt
βββ README.md
βββ LICENSE
-
Clone the repository:
git clone https://github.com/Ahmadhammam03/credit-card-fraud-detection-som.git cd credit-card-fraud-detection-som -
Create a virtual environment:
python -m venv fraud_detection_env source fraud_detection_env/bin/activate # On Windows: fraud_detection_env\Scripts\activate
-
Install required packages:
pip install -r requirements.txt
# Import required libraries
import numpy as np
import pandas as pd
from minisom import MiniSom
from sklearn.preprocessing import MinMaxScaler
# Load and preprocess data
dataset = pd.read_csv('data/Credit_Card_Applications.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
# Feature scaling
sc = MinMaxScaler(feature_range=(0, 1))
X = sc.fit_transform(X)
# Train SOM
som = MiniSom(x=10, y=10, input_len=15, sigma=1.0, learning_rate=0.5)
som.random_weights_init(X)
som.train_random(data=X, num_iteration=100)
# Identify fraud patterns
mappings = som.win_map(X)
frauds = np.concatenate((mappings[(8,1)], mappings[(6,9)]), axis=0)# Run the Jupyter notebook
jupyter notebook notebooks/mega_case_study.ipynb- Successfully identified 14 potential fraud cases from unlabeled data
- Created visual fraud detection map with clear anomaly clusters
- Fraud patterns concentrated in specific SOM regions
- Training Accuracy: ~99.5% after 10 epochs
- Fraud Detection Rate: High precision in identifying fraudulent applications
- Model Architecture: 2 hidden layers with optimized parameters
- Fraudulent applications show distinct patterns in financial attributes
- Geographic clustering of fraud cases in SOM visualization
- Effective combination of unsupervised and supervised learning approaches
- Data Preprocessing: Feature scaling using MinMaxScaler
- SOM Training: 10x10 grid with 100 iterations
- Anomaly Detection: Identification of outlier clusters
- Visualization: Color-coded fraud detection map
- Feature Engineering: Using SOM results as additional features
- Neural Network Design: 2-layer architecture with sigmoid activation
- Training: Binary classification with Adam optimizer
- Evaluation: Comprehensive performance metrics
The project includes several key visualizations:
- SOM Distance Map: Showing fraud clusters and patterns
- Training Progress: Neural network convergence plots
- Fraud Distribution: Geographic and feature-based analysis
- Performance Metrics: ROC curves and confusion matrices
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Ahmad Hammam
- GitHub: @Ahmadhammam03
- LinkedIn: Ahmad Hammam
- Original dataset source and research community
- MiniSOM library developers
- TensorFlow and Scikit-learn communities
- Open source machine learning community
- Self-Organizing Maps Theory
- MiniSOM Documentation
- TensorFlow Documentation
- Fraud Detection Research Papers
β If you found this project helpful, please give it a star! β