Adaptive-Ensemble-Learning-for-Intrusion-Detection-Using-CICIDS2017-and-CICIDS2018

This repository provides a complete workflow for working with the CICIDS2017 and CICIDS2018 datasets — from raw data loading and cleaning, through exploratory data analysis (EDA), to building and evaluating dynamic ensemble learning models for Intrusion Detection Systems (IDS).

🗂️ Contents

Part 1: 📌 CICIDS2017 Preprocessing & EDA
Part 2: 📌 CICIDS2018 Preprocessing & EDA
Part 3: 🚀 Dynamic Ensemble Performance Evaluation

📊 Overview

This project helps you:

✅ Load large network traffic CSVs efficiently
✅ Clean, optimize, and engineer features
✅ Visualize and understand attack distributions
✅ Train, load, and test multiple ML models
✅ Combine base models using static and adaptive ensembling
✅ Evaluate performance across datasets and techniques

⚙️ Requirements

Install once for all modules:

pip install pandas numpy scikit-learn matplotlib seaborn missingno joblib

Or inside a Colab cell:

!pip install pandas numpy scikit-learn matplotlib seaborn missingno joblib

📌 Part 1 — CICIDS2017 Data Preprocessing & EDA

✅ Key Steps:

Mount Drive

from google.colab import drive
drive.mount('/content/drive')

Load and Clean

data = load_cicids_data('/content/drive/MyDrive/Capstone/CICIDS2017')
data = optimize_dtypes(data)
data.drop_duplicates(inplace=True)

Handle Missing Values

data.replace([np.inf, -np.inf], np.nan, inplace=True)
data.fillna(data.median(), inplace=True)

Label Engineering

data['Attack Type'] = data['Label'].map(attack_map)
le = LabelEncoder()
data['Attack Number'] = le.fit_transform(data['Attack Type'])

EDA

import missingno as msno
msno.bar(data)
sns.heatmap(data.corr(numeric_only=True))

📌 Part 2 — CICIDS2018 Data Preprocessing & EDA

✅ Key Steps:

Mount Drive

from google.colab import drive
drive.mount('/content/drive')

Combine Multiple CSVs

df1 = pd.read_csv('/path/to/file1.csv')
df2 = pd.read_csv('/path/to/file2.csv')
data = pd.concat([df1, df2], ignore_index=True)

Fix Data Types

data = fixDataType(data)
data = optimize_dtypes(data)

Label Encoding

attack_map = {...}
data['Attack Type'] = data['Label'].map(attack_map)
le = LabelEncoder()
data['Attack Number'] = le.fit_transform(data['Attack Type'])

Visual EDA

msno.bar(data)
sns.boxplot(x='Attack Type', y='Flow Duration', data=data)

🚀 Part 3 — Dynamic Ensemble Performance Evaluation

✅ Main Features

Load trained models for 2017 & 2018
Combine models: average, weighted, max-voting
Adaptive ensembling with:
- Confidence metrics
- Meta-learner (RandomForestRegressor)
Evaluate all pairwise model combinations
Generate comparison tables & plots

🧩 How to Run

1️⃣ Mount Google Drive

from google.colab import drive
drive.mount('/content/drive')

2️⃣ Run the Pipeline

if __name__ == "__main__":
    runner, standard_summary, adaptive_summary = main()

🧩 Core Classes

Class	Role
`ModelLoader`	Loads models and test splits
`EnsemblePredictor`	Static ensembling
`AdaptiveEnsemblePredictor`	Confidence and meta-learning
`EvaluationMetrics`	Accuracy, F1, recall, precision
`VisualizationTools`	Confusion matrices, bar plots, heatmaps
`EnsembleExperimentRunner`	Runs all experiments and reporting

📈 Example: Run an Adaptive Ensemble

adaptive = AdaptiveEnsemblePredictor()
preds, confs, weights = adaptive.predict_ensemble(
    X_input, model1, model2,
    method='meta_learner',
    X_train=X_train_subset, y_train=y_train_subset
)

📊 Outputs

Individual model metrics (accuracy, F1, precision, recall)
Confusion matrix comparisons
Top-k model combination heatmaps
CSV-style DataFrame of results
Summary reports comparing standard vs adaptive ensembles

🏷️ License

Academic & research use only. Please cite CICIDS2017 and CICIDS2018.

✍️ Author

This project was built for security researchers working on real-time Intrusion Detection using ensemble learning techniques.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
AdaptiveEnsembles_PerformanceEvaluation.ipynb		AdaptiveEnsembles_PerformanceEvaluation.ipynb
CICIDS2017.ipynb		CICIDS2017.ipynb
CICIDS2018.ipynb		CICIDS2018.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adaptive-Ensemble-Learning-for-Intrusion-Detection-Using-CICIDS2017-and-CICIDS2018

🗂️ Contents

📊 Overview

⚙️ Requirements

📌 Part 1 — CICIDS2017 Data Preprocessing & EDA

✅ Key Steps:

📌 Part 2 — CICIDS2018 Data Preprocessing & EDA

✅ Key Steps:

🚀 Part 3 — Dynamic Ensemble Performance Evaluation

✅ Main Features

🧩 How to Run

🧩 Core Classes

📈 Example: Run an Adaptive Ensemble

📊 Outputs

🏷️ License

✍️ Author

About

Uh oh!

Releases

Packages

Languages

snigdhasv/Adaptive-Ensemble-Learning-for-Intrusion-Detection-Using-CICIDS2017-and-CICIDS2018

Folders and files

Latest commit

History

Repository files navigation

Adaptive-Ensemble-Learning-for-Intrusion-Detection-Using-CICIDS2017-and-CICIDS2018

🗂️ Contents

📊 Overview

⚙️ Requirements

📌 Part 1 — CICIDS2017 Data Preprocessing & EDA

✅ Key Steps:

📌 Part 2 — CICIDS2018 Data Preprocessing & EDA

✅ Key Steps:

🚀 Part 3 — Dynamic Ensemble Performance Evaluation

✅ Main Features

🧩 How to Run

🧩 Core Classes

📈 Example: Run an Adaptive Ensemble

📊 Outputs

🏷️ License

✍️ Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages