Abstract:
A distributed denial of service (DDoS) attack targets at hindering authorized individu
als from accessing a server or website by flooding it with traffic from many sources. To
avoid a DDoS attack from damaging the target system, detection is required. The sys
tem becomes unsafe as a result of this attack. The aim of this thesis work is to provide
an ensemble machine learning technique to detect DDoS attack. Another objective is
to select optimal features of the dataset. In this thesis dataset is collected from Kaggle
repository which contains 42 columns and 17171 rows. Firstly, three feature selection
techniques—ANOVA, Mutual Information, and Feature Importance have been used to re
duce the dataset and increase the performance. Then, optimal features have been selected
using domain knowledge. The traditional machine learning methods K-Nearest Neigh
bors (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Naive Bayes (NB)
are then used with the chosen features. Next, five ensemble models have been created by
all the combinations of four traditional models- (KNN, SVM, DT), (KNN, SVM, NB),
(KNN, NB, DT), (SVM, NB, DT) and (KNN, SVM, NB, DT). By evaluating accuracy,
precision, recall, and F1-score, the experiment’s outcome is determined. After all the
experiments, the result shows that the ensemble voting classifier by the combinations of
KNN, SVMand DTgives the highest accuracy. Among the feature selection techniques,
feature importance technique gives the maximum accuracy that is 98.86% and by using
the optimal features, highest accuracy to detect the DDoS attack is determined which is
99.4%