A MACHINE LEARNING FRAMEWORK FOR DETECTING MALICIOUS URLS IN IOT NETWORK TRAFFIC
Abstract
The rapid proliferation of Internet of Things (IoT) devices has significantly expanded the attack surface for cyber threats. Malicious or obfuscated URLs have emerged as a primary attack vector, often used to deliver payloads that exploit vulnerabilities like buffer overflows. To address this, we propose EIoT-MLF, a machine learning framework for the robust detection of malicious URLs in IoT network traffic. Our methodology employs a structured pipeline that includes rigorous data cleaning, correlation-based feature selection, and techniques to handle class imbalance. We conducted an extensive comparative evaluation of five machine learning classifiers Decision Tree, Random Forest, K-Nearest Neighbors, Logistic Regression, and Gaussian Naive Bayes across multiple heterogeneous datasets. Model performance was assessed using standard metrics: accuracy, precision, recall, and F1-score. Our results demonstrate that the Random Forest classifier achieved superior performance, with 98% accuracy and a high recall rate, which is crucial for minimizing false negatives in security applications. Analysis of feature importance identified URL length, specific token frequencies, digit ratios, and the use of non-standard ports as the most significant indicators of malicious activity. These findings confirm that a purpose-built, URL-centric machine learning approach can offer a generalizable and reliable solution for enhancing IoT security, providing a effective strategy to mitigate web-based intrusions.













