DEEP LEARNING-BASED TRAFFIC SIGN DETECTION IN DEVELOPING-COUNTRY ROAD CONDITIONS: A COMPARATIVE STUDY OF YOLOV8, YOLOV5, VISION TRANSFORMER AND RESNET18
Keywords:
Traffic sign detection; YOLOv8n; YOLOv5; Vision Transformer; ResNet18; intelligent transportation systems; road safety; developing countriesAbstract
Traffic sign detection is a safety-critical perception task for intelligent transportation systems, driver-assistance applications and road-infrastructure monitoring. In developing-country road environments, this task is complicated by faded or damaged signboards, inconsistent installation heights, dust, partial occlusion, motion blur, illumination variation and visually cluttered backgrounds. This paper presents a comparative deep learning study for traffic sign detection and recognition using YOLOv8n, YOLOv5nu, Vision Transformer (ViT-tiny) and ResNet18. A YOLO-format dataset containing 2,099 labelled images across 21 class identifiers was normalized into 1,679 training images and 420 validation images. YOLO models were trained at 640-pixel image size for 22 epochs using AdamW, while the image-level classifier branch was fine-tuned for 10 epochs. Experimental results show that YOLOv8 achieved 95.94% precision, 97.53% recall, 96.73% F1-score, 98.51% mAP@50 and 84.50% mAP@50-95. YOLOv5 obtained a slightly higher mAP@50 of 98.72%, whereas YOLOv8 provided stronger recall and marginally better mAP@50-95. For classification, ResNet18 reached 98.90% accuracy and weighted F1-score, while ViT-tiny achieved 62.91% accuracy, indicating that the transformer branch requires more data, stronger augmentation or hybrid local-global design before deployment. The findings support YOLOv8n as a practical real-time detection backbone for cost-aware traffic sign monitoring, while also showing that detector-classifier cascades must be validated end to end before being claimed as operationally superior.













