Real Time Fusion of Sign Language Recognition and YOLO Based Object Detection for Context Aware Communication
DOI:
https://doi.org/10.55011/59mh6b11Keywords:
Sign language recognition, YOLOv3, object detection, deep learning, accessibility, real-time systemsAbstract
This paper presents a unified, real-time system that integrates sign language recognition with object detection to enhance communication for the Deaf and hard-of-hearing community. The proposed framework combines a CNN-LSTM- based gesture recognition model with YOLOv3 for rapid object detection, enabling simultaneous interpretation of user gestures and surrounding visual context. Live video captured via webcam is processed frame-by-frame, with gesture and object outputs displayed through a web-based user interface. The gesture model was trained on the ISL-30 dataset, achieving an F1-score of 91.2%, while the object detector reached a mean average precision (mAP@0.5) of 46.5%, all while maintaining a real-time throughput of 44 FPS. Experimental results demonstrate that fusing linguistic and environmental cues significantly improves context-aware interaction, offering a scalable assistive solution for inclusive communication.
