Real-Time Binary Classification Using Classical Vision and Deep Learning
This project implements a real-time binary face classification pipeline by combining classical computer vision techniques with modern deep learning models. The goal is to detect human faces from live video streams and classify them into two categories—male or female—under real-world conditions such as varying lighting, pose, and background clutter.
Face detection is performed using OpenCV’s Haar Cascade classifier, a lightweight and fast method based on handcrafted features and boosted classifiers. Haar cascades enable efficient face localization on CPU-only systems, making them well suited for real-time applications where computational resources are limited. Detected face regions are cropped, normalized, and passed to a neural network for classification. For gender classification, a convolutional neural network is trained on labeled face images. The CNN learns discriminative spatial features such as facial structure, texture, and local appearance patterns that are difficult to capture with rule-based methods. The model outputs a binary prediction indicating the most likely gender class for each detected face.
The system is designed as a modular perception pipeline, where classical vision handles fast region proposal while deep learning focuses on semantic understanding. This separation improves efficiency, interpretability, and robustness compared to an end-to-end deep model operating directly on full images. The final system operates in real time on live camera feeds, demonstrating stable face tracking and classification with minimal latency. This project highlights the practical value of hybrid vision systems that combine traditional algorithms with learning-based models for deployment-ready computer vision applications. The CNN model is a ResNet-18 based fine tuned model.