Abstract:
Object detection is widely employed in various applications, including autonomous
vehicles, scanning digital images, street traffic detection, object classification, and facial
detection. Object detection does more than just find and classify things in an image. It also
finds where those things are and makes bounding boxes around them. Finding every
instance of an object from a given class, such as people, cars, or faces in a picture, is the
aim of object detection. Even though there are often few instances of the object in the
photograph, there are a vast array of locations and scales where it could appear that must
be investigated. As a result, most effective object detection networks combine object
identification methods with picture classifiers based on neural networks. We are able to
develop, train, and deploy object identification models using the Tensorflow Object
Detection API, an open-source platform built on Google's TensorFlow, and a Python library
termed OpenCV that allows anyone to perform specific computer vision through trained
image processing. The thesis mainly focuses on the real-time detection of defective
products in a production line. For this, we need a well-trained object detection model. For
our thesis, we used SSD-MobileNet-v2, which is a model that has already been trained on
the COCO (Common Objects in Context) dataset. But this model cannot detect our target
classes; therefore, we collected training samples and fine-tuned the model for better
prediction. We fine-tuned the model for 2500, 5000, and 10000 steps. With increases in the
training steps, performance metrics such as mAP (Mean Average Precision) and recall
increase. Hence, the fine-tuned model that has been trained at 10,000 steps showed better
overall performance. It showed a mAP value of 0.9278 and a recall value of 0.9379.