Machine learning is a variant of Artificial Intelligence where computers gain the ability to learn independently and apply learnings to new data sets. A subset of this is deep learning whereby the machine receives inputs for training and object recognition. This applies even in scenerios where the machine has not seen the object before.
We see this everyday done by our young children. Imagine for a moment a toddler learning the word ‘cat’. The toddler will point to different objects saying ‘cat’, which parents will confirm or deny the toddler’s assumption. Over a period of time, the toddler will learn what a cat actually is and recognize cats without parental help.
Triple Ring Technologies designed a demo to showcase an example of applying machine learning to images and video. We chose to use the NVIDIA cuDNN library as the ‘back end’ of our application due to our extensive experience with high computing using GPU. This library can also dramatically reduce time required to train and improve quality of a model. We also used a Convolutional Neural Network (CNN).
For the demo, we trained the application to recognize the FILA logo in sports games and on clothing. The application was trained on roughly 500 images of differing sizes and resolutions with at least one occurrence of the FILA logo. The content of these images varied, from clothing logos to large banners in the background of sports games. The training of the algorithm took only a few hours.
The application was tested on a sports video comprised of images not used for the training. Using a bounding box, the application tagged the FILA logo while calculating confidence level. The results are shown in the video above, which display the bounding box and confidence levels.
Achievement of high levels of detection accuracy requires careful and complex engineering of the training model as many factors influence quality. Each image frame must be scanned for any sub-region that contains the target object because it is unknown where the objects of interest may be located in each frame. We successfully generated a robust model that not only had the required accuracy but also analyzed the video stream at a speed of more than 60 frames per second using a single GPU.
Triple Ring Technologies is able to offer a range of services in all aspects of the Machine Learning process by leveraging our deep physics and software capabilities. The goal is to maximize the efficacy and speed of the model for new data streams, and Triple Ring will be tenacious in our quest to provide excellent results for our customers.
Please contact us for further demonstrations of our capabilities in this fast growing category of High Performance Computing.