Analysis

Computer Vision

Computer Vision is a field of artificial intelligence that teaches machines how to interpret and understand visual information from the world, similar to how humans perceive and comprehend images and videos. It involves enabling computers to analyze, recognize, and make decisions based on visual data.

A section of  computer vision intersects with Neural Networks in terms of CNNs. Convolutional Neural Networks are a specialized class of deep learning algorithms designed for processing grid-like data, particularly images. They have found widespread applications in computer vision due to their ability to automatically learn hierarchical features from visual data.  


There are several notable applications of CNNs which are used in this project.

Applications of Convolution Neural Networks
 

CNNs have proven to be versatile and potent tools, contributing significantly to advancements in computer vision across various industries and applications. Sometimes, people confuse between Object Detection vs Image Classification vs Image Segmentation. Some are:


Let's have a look at how different these are:-

Image Classification

Image Segmentation

Semantic Segmentation

Object Detection

Project Architecture

PokerMate aims to create an intelligent tool that not only can understand or detect an image as one or more of the 52 playing cards in a valid deck but also guide you for potential options you may have with those cards. The project achieves this goal in three stages: Classification, Detection and an Interface, in-order.

Stages of Modelling

Stage 1 | Image Classification

To start simple: image classification marks the initial stage. The goal of this stage was to develop an image classification model capable of recognizing playing cards using Convolutional Neural Networks (CNNs). The dataset consists of images depicting various playing cards, each labeled with its corresponding card type. 


MODEL

Using Keras, a convolution neural network was constructed to categorize given images of a playing cards. The input for this model would be an image file, presumably containing one card. And the output of the model would be a label representing one of the 52 valid cards.

Input (Sample image of a playing card)        - - - - >        Model      - - - - - >           Output (Ace of Spades)

Sample Model Classification Outputs

Hearts of Jack

Hearts of 7

Clubs of Queen

Clubs of Ace

Stage 2 | Object Detection

After training a basic model to identify playing cards, the project progressed to the task of recognizing multiple cards within a single image, a crucial step for identifying poker hands consisting of 2 + 5 cards. This stage involved object detection.

To enhance accuracy, the ante was raised, incorporating object detection through transfer learning. This entailed leveraging pre-trained weights from a popular object detection Base Nano model, YOLO Version 8.0, originally trained on everyday objects like dogs and cats. By fine-tuning and freezing specific layers, the model was then trained on the project's customized dataset. This process effectively transferred the learned configuration to the project's training inputs of detecting custom playing cards.

MODEL

After fine-tuning, the customized model became capable of detecting (rather than just classifying) a given image containing one or more cards, and inferencing them into respective cards labels.

     Input (Image of a one/more cards)       - - - - >        Model        - - - - - >         Output (Ace Spades, King Clubs, 10 Hearts)

Sample Model Detection Outputs

Stage 3 | User Interface

The project involved establishing an underlying logic, followed by the development of a user-friendly interface using Streamlit. This interface includes a straightforward design with an upload option, enabling users to submit images of their poker hands.

After image upload, the model undergoes a sequence of processes involving image classification to assess the overall suitability of the image, followed by object detection to identify specific card labels. The next step involves the business logic, which computes user's poker hand strength. Once completed, the user interface presents the output for users to consider in their decision-making process.

Voila! Never a dull moment at the Poker table!