Development and Evaluation of a Multi-Class Image Recognition Model Using Google Teachable Machine

a computer chip with the letter a on top of it

Introduction

Artificial Intelligence (AI)-based recognition models have become increasingly prominent across diverse fields, including healthcare and autonomous systems, due to their ability to automate complex classification tasks and extract meaningful patterns from large datasets (Russell & Norvig, 2020; Mayer-Schönberger & Cukier, 2013). These systems rely on deep learning architectures, particularly Convolutional Neural Networks (CNNs), which have proven effective in tasks such as image recognition and object detection (LeCun, Bengio, & Hinton, 2015; He et al., 2016).

 

This report presents the development and evaluation of a multi-class image recognition model implemented using Google Teachable Machine—a web-based platform that simplifies machine learning model creation (Google Teachable Machine, 2023). The model is trained to classify a custom dataset comprising animal images (cats, dogs, and birds), providing an opportunity to explore both model effectiveness and practical deployment in lightweight applications.

The study adopts a systematic approach, beginning with task selection and dataset collection, followed by model training and configuration. It then proceeds to the testing methodology and results analysis, concluding with a critical assessment of the model’s performance and actionable strategies for improvement. This structured framework ensures a comprehensive evaluation of the model’s capabilities and limitations, providing insights into its practical applicability and potential enhancements in real-world AI systems (Mitchell, 2019; Georgia Institute of Technology, n.d.).

Task Selection and Description

The chosen task for this project is image classification, specifically using a supervised learning approach. Supervised learning is one of the most widely used techniques in machine learning, where models learn to map inputs to outputs based on labelled training data (Russell & Norvig, 2020; Mitchell, 2019). In this context, the model is trained to classify images into three distinct categories: Cat, Dog, and Bird.

This task was selected because it is both simple and diverse, making it suitable for evaluating the model’s ability to generalize across different visual classes. Tasks with multiple but distinct categories are commonly used to assess model flexibility and robustness (LeCun, Bengio, & Hinton, 2015). Moreover, image classification has significant real-world relevance, especially in domains such as pet monitoring, wildlife tracking, and smart surveillance systems—demonstrating its practical value beyond theoretical experimentation (Mayer-Schönberger & Cukier, 2013; Mitchell, 2019).

Dataset Selection and Description

The dataset used in this project is a custom dataset collected through web scraping. For the image sources, the training images were obtained from Unsplash, while the testing images were sourced from Pexels. To ensure consistency and compatibility with lightweight models, all images were preprocessed by resizing them to 224×224 pixels. The distribution of the dataset is presented in Table 1 below.

Table 1:

Model Training Details

Platform: Google Teachable Machine

  • Model Type: Convolutional Neural Network (CNN) (default architecture).
  • Training Configuration:
    • Epochs: 50 (default).
    • Batch Size: 16.
    • Learning Rate: 0,001

Training Process

  • Input: Uploaded labelled images per class.
  • Output: Trained model.
  • Training Time: ~5 minutes (cloud-based processing).

Visualisations

Testing and Results

Testing Methodology

  • Test Set: 50 images (stratified by class, sourced from Pexels—a different source than the training images).
  • Metrics Evaluated:
    • Accuracy (Overall correctness).

Precision & Recall (Per-class performance).

Explanation of Metrics

  1. Accuracy:
    • Cat: 70.59% of cat samples were correctly classified.
    • Dog: 94.12% of dog samples were correctly identified.
    • Bird: 68.75% of bird samples were accurately predicted.
    • Overall accuracy is calculated by considering the total correct predictions across all categories.
  2. Precision:
    • Cat: Precision is calculated as (True Positives / (True Positives + False Positives)). For cats, this indicates the proportion of correctly identified cats out of all samples predicted as cats.
    • Dog: With a precision of 100%, all predicted dogs were correctly classified.
    • Bird: Precision reflects that 78.57% of predicted birds were accurate.
  3. Recall:
    • Cat: Recall for cats aligns with accuracy due to a lack of false negatives affecting this specific category.
    • Dog: High recall indicates that most actual dog samples were correctly identified.
    • Bird: Similar to accuracy, indicating some misclassification in this category.

Overall Assessment

The model performs best in identifying dogs, while it struggles slightly more with cats and birds. The precision metrics highlight the model’s reliability in predicting dogs, while recall indicates good performance in identifying actual samples across categories. Overall, the metrics suggest areas for improvement, particularly for birds and cats.

Analysis and Discussion

The evaluation of the image classification model trained on a custom dataset using Google Teachable Machine reveals key insights into its performance and limitations. The model demonstrated strong accuracy and precision in classifying dog images, achieving a perfect precision score (100%) and high recall (94.12%), indicating both confident and consistent identification of this class. This outcome reflects the effectiveness of convolutional neural networks (CNNs) in capturing salient visual features, even within simplified architectures (LeCun, Bengio, & Hinton, 2015; He et al., 2016).

In contrast, the model showed lower performance for cat and bird images, with both classes yielding accuracy and recall around 70%, and bird precision at 78.57%. This performance gap may be attributed to intra-class variability, overlapping features, or less diverse representations in the training set—factors that are known to affect generalization in image recognition tasks (Mitchell, 2019). Additionally, the use of different image sources for training (Unsplash) and testing (Pexels) could have introduced a domain shift, impacting model accuracy due to inconsistencies in lighting, background, or composition (Google Teachable Machine, 2023; Pexels, n.d.; Unsplash, n.d.).

Despite these challenges, the model showed adequate generalization ability for a beginner-friendly framework. The architecture provided by Google Teachable Machine enables rapid deployment and experimentation, making it ideal for educational use and prototype development (Google Teachable Machine, 2023). However, performance on more complex categories like cats and birds could be improved through data augmentation, use of larger or more balanced datasets, and exploration of deeper or residual network architectures (He et al., 2016; Russell & Norvig, 2020).

Conclusion

This project successfully implemented a supervised image classification task using Google Teachable Machine to categorize animals into three classes: Cat, Dog, and Bird. The model demonstrated high accuracy and precision in classifying dogs, while its performance was more moderate for cats and birds—indicating areas for further improvement. These results highlight the importance of dataset diversity, consistent image sourcing, and the potential need for more advanced model customization (LeCun, Bengio, & Hinton, 2015; He et al., 2016).

To build on these results, future work will focus on expanding the model’s capabilities and practical relevance. This includes testing the model with real-time camera input to evaluate its performance in dynamic, real-world environments and experimenting with audio-based animal recognition, such as distinguishing between barks and meows. These enhancements aim to create a more robust and versatile recognition system, paving the way for broader applications in fields like pet monitoring, wildlife observation, and intelligent home systems (Mitchell, 2019; Mayer-Schönberger & Cukier, 2013).

The use of Google Teachable Machine proved effective for rapid prototyping and ease of implementation, especially for educational and lightweight applications (Google Teachable Machine, 2023). However, the relatively simple convolutional neural network (CNN) model used may benefit from further architectural enhancements and training with a more balanced and diverse dataset to boost generalization performance (Russell & Norvig, 2020).

References

Georgia Institute of Technology. (n.d.). Natural language processing. https://sites.cc.gatech.edu/classes/AY2021/cs7650_fall/

Google Teachable Machine. (2023). Official documentation. https://teachablemachine.withgoogle.com/

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778.

Jurafsky, D. (2021, May 6). Part-of-speech tagging and named entity recognition [PowerPoint slides]. Stanford University. https://web.stanford.edu/~jurafsky/slp3/slides/8_POSNER_intro_May_6_2021.pptx

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539

Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.

Mitchell, M. (2019). Artificial intelligence: A guide for thinking humans. Farrar, Straus and Giroux. https://melaniemitchell.me/aibook/

Pexels. (n.d.). Testing Images [Photograph]. Pexels. https://www.pexels.com/photo/

Russell, S. J., & Norvig, P. (2020). Artificial intelligence: A modern approach (4th ed.). Pearson. https://github.com/pemagrg1/AI_class2022/blob/main/book/Artificial-Intelligence-A-Modern-Approach-4th-Edition-1-compressed.pdf

University of Washington. (2020). Word embeddings [PDF slides]. https://courses.cs.washington.edu/courses/csep517/20wi/slides/csep517wi20-WordEmbeddings.pdf

University of Waterloo. (n.d.). NLP fundamentals [PowerPoint slides]. https://ov-research.uwaterloo.ca/MSCI641/Week2_NLP_fundamentals.pptx

Unsplash. (n.d.). Training Images [Photograph]. Unsplash. https://unsplash.com/photos/

Scroll to Top