Application of computer vision based on neural networks for object recognition in images

Cover Page

Cite item

Full Text

Abstract

Background. Computer vision is defined as a scientific field that deals with research in various kinds of image processing using a computer . The most obvious examples of using this technology are “smart city” systems, drones, human and object recognition, fire detection, and user identification systems. However, this field could not ignore the rapidly developing area of neural networks . Over the past decade, the use of neural networks, particularly artificial intelligence, has been growing rapidly. This growth is driven by various factors, including the active dynamic development of AI, the popularization of this industry, the increase in the number of users, and the active dynamic development of mobile technologies. After the phenomenal result achieved in 2012 by the deep convolutional neural network AlexNet at the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), specialists began to consider the potential of training neural networks as a means of image recognition.

Objective. In this work, the difference between using computer vision based on neural networks and traditional computer vision is practically examined.

Methods. Let’s consider the similarities and differences between the two types of computer vision. Computer vision using neural networks differs in that it requires pre-training. This allows it to recognize objects in images with greater accuracy. This type can efficiently operate on video and real-time frames. Traditional computer vision does not require pre-training and can be implemented using traditional image processing methods. However, computer vision of this type may be less accurate in recognizing objects, especially in complex images.

Results. We present the code for a simple neural network program in Python for object recognition in images below:

import tensorflow as tf

from tensorflow.keras import layers, models

# Creating a neural network model

model = models.Sequential()

# Adding a convolutional layer

model.add(layers.Conv2D(32, (3, 3), activation=’relu’, input_shape=(28, 28, 1)))

model.add(layers.MaxPooling2D((2, 2)))

# Adding another convolutional layer

model.add(layers.Conv2D(64, (3, 3), activation=’relu’))

model.add(layers.MaxPooling2D((2, 2)))

# Adding a data pulling layer

model.add(layers.Flatten())

# Adding fully connected layers

model.add(layers.Dense(64, activation=’relu’))

model.add(layers.Dense(10, activation=’softmax’)) # 10 классов для распознавания

# Compiling the model

model.compile(optimizer=’adam’,

loss=’sparse_categorical_crossentropy’,

metrics=[‘accuracy’])

# A function for training the model on loaded images

def train_model(model, train_images, train_labels):

model.fit(train_images, train_labels, epochs=10)

# Loading images and training the model

train_images, train_labels = load_images() # A function for loading images

train_model(model, train_images, train_labels)

#Testing the model on new images

test_images, test_labels = load_test_images() # A function for loading test images

test_loss, test_acc = model.evaluate(test_images, test_labels)

print(‘The accuracy of object recognition in new images:’, test_acc)

Conclusions. After comparing the two versions of the code, we found that the efficiency is higher in computer vision using neural networks. This is due to the fact that neural networks can be trained on large datasets, allowing them to recognize objects with high accuracy.

The application of computer vision based on neural networks will enable the creation of more powerful and efficient systems that can process large volumes of data and make more accurate conclusions.

Full Text

Background. Computer vision is defined as a scientific field that deals with research in various kinds of image processing using a computer . The most obvious examples of using this technology are “smart city” systems, drones, human and object recognition, fire detection, and user identification systems. However, this field could not ignore the rapidly developing area of neural networks . Over the past decade, the use of neural networks, particularly artificial intelligence, has been growing rapidly. This growth is driven by various factors, including the active dynamic development of AI, the popularization of this industry, the increase in the number of users, and the active dynamic development of mobile technologies. After the phenomenal result achieved in 2012 by the deep convolutional neural network AlexNet at the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), specialists began to consider the potential of training neural networks as a means of image recognition.

Objective. In this work, the difference between using computer vision based on neural networks and traditional computer vision is practically examined.

Methods. Let’s consider the similarities and differences between the two types of computer vision. Computer vision using neural networks differs in that it requires pre-training. This allows it to recognize objects in images with greater accuracy. This type can efficiently operate on video and real-time frames. Traditional computer vision does not require pre-training and can be implemented using traditional image processing methods. However, computer vision of this type may be less accurate in recognizing objects, especially in complex images.

Results. We present the code for a simple neural network program in Python for object recognition in images below:

import tensorflow as tf

from tensorflow.keras import layers, models

# Creating a neural network model

model = models.Sequential()

# Adding a convolutional layer

model.add(layers.Conv2D(32, (3, 3), activation=’relu’, input_shape=(28, 28, 1)))

model.add(layers.MaxPooling2D((2, 2)))

# Adding another convolutional layer

model.add(layers.Conv2D(64, (3, 3), activation=’relu’))

model.add(layers.MaxPooling2D((2, 2)))

# Adding a data pulling layer

model.add(layers.Flatten())

# Adding fully connected layers

model.add(layers.Dense(64, activation=’relu’))

model.add(layers.Dense(10, activation=’softmax’)) # 10 классов для распознавания

# Compiling the model

model.compile(optimizer=’adam’,

loss=’sparse_categorical_crossentropy’,

metrics=[‘accuracy’])

# A function for training the model on loaded images

def train_model(model, train_images, train_labels):

model.fit(train_images, train_labels, epochs=10)

# Loading images and training the model

train_images, train_labels = load_images() # A function for loading images

train_model(model, train_images, train_labels)

#Testing the model on new images

test_images, test_labels = load_test_images() # A function for loading test images

test_loss, test_acc = model.evaluate(test_images, test_labels)

print(‘The accuracy of object recognition in new images:’, test_acc)

Conclusions. After comparing the two versions of the code, we found that the efficiency is higher in computer vision using neural networks. This is due to the fact that neural networks can be trained on large datasets, allowing them to recognize objects with high accuracy.

The application of computer vision based on neural networks will enable the creation of more powerful and efficient systems that can process large volumes of data and make more accurate conclusions.

×

About the authors

Togliatti State University

Author for correspondence.
Email: stasyrez@gmai.com

student, PIb-2106a

Russian Federation, Togliatti

References

  1. Goryachkin B.S., Kitov M.A. Computer vision // E-Scio. 2020. N 9(48). P. 317–345.
  2. Shevchenko V.D., Maryenkov A.N., Khanova A.A. Analysis of computer vision methods for detecting prohibited symbols in images on the internet // Caspian Journal: Management and High Technologies. 2022. N 2(58). P. 9–18. doi: 10.54398/20741707_2022_2_9

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2024 Reznikova A.R.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.