Up until now, object detection in images using computer vision models faced a major roadblock of a few seconds of lag due to processing time. This delay hindered practical adoption in use cases like autonomous driving. However, the YOLOv8 computer vision model’s release by Ultralytics has broken through the processing delay. The new model can detect objects in real time with unparalleled accuracy and speed, making it popular in the computer vision space.
This article explores YOLOv8, its capabilities, and how you can fine-tune and create your own models through its open-source Github repository.
Yolov8 Explained
YOLO (You Only Live Once) is a popular computer vision model capable of detecting and segmenting objects in images. The model has gone through several updates in the past, with YOLOv8 marking the 8th version.
As it stands, YOLOv8 builds on the capabilities of previous versions by introducing powerful new features and improvements. This enables real-time object detection in the image and video data with enhanced accuracy and precision.
From v1 to v8: A Brief History
Yolov1: Released in 2015, the first version of YOLO was introduced as a single-stage object detection model. Features included the model reading the entire image to predict each bounding box in one evaluation.
Yolov2: The next version, released in 2016, presented a top performance on benchmarks like PASCAL VOC and COCO and operates at high speeds (67-40 FPS). It could also accurately detect over 9000 object categories, even with limited specific detection data.
Yolov3: Launched in 2018, Yolov3 presented new features such as a more effective backbone network, multiple anchors, and spatial pyramid pooling for multi-scale feature extraction.
Yolov4: With Yolov4’s release in 2020, the new Mosaic data augmentation technique was introduced, which offered improved training capabilities.
Yolov5: Released in 2021, Yolov5 added powerful new features, including hyperparameter optimization and integrated experiment tracking.
Yolov6: With the release of Yolov6 in 2022, the model was open-sourced to promote community-driven development. New features were introduced, such as a new self-distillation strategy and an Anchor-Aided Training (AAT) strategy.
Yolov7: Released in the same year, 2022, Yolov7 improved upon the existing model in speed and accuracy and was the fastest object-detection model at the time of release.
What Makes YOLOv8 Standout?
YOLOv8’s unparalleled accuracy and high speed make the computer vision model stand out from previous versions. It’s a momentous achievement as objects can now be detected in real-time without delays, unlike in previous versions.
But besides this, YOLOv8 comes packed with powerful capabilities, which include:
- Customizable architecture: YOLOv8 offers a flexible architecture that developers can customize to fit their specific requirements.
- Adaptive training: YOLOv8’s new adaptive training capabilities, such as loss function balancing during training and techniques, improve the learning rate. Take Adam, which contributes to better accuracy, faster convergence, and overall better model performance.
- Advanced image analysis: Through new semantic segmentation and class prediction capabilities, the model can detect activities, color, texture, and even relationships between objects besides its core object detection functionality.
- Data augmentation: New data augmentation techniques help tackle aspects of image variations like low resolution, occlusion, etc., in real-world object detection situations where conditions are not ideal.
- Backbone support: YOLOv8 offers support for multiple backbones, including CSPDarknet (default backbone), EfficientNet (lightweight backbone), and ResNet (classic backbone), that users can choose from.
Users can even customize the backbone by replacing the CSPDarknet53 with any other CNN architecture compatible with YOLOv8’s input and output dimensions.
Training and Fine-tuning YOLOv8
The YOLOv8 model can be either fine-tuned to fit certain use cases or be trained entirely from scratch to create a specialized model. More details about the training procedures can be found in the official documentation.
Let’s explore how you can carry out both of these operations.
Fine-tuning YOLOV8 With a Custom Dataset
The fine-tuning operation loads a pre-existing model and uses its default weights as the starting point for training. Intuitively speaking, the model remembers all its previous knowledge, and the fine-tuning operation adds new information by tweaking the weights.
The YOLOv8 model can be finetuned with your Python code or through the command line interface (CLI).
1. Fine-tune a YOLOv8 model using Python
Start by importing the Ultralytics package into your code. Then, load the custom model that you want to train using the following code:
First, install the Ultralytics library from the official distribution.
# Install the ultralytics package from PyPI pip install ultralytics |
Next, execute the following code within a Python file:
from ultralytics import YOLO
# Load a model # Train the model on the MS COCO dataset |
By default, the code will train the model using the COCO dataset for 100 epochs. However, you can also configure these settings to set the size, epoch, etc, in a YAML file.
Once you train the model with your settings and data path, monitor progress, test and tune the model, and keep retraining until your desired results are achieved.
2. Fine-tune a YOLOv8 model using the CLI
To train a model using the CLI, run the following script in the command line:
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640 |
The CLI command loads the pretrained `yolov8n.pt` model and trains it further on the dataset defined in the `coco8.yaml` file.
Creating Your Own Model with YOLOv8
There are essentially 2 ways of creating a custom model with the YOLO framework:
- Training From Scratch: This approach allows you to use the predefined YOLOv8 architecture but will NOT use any pre-trained weights. The training will occur from scratch.
- Custom Architecture: You tweak the default YOLO architecture and train the new structure from scratch.
The implementation of both these methods remains the same. To train a YOLO model from scratch, run the following Python code:
from ultralytics import YOLO
# Load a model # Train the model |
Notice that this time, we have loaded a ‘.yaml’ file instead of a ‘.pt’ file. The YAML file contains the architecture information for the model, and no weights are loaded. The training command will start training this model from scratch.
To train a custom architecture, you must define the custom structure in a ‘.yaml’ file similar to the ‘yolov8n.yaml’ above. Then, you load this file and train the model using the same code as above.
To learn more about object detection using AI and to stay informed with the latest AI trends, visit unite.ai.