Object Detection, Measurement Positioning and more

Want to create a model to use pose estimation? I know something that can be detected, for example, segmentation, scaling and segmentation, all in real time. Yes, I am talking about the YOLO26 that appears ultralytics.
It can help security systems or it can be fine-tuned to detect even small objects. Wondering how to get started? No worries, we'll cover the basics of YOLO and learn how to make a conclusion using a model.
It's behind YOLO
YOLO (You Look Only Once) is a family of deep learning models used for computer vision tasks; Logical logic is the use of localization and classification. In simple words, localization finds objects and finds links to each. Then, the classifier predicts the class probabilities and assigns the most likely class to that object. The latest family of models from YOLO is the YOLO26, as previously mentioned they can sing:
- Object Discovery: It finds one or more objects in an image and predicts their confidence score with a combo box. This tells you what the object is and where it is.
- Classification: Assign an image to one of ImageNet's 1000 categories. The class with the highest probability is chosen as the final prediction.
- Standard setting: It detects 17 key points of the human body defined by the COCO dataset. This includes points such as the nose, shoulders elbows, knees and ankles to measure each person's posture.
- Oriented Bounding Box (OBB) detection: It predicts the bounding boxes rotation using five parameters. xyw h and θ. This is especially useful in aerial and satellite imagery where objects rarely appear perfectly aligned.
- Example Breakdown: Generates a pixel-level mask of every detected object. This helps to distinguish individual items even if they are in the same category.
These models have higher accuracy and better performance than previous generations of models.
Buildings

- Installation Image: The input image is resized and normalized before the model processes it.
- Spine (C3k2 + CSP): It extracts features from an image such as edges, textures, shapes, and patterns of objects.
- Neck (PAN-FPN): Make a combination of P3, P4 & P5. This helps improve detection of small, medium, and large objects respectively.
- Head of Discovery: It predicts object classes, bounding boxes, and confidence scores using ensemble feature maps.
- Definition of Departure: It removes several features from previous generations, mainly DFL and NMS. Simplifying the pipeline while improving latency.
- Output: Object detection, segmentation, pose estimation, shape detection, or segmentation.
For content
- C3k2: Block extraction block introduced recently in YOLO models. It improves feature learning with a few parameters.
- PAN (Path Aggregation Network): It overrides the low-level and high-level features in both directions, facilitating the detection of objects of various sizes accurately.
- FPN (Feature Pyramid Network): It combines feature maps from multiple depths, helping to visualize objects at multiple scales.
- p3 -> High resolution feature map, P4 -> Medium resolution feature map and P5 -> Low resolution feature map. They help the model to see small, medium, and large objects respectively.
Hands Open
Let's try YOLO26 with the help of Google Colab. We will use this image during the projection:

Note: YOLO models do not require high-end hardware, they can be run locally and in Jupyter Notebook.
Installation
!pip install -q "ultralytics>=8.4.0"
Here '-q' is used to install the library and dependencies without displaying anything.
Explaining the role of Assistant
from PIL import Image
# helper function
def show(result):
display(Image.fromarray(result.plot()[..., ::-1]))
This will be used to display the results.
Object discovery
from ultralytics import YOLO
IMAGE = "
model = YOLO("yolo26n.pt")
result = model(IMAGE)[0]
show(result)

The model successfully found the bus and the people.
Example Classification
seg_model = YOLO("yolo26n-seg.pt")
result = seg_model(IMAGE)[0]
show(result)

Here the model has done the segmentation, hiding the objects it has found. Edge detection also looks good.
Positioning / Measurement of key point
pose_model = YOLO("yolo26n-pose.pt")
result = pose_model(IMAGE)[0]
show(result)

The model successfully predicted the key points of the human body to determine posture.
Focused Drilling Boxes
obb_model = YOLO("yolo26n-obb.pt")
result = obb_model("https://ultralytics.com/images/boats.jpg")[0]
show(result)

This model can directly detect objects in the air, on the ground, or in satellite images. As you can see, it clearly saw the ships in the picture.
Image Classification
cls_model = YOLO("yolo26n-cls.pt")
result = cls_model(IMAGE)[0]
for i in result.probs.top5:
print(f"{result.names[i]:<25} {result.probs.data[i]:.2%}")
Output:

The model outputs 1000 class probabilities, where the classifier predicts the class as the smallest bus accurately.
The conclusion
In summary, you learned the basics of YOLO and YOLO26, explored its architecture, and referenced Google Colab for object discovery, instance segmentation, pose estimation, shape bounding boxes, and image segmentation. With its improved accuracy, efficiency, and real-time performance, YOLO26 is a good choice for many computer vision applications.
Frequently Asked Questions
A. In Google Colab, you can upload an image using the files.upload() function and pass the uploaded method to the model for reference.
A. Yes. You can read video as images (frames), apply a model to every frame, and combine the processed frames as video.
A. No. YOLO26 models can run on the CPU, although the GPU will be too fast to handle large tasks.
Sign in to continue reading and enjoy content curated by experts.


