Breaking Botten: Video Exception of GPU GPU to Deep Reading

Deep reading applications (DL) often require the processing of the video data such as the acquisition of an object, classification, and a part. However, regular video pipes are generally not working properly in the intense delivery of learning, which leads to work bottles. In this post will enhance Pytorch and FFMPEG with NVIA Hardware fast to achieve this functional.
District comes from the video system typically decorated and transmitted between CPU and GPU. The usual work movement we can find in the tutorials most followed the building:
- Check the frames in CPU: Video files are starting to be determined by mature tools using CPU based tools (eg, OpenCV, FfteG without GPU support.
- Forward to GPU: These components are transferred from CPU memory to the GPU memory to make a deep learning using freights such as TensorFlow, Poytorch, OnX, etc.
- Humility on GPU: When the gupo memory frames, the model makes humility.
- Referring to CPU (if required): Some backup steps may require data back to CPU.
This CPU-GPU transfer process introduces a high bottle of work, especially when the video processing is processed in high prices. Copies of unwanted memories and the potential change reduces general measurement speed, reduces real-time processing.
For example, the following snippet has a regular pipe of video processing when you start learning a deep reading:
Solution: Video-based Videos based on GPU and humility
The most effective way Keep the whole pipe on the GPUFrom video examination in Incoming, completing unwanted CPU-GPU transfers. This can be obtained using FFMPEG with NVIIDI GPU Hardware Shetcher.
For the best
- GPU-accelerated Video Decoding: Instead of using CPU based on testing, we are patient Ffmpeg with nvidia GPU to accelerate (nvdec) Determining video frames directly on GPU.
- Repair of zero frame: Decorable frames live in GPU memory, to avoid unnecessary memory transfers.
- GPU-is for detections made for: When the frames are issued, we are direct synchronic using any model in the same GPU, reduce the latency.

Hands!
Requirements
To benefit from the above-mentioned development, we will use this to depend on the following:
Insertion
Please find a deep understanding of the FFPEG to how to speed up by Nvidia GPU acceleration, follow these instructions.
Checked with:
- Manner: Personality 22.04
- NVIDIA Driver VERSION: 550.120
- Version Cuffa: 12.4
- Torch: 2.4.0
- TORTAADIO: 2.4.0
- Interpretation: 0.19.0
1. Add NV-Codecs
2. Clone and prepare FFMPEG
3. Make sure the installation was successful with Tortuaudoudio.utils
Time to get a prepared pipe code!
Measurement marks
Considering whether there is a difference, we will use this video from Perzonowski. As many short videos, I have installed the same video again and again to provide certain results in different video length. The original video is 32 seconds in the length we give 960 frames. New new videos have 5520 and 9300 independent.
The original video
- General Work Transfer: 28.51s
- Working for a Prepared Work: 24.2s
Ok … doesn't seem like real progress, right? Let's check it for long videos.
Modified Video V1 (5520 independent)
- General Work Transport: 118.72s
- Religious Work Relations: 100.23s
Modified Video V2 (9300 Frames)
- General Work Transport: 292.26s
- Religious Work Relations: 240.85s
As the video time goes up, the benefits of doing well are more visible. The most long test case, reaches a 18% Speedupindicates a greater reduction in time. These synthetic benefits are very important when treating large video dassets or even in the process of analyzing actual video.
Store
In today's post, we have tested two videos for the videos from CPU to go to GPU to GPU, as well as the GPUs and pass a lot of time, to save a lot of time as increased.



