How To Import Data Exported Data on Label Studio and use full stack with Docker

Preparing Dataset for the travel service to find something may take a long time and often frustrating. Label Studio, open source device tool, can be contributed by providing an easy way to install information. It sponsors various inscriptions, including computer view, processing environmental languages, and sound performance or processing. However, we will focus directly to the operation of the item acquisition.
But what if you want to spend open source opening opportunities, such as Pascal Voc Dataset? In this article, I will show you how to easily do these jobs in the Label Studio format while putting the whole stack – including Postgre Database, Minio Disease the Minio of Studio Baccond. Minio is a relevant S3 ambassador of S3: You can use Cloud-Naturen storage in production, but you can run it in the environment and test.
In this lesson, we will exceed the following steps:
- Change Pascal Voc annotations – Turn the arrest boxes from XML into Labi Studio Tasks in JSON format.
- Run the full stack – Start Label Studio with Postgreesql, Minio, INNX, and a backend using Docker.
- Set the Label Studio Project – Configure a new project within the Label Studio Interface.
- Add photos and functions to Minio – Keep your data in the corresponding S3 bucket.
- Connect Minio to Label Studio – Put the bucket of the cloud storage in your project so that the studio label can download photos directly.
Requirements
Following this lesson, make sure you have:
From VOC to Label Studio: Preparing for Inserts
The Pascal Voc Dataset has a folder structure where the train is separated by the train and rail datasets. This page Definition The folder contains each picture's explanation files. Overall, the training set includes 17,125 photos, each has a corresponding defex file.
.
└── VOC2012
├── Annotations # 17125 annotations
├── ImageSets
│ ├── Action
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── JPEGImages # 17125 images
├── SegmentationClass
└── SegmentationObject
An XML snippet below, taken from one of the annotations, describes the binding box around an object marked by the “person”. The box is specified using four pixel coordinates: xmin, ymin, xmaxbesides ymax.
The parable below shows an internal triangle as a descriptive banding box, which is described in the upper left corner (xmin, ymin) and lower-right corner (xmax, ymax), within the external square representing the picture.

The Label Studio expects that each binding box is described in its diameter, high, and upper left corner, featured as percent of the size of the image. Below is an effective example of a converted JSON format for the above-shown application.
{
"data": {
"image": "s3:////2007_000027.jpg"
},
"annotations": [
{
"result": [
{
"from_name": "label",
"to_name": "image",
"type": "rectanglelabels",
"value": {
"x": 35.802,
"y": 20.20,
"width": 36.01,
"height": 50.0,
"rectanglelabels": ["person"]
}
}
]
}
]
}
As you can see in the JSON format, and you need to specify the image file location – for example, the Beck method or S3 Beck if you use the cloud storage when using the cloud storage.
While recycling the details, I included all the details, even if it had been separated by training and verification. This imitates the real world condition when you start with one data and make a distinction in your training and ensure it is infected before training.
Running a full stack with Docker Pacse
I have compiled docker-compose.yml including docker-compose.minio.yml Files easily have simple configuration so that all the stack can run into the same network. Both of these files are taken from the official Label of Studio Guthub.
services:
nginx:
# Acts as a reverse proxy for Label Studio frontend/backend
image: heartexlabs/label-studio:latest
restart: unless-stopped
ports:
- "8080:8085"
- "8081:8086"
depends_on:
- app
environment:
- LABEL_STUDIO_HOST=${LABEL_STUDIO_HOST:-}
volumes:
- ./mydata:/label-studio/data:rw # Stores Label Studio projects, configs, and uploaded files
command: nginx
app:
stdin_open: true
tty: true
image: heartexlabs/label-studio:latest
restart: unless-stopped
expose:
- "8000"
depends_on:
- db
environment:
- DJANGO_DB=default
- POSTGRE_NAME=postgres
- POSTGRE_USER=postgres
- POSTGRE_PASSWORD=
- POSTGRE_PORT=5432
- POSTGRE_HOST=db
- LABEL_STUDIO_HOST=${LABEL_STUDIO_HOST:-}
- JSON_LOG=1
volumes:
- ./mydata:/label-studio/data:rw # Stores Label Studio projects, configs, and uploaded files
command: label-studio-uwsgi
db:
image: pgautoupgrade/pgautoupgrade:13-alpine
hostname: db
restart: unless-stopped
environment:
- POSTGRES_HOST_AUTH_METHOD=trust
- POSTGRES_USER=postgres
volumes:
- ${POSTGRES_DATA_DIR:-./postgres-data}:/var/lib/postgresql/data # Persistent storage for PostgreSQL database
minio:
image: "minio/minio:${MINIO_VERSION:-RELEASE.2025-04-22T22-12-26Z}"
command: server /data --console-address ":9009"
restart: unless-stopped
ports:
- "9000:9000"
- "9009:9009"
volumes:
- minio-data:/data # Stores uploaded dataset objects (like images or JSON tasks)
# configure env vars in .env file or your systems environment
environment:
- MINIO_ROOT_USER=${MINIO_ROOT_USER:-minio_admin_do_not_use_in_production}
- MINIO_ROOT_PASSWORD=${MINIO_ROOT_PASSWORD:-minio_admin_do_not_use_in_production}
- MINIO_PROMETHEUS_URL=${MINIO_PROMETHEUS_URL:-
- MINIO_PROMETHEUS_AUTH_TYPE=${MINIO_PROMETHEUS_AUTH_TYPE:-public}
volumes:
minio-data: # Named volume for MinIO object storage
This simplified Docper file describes four important services in its owno of volume:
App – Uses Label Studio yourself.
- Shares
mydataThe directory with ninxx, storing projects, configuration, and uploaded files. - Uses a Wrap the mount:
./mydata:/label-studio/data:rw→ Map folder from your host in the container.
Ninx – Works as a background returns for Laba Studio Frontlend and Backend.
- Shares
mydataDirectory through app service.
Postgresql (DB) – Managing Metadata information and project.
- Maintains data files in progress.
- Uses a Wrap the mount:
${POSTGRES_DATA_DIR:-./postgres-data}:/var/lib/postgresql/data.
Thin – Corresponding S3 compatible service.
- Dataset stores items such as pictures or JSON's annex function.
- Uses a volume:
minio-data:/data.
When setting folders carrying as ./mydata including ./postgres-dataYou need to assign identity to the same user who runs within the container. Studio labelo does not work as root – using non-root user with UID 1001. If the administrators are not a different user, the container will not write access and will enter to permission denied Errors.
After creating these folders in your project language, you can change their ownership with:
mkdir mydata
mkdir postgres-data
sudo chown -R 1001:1001 ./mydata ./postgres-data
Now that the clues of indicators are fixed, we can bring a stack using the docker. Simply run:
docker compose up -d
It may take a few minutes to pull all the required photos from Docker Hub and set the Studio Label. When the set is completed, open on your browser access the label studio interface. You need to create a new account, and you can sign in with your credentials in the interface. You can let the death token die Organization → API Token Settings. This note allows you to contact the Label Studio API, which is especially useful in automated activities.
Set the Label Studio Project
We can now create our first dictionary project on the Label Studio, especially Working for an item acquisition. But before starting your photos, you need to explain the types of classes to choose from. Dascal Voc Dataset, there are 20 types of things described first.

Add photos and functions to Minio
You can open a minio user interface in your browser to Lococatehost: 9000Then log in using credentials you specified under the appropriate service in the docker-compose.yml file.
I created a bucket with folders, one used to store photos and other Jon's activities formatted according to the above instructions.

We put the S3-Like Service service in a place that allows us to imitate S3 cloud storage without installing any costs. If you want to transfer files to the S3 bucket in AWS, it is better to do this directly through the Internet, considering the transfer costs. Good news is that and you can work with your minio bucket using AWS CLI. To do this, you need to install the profile in ~/.aws/config and give the corresponding credentials in ~/.aws/credentials under the same profile name.
And then, you can easily synchronize your location folder using the following commands:
#!/bin/bash
set -e
PROFILE=
MINIO_ENDPOINT= # e.g.
BUCKET_NAME=
SOURCE_DIR=
DEST_DIR=
aws s3 sync
--endpoint-url "$MINIO_ENDPOINT"
--no-verify-ssl
--profile "$PROFILE"
"$SOURCE_DIR" "s3://$BUCKET_NAME/$DEST_DIR"
Connect Minio to Label Studio
After all the details, including images and explanations, you have been loaded, can pass through the cloud storage project that has created in the past step.
From your project settings, go to The Storage of the Clouds and added the required parameters, such as the endpoint (pointing the service name to the Docker table and Durban Number, e.g. minio:9000), a bucket name, and the correct start in which the annex files are stored. Each approach within JSON files will promise the corresponding image.

After ensuring that the connection is valid, you can synchronize your project with the cloud storage. You may need to conduct a synchronization command as a dataset contains 22,223 photos. It may seem to fail at the beginning, but when you start synchronization, it continues to improve. Finally, all the Pocascacy VOC's information will be successfully imported to the Label Studio.

You can see the imported activities with their icon images on the worksheet. When you click on the work, the photo will appear with pre-prior adjectors.

Conclusions
In this lesson, show us how to enter the Pascal VOC data on Labili Labili in Labuli Labilizing XML Information into Labuli Studio's Ykson Formation, and connecting the minio as a non-compatible S3-conveni. This setup lets you work with a big dataset, predefined in a positive and incoming way, everything in your local machine. To explore your project settings and file formats in the first place will ensure a smooth conversion of the clouds.
I hope this teaching lesson helps you kick your annexor project with predefined data or verify. When your data is ready for training, you can send all tasks into popular formats such as COO or YOLE.



