A combined cloud: a united API of RAG pipes

Image by editor of Kanal Mehare) | Kanele
During a talk of some machine study engineering, I asked why we need Langchain with many apis and services to set a well-known pipe. Why don't we have one API that treats everything – Like uploading documents, installation, embedding, Renewordinations, and Vector storage – everything in one place?
It comes from, there is a solution called Xedbreads. This planat is fast, easy-to-use, and provides for recycling tool equipment. In this lesson, we will examine the integrated cloud and learn how to build a full-time Apag pipes using the MixedBread's API and the latest Opelai model.
The ColoudBreadbreadbread launched
The entire unified cloud is in one solution to the appropriate AI system for advanced text skills. Designed to facilitate the development process, provides a full suite of all administrative tools from the management of the documents in the Inventee and Recovery.
MixedBread Cloud provides:
- Loading Text: Enter any kind of document using an easy-useful interface or API
- Documentation: Release formal information from different documents of the document, converts random data into text
- Vector stores: Keep and return the embedded with the markable sets of files
- Text embedding: Change text into high-quality Vector submissions that hold semantic definition
- Reranking: Upgrade the quality of searching by reorganizing the results based on their original query
Creating a RAG app with MixedBread and Accecai
In this project, we will learn how to build a RAG app uses MixedBread and Aurai API. This step step by step will set the environmental setting, uploaded documents, creating the Vector store, processing the file processing, and creating a functional functional process.
1. to set up
- Visit the compiled website and create an account. Once registered, produce your API key. Similarly, make sure you have the ipelai API keys ready.
- Then, keep your API keys such as natural variations to access secure access to your code.
- Make sure you have the required libraries of Python:
pip install mixedbread openai
- Start a mixed bread client and open AI client using API keys. Also, set the PAT or PDF folder, the Vector Store name, and delete the word llm.
import os
import time
from mixedbread import Mixedbread
from openai import OpenAI
# --- Configuration ---
# 1. Get your Mixedbread API Key
mxbai_api_key = os.getenv("MXBAI_API_KEY")
# 2. Get your OpenAI API Key
openai_api_key = os.getenv("OPENAI_API_KEY")
# 3. Define the path to the FOLDER containing your PDF files
pdf_folder_path = "/work/docs"
# 4. Vector Store Configuration
vector_store_name = "Abid Articles"
# 5. OpenAI Model Configuration
openai_model = "gpt-4.1-nano-2025-04-14"
# --- Initialize Clients ---
mxbai = Mixedbread(api_key=mxbai_api_key)
openai_client = OpenAI(api_key=openai_api_key)
2. Loading files
We will receive all the PDF files in the defined folder and then put them in the integrated cloud using API.
import glob
pdf_files_to_upload = glob.glob(os.path.join(pdf_folder_path, "*.pdf")) # Find all .pdf files
print(f"Found {len(pdf_files_to_upload)} PDF files to upload:")
for pdf_path in pdf_files_to_upload:
print(f" - {os.path.basename(pdf_path)}")
uploaded_file_ids = []
print("nUploading files...")
for pdf_path in pdf_files_to_upload:
filename = os.path.basename(pdf_path)
print(f" Uploading {filename}...")
with open(pdf_path, "rb") as f:
upload_response = mxbai.files.create(file=f)
file_id = upload_response.id
uploaded_file_ids.append(file_id)
print(f" -> Uploaded successfully. File ID: {file_id}")
print(f"nSuccessfully uploaded {len(uploaded_file_ids)} files.")
All four files of PDF is successfully loaded successfully.
Found 4 PDF files to upload:
- Building Agentic Application using Streamlit and Langchain.pdf
- Deploying DeepSeek Janus Pro locally.pdf
- Fine-Tuning GPT-4o.pdf
- How to Reach $500k on Upwork.pdf
Uploading files...
Uploading Building Agentic Application using Streamlit and Langchain.pdf...
-> Uploaded successfully. File ID: 8a538aa9-3bde-4498-90db-dbfcf22b29e9
Uploading Deploying DeepSeek Janus Pro locally.pdf...
-> Uploaded successfully. File ID: 52c7dfed-1f9d-492c-9cf8-039cc64834fe
Uploading Fine-Tuning GPT-4o.pdf...
-> Uploaded successfully. File ID: 3eaa584f-918d-4671-9b9c-6c91d5ca0595
Uploading How to Reach $500k on Upwork.pdf...
-> Uploaded successfully. File ID: 0e47ba93-550a-4d4b-9da1-6880a748402b
Successfully uploaded 4 files.
You can go your mixed dashboard and click on the “Files” tab to see all the uploaded files.

3. Creating and engages in Vector Store
Now we will create the Vector store and add the uploaded files by providing a list of uploaded ID IDs.
vector_store_response = mxbai.vector_stores.create(
name=vector_store_name,
file_ids=uploaded_file_ids # Add all uploaded file IDs during creation
)
vector_store_id = vector_store_response.id
4. Monitor the process of processing file
The MixedBread Vector shop will convert each file to the fences and keep it in the vector store. This means that you can do the same search of photos or text within the PDFs.
We wrote the custom code for monitoring file processing status.
print("nMonitoring file processing status (this may take some time)...")
all_files_processed = False
max_wait_time = 600 # Maximum seconds to wait (10 minutes, adjust as needed)
check_interval = 20 # Seconds between checks
start_time = time.time()
final_statuses = {}
while not all_files_processed and (time.time() - start_time) < max_wait_time:
all_files_processed = True # Assume true for this check cycle
current_statuses = {}
files_in_progress = 0
files_completed = 0
files_failed = 0
files_pending = 0
files_other = 0
for file_id in uploaded_file_ids:
status_response = mxbai.vector_stores.files.retrieve(
vector_store_id=vector_store_id,
file_id=file_id
)
current_status = status_response.status
final_statuses[file_id] = current_status # Store the latest status
if current_status == "completed":
files_completed += 1
elif current_status in ["failed", "cancelled", "error"]:
files_failed += 1
elif current_status == "in_progress":
files_in_progress += 1
all_files_processed = False # At least one file is still processing
elif current_status == "pending":
files_pending += 1
all_files_processed = False # At least one file hasn't started
else:
files_other += 1
all_files_processed = False # Unknown status, assume not done
print(f" Status Check (Elapsed: {int(time.time() - start_time)}s): "
f"Completed: {files_completed}, Failed: {files_failed}, "
f"In Progress: {files_in_progress}, Pending: {files_pending}, Other: {files_other} "
f"/ Total: {len(uploaded_file_ids)}")
if not all_files_processed:
time.sleep(check_interval)
# --- Check Final Processing Outcome ---
completed_count = sum(1 for status in final_statuses.values() if status == 'completed')
failed_count = sum(1 for status in final_statuses.values() if status in ['failed', 'cancelled', 'error'])
print("n--- Processing Summary ---")
print(f"Total files processed: {len(final_statuses)}")
print(f"Successfully completed: {completed_count}")
print(f"Failed or Cancelled: {failed_count}")
for file_id, status in final_statuses.items():
if status != 'completed':
print(f" - File ID {file_id}: {status}")
if completed_count == 0:
print("nNo files completed processing successfully. Exiting RAG pipeline.")
exit()
elif failed_count > 0:
print("nWarning: Some files failed processing. RAG will proceed using only the successfully processed files.")
elif not all_files_processed:
print(f"nWarning: File processing did not complete for all files within the maximum wait time ({max_wait_time}s). RAG will proceed using only the successfully processed files.")
It took about 42 seconds to consider more than 100 pages.
Monitoring file processing status (this may take some time)...
Status Check (Elapsed: 0s): Completed: 0, Failed: 0, In Progress: 4, Pending: 0, Other: 0 / Total: 4
Status Check (Elapsed: 21s): Completed: 0, Failed: 0, In Progress: 4, Pending: 0, Other: 0 / Total: 4
Status Check (Elapsed: 42s): Completed: 4, Failed: 0, In Progress: 0, Pending: 0, Other: 0 / Total: 4
--- Processing Summary ---
Total files processed: 4
Successfully completed: 4
Failed or Cancelled: 0
If you click the “Vector Store” tab in MixedBread Dashboard, you will see that Victor Store has been successfully made and with 4 files stored.

5. Build a RAG pipe
The RAG pipe contains three main components: Returning, addition, and generation. Below is a definition of steps in action that these parts are working together to create a program to answer strong questions.
The first step in the RAG pipe is retrieved, where the program requires appropriate information based on the user question. This is available by installing the vector store to get the same results.
user_query = "How to Deploy Deepseek Janus Pro?"
retrieved_context = ""
search_results = mxbai.vector_stores.search(
vector_store_ids=[vector_store_id], # Search within our newly created store
query=user_query,
top_k=10 # Retrieve top 10 relevant chunks across all documents
)
if search_results.data:
# Combine the content of the chunks into a single context string
context_parts = []
for i, chunk in enumerate(search_results.data):
context_parts.append(f"Chunk {i+1} from '{chunk.filename}' (Score: {chunk.score:.4f}):n{chunk.content}n---")
retrieved_context = "n".join(context_parts)
else:
retrieved_context = "No context was retrieved."
The next step is to be due, where the restored sleep context is associated with the user's question to create faster custom. This acceleration includes program commands, user's question, and re-obtained context.
prompt_template = f"""
You are an assistant answering questions based *only* on the provided context from multiple documents.
Do not use any prior knowledge. If the context does not contain the answer to the question, state that clearly.
Context from the documents:
---
{retrieved_context}
---
Question: {user_query}
Answer:
"""
The last step is a generation, where quickly combined is sent to the Language model (Opelai's GPT-4.1-Nano) to produce an answer. This model is selected its operation and speed.
response = openai_client.chat.completions.create(
model=openai_model,
messages=[
{"role": "user", "content": prompt_template}
],
temperature=0.2,
max_tokens=500
)
final_answer = response.choices[0].message.content.strip()
print(final_answer)
The RAG pipe produces the most relevant and content responses.
To deploy DeepSeek Janus Pro locally, follow these steps:
1. Install Docker Desktop from and set it up with default settings. On Windows, ensure WSL is installed if prompted.
2. Clone the Janus repository by running:
```
git clone
```
3. Navigate into the cloned directory:
```
cd Janus
```
4. Build the Docker image using the provided Dockerfile:
```
docker build -t janus .
```
5. Run the Docker container with the following command, which sets up port forwarding, GPU access, and persistent storage:
```
docker run -it --rm -p 7860:7860 --gpus all --name janus_pro -e TRANSFORMERS_CACHE=/root/.cache/huggingface -v huggingface:/root/.cache/huggingface janus:latest
```
6. Wait for the container to download the model and start the Gradio application. Once running, access the app at
7. The application has two sections: one for image understanding and one for image generation, allowing you to upload images, ask for descriptions or poems, and generate images based on prompts.
This process enables you to deploy DeepSeek Janus Pro locally on your machine.
Store
Creating a rag app using MixedDBread was a straightforward and efficient process. The combined group recommends using their dashboard for services such as uploading documents, driving data, architectory, and performing the same user searches. This methodology makes it easy for experts from various fields to create their programs for understanding text without requiring technical technology.
In this study, we learned that APE of the IxutbredBreadbreadbread is simplified the process of building a rag. Startup implementation requires only a few steps and moves instant, accurate results. Unlike traditional textbook methods from documents, MixedBraad converts all pages into embedding, enabling effective and accurate return of relevant information. This method of moving a Page-level confirms that the results are genuine and suitable.
Abid Awa (@ 1abidaswan) is a certified scientist for a scientist who likes the machine reading models. Currently, focus on the creation of the content and writing technical blogs in a machine learning and data scientific technology. Avid holds a Master degree in technical management and Bachelor degree in Telecommunication Engineering. His viewpoint builds AI product uses a Graph Neural network for students who strive to be ill.