ANI

Run a full model of R1-0528 in your area


Photo by the writer

Deepseek-R1-0528 The latest renewal of DeepseeEek's R1 requirement requires 715GB of the disk space of the disk, which makes it one of the most open models available. However, because of advanced measurement strategies from AdslothThe model size can be reduced in 162GB, 80% reduction. This allows users to recognize the full power of model with the lowest hardware requirements, albeit by small work performance.

In this lesson, we will:

  1. Set Ollama and Web Ui open to use your local Deepseek model.
  2. Download and prepare for a version of 1.78-bit used (IQ1_S) of model.
  3. Run the model using both GPU + CPU and CPU-only setup.

Step 0: Requirements

To launch a comprehensive version of IQ1_S, your program should meet these following requirements:

GPU requirements: At least 1x 24GB GPU (eg, Nvidia RTX 4090 or A6000) and 128GB RAM. For this tip, you can expect a generation speed of 5 / second tokens.

RAM requirements: The minimum of 64GB RAM is required to conduct model to use model model without GPU but the performance will be limited to 1 token / second.

Total Setup: In order to work very well (5+ tokens / second), you need at least 180GB of integrated memory or combination of 180GB RAM + vram.

Storage: Make sure you have at least 200GB of the free space for the model and its dependability.

Step 1: Enter the dependence and Ollama

Update your system and enter the tools needed. Ollama is a lack of heavy use of large-language models in the area. Add to the spread of Ubuntu using the following commands:

apt-get update
apt-get install pciutils -y
curl -fsSL  | sh

Step 2: Download and use model

Run the 1.78-bit version used (IQ1_S) of the deeper model-R1-0528 using the following command:

ollama serve &
ollama run hf.co/unsloth/DeepSeek-R1-0528-GGUF:TQ1_0

Run a full model of R1-0528 in your area

Step 3: Set and use Web Open UI

Pull the Ui Docker web site open photo with Cuda Support. Run the UI vessel of UI for GPU support and Ollama integration.

This command will:

  • Start the UI open Web server in Port 8080
  • Enable GPU rapid using --gpus all flag
  • Mount the Required Data Guide (-v open-webui:/app/backend/dataSelected
docker pull ghcr.io/open-webui/open-webui:cuda
docker run -d -p 9783:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:cuda

When the container operate, access Web Ui Interface open in your browser at http://localhost:8080/.

Step 4: Runn Deepseek R1 0528 in the open webui

Select hf.co/unsloth/DeepSeek-R1-0528-GGUF:TQ1_0 model from the model menu.

Run a full model of R1-0528 in your area

If the Ollama server fails to use GPU well, you can switch to the killing CPU. While this will reduce the operation (nearly 1 / second token), it ensures that the model can still run.

# Kill any existing Ollama processes
pkill ollama 

# Clear GPU memory
sudo fuser -v /dev/nvidia* 

# Restart Ollama service
CUDA_VISIBLE_DEVICES="" ollama serve

When the model works, you can work with one with the UI open UI. However, note that the speed will be limited to 1 token / second due to lack of GPU acceleration.

Run a full model of R1-0528 in your area

The last thoughts

Running even a version made into a challenge. You need an instant internet connection to download the model, and when download fails, you must restart the whole process from the beginning. I also faced many problems trying to use on my GPU, as I kept finding GGUF's related mistakes. Without trying several ordinary repairs to the GPU errors, nothing works, so I finally turned off everything on the CPU. While this is doing the work, now it takes just 10 minutes in model to produce an answer, far away from the ideal item.

I'm sure there are better solutions, perhaps using Illama.cpp, but let me trust me, it took me all day so I could run.

Abid Awa (@ 1abidaswan) is a certified scientist for a scientist who likes the machine reading models. Currently, focus on the creation of the content and writing technical blogs in a machine learning and data scientific technology. Avid holds a Master degree in technical management and Bachelor degree in Telecommunication Engineering. His viewpoint builds AI product uses a Graph Neural network for students who strive to be ill.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button