ANI

Ukuhlelwa Kwendawo Kwe-Agentic Okushibhile: Ikhodi kaClaude + Ollama + Gemma4

0 1 14 minutes read

Ukuhlelwa Kwendawo Kwe-Agentic Okushibhile: Ikhodi kaClaude + Ollama + Gemma4

# Isingeniso

Bona ngeso lengqondo lokhu: ukugeleza komsebenzi wama-agent amaningi okufunda amafayela, okubhala iziqephu, okuqhuba izivivinyo, futhi aphindaphinde kumasevisi amane, okwenza amakholi we-API angu-400 ntambama eyodwa. Isaziso siyafika. Usuweqe umkhawulo othambile futhi. Yonke ithokheni ibiza imali, yonke imininingwane ithumela ikhodi yakho yobunikazi kuseva yenkampani yangaphandle, futhi imikhawulo yezilinganiso iphazamisa izikhathi ezisebenza isikhathi eside — okuwukuphela kwesixazululo ukukhokha okwengeziwe.

IGemma 4 26B MoE yenza kusebenze izigidigidi ezi-3.8 kuphela zamapharamitha ayizigidi eziyizinkulungwane ezingama-26 ngokudlula phambili ngakunye. Ithole amaphuzu angama-77.1%. I-LiveCodeBench I-v6 kanye ne-86.4% ivuliwe t2-ibhentshi ukusetshenziswa kwethuluzi le-ejenti – ibhentshimakhi ehlola ngokukhethekile ukuthi kwenzekani uma imodeli kufanele ishayele amathuluzi, ikhiphe izinyathelo, futhi isingathe amaphutha ekugelezeni komsebenzi okuzinyathelo eziningi. Isizukulwane esedlule, i-Gemma 3 27B, sithole u-6.6% kuleyo bhentshimakhi efanayo. Lokho akukhona ukuthuthukisa okuncane. Ingumehluko phakathi kwemodeli engakwazi ukubiza amathuluzi ngokuthembekile naleyo engasebenzisa iluphu ye-Claude Code gentic ngaphandle kokuhlala ingasebenzi kahle imingcele yayo yokushaya ucingo.

Lesi sihloko sakha isitaki esigcwele: U-Ollama usebenzela i-Gemma 4 endaweni, i-Modelfile evimbela ukwehluleka kwewindi lokuqukethwe kumaseshini e-ejenti, settings.json enamathisela Ikhodi ye-Claude endaweni yokugcina yendawo, iskripthi sokuqinisekisa esiqinisekisa ukuthi yonke into iyasebenza ngaphambi kokuthi uyisebenzise kukhodi yangempela, kanye nemininingwane eqotho yalokho okuphukayo nendlela yokuyilungisa. Izithameli ngonjiniyela asebeqonda kakade ukuthi yiziphi izinhlobo zezilimi ezinkulu (ama-LLM) nokuthi ama-loop e-agent abiza malini. Akukho ukubamba ngesandla ezintweni eziyisisekelo.

# Kungani iGemma 4?

Ikhishwe ngo-Ephreli 2, 2026 ngaphansi kwe-Apache 2.0, i-Gemma 4 iwumndeni wemodeli yesisindo esivulekile ye-Google DeepMind kuze kube manje. Izinhlobonhlobo ezine zithunyelwe: E2B (2B isebenza kahle), E4B (4B isebenza kahle), 26B MoE, kanye ne-31B Dense. I-26B MoE isebenzisa ochwepheshe abancane abangu-128 futhi yenza kusebenze abangu-8 kuphela ithokheni ngayinye kanye nochwepheshe oyedwa okwabiwe, iletha ikhwalithi eseduze ne-31B ngezindleko zekhompiyutha eziphansi kakhulu.

Izinguqulo zangaphambilini ze-Gemma zasebenzisa ilayisense ye-Google yangokwezifiso enemikhawulo yokusetshenziswa kwezentengiso edidayo kangangokuthi amaqembu ezomthetho ezinkampani ngokuvamile ayihlaba umkhosi njengesivimbi. IGemma 4 yi-Apache 2.0, eyokuqala yomndeni wakwaGemma. Uma ithimba lakho lifuna ukushumeka lokhu kumathuluzi angaphakathi, thumela imikhiqizo phezu kwayo, noma uyiqhube ngamapayipi okukhiqiza ngaphandle kokubuyekezwa okusemthethweni, okushintsha izinto ngokusebenza.

// Izinombolo Ezibalulekile Kuma-ejenti Okubhala Amakhodi

Ibhentshimakhi	Intengo 3 27B	IGemma 4 26B MoE	IGemma 4 31B Iminyene
τ2-ibhentshi (ukusetshenziswa kwethuluzi le-ejenti)	6.6%	~79%	86.4%
I-LiveCodeBench v6	29.1%	77.1%	80.0%
GPQA Diamond	42.4%	82.3%	84.3%
I-AIME 2026 (izibalo)	20.8%	88.3%	89.2%
Arena AI ELO	1365	1441	1452

// Izidingo ze-Hardware

Ngaphambi kokudonsa imodeli engu-18 GB, yazi ukuthi empeleni usebenza ngani. Umndeni we-Gemma 4 wawuklanyelwe ukuhlanganisa amadivaysi ngokusebenzisa iziteshi zokusebenza, futhi okune okuhlukile kubonisa lobo bubanzi.

Okuhlukile	Umaki we-Ollama	Amapharamitha asebenzayo	I-VRAM ku-Q4	Iwindi lokuqukethwe
Umkhawulo 4B	igem4:e4b	4B	~6GB	128K
26B MOE	igugu 4:26b	3.8B	~16–18 GB	256K
31B Okuminyene	igugu4:31b	31B	~24–32 GB	256K

// Ifaka i-Ollama, i-Gemma 4, ne-Claude Code

Isinyathelo 1: Faka i-Ollama

# macOS and Linux -- one-line install
curl -fsSL  | sh

# Verify version -- must be 0.14.0+ for Anthropic Messages API support
# The Anthropic-compatible endpoint was added in January 2026
ollama version
# Expected: ollama version is 0.22.x or higher (as of May 2026)

# Windows: download the native installer from 
# WSL2 is recommended if you want GPU passthrough on Windows

Ngemva kokufaka, i-Ollama iqala njengesevisi yangemuva ku-port 11434. Qinisekisa ukuthi iphezulu:

curl 
# Expected response: Ollama is running

Isinyathelo sesi-2: Donsa i-Gemma 4

# The 26B MoE -- recommended for this setup (~18 GB download)
ollama pull gemma4:26b

# While you wait, confirm the download is progressing
ollama ps
# Shows currently downloading or running models

# Optional: also pull the 31B for comparison on capable hardware
ollama pull gemma4:31b

# Confirm the pull completed
ollama list
# Should show gemma4:26b with size and modification date

Isinyathelo sesi-3: Faka Ikhodi ye-Claude

# Prerequisites: Node.js 18 or later
node --version   # Confirm you are on 18+

# Install Claude Code CLI globally
npm install -g @anthropic-ai/claude-code

# Verify the install
claude --version

Njengoba u-Ollama egijima futhi i-Gemma 4 idonswa, umuzwa wemvelo olandelayo uwukuthekelisa okuguquguqukayo kwemvelo bese wethula Ikhodi ye-Claude ngokushesha.

# Ifayela le-Model

U-OllamaIwindi lomongo elizenzakalelayo le-Gemma 4 amathokheni angu-4K. Iwindi lokuqukethwe langempela le-Gemma 4 liyi 128K–256K. Lokho okuzenzakalelayo kwe-4K akusona isiphakamiso — yiyona ezosetshenziswa u-Ollama ngaphandle uma uyibhala ngaphezulu. Kuseshini ye-ejenti ye-Claude Code efunda amafayela omthombo, ephethe umlando wengxoxo, futhi egcina imiphumela yekholi yamathuluzi ngokushintshana okuningi, amathokheni e-4K aphela ngemizuzwana.

Ngaphandle kokweqa komongo, Ikhodi ye-Claude ilahlekelwa ukulandela kokuqukethwe kwefayela phakathi nokuhlelwa, ikhohlwe imiyalelo yangaphambili, futhi ikhiqize izinguquko ezihlukene. Ngokuqondile: uma i-ejenti izama ukwenza kabusha isigaba sesevisi yemigqa engu-200, ikhohlwa kahle ukuthi ingxenye yesibili ikhona. Umenzeli akaphakamisi iphutha. Isebenza buthule ekubukeni okungaphelele kwefayela futhi ikhiqize okukhiphayo okulungile kancane okuphuka phansi.

Ukulungiswa kuyifayela le-Model elibhaka usayizi womongo olungile namanye amapharamitha okucatshangwayo abe okuhlukile kwemodeli okuqanjwe igama. Dala leli fayela:

# ~/.ollama/Modelfiles/gemma4-claude
# Gemma 4 26B MoE variant tuned for Claude Code agentic sessions.
# Bakes context window, temperature, and system prompt into the model
# so every Claude Code session starts with the correct configuration.
#
# Build with:
#   mkdir -p ~/.ollama/Modelfiles
#   ollama create gemma4-claude -f ~/.ollama/Modelfiles/gemma4-claude

FROM gemma4:26b

# Context window -- 65536 tokens (64K) is the tested-safe floor for real
# codebases without triggering swap on 16-18 GB VRAM systems.
# Increase to 131072 (128K) if you have headroom on 24 GB+ systems.
# Do not go above 131072 unless you have profiled your memory usage
# under load -- Ollama pre-allocates the full KV cache upfront.
PARAMETER num_ctx 65536

# Temperature -- 0.2 is deliberately low for agentic coding.
# Higher temperature introduces variability in tool call parameter
# formatting that causes Claude Code's tool validator to reject calls.
# For creative tasks, you would set this higher. For agentic loops: low.
PARAMETER temperature 0.2

# top_p -- nucleus sampling threshold. 0.9 keeps generation focused
# while avoiding the repetition loops that top_p=1.0 can produce on
# long agentic sessions.
PARAMETER top_p 0.9

# repeat_penalty -- penalizes the model for repeating tokens.
# 1.15 helps prevent tool call loops where Gemma 4 retries the same
# failed tool call with nearly identical parameters indefinitely.
PARAMETER repeat_penalty 1.15

# num_predict -- maximum tokens per response. 4096 is sufficient for
# most code patches. Increase to 8192 if you regularly generate
# large files in a single generation.
PARAMETER num_predict 4096

# System prompt -- reinforces coding agent behavior and explicit
# tool use discipline. Gemma 4 benefits from being reminded to
# commit to tool calls rather than describing what it would do.
SYSTEM """You are a senior software engineer operating as a coding agent.

When working with code:
- Read files before editing them. Never assume file contents.
- Make one focused change at a time and verify it before proceeding.
- When a tool call fails, examine the error carefully before retrying.
  Do not retry with identical parameters. Diagnose first.
- Prefer surgical edits over full file rewrites.
- Run tests after each meaningful change, not after a batch of changes.
- If you are uncertain about the codebase structure, read more files
  rather than guessing.

Be precise and methodical. Avoid explaining what you are about to do
when you could simply do it."""

Yakha okuhlukile:

# Create the Modelfiles directory if it does not exist
mkdir -p ~/.ollama/Modelfiles

# Save the Modelfile content from above to this path, then build:
ollama create gemma4-claude -f ~/.ollama/Modelfiles/gemma4-claude

# Verify the variant was created
ollama list
# Should show gemma4-claude alongside gemma4:26b

# Quick smoke test -- verify it loads and responds
ollama run gemma4-claude "What is the time complexity of binary search and why?"
# Expect a clear, concise technical response within a few seconds

# Ikhodi ye-Claude yokufaka izintambo kumodeli yasendaweni

Ngokwakhiwe okuhlukile kwemodeli, isendlalelo sokumisa sixhuma i-Claude Code ku-Ollama. Okuguquguqukayo kwemvelo okubili kungumongo walokhu, kodwa okuguquguqukayo okuthathu okungeziwe kuvimbela izindlela zokwehluleka ezivame kakhulu.

Iphoyinti lokugcina elihambisana ne-Anthropic lika-Ollama liku hhayi /v1. I /v1 Indlela iyisendlalelo sika-Ollama esihambisana ne-OpenAI. Ikhodi ye-Claude isebenzisa iphrothokholi ye-Anthropic Messages API, ekhomba ekugcineni kwezimpande. Ukusebenzisa i- /v1 indlela izoveza amaphutha okuqinisekisa noma ukuziphatha okungalindelekile.

// Izilungiselelo Zomhlaba Wonke — ~/.claude/settings.json

Lokhu kulungiselelwa kusebenza kuzo zonke izikhathi ze-Claude Code kuwo wonke amaphrojekthi. Kuyinketho elungile ngaphandle kwalapho ushintsha phakathi kwamamodeli endawo namafu njalo ngephrojekthi ngayinye.

{
  "env": {
    "ANTHROPIC_BASE_URL": "

    "ANTHROPIC_AUTH_TOKEN": "ollama",

    "ANTHROPIC_API_KEY": "",

    "ANTHROPIC_MODEL": "gemma4-claude",

    "ANTHROPIC_DEFAULT_SONNET_MODEL": "gemma4-claude",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "gemma4-claude",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "gemma4-claude",

    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
  }
}

Kungani okuguquguqukayo ngakunye kubalulekile:

ANTHROPIC_BASE_URL iqondisa kabusha zonke izingcingo ze-Claude Code API ezivela eziphakelini ze-Anthropic ziye endaweni yangakini ye-Ollama.
I-ANTHROPIC_AUTH_TOKEN kufanele isethelwe kunoma iyiphi iyunithi yezinhlamvu engenalutho; U-Ollama uyaziba inani kodwa Ikhodi ka-Claude idinga ukuthi unhlokweni ube khona.
ANTHROPIC_API_KEY: “” ithulula ngokucacile ukhiye ukuze Ikhodi ye-Claude ingakwazi ukubuyela kukhiye wangempela we-Anthropic API uma kungenzeka ukuthi isethwe endaweni yegobolondo lakho. Ngaphandle kwalokhu, a akulungiselelwe kahle ANTHROPIC_BASE_URL ingase yehluleke buthule ku-API ekhokhelwayo.
ANTHROPIC_MODEL igama lemodeli eliyinhloko elithi Claude Code lithumela izicelo. Setha lokhu kokuhluka kwakho kwe-Modelfile yangokwezifiso, gemma4-claude hhayi gemma4:26b. Ithegi yemodeli eluhlaza ayikukhiphi iwindi lomongo.
ANTHROPIC_DEFAULT_SONNET_MODEL, ANTHROPIC_DEFAULT_HAIKU_MODELfuthi ANTHROPIC_DEFAULT_OPUS_MODEL: Ikhodi ye-Claude yangaphakathi ihambisa izinhlobo ezahlukene zemisebenzi kumamodeli ahlukene. Ukubeka zontathu kumodeli efanayo yendawo kuqinisekisa ukuthi isicelo ngasinye sifika endaweni yakho ye-Ollama kungakhathaliseki ukuthi iyiphi i-Claude Code ekhetha isigaba ngaphakathi.
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS: “1” inqamula izihloko ze-beta eziqondene ne-Anthropic lezo i-Claude Code eyengeza ezicelweni. Amaseva wendawo awaziboni lezi zihloko futhi enqaba izicelo eziwahlanganisayo. Ukusetha lokhu kuguquguquka kuvimbela lelo phutha ngaphandle kokuthikameze noma yimuphi umsebenzi owumongo wekhodi ye-Claude.

// Ukucushwa Kwephrojekthi Ngamunye — .claude/settings.json

Kumaphrojekthi lapho ufuna ukuqagela kwasendaweni kuhlukaniswe nokusethwa kwakho komhlaba wonke – amakhosombe ayimfihlo, izisekelo ezibucayi, noma amaphrojekthi anezidingo ezithile zemodeli – sebenzisa ifayela lezilungiselelo ezingeni lephrojekthi esikhundleni salokho:

# In your project root
mkdir -p .claude

cat > .claude/settings.json << 'EOF'
{
  "env": {
    "ANTHROPIC_BASE_URL": "
    "ANTHROPIC_AUTH_TOKEN": "ollama",
    "ANTHROPIC_API_KEY": "",
    "ANTHROPIC_MODEL": "gemma4-claude",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "gemma4-claude",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "gemma4-claude",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "gemma4-claude",
    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
  }
}
EOF

UClaude Code ufunda izinga lephrojekthi .claude/settings.json uma ikhona, ukweqa izilungiselelo zomhlaba wonke zaleyo phrojekthi. Engeza .claude/settings.json kweyakho .gitignore uma izilungiselelo ziqukethe noma yini eqondene nendawo ethile, noma yenza uma ufuna ukuthi ithimba lonke lisebenzise incazelo yendawo kuleyo phrojekthi.

// Iqinisekisa Ukusethwa

Ngaphambi kokusebenzisa Ikhodi ye-Claude ngokumelene ne-codebase yangempela, qinisekisa izinto ezintathu: U-Ollama usebenza ngendlela efanele, imodeli iphendula amakholi e-API ngefomethi ye-Anthropic Messages, futhi ukushaya kwethuluzi kusebenza ngokuqondile. Iphuzu lesithathu alinakuxoxiswana: ukushaya ithuluzi yindlela i-Claude Code efunda ngayo amafayela, ibhala iziqephu, futhi ikhiphe imiyalo. Imodeli engakwazi ukufometha amakholi wamathuluzi ngendlela efanele izosebenza futhi yehluleke emisebenzini eyisisekelo yomenzeli.

Okudingekayo:

pip install httpx   # Async HTTP client for the verification script

Iskriphthi esigcwele sokuqinisekisa:


#!/usr/bin/env python3
"""
verify_local_setup.py

Verifies the full Claude Code + Ollama + Gemma 4 stack before use.
Runs three checks in sequence:
  1. Ollama health and model availability
  2. Basic Anthropic Messages API call
  3. Tool calling round-trip

Prerequisites:
  pip install httpx

How to run:
  python verify_local_setup.py

Expected output on a working setup:
  [PASS] Ollama is running on localhost:11434
  [PASS] Model 'gemma4-claude' is available
  [PASS] Anthropic Messages API call successful
  [PASS] Tool calling: model produced a valid tool_use block
  All checks passed -- Claude Code + Ollama + Gemma 4 is ready.
"""

import httpx
import json
import sys

# ── Configuration ─────────────────────────────────────────────────────────────
OLLAMA_BASE_URL = "
MODEL_NAME      = "gemma4-claude"   # Must match your Modelfile variant name
TIMEOUT         = 120.0             # Seconds -- generation can be slow on first call


def check_ollama_health() -> bool:
    """
    Check 1: Verify Ollama is running and responding.
    Hits the root endpoint which returns 'Ollama is running' when healthy.
    """
    print("nCheck 1: Ollama health")
    try:
        response = httpx.get(OLLAMA_BASE_URL, timeout=5.0)
        if "Ollama is running" in response.text:
            print(f"  [PASS] Ollama is running on {OLLAMA_BASE_URL}")
            return True
        else:
            print(f"  [FAIL] Unexpected response: {response.text[:100]}")
            return False
    except httpx.ConnectError:
        print(f"  [FAIL] Cannot connect to {OLLAMA_BASE_URL}")
        print("         Is Ollama running? Try: ollama serve")
        return False


def check_model_available() -> bool:
    """
    Check 2: Verify the specific model variant is available in Ollama.
    Uses the /api/tags endpoint which lists all pulled models.
    """
    print("nCheck 2: Model availability")
    try:
        response = httpx.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=5.0)
        data     = response.json()
        models   = [m["name"] for m in data.get("models", [])]

        # Normalize: Ollama may add ":latest" if not specified
        normalized = [m.split(":")[0] for m in models]

        if MODEL_NAME in models or MODEL_NAME in normalized:
            print(f"  [PASS] Model '{MODEL_NAME}' is available")
            return True
        else:
            print(f"  [FAIL] Model '{MODEL_NAME}' not found")
            print(f"         Available models: {', '.join(models) or 'none'}")
            print(f"         Run: ollama create {MODEL_NAME} -f ~/.ollama/Modelfiles/gemma4-claude")
            return False
    except Exception as e:
        print(f"  [FAIL] Error checking model list: {e}")
        return False


def check_messages_api() -> bool:
    """
    Check 3: Send a basic Anthropic Messages API call to the local endpoint.
    Verifies the request format, model routing, and basic generation work.
    Uses the same /v1/messages path and request schema that Claude Code uses.
    Note: Claude Code uses  (root), not /v1.
    The Anthropic-compatible API is at /api/chat or the root -- Ollama routes it.
    """
    print("nCheck 3: Anthropic Messages API call")

    payload = {
        "model": MODEL_NAME,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "Reply with exactly: VERIFICATION_OK"
            }
        ]
    }

    headers = {
        "Content-Type":      "application/json",
        "x-api-key":         "ollama",            # Required by the API spec; value ignored locally
        "anthropic-version": "2023-06-01"         # Required version header
    }

    try:
        response = httpx.post(
            f"{OLLAMA_BASE_URL}/v1/messages",
            json=payload,
            headers=headers,
            timeout=TIMEOUT
        )

        if response.status_code != 200:
            print(f"  [FAIL] HTTP {response.status_code}: {response.text[:200]}")
            return False

        data = response.json()

        # Anthropic Messages API response structure:
        # { "content": [{"type": "text", "text": "..."}], "stop_reason": "..." }
        content_blocks = data.get("content", [])
        text_blocks    = [b for b in content_blocks if b.get("type") == "text"]

        if not text_blocks:
            print(f"  [FAIL] No text content in response: {json.dumps(data, indent=2)}")
            return False

        response_text = text_blocks[0].get("text", "")
        print(f"  [PASS] Anthropic Messages API call successful")
        print(f"         Model response: {response_text[:80]}")
        return True

    except Exception as e:
        print(f"  [FAIL] Request failed: {e}")
        return False


def check_tool_calling() -> bool:
    """
    Check 4: Verify tool calling works end-to-end.
    This is the most important check for Claude Code agentic use.
    Claude Code relies on the model correctly producing tool_use blocks
    for every file operation, shell command, and code execution.

    Sends a simple tool definition and a prompt that should trigger it.
    Verifies the model returns a tool_use block (not just text describing the call).
    """
    print("nCheck 4: Tool calling verification")

    # A minimal tool definition using the Anthropic function calling schema
    tools = [
        {
            "name": "read_file",
            "description": "Read the contents of a file at the given path.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description": "The absolute or relative file path to read"
                    }
                },
                "required": ["path"]
            }
        }
    ]

    payload = {
        "model": MODEL_NAME,
        "max_tokens": 256,
        "tools": tools,
        # Force the model to call a tool rather than respond in text.
        # tool_choice: {"type": "any"} requires any tool use.
        # Remove this if testing whether the model self-selects tools.
        "tool_choice": {"type": "any"},
        "messages": [
            {
                "role": "user",
                "content": "Read the file at /tmp/test.py and show me its contents."
            }
        ]
    }

    headers = {
        "Content-Type":      "application/json",
        "x-api-key":         "ollama",
        "anthropic-version": "2023-06-01"
    }

    try:
        response = httpx.post(
            f"{OLLAMA_BASE_URL}/v1/messages",
            json=payload,
            headers=headers,
            timeout=TIMEOUT
        )

        if response.status_code != 200:
            print(f"  [FAIL] HTTP {response.status_code}: {response.text[:200]}")
            return False

        data           = response.json()
        content_blocks = data.get("content", [])
        tool_blocks    = [b for b in content_blocks if b.get("type") == "tool_use"]

        if not tool_blocks:
            print("  [FAIL] Model did not produce a tool_use block")
            print("         This means tool calling is not working correctly.")
            print("         Agentic Claude Code sessions will fail on file operations.")
            print(f"         Full response: {json.dumps(data, indent=2)}")
            return False

        tool_call  = tool_blocks[0]
        tool_name  = tool_call.get("name", "")
        tool_input = tool_call.get("input", {})

        print(f"  [PASS] Tool calling: model produced a valid tool_use block")
        print(f"         Tool called: {tool_name}")
        print(f"         Parameters:  {json.dumps(tool_input)}")

        # Sanity check: did it call the right tool with the right parameter?
        if tool_name == "read_file" and "path" in tool_input:
            print(f"         Tool name and parameter are correct.")
        else:
            print(f"         WARNING: Unexpected tool name or missing 'path' parameter.")
            print(f"         The model called a tool but not the expected one.")

        return True

    except Exception as e:
        print(f"  [FAIL] Request failed: {e}")
        return False


def main():
    print("=" * 60)
    print("Claude Code + Ollama + Gemma 4 Setup Verification")
    print("=" * 60)

    checks = [
        check_ollama_health,
        check_model_available,
        check_messages_api,
        check_tool_calling,
    ]

    results = [check() for check in checks]

    print("n" + "=" * 60)
    passed = sum(results)
    total  = len(results)

    if all(results):
        print(f"All {total} checks passed.")
        print("Claude Code + Ollama + Gemma 4 is ready.")
        print(f"nLaunch with: claude")
        sys.exit(0)
    else:
        failed_checks = [i + 1 for i, r in enumerate(results) if not r]
        print(f"{passed}/{total} checks passed. Failed: {failed_checks}")
        print("Resolve the failures above before using Claude Code locally.")
        sys.exit(1)


if __name__ == "__main__":
    main()

Isebenza kanjani:

pip install httpx
python verify_local_setup.py

# I-Agentic Task Walkthrough

Ngokudlula kokuqinisekisa, nansi indlela iseshini ye-ejenti yangempela ebukeka ngayo. Umsebenzi: thatha imojuli ekhona yePython ngaphandle kokuhlolwa, ihlaziye, bhala i-suite yokuhlola, yenza izivivinyo, futhi ulungise noma yikuphi ukwehluleka.

# Navigate to a project directory
cd ~/projects/my-service

# Confirm Claude Code sees the local configuration
claude --version
# Verify it does not prompt for an Anthropic API key -- if it does,
# the settings.json is not being read correctly

# Start an agentic session
claude

# Inside Claude Code, give the agent a concrete task:
# > Analyze the UserService class in src/user_service.py.
# > Write a pytest test suite covering all public methods.
# > Run the tests and fix any failures.
# > The goal is a clean pytest run with no skips.

Indlela ithuluzi le-Claude Code call trace libukeka ngayo ngalesi sikhathi:

→ read_file("src/user_service.py")
  Reading 247 lines...

→ list_files("src/")
  Found: user_service.py, models.py, db.py, exceptions.py

→ read_file("src/models.py")
  Reading 89 lines...

→ write_file("tests/test_user_service.py", [test content])
  Written: 312 lines

→ bash("python -m pytest tests/test_user_service.py -v 2>&1")
  Running 14 tests...
  FAILED tests/test_user_service.py::test_update_email_invalid
    AssertionError: Expected ValidationError, got None

→ read_file("src/user_service.py")  [targeted re-read of update_email method]
  ...

→ write_file("tests/test_user_service.py", [corrected test])
  Patched test_update_email_invalid assertion

→ bash("python -m pytest tests/test_user_service.py -v 2>&1")
  14 passed in 1.23s

I-Gemma 4 iphatha le phethini ngokwethembeka – ukufunda amafayela ngaphambi kokuhlela, ukwenza izivivinyo ngemva kwezinguquko, nokuhlonza ukwehluleka okuvela ekuphumeni kwephutha kunokuzama kabusha ngokungaboni. Ukuziphatha ezinqumweni zezakhiwo eziyinkimbinkimbi kuwo wonke amafayela amaningi yilapho amamodeli wamafu asenomkhawulo. Ngomsebenzi ongenhla (ukuhlaziya, ukwenziwa kokuhlola, nokulungiswa okuhlosiwe), ukusethwa kwendawo kuyakwazi ngokugcwele.

Ongakubuka: Uma ubona umenzeli ekhiqiza amaphutha “Amapharamitha angavumelekile wethuluzi” bese uzama futhi ngamapharamitha afanayo ngokuphindaphindiwe, izinga lokushisa liphezulu kakhulu, noma imodeli ayisebenzisi gemma4-claude Okuhlukile kwefayela lemodeli. Kokubili izinga lokushisa kanye nokukhishwa kwewindi lomongo kubhakwa kokuhlukile; okuluhlaza gemma4:26b umaka akaziphathi.

// Yini Ephukayo Nendlela Yokulungisa

Ithuluzi Amaphutha Okufometha
- Uphawu: Imibiko yekhodi kaClaude Izimiso zethuluzi ezingavumelekile ngokuphindaphindiwe. Umenzeli uyaxolisa futhi azame ngamapharamitha afanayo noma acishe afane, bese ayaluqa.
- Imbangela: Lokhu kubhalwe ku Izinkinga ze-Ollama GitHub. Imodeli ikhiqiza ikholi yethuluzi i-JSON engafani ne-schema ye-Claude Code elindelwe. Okuvame kakhulu: amagama ezinkambu ezingalungile, izinkambu ezidingekayo ezingekho, noma izinto ezibekwe esidlekeni lapho kulindeleke khona ama-scalar.
- Lungisa: Qinisekisa ukuthi uyasebenza gemma4-claude (okuhlukile kwe-Modelfile) hhayi gemma4:26b ngqo. I temperature: 0.2 kanye nokwaziswa kwesistimu ku-Modelfile kunciphisa kakhulu lokhu. Uma inkinga iqhubeka, yehlisa izinga lokushisa liye ku-0.1 ku-Modelfile bese uyakha kabusha.

Iwindi Lokuqukethwe Ukushintshwa Kwediski

Uphawu: Isizukulwane siyehla ekukhaseni ngemva kokujika okuningi. ollama ps ibonisa ukwehla kokusetshenziswa kwe-GPU. I-OS ifaka i-cache ye-KV kudiski.

Lungisa:

# Option 1: Reduce context window in the Modelfile
# Edit ~/.ollama/Modelfiles/gemma4-claude
# Change: PARAMETER num_ctx 65536
# To:     PARAMETER num_ctx 32768
# Then rebuild: ollama create gemma4-claude -f ~/.ollama/Modelfiles/gemma4-claude

# Option 2: Enable KV cache quantization to reduce memory footprint
export OLLAMA_KV_CACHE_TYPE=q8_0
# This quantizes the KV cache itself, reducing memory at a small quality cost
# Restart Ollama after setting this: pkill ollama && ollama serve

Ukulayishwa kwemodeli phakathi kokujika komenzeli

Uphawu: Ukubambezeleka okubonakalayo kokuqala okubandayo ekuqaleni komlayezo ngamunye wekhodi ye-Claude. U-Ollama uthulula imodeli ngemva kokuphela kwesikhathi sokungasebenzi futhi uyayilayisha kabusha esicelweni ngasinye.

Lungisa:

# Keep the model loaded indefinitely during your work session
export OLLAMA_KEEP_ALIVE=-1

# Or set it in your shell profile for permanent effect
echo 'export OLLAMA_KEEP_ALIVE=-1' >> ~/.zshrc

# Alternatively, use the Ollama API to pin the model
curl /api/generate 
  -d '{"model": "gemma4-claude", "keep_alive": -1}'
# This pins the model until you explicitly unload it or restart Ollama

Amaphutha Wokunqatshwa Kwesihloko se-Beta
- Uphawu: Claude Code ukhiqiza Amanani angalindelekile enhlokweni ye-anthropic-beta amaphutha ekuqaliseni noma phakathi neseshini.
- Lungisa: Qinisekisa CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS: "1" ikuwe settings.json. Uma usetha ngokuthekelisa igobolondo esikhundleni sokuthi settings.jsonqinisekisa ukuthi ithunyelwa ngesikhathi esifanayo lapho claude iyasebenza:
```
echo $CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS
# Must print: 1
```

# Esonga

Isitaki esichazwe kulesi sihloko asibona ubufakazi bomqondo. Kuwuhlelo lokukhiqiza olusebenzayo obelusebenza nsuku zonke onjiniyela kusukela u-Ollama engeze usekelo lwe-Anthropic Messages API ngoJanuwari 2026. I-Modelfile ayikhethi; umehluko phakathi kwethuluzi elisebenzayo kanye nalelo elikhiqiza ngokuthula okungaphelele emisebenzini yamafayela amaningi. Umbhalo wokuqinisekisa ubamba izinkinga zokumisa ngaphambi kokuthi zivele maphakathi neseshini njengokuhluleka kwe-ejenti okudidayo.

Ukusetha okwakhelwe kulesi sihloko kuyimfihlo, i-ejenti yezindleko ze-zero-ithokheni ephatha iningi lemisebenzi yansuku zonke yobunjiniyela – ukuhlaziywa kwekhodi, ukukhiqiza ukuhlolwa, ukulungisa kabusha okuqondisiwe, nokususa iphutha – ngesivinini sokukhiqiza esisetshenziswa kuhadiwe yesimanjemanje.

Lokhu kusetha akukona ukumiselela okucatshangelwayo kwefu ekucabangeni kwezakhiwo eziyinkimbinkimbi kuwo wonke ama-codebase amakhulu noma imisebenzi yekilasi le-SWE-bhentshi edinga ukuqonda okujulile kwekhosombe esikalini.

Shithu Olumide ungunjiniyela wesofthiwe nombhali wezobuchwepheshe othanda ukusebenzisa ubuchwepheshe obuphambili ekwenzeni izindaba ezithokozisayo, oneso elibukhali lemininingwane kanye nekhono lokwenza imiqondo eyinkimbinkimbi ibe lula. Ungathola futhi i-Shittu Twitter.

Source link

nimda 18 hours ago

0 1 14 minutes read