Python 3.14 and its New JIT Compiler

marks a turning point in the development of the world's most popular programming language. Although Python has long been known for its readability and large ecosystem, its killer speed is often the “elephant in the room.”
With 3.14, the CPython development team has brought not one, but two of the most anticipated features of recent times.
The end of GIL
I have written about this before. True concurrency is now available in Python if you want it. If you want more details about GIL-free Python, I will leave a link to my article about it at the end.
Just-In-Time (JIT) producer.
This testing feature is now wrapped directly into official installers, and that's what we'll focus on here. It's the result of years of architectural tweaks by the Python core team and others, aimed at making Python “automatically faster” without breaking the C-extension ecosystem that powers everything from data science to web backends.
In this article, we'll lift the hood of the new JIT, examine how it differs from previous optimization efforts, and walk through some benchmarks to help you decide if it's time to try JIT in your operations.
What is Python's New Just-In-Time (JIT) compiler?
To understand the 3.14 JIT, we need to understand how Python runs traditionally. Standard Python (CPython) is a translated language. When you use a script, your code is compiled into bytecode, which is a set of instructions executed by the CPython virtual machine.
JIT changes this flow. Instead of simply interpreting the bytecode line by line, the JIT monitors which parts of your code are used the most (“hot” paths). When a function or loop is considered “hot,” the JIT translates the bytecode into native machine code (instructions that the CPU understands). Then, the next time the code is run, no explanation is needed. Instead, it runs as is. This can be a great time saver, as we will see later.
How JIT fits into CPython
The Python 3.14 JIT is not a total rewrite. It is designed as an entry-level component that works alongside an existing translator. It uses a so-called “copy-and-patch” approach, which allows the JIT to be lightweight and portable across all types of CPU architectures without requiring a large, complex integration backend like LLVM.
What Changed in Python 3.14?
Python 3.13 had a basic, experimental JIT, but it was disabled by default. If you wanted to test it, you had to compile the CPython source tree and include it with some test flags like - - enable-experimental-jit.
With Python 3.14, everything changed. Provided JIT to the official .msi (Windows) and .pkg (macOS) installers. It also means that you no longer need a C compiler on your machine to get the benefits of JIT. Although still “experimental,” the inclusion in the official binaries indicates that the core team believes that JIT is stable enough for public testing.
Getting Python 3.14
Go to the top and you will see a download option for 3.14. Click that, and follow the instructions.
Otherwise, if you have UV installed tool, you can type the following.
PS C: > uv python install 3.14
Enables JIT
By default, JIT i disabled. This is a safety measure; because it's experimental, the Python Steering Council wants to ensure that users don't face unexpected regressions in stability or memory usage without an obvious choice.
To enable the JIT, you use an environment variable. This tells the CPython runtime to start the JIT engine when it starts.
On Windows (PowerShell):
$env:PYTHON_JIT=1
python my_script.py
On macOS/Linux (Bash/Zsh):
PYTHON_JIT=1
python my_script.py
Once enabled, CPython does not compile everything immediately with the JIT. It uses a tiering system. Basically, it tries to run the code as low as possible first, and only spends compilation/upgrade effort on the parts that prove to be hot.
- Section 0: Standard translation.
- Section 1: Special bytecode (introduced in 3.11).
- Phase 2 (JIT): Machine code generation for commonly used methods.
Measuring the Impact of JIT
When testing the JIT, you cannot use the time.time() close to work. JITs need ia time to warm up. The first few iterations of the loop may be slower than usual as the JIT profiles the code, but subsequent iterations can be much faster.
The Benchmark Suite
Below is a comprehensive test program designed to implement the unique features of JIT, from heavy calculations to complex object management.
File 1: workloads.py
This file contains three separate CPU-bound tasks.
1/ The Mandelbrot function iterates the Mandelbrot formula over a pixel grid and returns a checksum for each pixel's iteration count.
2/ The Djikstra function constructs a random weighted deterministic graph and using Dijkstra from point 0, it returns how many nodes have been completed/visited.
3/ The Levenshtein function generates N pairs of random strings and returns the sum of their Levenshtein distances
from __future__ import annotations
import random
import heapq
# Workload 1: Mandelbrot (CPU + math loops)
def mandelbrot(width: int = 1000, height: int = 1000, iters: int = 500) -> int:
checksum = 0
for y in range(height):
cy = (y / height) * 2.4 - 1.2
for x in range(width):
cx = (x / width) * 3.2 - 2.2
zx, zy, count = 0.0, 0.0, 0
while zx * zx + zy * zy <= 4.0 and count < iters:
zx, zy = zx * zx - zy * zy + cx, 2.0 * zx * zy + cy
count += 1
checksum += count
return checksum
# Workload 2: Dijkstra (heap + list + logic)
def dijkstra(n: int = 10000, edges_per_node: int = 50, seed: int = 123) -> int:
rng = random.Random(seed)
graph = [[] for _ in range(n)]
for u in range(n):
for _ in range(edges_per_node):
v = rng.randrange(n)
if v != u:
graph[u].append((v, rng.randrange(1, 30)))
dist = [10**12] * n
dist[0] = 0
pq = [(0, 0)]
visited = 0
while pq:
d, u = heapq.heappop(pq)
if d != dist[u]:
continue
visited += 1
for v, w in graph[u]:
nd = d + w
if nd < dist[v]:
dist[v] = nd
heapq.heappush(pq, (nd, v))
return visited
# Workload 3: Levenshtein distance (dynamic programming)
def levenshtein(a: str, b: str) -> int:
prev = list(range(len(b) + 1))
for i, ca in enumerate(a, 1):
cur = [i]
for j, cb in enumerate(b, 1):
cur.append(min(cur[j - 1] + 1, prev[j] + 1, prev[j - 1] + (ca != cb)))
prev = cur
return prev[-1]
def levenshtein_batch(n: int = 10000, seed: int = 7, k: int = 50) -> int:
"""
Deterministic batch: fixed RNG seed, fixed alphabet, fixed string length.
Returns the sum of distances.
"""
rng = random.Random(seed)
alphabet = "abc"
total = 0
for _ in range(n):
a = "".join(rng.choices(alphabet, k=k))
b = "".join(rng.choices(alphabet, k=k))
total += levenshtein(a, b)
return total
File 2: benchmark.py
This script automatically compares different tasks with JIT enabled and disabled.
import os
import time
import json
import subprocess
from pathlib import Path
PYTHON_EXE = r"C:UsersthomaAppDataLocalProgramsPythonPython314python.exe"
PROJECT_DIR = Path(__file__).resolve().parent
# Original workloads (statement prints a result for sanity)
WORKLOADS = [
("mandelbrot", 'from workloads import mandelbrot; print(mandelbrot())'),
("dijkstra", 'from workloads import dijkstra; print(dijkstra())'),
("levenshtein_batch", 'from workloads import levenshtein_batch; print(levenshtein_batch())'),
]
N_RUNS = 10 # average of ALL runs (set to 6/10/20 as you like)
OUTFILE = PROJECT_DIR / "results_avg.json"
def run_once(stmt: str, jit_val: int) -> tuple[float, str]:
env = os.environ.copy()
env["PYTHON_JIT"] = str(jit_val)
# Ensure local workloads.py is importable in subprocess
env["PYTHONPATH"] = str(PROJECT_DIR) + (os.pathsep + env.get("PYTHONPATH", ""))
t0 = time.perf_counter()
p = subprocess.run(
[PYTHON_EXE, "-c", stmt],
env=env,
cwd=str(PROJECT_DIR),
capture_output=True,
text=True,
)
t1 = time.perf_counter()
if p.returncode != 0:
raise RuntimeError(
f"Run failed (PYTHON_JIT={jit_val})nn"
f"Statement:n{stmt}nn"
f"STDOUT:n{p.stdout}nnSTDERR:n{p.stderr}"
)
return (t1 - t0, p.stdout.strip())
def summarize(times: list[float]) -> dict:
return {
"avg": sum(times) / len(times),
"min": min(times),
"max": max(times),
"runs": times,
}
def bench_workload(name: str, stmt: str) -> dict:
results = {}
outputs = {}
for jit_val in (0, 1):
times = []
outs = []
print(f" PYTHON_JIT={jit_val}: running {N_RUNS} times...")
for i in range(1, N_RUNS + 1):
dt, out = run_once(stmt, jit_val)
times.append(dt)
outs.append(out)
print(f" run {i}/{N_RUNS}: {dt:.6f}s")
results[jit_val] = summarize(times)
outputs[jit_val] = outs
avg0 = results[0]["avg"]
avg1 = results[1]["avg"]
speedup = avg0 / avg1 if avg1 else float("inf")
delta_pct = (avg1 - avg0) / avg0 * 100.0 if avg0 else 0.0
return {
"workload": name,
"jit0": results[0],
"jit1": results[1],
"speedup_jit0_over_jit1": speedup,
"delta_pct_jit1_vs_jit0": delta_pct,
"outputs": outputs, # sanity: should be stable
}
def main() -> int:
all_results = []
print(f"Using Python: {PYTHON_EXE}")
print(f"Project dir: {PROJECT_DIR}")
print(f"Runs per setting (avg of all runs): {N_RUNS}n")
for name, stmt in WORKLOADS:
print(f"=== {name} ===")
r = bench_workload(name, stmt)
all_results.append(r)
print(f"n Averages:")
print(f" JIT=0 avg: {r['jit0']['avg']:.6f}s (min {r['jit0']['min']:.6f}, max {r['jit0']['max']:.6f})")
print(f" JIT=1 avg: {r['jit1']['avg']:.6f}s (min {r['jit1']['min']:.6f}, max {r['jit1']['max']:.6f})")
print(f" Speedup (JIT=0 / JIT=1): {r['speedup_jit0_over_jit1']:.3f}× (Δ={r['delta_pct_jit1_vs_jit0']:+.2f}%)n")
# Optional: warn if outputs vary across runs (nondeterminism)
if len(set(r["outputs"][0])) != 1:
print(" !! WARNING: JIT=0 output differs across runs (nondeterministic workload?)")
if len(set(r["outputs"][1])) != 1:
print(" !! WARNING: JIT=1 output differs across runs (nondeterministic workload?)")
OUTFILE.write_text(json.dumps(all_results, indent=2), encoding="utf-8")
print(f"Wrote: {OUTFILE}")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Here are my results.
C:Usersthomaprojectspython_jit>C:UsersthomaAppDataLocalProgramsPythonPython314python.exe benchmark.py
Using Python: C:UsersthomaAppDataLocalProgramsPythonPython314python.exe
Project dir: C:Usersthomaprojectspython_jit
Runs per setting (avg of all runs): 10
=== mandelbrot ===
PYTHON_JIT=0: running 10 times...
run 1/10: 6.890924s
run 2/10: 6.950737s
run 3/10: 7.265357s
run 4/10: 6.947150s
run 5/10: 6.932333s
run 6/10: 6.939378s
run 7/10: 7.194705s
run 8/10: 6.995550s
run 9/10: 6.902696s
run 10/10: 7.256164s
PYTHON_JIT=1: running 10 times...
run 1/10: 5.216740s
run 2/10: 5.241888s
run 3/10: 5.350822s
run 4/10: 5.246767s
run 5/10: 5.294771s
run 6/10: 5.273295s
run 7/10: 5.272135s
run 8/10: 5.617062s
run 9/10: 5.251656s
run 10/10: 5.239060s
Averages:
JIT=0 avg: 7.027499s (min 6.890924, max 7.265357)
JIT=1 avg: 5.300420s (min 5.216740, max 5.617062)
Speedup (JIT=0 / JIT=1): 1.326× (Δ=-24.58%)
=== dijkstra ===
PYTHON_JIT=0: running 10 times...
run 1/10: 0.235401s
run 2/10: 0.227603s
run 3/10: 0.244492s
run 4/10: 0.232971s
run 5/10: 0.249589s
run 6/10: 0.232229s
run 7/10: 0.229422s
run 8/10: 0.238399s
run 9/10: 0.230657s
run 10/10: 0.235772s
PYTHON_JIT=1: running 10 times...
run 1/10: 0.238862s
run 2/10: 0.239266s
run 3/10: 0.240312s
run 4/10: 0.231413s
run 5/10: 0.232692s
run 6/10: 0.233783s
run 7/10: 0.230016s
run 8/10: 0.237760s
run 9/10: 0.240895s
run 10/10: 0.246033s
Averages:
JIT=0 avg: 0.235653s (min 0.227603, max 0.249589)
JIT=1 avg: 0.237103s (min 0.230016, max 0.246033)
Speedup (JIT=0 / JIT=1): 0.994× (Δ=+0.62%)
=== levenshtein_batch ===
PYTHON_JIT=0: running 10 times...
run 1/10: 2.176256s
run 2/10: 2.171253s
run 3/10: 2.171834s
run 4/10: 2.170444s
run 5/10: 2.149874s
run 6/10: 2.162820s
run 7/10: 2.171975s
run 8/10: 2.199151s
run 9/10: 2.168398s
run 10/10: 2.167821s
PYTHON_JIT=1: running 10 times...
run 1/10: 1.575666s
run 2/10: 1.612615s
run 3/10: 1.571106s
run 4/10: 1.584650s
run 5/10: 1.579948s
run 6/10: 1.582633s
run 7/10: 1.593924s
run 8/10: 1.573608s
run 9/10: 1.581427s
run 10/10: 1.578553s
Averages:
JIT=0 avg: 2.170983s (min 2.149874, max 2.199151)
JIT=1 avg: 1.583413s (min 1.571106, max 1.612615)
Speedup (JIT=0 / JIT=1): 1.371× (Δ=-27.06%)
Interpreting the Results
As you can see, the results are a mixed bag. This is common in JIT testing.
- 10–30% Speed Up: Typical of “pure Python” loops (such as Mandelbrot or Levenshtein tests) where the JIT can avoid the bytecode dispatch loop.
- 0% Upgrade: Common for I/O-bound functions or code that heavily uses C extensions. Dijkstra's code didn't speed up because its runtime is dominated by heap/tuple operations and heavy, assignment-driven work that the current CPython JIT doesn't optimize for, so any savings in the interpreter are lost in noise.
When to use the Python 3.14 JIT
JIT is a powerful tool, but it is not a “magic button.” In my experience, you should try JIT if you have…
- CPU-Bound Logic: Your application performs complex calculations, data processing, or complex logic in pure Python.
- Long-Term Procedures: Web servers (Gunicorn/Uvicorn) or background workers (Celery) run long hours, allowing the JIT more time to warm up and prepare hot methods.
- Pilot Test: You want to prepare your codebase for future versions of Python (3.15+), where the JIT will likely be more aggressive.
And avoid it if you have…
- I/O-Bound Applications: If your application is just waiting for database queries or API responses, JIT will not help.
- Areas of Memory Impairment: Small Lambda functions or small containers may suffer from increased JIT cache memory.
- Short-lived CLI tools: A script that runs in less than a second doesn't need a JIT.
Future directions: Beyond 3.14
The CPython core team views 3.14 as a “baseline year.” Future iterations (Python 3.15 and 3.16) are expected to include:
- Advanced Upgrade Passes: Type information gathered at runtime is used to generate more aggressive machine code.
- Better Heuristics: Smart decisions are open when to combine, reduce the “warm-up” penalty.
- High Low: Refine the copy and paste method to reduce memory usage.
Summary
The JIT for Python 3.14 is more than just a performance patch. It is a statement of purpose. It shows that Python is committed to closing the performance gap with languages like Java or Go while maintaining the “battery-equipped” simplicity that made it popular.
For many developers, JIT is just another tool worth exploring. If performance is important to your projects, it's worth testing Python 3.14 against your existing workload. A few benchmarks on your most important code paths may reveal performance gains you didn't expect.
Here is a link to my previous article on GIL Fee Python, which I mentioned earlier.



