ANI

I asked ChatGPT, Claude and DeepSeek to build Tetris

I asked ChatGPT, Claude and DeepSeek to build Tetris
Photo by the Author

# Introduction

It seems like almost every week, a new model claims to be the best, beating existing AI models in every benchmark.

I get free access to the latest AI models at my full-time job during release weeks. I usually don't pay too much attention to the hype and use whatever model is automatically selected by the program.

However, I know developers and friends who want to build software with AI that can be sent to production. Since these programs are self-sustaining, their challenge lies in finding the best model to do the job. They want to balance cost and reliability.

Because of this, after the release of GPT-5.2, I decided to do a practical test to understand if this model was worth the hype, and if it really was better than the competition.

Specifically, I chose to test the flagship models from each supplier: Claude Opus 4.5 (The most efficient Anthropic model), GPT-5.2 Pro (OpenAI's latest extensible reasoning model), and DeepSeek V3.2 (one of the latest open source methods).

To test these models, I chose to make them create a playable Tetris game with a single input.

These are the metrics I used to evaluate the success of each model:

Conditions Explanation
First Attempt Success With just one prompt, does the model deliver working code? Multiple debugging iterations lead to higher costs over time, which is why this metric was chosen.
Feature Perfection Are all the features mentioned in the prompt built for the model, or are there any missed?
Playing Beyond the use of technology, is the game really smooth to play? Or were there issues that caused a conflict in the user experience?
Cost effectiveness How much did it cost to get the code ready for production?

# Prompt

Here's a warning I've included for each AI model:

Build a fully functional Tetris game as a single HTML file that I can open directly in my browser.

Requirements:

GAME COMPONENTS:
– All 7 Tetris piece types
– Smooth surface rotation with wall collision detection
– The pieces should fall automatically, increasing the speed gradually as the user's score increases
– Line removal with visuals
– “Next episode” preview box
– Game over discovery when the pieces reach the top

CONTROLS:
– Arrow keys: Left/Right to move, Down to fast down, Up to rotate
– Mobile touch controls: Swipe left/right to move, swipe down to drop, tap to rotate
– Spacebar to pause/unrest
– Enter the key to restart after the game is over

VISIBLE PERFORMANCE:
– Gradient colors for each type of episode
– Smooth animations where the pieces move and the lines are clear
– Clean UI with rounded corners
– Update scores in real time
– Level indicator
– Game over screen with last score and restart button

GAMEPLAY CITY AND POLISH:
– Smooth 60fps gameplay
– Particle effects when lines are cleared (optional but awesome)
– Increase the score based on the number of lines cleared at once
– Grid background
– Responsive design

Make it look polished and feel satisfying to play. The code should be clean and well organized.

# Results

// 1. Claude Opus 4.5

The Opus 4.5 model did exactly what I asked for.

The UI was clean and the instructions were clearly displayed on the screen. All the controls were responsive and the game was fun to play.

The game is so smooth that I ended up playing for a long time and was distracted from testing other models.

Also, Opus 4.5 took less than 2 minutes to provide me with this working game, leaving me impressed with the first attempt.

Claude's Tetris Gameplay screenshotClaude's Tetris Gameplay screenshot
Tetris game developed by Opus 4.5

// 2. GPT-5.2 Pro

GPT-5.2 Pro is the latest model of OpenAI with extended logic. In context, GPT-5.2 has three categories: Instant, Thinking, and Pro. At the time of writing this article, the GPT-5.2 Pro is their most intelligent model, offering expanded imaging and reasoning capabilities.

It is also 4x more expensive than Opus 4.5.

There was a lot of tension in this model, which led to me going in with high expectations.

Unfortunately, I was underwhelmed by the performance of this model.

In the first attempt, GPT-5.2 Pro produced a Tetris game with a plot glitch. The bottom lines of the game were out of view, and I couldn't see where the pieces sat.

This makes the game unplayable, as shown in the screenshot below:

Tetris game developed by GPT-5.2Tetris game developed by GPT-5.2
Tetris game developed by GPT-5.2

I was especially surprised by this bug as it took about 6 minutes for the model to generate this code.

I decided to try again with this follow-up command to fix the viewport problem:

The game works, but there is an error. The bottom lines of the Tetris board are cut off at the bottom of the screen. I can't see the pieces when they arrive and the canvas extends beyond the visible viewing port.

Please correct this by:
1. Making sure the entire game board is level in the viewing area
2. Adding proper centering so that the full board is visible

The game must fit on the screen and all lines are visible.

After the tracking information, the GPT-5.2 Pro model produced a working game, as seen in the screenshot below:

Tetris Second Try by GPT-5.2Tetris Second Try by GPT-5.2
Tetris second try with GPT-5.2

However, the game was not as smooth as the one produced by the Opus 4.5 model.

When I pressed the “down” arrow to drop a piece, the next piece would sometimes drop quickly at high speed, not giving me enough time to think about how to place it.

The game ended up being playable only if I let each piece arrange itself, which wasn't the best.

(Note: I also tried the standard GPT-5.2 model, which produced the same carriage code the first time.)

// 3. DeepSeek V3.2

DeepSeek's first attempt at building this game had two problems:

  • The pieces begin to disappear when they reach the bottom of the screen.
  • The “down” arrow used to drop pieces quickly ends up scrolling the entire web page rather than simply moving the game pieces.

Tetris game developed by DeepSeek V3.2Tetris game developed by DeepSeek V3.2
Tetris game developed by DeepSeek V3.2

I also told the model to fix this problem, and the game controls finally worked fine.

However, some pieces disappeared before they landed. This made the game completely unplayable even after the second replay.

I'm sure this problem can be fixed with 2–3 more commands, and given DeepSeek's low prices, you can afford 10+ rounds of debugging and still spend less than one successful attempt for Opus 4.5.

# Summary: GPT-5.2 vs Opus 4.5 vs DeepSeek 3.2

// Apportionment of Costs

Here is a cost comparison between the three models:

Model Input (with 1M tokens) Output (in 1M tokens)
DeepSeek V3.2 $0.27 $1.10
GPT-5.2 $1.75 $14.00
Claude Opus 4.5 $5.00 $25.00
GPT-5.2 Pro $21.00 $84.00

DeepSeek V3.2 is the cheapest option, and you can download the model weights for free and use them in your infrastructure.

GPT-5.2 is about 7x more expensive than DeepSeek V3.2, followed by Opus 4.5 and GPT-5.2 Pro.

For this particular task (creating a Tetris game), we consumed about 1,000 input tokens and 3,500 output tokens.

For each additional iteration, we will estimate an additional 1,500 tokens for each additional round. Here are the estimated costs for each model:

Model Total Cost The result
DeepSeek V3.2 ~$0.005 The game is unplayable
GPT-5.2 ~$0.07 Playable, but bad user experience
Claude Opus 4.5 ~$0.09 Playable and user friendly
GPT-5.2 Pro $0.41 Playable, but bad user experience

# What is taken

Based on my experience in creating this game, I would stick to the Opus 4.5 model for everyday coding tasks.

Although GPT-5.2 is cheaper than Opus 4.5, I personally would not use it for coding, as the repetitions required to produce the same result would lead to the same amount of money spent.

The DeepSeek V3.2, however, is more affordable than the other models on this list.

If you're a developer on a budget and have free time to debug, you'll still end up saving money even if it takes you more than 10 tries to get the code working.

I was surprised by GPT 5.2 Pro's inability to produce a working game the first time, as it took about 6 minutes of thinking before coming up with the wrong code. After all, this is OpenAI's flagship model, and Tetris should be an easy task.

However, GPT-5.2 Pro's strengths are derived from mathematical and scientific research, and it is designed specifically for problems that do not rely on pattern recognition from training data. Perhaps this model is overdeveloped for simple day-to-day coding tasks, and should instead be used when building something complex and requiring novel architecture.

What is done in this test:

  • Opus 4.5 works very well for everyday coding tasks.
  • DeepSeek V3.2 is a budget alternative that delivers a reasonable result, although it requires some debugging effort to achieve the desired result.
  • The GPT-5.2 (Standard) didn't perform as well as the Opus 4.5, while the GPT-5.2 (Pro) is better suited for complex reasoning than for quick coding tasks like this one.

Feel free to repeat this test with the command I shared above, and happy coding!
&nbsp
&nbsp

Natasha Selvaraj is a self-taught data scientist with a passion for writing. Natassha writes on everything related to data science, a true master of all data topics. You can connect with him on LinkedIn or check out his YouTube channel.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button