Generative AI

MEE BUILDING OF MOE BUILDING: QWEN3 30B-A3B vs. GPT-OSS 20B

This article provides comparisons with the technology between the two technical models: Alaba's QWEN3 30B-A3B (August 2025).

Overview of the model

Feature QWEN3 30B-A3B GPT-OSS 20B
Complete parameters 30.5b 21B
Active parameters 3.3B 3.6B
The value of the layout 48 +
MOE Specialists 128 (8) (4 active)
Attention to building The attention of a distinctive question United attention and many questions
Question / value of value 32Q / 4kV 64Q / 8kV
Core 32,768 (EXT. 262,144) 128,000
Vocabulary size 151,936 O200K_Harmony (~ 200k)
Bar General accuracy National MXFP4
Release date April 2025 August 2025

Sources: Official QWen3 Documents, Opelai GPT-OSS Books

QWEN3 30B-A3B technical information

Construction Details

QWEN3 30B-A3B using a deep artwork for transform 48 layerseach contains a mixture of mixture-experts with 128 Specialists per layer. The model is activated 8 experts for each token During humility, gaining balance between efficiency and computational function.

Attention to pay attention

The model is using The attention of a distinctive question (snip) reference 32 Quests of Quest for 4 Heads of 4³. The project sets memory use while keeping the quality of attention, the most beneficial to process a long context.

Frequent Support

  • Traditional Length of Traditional: 32,768 tokens
  • Expanded context: Up to 262,144 Token (recent variations)
  • Most languages Support: 119 languages and tongue languages
  • Vocabulary: 151,936 tokens using Ble Tokenization

Different features

Qwen3 includes a Hybrid consultation system Supporting the “thought” methods and “unquestioning methods”, which allows users to control over Overhead based on job creation.

Specification of GPT-OSS 20B technology

Construction Details

GPT-OSS 20B includes a 24-Latal Transformer reference 32 MOE experts for each layer⁸. The model is activated 4 Specialists in each ofFocus the amount of broad expert on top of good cohesion.

Attention to pay attention

Model resources United attention and many questions reference 64 questions and 8 heads keywords organized in 8 groups¹⁰. This configuration supports effective compliance while maintaining the quality of attention in all comprehensive information.

The context and doing well

  • Traditional Length of Traditional: 128,000 tokens
  • Bar: Traditional accuracy of MXFP4 (4.25-bit) to get moe instruments
  • Memory performance: It works in a 16GB memory in value
  • Tokenzer: O200K_Harmony (GPT-4O Supeset City)

Operating Features

GPT-OSS 20B Uses Dense exchange and constructed in the area of attention patterns It's like GPT-3, with Upgrading Savings (wires) with an encoding¹⁵ in shock.

The comparison of the philosophy of buildings

A strategy of contempt vs.

QWEN3 30B-A3B emphasizes Depth and variation of professionals:

  • 48 layers enable multiple categories and hierarchical issues
  • 128 experts each layer offer good technology
  • Deserves complex tasks of consultation that requires deep processing

GPT-OSS 20B It is priority Width and range of competitive:

  • 24-experts with great professionals increases the power to facilitate
  • Few but more powerful technicians (32 vs 128) increases their professional capabilities
  • Made for successful passing to pass

Trailing Techniques Route

QWEN3: Tokens tokens pass 8 of 128 expertsPromoting different methods of processing, considering critical conditions and decision making.

GPT-OSS: Tokens tokens pass 4 of 32 expertsIncrease the power of each context and bring on the focus of each proposal.

Memory and supply supply

QWEN3 30B-A3B

  • Memory Requirements: Variable based on understanding and length of conditions
  • Submission: It is made for the shipping of clouds and the edge of the transparent status
  • Bar: Supports a variety of training plans after training after

GPT-OSS 20B

  • Memory Requirements: 16GB with traditional MXFP4 value, ~ 48GB in BFLOAT16
  • Submission: Designed to comply with consumer hardware
  • Bar: Traditional Training of MXFP4 enables effective humility without decaying quality

Operating Features

QWEN3 30B-A3B

  • Pass Mathematical thinking, codes, and complex practical activities
  • Strong performance within Various Circumstances In all 119 languages
  • Imaginative mode provides improved material for complex problems

GPT-OSS 20B

  • Reaches Performance compared to opening O3-mini to the ordinary benches
  • It is made for Use of the tool, web browsing, and calling
  • Powerful Consideration At the help levels of thinking

Use the cases of cases

Select QWEN3 30B-A3B for:

  • Complex consultation activities that require higher processing of many categories
  • Various applications in different languages
  • Conditions that require extension of moderation
  • Applications when considering obvious appearance

Select GPT-OSS 20B of:

  • Shipment of resources that need to work well
  • Applications to call Agentic Apps
  • Quick acquisition and consistent operation
  • Shipping Conditions on the Decasage With Restricted Memory

Store

QWEN3 30B-A3B and GPT-OSS 20B reflects the corresponding methods of the composition of MOE. Lartwen3 emphasizes the depth, scholar differences, and multilingualism, which makes it easy to find complex consultation programs. GPT-OSS 20B prioritizing efficiency, tools, and income, sales, efficient production sites with resources.

Both species indicate the appearance of MOE properties of MOE buildings more than a parameter.

Note: This article is inspired from the Reddit Post and the Graphic Graphic by Sebastian Raskka.


Resources

  1. QWEN3 30B-A3B model card – Hugging face
  2. QWEN3 Blog of technology
  3. QWEN3 30B-A3B Bas Information
  4. QWEN3 30B-A3B Teached 2507
  5. QWEN3 Support Official
  6. Qwen Tokenzer books
  7. Features of QWen3 model
  8. Opelai GPT-OSS Introduction
  9. GPT-OSS GITUB REOPSITORY
  10. GPT-OSS 20B – Groq texts
  11. Technical technical information
  12. Collecting the GPT-OSS Face Blog
  13. Opelai GPT-OSS 20B Model Card
  14. Opelai GPT-OSS Introduction
  15. NVIDIA GPT-OSS Blog Blog
  16. Collecting the GPT-OSS Face Blog
  17. QWEN3 Performance Analysis
  18. Opelai GPT-OSS Model Card
  19. GPT-OSS POWER 20B


Michal Sutter is a Master of Science for Science in Data Science from the University of Padova. On the basis of a solid mathematical, machine-study, and data engineering, Excerels in transforming complex information from effective access.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button