Together ai deepcoder-14b – Preview: Open source code that is fully open model that combines o3-mini with only 14b parameters

The need for a wise generation of the code and the default planning solutions have become stronger, we were motivated by speedy increase in software and developer production requirements. While the processing of environmental languages and moderate thought models are highly effective, the coding domain has received slow progress. This lag is primarily called high-quality, guaranteed data signs in RL training programs. Unlike mathematical problems, the customs of formal, certified online examples, operating activities are often suffering noise, insufficient coverage, and unwanted effects. As a result, the code of the column of the column is always a major challenge.
DEPCODER-14B preview He was rejected in combination Ai in partnership with the Agentica party. This powerful model was well organized from Deepseek-R1-Diplelled-QWen-14B using verification strengthening, and shows great progress in the code of thinking. In the operation of 60.6% Pass Pass @ 1 in Trovascacy in Lecodebench (LCB), DEPCoder-14B preview
Release is very important by looking for benches. Deepseek-R1-QWen-14B Scores 53.0% in LCB, and a deep test shows 8% elimination with accuracy comparable with its basic model. Also, it competes with the toes and the formatting models, such as O3-mini (60.9%) and O1-17.5% (59.5) accurately and codes. About competitive metrics, reaching the overthrow of 1936 and perprentile of 95.3%, which are clear indicators of their real-country ability.
The model was trained over 2.5 weeks in 32 H100 GPUS Using Selected Data for Certified Codes 24,000 Codes. This data is built with a strong sorting of resources to ensure quality and diversity. It includes problems from Taco certified, PrimeElect's synthellet-1, and installed from LiveCodebench sent between assessments, the minimum five program testing, and providing tests for data avoidance. This helped maintain training and expand the operation of RL.
To simplify this level of verification, Sencoder's training includes scalable code for the compatible assessment. More than 1,000 problems to install codes are tested in each RL step using two strong sandbox, code translator together with the local sand box. These areas have confirmed that the rest of the remedy produced in the model is firmly tested throughout the unit test, filing rewarding and promotion of accurate thinking.
Also, the formation of the DeepCoder Support is made with “Verl-Pipe, Advanced Expansion in RL-Training Pipeline doubled RL. This is enhanced by accelerating development cycles and provides a motivating framework for others who want to build or estimate in the same Open-Source llMs in Cosystems.
Some important place in the release of the deeper-14b-Vival test includes:
- DeepCoder-14b-Preview viewing is 60.6% Pass @ 1 operation of O3-mini working with a few parameters.
- The 24K certified model training has been guaranteed 24K problems, carefully selected to avoid the sound and gambling.
- It was trained in 32 H100 GPUS 2,5 weeks, emphasizing the reorganization of the app.
- The nature of the Sand-Ndject Ndject confirmed the verification of the correct and final code verification during training.
- System efficiency with Verl-pipe redesigning double training speed and provides usable pipe for future models.
- Deepcoder is completely open, including datasets, code, and training logs, indicates a public development method.
Survey Technical Details, The model in the kisses of face including Gitubub page. All credit for this study goes to research for this project. Also, feel free to follow it Sane and don't forget to join ours 85k + ml subreddit.
🔥 [Register Now] The Minicon Virtual Conference at an open Source AI: Free Registration + 3 3 Certificate Reference (April 12, 9 pm 12 pm) + workshop [Sponsored]
Asphazzaq is a Markteach Media Inc. According to a View Business and Developer, Asifi is committed to integrating a good social intelligence. His latest attempt is launched by the launch of the chemistrylife plan for an intelligence, MarktechPost, a devastating intimate practice of a machine learning and deep learning issues that are clearly and easily understood. The platform is adhering to more than two million moon visits, indicating its popularity between the audience.
