IQuest-Coder-V1: Tiny Core, Titan Power for Code Intelligence

IQuest-Coder-V1 Technical Report

IQuest Coder Team

Overview

IQuest-Coder-V1 is a family of code-specialized large language models developed with the philosophy of "Tiny Core, Titan Power." The model family ranges from 7B to 40B parameters and achieves state-of-the-art performance on a wide range of coding benchmarks, including agentic software engineering and competitive programming tasks.

Model Variants

The IQuest-Coder-V1 family includes multiple variants tailored for different use cases:

Base models (7B / 14B / 40B): Pre-trained foundation models for code understanding and generation.
Instruct models: Instruction-tuned variants optimized for general coding assistance and efficient instruction-following.
Thinking models: Reasoning-enhanced variants that leverage reinforcement learning to produce explicit reasoning traces for complex problem-solving.
Loop-Instruct models: Recurrent transformer variants with shared parameters across iterations, enabling significantly more efficient deployment.

Architecture

The model architecture is built on several key design choices:

Grouped Query Attention (GQA): 40 query heads with 8 key-value heads for efficient inference.
Native 128K context window: Full long-context support without additional scaling techniques.
Vocabulary size: 76,800 tokens with a hidden dimension of 5,120.
Layer configurations: 14 layers (7B), 28 layers (14B), and 80 layers (40B).

The Loop variants introduce a recurrent transformer design with shared parameters across two iterations, which reduces memory footprint while maintaining competitive performance.

Code-Flow Multi-Stage Training

A distinctive aspect of IQuest-Coder-V1 is its code-flow multi-stage training paradigm. Unlike traditional approaches that treat code as static text, this method captures repository evolution patterns, commit transitions, and dynamic code transformations. By learning from real-world development workflows, the model develops a deeper understanding of how code evolves over time, leading to more practical and context-aware code generation.

Performance

IQuest-Coder-V1 achieves strong results across diverse coding benchmarks:

SWE-Bench Verified: 76.2% — demonstrating strong agentic software engineering capability.
LiveCodeBench v6: 81.1% — competitive programming performance.
BigCodeBench: 49.9% — practical coding task evaluation.

Additional evaluations span Evalplus-HumanEval, Evalplus-MBPP, FullStackBench, CruxEval, Aider-Polyglot, Mercury, Bird, Spider, Terminal-Bench, Mind2Web, and BFCL V3, covering code generation, program repair, full-stack development, database queries, and tool-use scenarios.

Dual Specialization

The model family offers two complementary specialization paths:

Thinking models prioritize reasoning depth with explicit chain-of-thought traces, ideal for complex algorithmic and debugging tasks.
Instruct models prioritize efficiency and directness, suitable for everyday coding assistance and rapid prototyping.

This dual approach allows users to choose the right trade-off between reasoning thoroughness and response speed depending on the task at hand.

Deployment

IQuest-Coder-V1 is production-ready with vLLM integration for OpenAI-compatible API deployment. The 40B models are recommended to run with tensor parallelism (e.g., --tensor-parallel-size 8). Recommended sampling parameters for instruct models are Temperature=0.6, TopP=0.85, and TopK=20.

Resources

Project page: https://iquestlab.github.io/
Code and technical report: https://github.com/IQuestLab/IQuest-Coder-V1
Models on Hugging Face: https://huggingface.co/collections/IQuestLab/iquest-coder