IQuest-Coder-V1: Tiny Core, Titan Power for Code Intelligence
Overview
IQuest-Coder-V1 is a family of code-specialized large language models developed with the philosophy of "Tiny Core, Titan Power." The model family ranges from 7B to 40B parameters and achieves state-of-the-art performance on a wide range of coding benchmarks, including agentic software engineering and competitive programming tasks.
Model Variants
The IQuest-Coder-V1 family includes multiple variants tailored for different use cases:
- Base models (7B / 14B / 40B): Pre-trained foundation models for code understanding and generation.
- Instruct models: Instruction-tuned variants optimized for general coding assistance and efficient instruction-following.
- Thinking models: Reasoning-enhanced variants that leverage reinforcement learning to produce explicit reasoning traces for complex problem-solving.
- Loop-Instruct models: Recurrent transformer variants with shared parameters across iterations, enabling significantly more efficient deployment.
Architecture
The model architecture is built on several key design choices:
- Grouped Query Attention (GQA): 40 query heads with 8 key-value heads for efficient inference.
- Native 128K context window: Full long-context support without additional scaling techniques.
- Vocabulary size: 76,800 tokens with a hidden dimension of 5,120.
- Layer configurations: 14 layers (7B), 28 layers (14B), and 80 layers (40B).
The Loop variants introduce a recurrent transformer design with shared parameters across two iterations, which reduces memory footprint while maintaining competitive performance.
Code-Flow Multi-Stage Training
A distinctive aspect of IQuest-Coder-V1 is its code-flow multi-stage training paradigm. Unlike traditional approaches that treat code as static text, this method captures repository evolution patterns, commit transitions, and dynamic code transformations. By learning from real-world development workflows, the model develops a deeper understanding of how code evolves over time, leading to more practical and context-aware code generation.
Performance
IQuest-Coder-V1 achieves strong results across diverse coding benchmarks:
- SWE-Bench Verified: 76.2% — demonstrating strong agentic software engineering capability.
- LiveCodeBench v6: 81.1% — competitive programming performance.
- BigCodeBench: 49.9% — practical coding task evaluation.
Additional evaluations span Evalplus-HumanEval, Evalplus-MBPP, FullStackBench, CruxEval, Aider-Polyglot, Mercury, Bird, Spider, Terminal-Bench, Mind2Web, and BFCL V3, covering code generation, program repair, full-stack development, database queries, and tool-use scenarios.
Dual Specialization
The model family offers two complementary specialization paths:
- Thinking models prioritize reasoning depth with explicit chain-of-thought traces, ideal for complex algorithmic and debugging tasks.
- Instruct models prioritize efficiency and directness, suitable for everyday coding assistance and rapid prototyping.
This dual approach allows users to choose the right trade-off between reasoning thoroughness and response speed depending on the task at hand.
Deployment
IQuest-Coder-V1 is production-ready with vLLM integration for OpenAI-compatible API deployment. The 40B models are recommended to run with tensor parallelism (e.g., --tensor-parallel-size 8). Recommended sampling parameters for instruct models are Temperature=0.6, TopP=0.85, and TopK=20.
Resources
- Project page: https://iquestlab.github.io/
- Code and technical report: https://github.com/IQuestLab/IQuest-Coder-V1
- Models on Hugging Face: https://huggingface.co/collections/IQuestLab/iquest-coder