IQuest-Coder-V1 Technical Report
IQuest Coder Team


Overview

IQuest-Coder-V1 is a family of code-specialized large language models developed with the philosophy of "Tiny Core, Titan Power." The model family ranges from 7B to 40B parameters and achieves state-of-the-art performance on a wide range of coding benchmarks, including agentic software engineering and competitive programming tasks.

Model Variants

The IQuest-Coder-V1 family includes multiple variants tailored for different use cases:

  • Base models (7B / 14B / 40B): Pre-trained foundation models for code understanding and generation.
  • Instruct models: Instruction-tuned variants optimized for general coding assistance and efficient instruction-following.
  • Thinking models: Reasoning-enhanced variants that leverage reinforcement learning to produce explicit reasoning traces for complex problem-solving.
  • Loop-Instruct models: Recurrent transformer variants with shared parameters across iterations, enabling significantly more efficient deployment.

Architecture

The model architecture is built on several key design choices:

  • Grouped Query Attention (GQA): 40 query heads with 8 key-value heads for efficient inference.
  • Native 128K context window: Full long-context support without additional scaling techniques.
  • Vocabulary size: 76,800 tokens with a hidden dimension of 5,120.
  • Layer configurations: 14 layers (7B), 28 layers (14B), and 80 layers (40B).

The Loop variants introduce a recurrent transformer design with shared parameters across two iterations, which reduces memory footprint while maintaining competitive performance.

Code-Flow Multi-Stage Training

A distinctive aspect of IQuest-Coder-V1 is its code-flow multi-stage training paradigm. Unlike traditional approaches that treat code as static text, this method captures repository evolution patterns, commit transitions, and dynamic code transformations. By learning from real-world development workflows, the model develops a deeper understanding of how code evolves over time, leading to more practical and context-aware code generation.

Performance

IQuest-Coder-V1 achieves strong results across diverse coding benchmarks:

  • SWE-Bench Verified: 76.2% — demonstrating strong agentic software engineering capability.
  • LiveCodeBench v6: 81.1% — competitive programming performance.
  • BigCodeBench: 49.9% — practical coding task evaluation.

Additional evaluations span Evalplus-HumanEval, Evalplus-MBPP, FullStackBench, CruxEval, Aider-Polyglot, Mercury, Bird, Spider, Terminal-Bench, Mind2Web, and BFCL V3, covering code generation, program repair, full-stack development, database queries, and tool-use scenarios.

Dual Specialization

The model family offers two complementary specialization paths:

  • Thinking models prioritize reasoning depth with explicit chain-of-thought traces, ideal for complex algorithmic and debugging tasks.
  • Instruct models prioritize efficiency and directness, suitable for everyday coding assistance and rapid prototyping.

This dual approach allows users to choose the right trade-off between reasoning thoroughness and response speed depending on the task at hand.

Deployment

IQuest-Coder-V1 is production-ready with vLLM integration for OpenAI-compatible API deployment. The 40B models are recommended to run with tensor parallelism (e.g., --tensor-parallel-size 8). Recommended sampling parameters for instruct models are Temperature=0.6, TopP=0.85, and TopK=20.

Resources