<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://ids-lab-scu.github.io/feed.xml" rel="self" type="application/atom+xml"/><link href="https://ids-lab-scu.github.io/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-05-06T20:36:43+00:00</updated><id>https://ids-lab-scu.github.io/feed.xml</id><title type="html">IDs Lab | SCU</title><subtitle>This is IDs Lab which aims to build next-gen system for artificial intelligence and database applications. </subtitle><entry><title type="html">IQuest-Coder-V1: Tiny Core, Titan Power for Code Intelligence</title><link href="https://ids-lab-scu.github.io/blog/2026/post-iquest-coder/" rel="alternate" type="text/html" title="IQuest-Coder-V1: Tiny Core, Titan Power for Code Intelligence"/><published>2026-01-15T14:00:00+00:00</published><updated>2026-01-15T14:00:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2026/post-iquest-coder</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2026/post-iquest-coder/"><![CDATA[<article> <div class="title"><strong>IQuest-Coder-V1 Technical Report</strong></div> <div class="author"> IQuest Coder Team </div> <p><br/></p> <h4 id="overview">Overview</h4> <p> IQuest-Coder-V1 is a family of code-specialized large language models developed with the philosophy of <strong>"Tiny Core, Titan Power."</strong> The model family ranges from 7B to 40B parameters and achieves state-of-the-art performance on a wide range of coding benchmarks, including agentic software engineering and competitive programming tasks. </p> <h4 id="model-variants">Model Variants</h4> <p> The IQuest-Coder-V1 family includes multiple variants tailored for different use cases: </p> <ul> <li><strong>Base models</strong> (7B / 14B / 40B): Pre-trained foundation models for code understanding and generation.</li> <li><strong>Instruct models</strong>: Instruction-tuned variants optimized for general coding assistance and efficient instruction-following.</li> <li><strong>Thinking models</strong>: Reasoning-enhanced variants that leverage reinforcement learning to produce explicit reasoning traces for complex problem-solving.</li> <li><strong>Loop-Instruct models</strong>: Recurrent transformer variants with shared parameters across iterations, enabling significantly more efficient deployment.</li> </ul> <h4 id="architecture">Architecture</h4> <p> The model architecture is built on several key design choices: </p> <ul> <li><strong>Grouped Query Attention (GQA)</strong>: 40 query heads with 8 key-value heads for efficient inference.</li> <li><strong>Native 128K context window</strong>: Full long-context support without additional scaling techniques.</li> <li><strong>Vocabulary size</strong>: 76,800 tokens with a hidden dimension of 5,120.</li> <li><strong>Layer configurations</strong>: 14 layers (7B), 28 layers (14B), and 80 layers (40B).</li> </ul> <p> The Loop variants introduce a recurrent transformer design with shared parameters across two iterations, which reduces memory footprint while maintaining competitive performance. </p> <h4 id="training">Code-Flow Multi-Stage Training</h4> <p> A distinctive aspect of IQuest-Coder-V1 is its <strong>code-flow multi-stage training paradigm</strong>. Unlike traditional approaches that treat code as static text, this method captures repository evolution patterns, commit transitions, and dynamic code transformations. By learning from real-world development workflows, the model develops a deeper understanding of how code evolves over time, leading to more practical and context-aware code generation. </p> <h4 id="benchmarks">Performance</h4> <p> IQuest-Coder-V1 achieves strong results across diverse coding benchmarks: </p> <ul> <li><strong>SWE-Bench Verified</strong>: 76.2% — demonstrating strong agentic software engineering capability.</li> <li><strong>LiveCodeBench v6</strong>: 81.1% — competitive programming performance.</li> <li><strong>BigCodeBench</strong>: 49.9% — practical coding task evaluation.</li> </ul> <p> Additional evaluations span Evalplus-HumanEval, Evalplus-MBPP, FullStackBench, CruxEval, Aider-Polyglot, Mercury, Bird, Spider, Terminal-Bench, Mind2Web, and BFCL V3, covering code generation, program repair, full-stack development, database queries, and tool-use scenarios. </p> <h4 id="dual-specialization">Dual Specialization</h4> <p> The model family offers two complementary specialization paths: </p> <ul> <li><strong>Thinking models</strong> prioritize reasoning depth with explicit chain-of-thought traces, ideal for complex algorithmic and debugging tasks.</li> <li><strong>Instruct models</strong> prioritize efficiency and directness, suitable for everyday coding assistance and rapid prototyping.</li> </ul> <p> This dual approach allows users to choose the right trade-off between reasoning thoroughness and response speed depending on the task at hand. </p> <h4 id="deployment">Deployment</h4> <p> IQuest-Coder-V1 is production-ready with vLLM integration for OpenAI-compatible API deployment. The 40B models are recommended to run with tensor parallelism (e.g., <code>--tensor-parallel-size 8</code>). Recommended sampling parameters for instruct models are Temperature=0.6, TopP=0.85, and TopK=20. </p> <h4 id="resources">Resources</h4> <ul> <li>Project page: <a href="https://iquestlab.github.io/">https://iquestlab.github.io/</a></li> <li>Code and technical report: <a href="https://github.com/IQuestLab/IQuest-Coder-V1">https://github.com/IQuestLab/IQuest-Coder-V1</a></li> <li>Models on Hugging Face: <a href="https://huggingface.co/collections/IQuestLab/iquest-coder">https://huggingface.co/collections/IQuestLab/iquest-coder</a></li> </ul> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="SYSTEM"/><summary type="html"><![CDATA[A technical overview of IQuest-Coder-V1, a family of code-specialized language models from 7B to 40B parameters.]]></summary></entry><entry><title type="html">CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning</title><link href="https://ids-lab-scu.github.io/blog/2025/post-cloneshield/" rel="alternate" type="text/html" title="CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning"/><published>2025-05-15T14:00:00+00:00</published><updated>2025-05-15T14:00:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2025/post-cloneshield</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2025/post-cloneshield/"><![CDATA[<article> <div class="title"><strong>CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning</strong></div> <div class="author"> Renyuan Li, Zhibo Liang, Haichuan Zhang, Tianyu Shi, Zhiyuan Cheng, Jia Shi, Carl Yang, Mingjie Tang </div> <p><br/></p> <h4 id="abstract">Abstract</h4> <p> Recent breakthroughs in text-to-speech (TTS) voice cloning have raised serious privacy concerns, allowing highly accurate vocal identity replication from just a few seconds of reference audio, while retaining the speaker’s vocal authenticity. In this paper, we introduce \textbf{CloneShield}, a universal time-domain adversarial perturbation framework specifically designed to defend against zero-shot voice cloning. Our method provides protection that is robust across speakers and utterances, without requiring any prior knowledge of the synthesized text. We formulate perturbation generation as a multi-objective optimization problem, and propose Multi-Gradient Descent Algorithm (MGDA) to ensure the robust protection across diverse utterances. To preserve natural auditory perception for users, we decompose the adversarial perturbation via Mel-spectrogram representations and fine-tune it for each sample. This design ensures imperceptibility while maintaining strong degradation effects on zero-shot cloned outputs. Experiments on three state-of-the-art zero-shot TTS systems, five benchmark datasets and evaluations from 60 human listeners demonstrate that our method preserves near-original audio quality in protected inputs (PESQ = 3.90, SRS = 0.93) while substantially degrading both speaker similarity and speech quality in cloned samples (PESQ = 1.07, SRS = 0.08). </p> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="SECURITY"/><summary type="html"><![CDATA[CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning Renyuan Li, Zhibo Liang, Haichuan Zhang, Tianyu Shi, Zhiyuan Cheng, Jia Shi, Carl Yang, Mingjie Tang]]></summary></entry><entry><title type="html">ATTACK AS DEFENSE RUN-TIME BACKDOOR IMPLANTATION FOR IMAGE CONTENT PROTECTION</title><link href="https://ids-lab-scu.github.io/blog/2024/post-attack_as_defense/" rel="alternate" type="text/html" title="ATTACK AS DEFENSE RUN-TIME BACKDOOR IMPLANTATION FOR IMAGE CONTENT PROTECTION"/><published>2024-10-07T13:56:00+00:00</published><updated>2024-10-07T13:56:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2024/post-attack_as_defense</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2024/post-attack_as_defense/"><![CDATA[<article> <div class="title"><strong>ATTACK AS DEFENSE: RUN-TIME BACKDOOR IMPLANTATION FOR IMAGE CONTENT PROTECTION </strong></div> <div class="author"> Haichuan Zhang, Meiyu Lin, Zhaoyi Liu, Renyuan Li, Zhiyuan Cheng, Carl Yang, Mingjie Tang </div> <p><br/></p> <h4 id="abstract">Abstract</h4> <p>As generative models achieve great success, tampering and modifying the sensitive image contents (i.e., human faces, artist signatures, commercial logos, etc.) have induced a significant threat with social impact. The backdoor attack is a method that implants vulnerabilities in a target model, which can be activated through a trigger. In this work, we innovatively prevent the abuse of image content modification by implanting the backdoor into image-editing models. Once the protected sensitive content on an image is modified by an editing model, the backdoor will be triggered, making the editing fail. Unlike traditional backdoor attacks that use data poisoning, to enable protection on individual images and eliminate the need for model training, we developed the first framework for run-time backdoor implantation, which is both time- and resource- efficient. We generate imperceptible perturbations on the images to inject the backdoor and define the protected area as the only backdoor trigger. Editing other unprotected insensitive areas will not trigger the backdoor, which minimizes the negative impact on legal image modifications. Evaluations with state-of-the-art image editing models show that our protective method can increase the CLIP-FID of generated images from 12.72 to 39.91, or reduce the SSIM from 0.503 to 0.167 when subjected to malicious editing. At the same time, our method exhibits minimal impact on benign editing, which demonstrates the efficacy of our proposed framework. The proposed run-time backdoor can also achieve effective protection on the latest diffusion models.</p> <div class="row"> <div class="col-12 col-sm-12 col-md-9 col-lg-8 mx-auto d-block"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/OVERVIEW-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/OVERVIEW-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/OVERVIEW-1400.webp"/> <img src="/assets/img/publications/examples_ATTACK_AS_DEFENSE/OVERVIEW.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" title="post_couler" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Overview </div> </div> </div> <div class="row"> <p><br/></p> <div class="col-12 col-sm-12 col-md-9 col-lg-8 mx-auto d-block"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/EXAMPLE1-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/EXAMPLE1-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/assets/img/publications/examples_ATTACK_AS_DEFENSE/EXAMPLE1-1400.webp"/> <img src="/assets/img/publications/examples_ATTACK_AS_DEFENSE/EXAMPLE1.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" title="post_couler" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Some Examples </div> </div> </div> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="Security,"/><category term="Image"/><category term="Protection"/><summary type="html"><![CDATA[Protecting sensitive image content using runtime backdoor implantation.]]></summary></entry><entry><title type="html">m-LoRA - How to Efficiently Fine-Tune Dozens of Large Language Models on a Single GPU</title><link href="https://ids-lab-scu.github.io/blog/2024/post-m-LoRA/" rel="alternate" type="text/html" title="m-LoRA - How to Efficiently Fine-Tune Dozens of Large Language Models on a Single GPU"/><published>2024-05-13T02:56:00+00:00</published><updated>2024-05-13T02:56:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2024/post-m-LoRA</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2024/post-m-LoRA/"><![CDATA[<article> <div class="title"><strong>m-LoRA: How to Efficiently Fine-Tune Dozens of Large Language Models on a Single GPU </strong></div> <div class="author"> Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang </div> <p><br/></p> <h4 id="abstract">Abstract</h4> <p>Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains. Recent studies suggest that the resources required for fine-tuning LLMs can be economized through parameter-efficient methods such as Low-Rank Adaptation (LoRA). While LoRA effectively reduces computational burdens and resource demands, it currently supports only a single-job fine-tuning setup. In this paper, we present M-LORA, a high-throughput framework for fine-tuning LLMs. M-LORA efficiently trains multiple jobs on a single GPU using the LoRA method, leveraging shared pre-trained model and adaptive scheduling. M-LORA is compatible with transformer-based language models like LLaMA and ChatGLM, etc. Experiments show that M-LORA saves 53% of GPU memory when training multiple LLaMA7B models on NVIDIA A100 80GB GPU and boosts training throughput by about 17% compared to existing methods when training with various pre-trained models on different GPUs. The adaptive scheduling algorithm reduces turnaround time by 24%, end-to-end training latency by 12%, prioritizing jobs and preventing out-of-memory issues.</p> <div class="row"> <div class="col-12 col-sm-12 col-md-9 col-lg-8 mx-auto d-block"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/assets/img/publications/post_m-lora-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/assets/img/publications/post_m-lora-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/assets/img/publications/post_m-lora-1400.webp"/> <img src="/assets/img/publications/post_m-lora.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" title="post_m-LoRA" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Overview of m-LoRA. </div> </div> </div> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="System"/><summary type="html"><![CDATA[m-LoRA: How to Efficiently Fine-Tune Dozens of Large Language Models on a Single GPU Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang]]></summary></entry><entry><title type="html">GPTuner - A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization</title><link href="https://ids-lab-scu.github.io/blog/2024/post-gptuner/" rel="alternate" type="text/html" title="GPTuner - A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization"/><published>2024-03-13T13:56:00+00:00</published><updated>2024-03-13T13:56:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2024/post-gptuner</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2024/post-gptuner/"><![CDATA[<article> <div class="title"><strong>GPTuner - A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization </strong></div> <div class="author"> Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, Jianguo Wang </div> <p>——Accepted by the VLDB 2024, <a href="/assets/pdf/gptuner.pdf">[PDF]</a> </p> <p><br/></p> <h4 id="abstract">Abstract</h4> <p>Modern database management systems (DBMS) expose hundreds of configurable knobs to control system behaviours. Determining the appropriate values for these knobs to improve DBMS performance is a long-standing problem in the database community. As there is an increasing number of knobs to tune and each knob could be in continuous or categorical values, manual tuning becomes impractical. Recently, automatic tuning systems using machine learning methods have shown great potentials. However, existing approaches still incur significant tuning costs or only yield sub-optimal performance. This is because they either ignore the extensive domain knowledge available (e.g., DBMS manuals and forum discussions) and only rely on the runtime feedback of benchmark evaluations to guide the optimization, or they utilize the domain knowledge in a limited way. Hence, we propose GPTuner, a manual-reading database tuning system. Firstly, we develop a Large Language Model (LLM)-based pipeline to collect and refine heterogeneous knowledge, and propose a prompt ensemble algorithm to unify a structured view of the refined knowledge. Secondly, using the structured knowledge, we (1) design a workload-aware and training-free knob selection strategy, (2) develop a search space optimization technique considering the value range of each knob, and (3) propose a Coarse-to-Fine Bayesian Optimization Framework to explore the optimized space. Finally, we evaluate GPTuner under different benchmarks (TPC-C and TPC-H), metrics (throughput and latency) as well as DBMS (PostgreSQL and MySQL). Compared to the state-of-the-art approaches, GPTuner identifies better configurations in 16x less time on average. Moreover, GPTuner achieves up to 30% performance improvement (higher throughput or lower latency) over the best-performing alternative.</p> <div class="row"> <div class="col-12 col-sm-12 col-md-9 col-lg-8 mx-auto d-block"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/assets/img/publications/post_gptuner-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/assets/img/publications/post_gptuner-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/assets/img/publications/post_gptuner-1400.webp"/> <img src="/assets/img/publications/post_gptuner.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" title="post_gptuner" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Overview of GPTuner. </div> </div> </div> <h4 id="abstract">More About GPTuner</h4> <p>For more information about GPTuner, please visit Jiale Lao' s blog <a href="https://solidlao.github.io/post/llm-db-tuning/">[From DB-BERT to DB-BART and Beyond]</a>, where he makes full introduction of the project background and introduce DB-Bert, which is also a masterpiece in the filed of knob-tuning and inspires GPTuner's design.</p> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="DB"/><summary type="html"><![CDATA[GPTuner - A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, Jianguo Wang]]></summary></entry><entry><title type="html">Couler - Unified Machine Learning Workflow Optimization in Cloud</title><link href="https://ids-lab-scu.github.io/blog/2024/post-couler/" rel="alternate" type="text/html" title="Couler - Unified Machine Learning Workflow Optimization in Cloud"/><published>2024-03-06T13:56:00+00:00</published><updated>2024-03-06T13:56:00+00:00</updated><id>https://ids-lab-scu.github.io/blog/2024/post-couler</id><content type="html" xml:base="https://ids-lab-scu.github.io/blog/2024/post-couler/"><![CDATA[<article> <div class="title"><strong>Couler - Unified Machine Learning Workflow Optimization in Cloud </strong></div> <div class="author"> Xiaoda Wang, Yuan Tang, Tengda Guo, Bo Sang, Jingji Wu, Jian Sha, Ke Zhang, Jiang Qian, Mingjie Tang </div> <p>——Accepted by the IEEE ICDE 2024, <a href="/assets/pdf/couler.pdf">[PDF]</a> </p> <p><br/></p> <h4 id="abstract">Abstract</h4> <p>Machine Learning (ML) has become ubiquitous, fueling data-driven applications across various organizations. Contrary to the traditional perception of ML in research, ML workflows can be complex, resource-intensive, and time-consuming. Expanding an ML workflow to encompass a wider range of data infrastructure and data types may lead to larger workloads and increased deployment costs. Currently, numerous workflow engines are available (with over ten being widely recognized). This variety poses a challenge for end-users in terms of mastering different engine APIs. While efforts have primarily focused on optimizing ML Operations (MLOps) for a specific workflow engine, current methods largely overlook workflow optimization across different engines. In this work, we design and implement Couler, a system designed for unified ML workflow optimization in the cloud. Our main insight lies in the ability to generate an ML workflow using natural language (NL) descriptions. We integrate Large Language Models (LLMs) into workflow generation, and provide a unified programming interface for various workflow engines. This approach alleviates the need to understand various workflow engines' APIs. Moreover, Couler enhances workflow computation efficiency by introducing automated caching at multiple stages, enabling large workflow auto-parallelization and automatic hyperparameters tuning. These enhancements minimize redundant computational costs and improve fault tolerance during deep learning workflow training. Couler is extensively deployed in real-world production scenarios at Ant Group, handling approximately 22k workflows daily, and has successfully improved the CPU/Memory utilization by more than 15% and the workflow completion rate by around 17%.</p> <div class="row"> <div class="col-12 col-sm-12 col-md-9 col-lg-8 mx-auto d-block"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/assets/img/publications/post_couler-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/assets/img/publications/post_couler-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/assets/img/publications/post_couler-1400.webp"/> <img src="/assets/img/publications/post_couler.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" title="post_couler" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Overview of Couler. </div> </div> </div> <p><br/></p> </article>]]></content><author><name></name></author><category term="tech-report"/><category term="AI"/><category term="System"/><summary type="html"><![CDATA[Couler - Unified Machine Learning Workflow Optimization in Cloud Xiaoda Wang, Yuan Tang, Tengda Guo, Bo Sang, Jingji Wu, Jian Sha, Ke Zhang, Jiang Qian, Mingjie Tang]]></summary></entry></feed>