publications | Weian Mao

2026

Preprint
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Weian Mao, Xi Lin, Wei Huang, and 5 more authors

2026

Abs arXiv Bib Code Website

Extended reasoning in large language models (LLMs) creates severe KV cache memory bottlenecks. We propose TriAttention, which leverages pre-RoPE Q/K concentration and trigonometric series to estimate key importance for KV cache compression. On AIME25 with 32K-token generation, TriAttention matches Full Attention reasoning accuracy while achieving 2.5x higher throughput or 10.7x KV memory reduction, whereas leading baselines achieve only about half the accuracy at the same efficiency.
@article{mao2026triattention, title = {TriAttention: Efficient Long Reasoning with Trigonometric KV Compression}, author = {Mao, Weian and Lin, Xi and Huang, Wei and Xie, Yuxin and Fu, Tianfu and Zhuang, Bohan and Han, Song and Chen, Yukang}, year = {2026}, }

2025

ICLR
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Xiaoran Jiao^*, Weian Mao^*, Wengong Jin, and 3 more authors

In The Thirteenth International Conference on Learning Representations (ICLR Spotlight), 2025

Abs arXiv Bib Code

Predicting the change in binding free energy (DDG) upon mutations is crucial for understanding protein-protein interactions in drug design. We introduce a Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to DDG prediction. On SKEMPI v2, we achieve Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), substantially exceeding prior state-of-the-art.
@inproceedings{jiao2025baddg, title = {Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions}, author = {Jiao, Xiaoran and Mao, Weian and Jin, Wengong and Yang, Peiyuan and Chen, Hao and Shen, Chunhua}, booktitle = {The Thirteenth International Conference on Learning Representations (ICLR Spotlight)}, year = {2025}, }
ICLR
Revisiting Convolution Architecture in the Realm of DNA Foundation Models

Yu Bo^*, Weian Mao^*, Yanjun Shao, and 6 more authors

In The Thirteenth International Conference on Learning Representations (ICLR), 2025

Abs arXiv Bib Code

We present ConvNova, a CNN-based DNA foundation model employing dilated convolutions, gated convolutions, and a dual-branch gating framework. ConvNova significantly outperforms recent Transformer and state space model methods on more than half of the benchmark tasks, with particularly strong results in histone-related applications, while requiring fewer parameters.
@inproceedings{bo2025convnova, title = {Revisiting Convolution Architecture in the Realm of DNA Foundation Models}, author = {Bo, Yu and Mao, Weian and Shao, Yanjun and Bai, Weiqiang and Ye, Peng and Ma, Xinzhu and Zhao, Junbo and Chen, Hao and Shen, Chunhua}, booktitle = {The Thirteenth International Conference on Learning Representations (ICLR)}, year = {2025}, }

2024

ICLR
De novo Protein Design Using Geometric Vector Field Networks

Weian Mao, Muzhi Zhu, Zheng Sun, and 4 more authors

In The Twelfth International Conference on Learning Representations (ICLR Spotlight), 2024

Abs arXiv Bib Code

We introduce Vector Field Network (VFN), a novel deep learning approach for protein structure modeling. VFN enables learnable vector computations between coordinates of frame-anchored virtual atoms, improving frame representation capabilities. VFN surpasses existing methods like IPA in protein diffusion tasks, achieving 67.04% vs. 53.58% designability scores. For inverse folding, VFN outperforms PiFold with 54.7% vs. 51.66% sequence recovery rates.
@inproceedings{mao2024vfn, title = {De novo Protein Design Using Geometric Vector Field Networks}, author = {Mao, Weian and Zhu, Muzhi and Sun, Zheng and Shen, Shuaike and Wu, Lin Yuanbo and Chen, Hao and Shen, Chunhua}, booktitle = {The Twelfth International Conference on Learning Representations (ICLR Spotlight)}, year = {2024}, }
ICML
Floating Anchor Diffusion Model for Multi-motif Scaffolding

Ke Liu^*, Weian Mao^*, Shuaike Shen, and 4 more authors

In International Conference on Machine Learning (ICML), 2024

Abs arXiv Bib Code

We propose Floating Anchor Diffusion (FADiff) model for multi-motif scaffolding in protein design. FADiff allows motifs to float rigidly and independently during diffusion, guaranteeing the presence of motifs and automating motif position design. FADiff is the first work to tackle the challenge of scaffolding multiple motifs without relying on expertise of relative motif positions.
@inproceedings{liu2024fadiff, title = {Floating Anchor Diffusion Model for Multi-motif Scaffolding}, author = {Liu, Ke and Mao, Weian and Shen, Shuaike and Jiao, Xiaoran and Sun, Zheng and Chen, Hao and Shen, Chunhua}, booktitle = {International Conference on Machine Learning (ICML)}, pages = {31691--31708}, year = {2024}, organization = {PMLR}, }
ICML
Generative Active Learning for Long-tailed Instance Segmentation

Muzhi Zhu, Chengxiang Fan, Hao Chen, and 4 more authors

In International Conference on Machine Learning (ICML), 2024

Abs arXiv Bib Code

We propose BSGAL, an algorithm that online estimates the contribution of generated data based on gradient cache for long-tailed instance segmentation. Experimental results demonstrate that BSGAL surpasses baseline methods and substantially improves performance in long-tailed segmentation scenarios.
@inproceedings{zhu2024divergen, title = {Generative Active Learning for Long-tailed Instance Segmentation}, author = {Zhu, Muzhi and Fan, Chengxiang and Chen, Hao and Liu, Yang and Mao, Weian and Xu, Xiaogang and Shen, Chunhua}, booktitle = {International Conference on Machine Learning (ICML)}, year = {2024}, }

2023

ICCV
CTVIS: Consistent Training for Online Video Instance Segmentation

Kaining Ying, Qing Zhong, Weian Mao, and 7 more authors

In IEEE/CVF International Conference on Computer Vision (ICCV), 2023

Abs arXiv Bib Code

We propose a consistent training strategy for online video instance segmentation that aligns training and inference pipelines. Our approach incorporates momentum-averaged embeddings and memory bank mechanisms. Results show improvements of up to +5.0 points on YTVIS19, YTVIS21, and OVIS benchmarks.
@inproceedings{ying2023ctvis, title = {CTVIS: Consistent Training for Online Video Instance Segmentation}, author = {Ying, Kaining and Zhong, Qing and Mao, Weian and Wang, Zhenhua and Chen, Hao and Wu, Lin Yuanbo and Liu, Yifan and Fan, Chengxiang and Zhuge, Yunzhi and Shen, Chunhua}, booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)}, pages = {899--908}, year = {2023}, }
ICCV
SegPrompt: Boosting Open-World Segmentation via Category-Level Prompt Learning

Muzhi Zhu, Hengtao Li, Hao Chen, and 5 more authors

In IEEE/CVF International Conference on Computer Vision (ICCV), 2023

Abs arXiv Bib Code

We introduce SegPrompt, a training mechanism that uses category information to improve class-agnostic segmentation ability for both known and unknown categories. Results demonstrate improvements of 5.6% and 6.1% in average recall on our benchmark.
@inproceedings{zhu2023segprompt, title = {SegPrompt: Boosting Open-World Segmentation via Category-Level Prompt Learning}, author = {Zhu, Muzhi and Li, Hengtao and Chen, Hao and Fan, Chengxiang and Mao, Weian and Jing, Chenchen and Liu, Yifan and Shen, Chunhua}, booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)}, pages = {999--1008}, year = {2023}, }

2022

ECCV
Poseur: Direct Human Pose Regression with Transformers

Weian Mao, Yongtao Ge, Chunhua Shen, and 4 more authors

In European Conference on Computer Vision (ECCV), 2022

Abs arXiv Bib Code

We propose a direct, regression-based approach to 2D human pose estimation. The problem is formulated as a sequence prediction task solved using a Transformer network that directly learns a regression mapping from images to keypoint coordinates, without resorting to heatmaps. The approach is end-to-end differentiable and achieves competitive performance on MS-COCO and MPII datasets.
@inproceedings{mao2022poseur, title = {Poseur: Direct Human Pose Regression with Transformers}, author = {Mao, Weian and Ge, Yongtao and Shen, Chunhua and Tian, Zhi and Wang, Xinlong and Wang, Zhibin and van den Hengel, Anton}, booktitle = {European Conference on Computer Vision (ECCV)}, pages = {72--88}, year = {2022}, organization = {Springer}, }

2021

CVPR
FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions

Weian Mao, Zhi Tian, Xinlong Wang, and 1 more author

In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Abs arXiv Bib

We present a fully convolutional framework for multi-person pose estimation using dynamic instance-aware convolutions. The approach eliminates Region of Interest operations and post-processing grouping steps. On COCO, the real-time version achieves approximately 4.5x faster inference than Mask R-CNN while delivering superior performance.
@inproceedings{mao2021fcpose, title = {FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions}, author = {Mao, Weian and Tian, Zhi and Wang, Xinlong and Shen, Chunhua}, booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, pages = {9034--9043}, year = {2021}, }