Publications


2024

Guser: A GPGPU Power Stressmark Generator
Yalong Shan, Yongkui Yang, Xuehai Qian, Zhibin Yu
HPCA 2024
NAPA: Intermediate-level Variational Native-pulse Ansatz for Variational Quantum Algorithms
Zhiding Liang, Jinglei Cheng, Hang Ren, Hanrui Wang, Fei Hua, Zhixin Song, Yongshan Ding, Fred Chong, Song Han, Yiyu Shi, Xuehai Qian
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2024

2023

GNNPipe: Scaling Deep GNN Training with Pipelined Model Parallelism
Jingji Chen, Zhuoming Chen, Xuehai Qian
arXiv 2023
Hybrid Gate-Pulse Model for Variational Quantum Algorithms
Zhiding Liang, Zhixin Song, Jinglei Cheng, Zichang He, Ji Liu, Hanrui Wang, Ruiyang Qin, Yiru Wang, Song Han, Xuehai Qian, Yiyu Shi
DAC 2023
Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine
Jingji Chen, Xuehai Qian
ASPLOS 2023
Achieving Sub-second Pairwise Query over Evolving Graphs
Hongtao Chen, Mingxing Zhang, Ke Yang, Kang Chen, Albert Zomaya, Yongwei Wu, Xuehai Qian
ASPLOS 2023
DecoMine: A Compilation-based Graph Pattern Mining System with Pattern Decomposition
Jingji Chen, Xuehai Qian
ASPLOS 2023
RobustState: Boosting Fidelity of Quantum State Preparation via Noise-Aware Variational Training
Hanrui Wang, Yilian Liu, Pengyu Liu, Jiaqi Gu, Zirui Li, Zhiding Liang, Jinglei Cheng, Yongshan Ding, Xuehai Qian, Yiyu Shi, David Z. Pan, Frederic T. Chong, Song Han
ICCAD 2023 ML4Sci

2022

SparseCore: Stream ISA and Processor Specialization for Sparse Computation
Gengyu Rao, Jingji Chen, Jason Yik, Xuehai Qian
ASPLOS 2022
HyBP: Hybrid Isolation-Randomization Secure Branch Predictor
Lutan Zhao, Peinan Li, Rui Hou, Michael Huang, Xuehai Qian, Lixin Zhang, Dan Meng
HPCA 2022
QuEst: Graph Transformer for Quantum Circuit Reliability Prediction
Hanrui Wang, Pengyu Liu, Jinglei Cheng, Zhiding Liang, Jiaqi Gu, Zirui Li, Yongshan Ding, Weiwen Jiang, Yiyu Shi, Xuehai Qian, David Z. Pan, Frederic T. Chong, Song Han
ICCAD 2022
Variational Quantum Pulse Learning
Hanrui Wang, Zhiding Liang, Jinglei Cheng, Yongshan Ding, Hang Ren, Zhengqi Gao, Xuehai Qian, Song Han, Weiwen Jiang, Yiyu Shi
QCE 2022

2021

RDMA-enabled Concurrency Control Protocols for Transactions in the Cloud Era
Chao Wang, Xuehai Qian
IEEE Transactions on Cloud Computing 2021
Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee
Youwei Zhuo, Jingji Chen, Gengyu Rao, Qinyi Luo, Yanzhi Wang, Hailong Yang, Depei Qian, Xuehai Qian
ACM Transactions on Computer Systems 2021
Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform?
Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang
IEEE Transactions on Neural Networks and Learning Systems 2021
ESCALATE: Boosting the Efficiency of Sparse CNN Accelerator with Kernel Decomposition
Shiyu Li, Edward Hanson, Xuehai Qian, Hai Li, Yiran Chen
MICRO 2021
Kudu: An Efficient and Scalable Distributed Graph Pattern Mining Engine
Jingji Chen, Xuehai Qian
arXiv 2021
HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation
Qincheng Xiao, Size Zheng, Bingzhe Wu, Pengcheng Xu, Xuehai Qian, Yun Liang
ISCA 2021
FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator
Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding
ISCA 2021
GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator
Chunhua Deng, Siyu Liao, Yang Sui, Xuehai Qian, Bo Yuan
ISCA 2021
Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework
Sung-En Chang, Yanyu Li, Mengshu Sun, Runbin Shi, Hayden K.-H. So, Yanzhi Wang, Xuehai Qian, Xue Lin,
HPCA 2021

2020

IntersectX: An Accelerator for Graph Mining
Gengyu Rao, Jingji Chen, Xuehai Qian
arXiv 2020
DwarvesGraph: A High-Performance Graph Mining System with Pattern Decomposition
Jingji Chen, Xuehai Qian
arXiv 2020
Low-Cost Floating-Point Processing in ReRAM for Scientific Computing
Linghao Song, Fan Chen, Xuehai Qian, Hai Li, Yiran Chen
arXiv 2020
A Lightweight Isolation Mechanism for Secure Branch Predictors
Lutan Zhao, Peinan Li, Rui Hou, Michael C. Huang, Jiazhen Li, Lixin Zhang, Xuehai Qian, Dan Meng
Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?
Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang
A Comprehensive Evaluation of RDMA-enabled Concurrency Control Protocols
Chao Wang, Kezhao Huang, Xuehai Qian
ReversiSpec: Reversible Coherence Protocol forDefending Transient Attacks
You Wu, Xuehai Qian
AccQOC: Accelerating Quantum Optimal Control Based Pulse Generation
Jinglei Cheng, Haoqing Deng, Xuehai Qian
ISCA 2020
SympleGraph: Distributed Graph Processing with Precise Loop-carried Dependency Guarantee
Youwei Zhuo, Jingji Chen, Qinyi Luo, Yanzhi Wang, Hailong Yang, Depei Qian, Xuehai Qian
PLDI 2020
Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training
Qinyi Luo, Jiaao He, Youwei Zhuo, Xuehai Qian
ASPLOS 2020
Capuchin: Tensor-based GPU Memory Management for Deep Learning
Xuan Peng, Xuanhua Shi, Hulin Dai, Hai Jin, Weiliang Ma, Fan Yang, Xuehai Qian
ASPLOS 2020
AsymNVM: AnEfficient Framework for Implementing Persistent Data Structures on Asymmetric NVM Architecture
Teng Ma, Mingxing Zhang, Kang Chen, Zhuo Song, Yongwei Wu, Xuehai Qian
ASPLOS 2020
DNN-Guard: An Elastic Heterogeneous Architecture for DNN Accelerator against Adversarial Attacks
Xingbin Wang, Rui Hou, Boyan Zhao, Fengkai Yuan, Jun Zhang, Dan Meng, Xuehai Qian
ASPLOS 2020
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning
Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren
ASPLOS 2020
AccPar: Tensor Partitioning for Heterogeneous Deep Learning Accelerator Arrays
Linghao Song, Fan Chen, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen
HPCA 2020
TUPIM: A Transparent and Universal Processing-in-memory Architecture for Unmodified Binaries
Sheng Xu, Xiaoming Chen, Yinhe Han, Xuehai Qian, Xiaowei Li
GLSVLSI 2020
Efficient Performance Estimation and Work-Group Size Pruning for OpenCL Kernels on GPUs
Xiebing Wang, Xuehai Qian, Alois Knoll, Kai Huang
IEEE Transactions on Parallel and Distributed Systems
CELIA: A Full-Stack Framework for STT-MRAM-Based Deep Learning Acceleration
Hao Yan, Hebin R. Cherian, Ethan C. Ahn, Xuehai Qian, Lide Duan
IEEE Transactions on Parallel and Distributed Systems

2019

GraphQ: Scalable PIM-Based Graph Processing
Youwei Zhuo, Chao Wang, Mingxing Zhang, Rui Wang, Dimin Niu, Yanzhi Wang, Xuehai Qian
MICRO 2019
SpeedyBox: Low-Latency NFV Service Chains with Cross-NF Runtime Consolidation
Yimin Jiang, Yong Cui, Wenfei Wu, Zhe Xu, Jiahan Gu, K. K. Ramakrishnan, Yongchao He, Xuehai Qian
ICDCS 2019
TIE: Energy-efficient Tensor Train-Based Inference Engine for Deep Neural Network
Chunhua Deng, Fangxuan Sun, Xuehai Qian, Jun Lin, Zhongfeng Wang, Bo Yuan
ISCA 2019
A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron Superconducting Technology
Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Yoshikawa Nobuyuki, Yanzhi Wang
ISCA 2019
HOP: Heterogeneity-Aware Decentralized Training
Qinyi Luo, Jinkun Lin, Youwei Zhuo, Xuehai Qian
ASPLOS 2019
SW-Lock: A Fast Lock for Sunway Taihulight
Xiongchao Tang, Jidong Zhai, Xuehai Qian, Wenguang Chen
ASPLOS 2019
ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers
Ao Ren, Jiayu Li, Tianyun Zhang, Shaokai Ye, Wenyao Xu, Xuehai Qian, Xue Lin, Yanzhi Wang
ASPLOS 2019
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
Linghao Song, Jiachen Mao, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen
HPCA 2019
A Hybrid Framework for Fast and Accurate GPU Performance Estimation through Source-Level Analysis and Trace-Based Simulation
Xiebing Wang, Kai Huang, Alois Knoll, Xuehai Qian
HPCA 2019
E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs
Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang
HPCA 2019
PIMSim: A Flexible and Detailed Processing-in-Memory Simulator
Sheng Xu, Xiaoming Chen, Ying Wang, Yinhe Han, Xuehai Qian, Xiaowei Li
Compter Architecture Letters 2019
CLIP: A Disk I/O Focused Parallel Out-of-core Graph Processing System
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, Weimin Zheng
IEEE Transactions on Parallel and Distributed Systems 2019
HEIF: Highly Efficient Stochastic Computing based Inference Framework for Deep Neural Networks
Zhe Li, Ji Li, Ao Ren, Ruizhe Cai, Caiwen Ding, Xuehai Qian, Jeffrey Draper, Bo Yuan, Jian Tang, Qinru Qiu, Yanzhi Wang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2019

2018

CSE: Parallel Finite State Machines with Convergence Set Enumeration
Youwei Zhuo, Jinglei Cheng, Qinyi Luo, Jidong Zhai, Yanzhi Wang, Zhongzhi Luan, Xuehai Qian
MICRO 2018
CounterMiner: Mining Big Performance Data from Hardware Counters
Yirong Lv, Bin Sun, Qinyi Luo, Zhibin Yu, Xuehai Qian
MICRO 2018
PermDNN: Efficient Compressed Deep Neural Network Architecture with Permuted Diagonal Matrices
Chunhua Deng, Siyu Liao, Yi Xie, Keshab K. Parhi, Xuehai Qian, Bo Yuan
MICRO 2018
ReRAM-based accelerator for deep learning
Bing Li, Linghao Song, Fan Chen, Xuehai Qian, Yiran Chen, Hai Helen Li
DATE 2018
vSensor: Leveraging Fixed-Workload Modules of Programs for Performance Variance Detection
Xiongchao Tang, Jidong Zhai, Xuehai Qian, Bingsheng He, Wei Xue, Wenguang Chen
PPOPP 2018
Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System
Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chenying Huan, Kang Chen
ASPLOS 2018
VIBNN: Hardware Acceleration of Bayesian Neural Networks
Ruizhe Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang
ASPLOS 2018
DAC: Data-Aware Auto-Tuning High Dimensional Configurations of In-Memory Cluster Computing.
Zhibin Yu, Zhendong Bei, Xuehai Qian
ASPLOS 2018
GraphR: Accelerating Graph Processing Using ReRAM.
Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen
HPCA 2018
GraphP: Reducing Communication of PIM-based Graph Processing with Efficient Data Partition
Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, Xuehai Qian
HPCA 2018
G-TSC: Timestamp Based Coherence for GPUs
Abdulaziz Tabbakh, Xuehai Qian, Murali Annavaram
HPCA 2018
Towards Ultra-High Performance and Energy Effciency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework.
Yanzhi Wang, Caiwen Ding, Geng Yuan, Siyu Liao, Zhe Li, Xiaolong Ma, Bo Yuan, Xuehai Qian, Jian Tang, Qinru Qiu, Xue Lin
AAAI 2018
Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems
Xiaoxiao Liu, Wei Wen, Xuehai Qian, Hai Li, Yiran Chen
ASP-DAC 2018
DudeTx: Durable Transactions Made Decoupled
Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng, Jinglei Ren
ACM Transactions on Storage 2018

2017

CIRCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices
Caiwen Ding, Yanzhi Wang, Siyu Liao, Zhe Li, Yu Bai, Youwei Zhuo, Chao Wang, Xuehai Qian, Ning Liu, Geng Yuan, Xiaolong Ma, Yipeng Zhang, Xue Lin, Jian Tang, Qinru Qiu, Bo Yuan
MICRO 2017
Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, Weimin Zheng
ATC 2017
Power Efficient Sharing-Aware GPU Data Management
Abdulaziz Tabbakh, Murali Annavaram and Xuehai Qian
IPDPS 2017
SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing
Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan and Yanzhi Wang
ASPLOS 2017
DudeTM: Building Durable Transactions with Decoupling for Persistent Memory
Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng and Jinglei Ren
ASPLOS 2017
PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning
Linghao Song Xuehai Qian, Hai Li and Yiran Chen
HPCA 2017

2016

Exploring the Hidden Dimension in Graph Processing
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li and Weimin Zheng
OSDI 2016
SReplay: Deterministic Group Replay for One-Sided Communication
Xuehai Qian, Koushik Sen, Paul Hargrove and Costin Iancu
ICS 2016

Prior to 2015

Pacifier: Record and Replay for Relaxed-Consistency Multiprocessors with Distributed Directory Protocol
Xuehai Qian, Benjamin Sahelices and Depei Qian
ISCA 2014
OmniOrder: Directory-Based Conflict Serialization of Transactions
Xuehai Qian, Benjamin Sahelices and Josep Torrellas
ISCA 2014
BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment
Xuehai Qian, Benjamin Sahelices, Josep Torrellas and Depei Qian
MICRO 2013
Volition: Precise and Scalable Sequential Consistency Violation Detection
Xuehai Qian, Benjamin Sahelices, Josep Torrellas and Depei Qian
ASPLOS 2013
Rainbow: Efficient Memory Race Recording with High Replay Parallelism for Relaxed Memory Model
Xuehai Qian, He Huang, Benjamin Sahelices and Depei Qian
HPCA 2013
BulkSMT: Designing SMT Processors for Atomic-Block Execution
Xuehai Qian, Wonsun Ahn and Josep Torrellas
HPCA 2012
ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
Xuehai Qian, Wonsun Ahn and Josep Torrellas
MICRO 2010
Optmized Register Renaming Scheme for Stack-Based x86 Operations
Xuehai Qian, He Huang, Zhenzhong Duan, Junchao Zhang, Nan Yuan, Yongbin Zhou, Hao Zhang, Huimin Cui and Dongrui Fan
ARCS 2007
Circuit Implementation of Floating Point Range Reduction for Trigonometric Functions
Xuehai Qian, Hao Zhang, Jingang Yang, He Huang, Junchao Zhang and Dongrui Fan
ISCAS 2007
Design and Implementation of Floating Point Stack on General RISC Architecture
Xuehai Qian, He Huang, Hao Zhang, Guoping Long, Junchao Zhang and Dongrui Fan
PDP 2007