Basic Information


Xiaofei Yang





Research Topic




Brief introduction of my research topic

Uncovering the underlying mechanisms of the complex phenotype formation and evolution is the key problem in life sciences. Two major steps are included in this process. One is the accurate and complete genome assembly, and another is genetics and epigenetics functional pattern discovery from the high-dimensional multi-omics data. I have achieved a series of original achievements in the genome assembly and functional pattern discovering. 1) Proposed a sequence graph model based on sequential k-mer patterns to address the accuracy assembly of high repetitive sequence, such as telomere and centromere, and obtain the accurate and complete reference genome. 2) Developed efficient graph pattern mining methods based on the proposed multi-omics dimensionality reduction approaches to discover the genetics and epigenetics functional patterns. 3) By applying the developed methods on three Papaver species, the applicant obtained high-quality genome for each species and revealed punctuated patchwork evolution of the morphinan and noscapine biosynthesis pathway. The applicant has published 20 papers as first/corresponding author (including the co-first/corresponding) in Nature, Science, Nature Communications, Fundamental Research, Molecular Biology and Evolution, Genome Biology, Briefings in Bioinformatics, BioinformaticsGenomics Proteomics & Bioinformatics (the total IF is 200). The total citation of my paper is over 1000 times, and my h-index is 16.


  1. Li Guo#, Thilo Winzer#, Xiaofei Yang#, Yi Li#, Zemin Ning#, Zhesi He, Roxana Teodor, Ying Lu, Tim A. Bowser, Ian A. Graham*, Kai Ye*, The opium poppy genome and morphinan production, Science2018, 362(6412): 343-347. (并列一作, IF=47.728)
  2. Xiaofei Yang#, Shenghan Gao#, Li Guo#, Bo Wang, Yanyan Jia, Jian Zhou, Yizhuo Che, Peng Jia, Jiadong Lin, Tun Xu, Jianyong Sun, Kai Ye*. Three chromosome-scale Papaver genomes reveal punctuated patchwork evolution of the morphinan and noscapine biosynthesis pathway. Nature Communications, 2021, 12, 6030 (2021). (IF= 14.919)
  3. Xiaofei Yang#, Xixi Zhao#, Shoufang Qu, Peng Jia, Bo Wang, Shenghan Gao, Tun Xu, Wenxin Zhang, Jie Huang*, Kai Ye*, Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing, Fundamental Research, 2022,
  4. Shenghan Gao, Xiaofei Yang*#, Jianyong Sun, Xixi Zhao, Bo Wang, Kai Ye, IAGS: Inferring Ancestor Genome Structure under a wide range of evolutionary scenarios, Molecular Biology and Evolution, 2022, 39(3): msac041., (并列一作,并列通讯, IF=16.240)
  5. Bo Wang, Xiaofei Yang*, Yanyan Jia, Yu Xu, Peng Jia, Ningxin Dang, Songbo Wang, Tun Xu, Xixi Zhao, Shenghan Gao, Quanbin Dong, Kai Ye*. High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads. Genomics, Proteomics & Bioinformatics (2021), (*共同通讯,IF=7.691)
  6. Tun Xu, Xiaofei Yang*, Yanyan Jia, Zihang Li, Guangbo Tang, Xiujuan Li, Bo Wang, Tingjie Wang, Jiadong Lin, Li Guo, Kai Ye*, A global survey of the transcriptome of the opium poppy (Papaver somniferum) based on single-molecule long-read isoform sequencing. The Plant Journal, 2022, (*共同通讯, IF=6.417)
  7. Tingjie Wang, Ningxin Dang, Guangbo Tang, Zihang Li, Xiujuan Li, Bingyin Shi, Zhong Xu, Lei Li, Xiaofei Yang*, Chuanrui Xu*, Kai Ye*, Integrating bulk and single-cell RNA sequencing reveals cellular heterogeneity and immune infiltration in hepatocellular carcinoma. Molecular Oncology, 2022, (*共同通讯,IF=6.603)
  8. Jiadong Lin#; Xiaofei Yang#; Walter Kosters; Tun Xu; Yanyan Jia; Songbo Wang; Qihui Zhu; Mallory Ryan; Li Guo; Chengsheng Zhang; The Human Genome Structural Variation Consortium; Charles Lee; Scott E. Devine; Evan E. Eichler; Kai Ye*; Mako: a graph-based pattern growth approach to detect complex structural variants, Genomics Proteomics & Bioinformatics, 2021, (#:并列一作, IF=7.691)
  9. Xiaofei Yang#; Shenghan Gao#; Tingjie Wang#; Boyu Yang; Ningxin Dang; Kai Ye*. gCAnno: a graph-based single cell type annotation method. BMC Genomics, 2020, 21, 823. (IF=3.969)
  10. Xiaofei Yang, Tun Xu, Peng Jia, Han Xia, Li Guo, Lei Zhang, Kai Ye, Transportation, germs, culture: a dynamic graph model of COVID-19 outbreak, Quantitative Biology, 2020, 8(3): 238–244,
  11. Xiaofei Yang, Wan-Ping Lee, Kai Ye, Charles Lee. One reference genome is not enough [J]. Genome Biology, 2019, 20(1): 104. (IF = 13.583)
  12. Xiaofei Yang, Lin Gao, Shihua Zhang. Comparative pan-cancer DNA methylation analysis reveals cancer common and specific patterns, Briefings in Bioinformatics, 2017, 18(5):761-773. (IF = 11.622)
  13. Xiaofei Yang, Xiaojian Shao, Lin Gao, Shihua Zhang. Systematic DNA methylation analysis of multiple cell lines reveals common and specific patterns within and across tissues of origin[J]. Human Molecular Genetics, 2015, 24(15): 4374-4384. (IF = 6.150)
  14. Xiaofei Yang, Lin Gao, Xingli Guo, Xinhua Shi, Hao Wu, Fei Song, Bingbo Wang. A network based method for analysis of lncRNA-disease associations and prediction of lncRNAs implicated in diseases[J]. PLoS One, 2014, 9(1): e87797. (IF = 3.24)
  15. Xiaofei Yang, Xiaojian Shao, Lin Gao, et al. Comparative DNA methylation analysis to decipher common and cell type-specific patterns among multiple cell types[J]. Briefings in Functional Genomics, 2016, 15(6):399-407. (IF = 4.241)
  16. Peter Ebert#; Peter A. Audano#; Qihui Zhu#; Bernardo Rodriguez-Martin#; ... Jiadong Lin; Xiaofei Yang; Kai Ye; ...; Charles Lee*; Jan O. Korbel*; Tobias Marschall*; Evan E. Eichler*; Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science, 2021: eabf7117.
  17. Wan-Ping Lee#, Qihui Zhu#, Xiaofei Yang, Silvia Liu, Eliza Cerveira, Mallory Ryan, Adam Mil-Homens, Lauren Bellfy, Kai Ye, Chengsheng Zhang*, and Charles Lee*. JAX-CNV: A whole genome sequencing-based algorithm for copy number detection at clinical grade level. Genomics, Proteomics & Bioinformatics (2022),
  18. Yongyong Kang, Xinchao Ji, Li Guo, Han Xia, Xiaofei Yang, Zhen Xie, Xiaodan Shi, Rui Wu, Dongyun Feng, Chen Wang, Min Chen, Wenliang Zhang, Hong Wei, Yuanlin Guan, Kai Ye, Gang Zhao. Cerebrospinal Fluid from Healthy Pregnant Women Does Not Harbor a Detectable Microbial Community. Microbiology spectrum, 2021, 9(3), e00769-21.
  19. Rundong Li, Pinghui Wang*, Jiongli Zhu, Junzhou Zhao, Jia Di, Xiaofei Yang, Kai Ye. “Building Fast and Compact Sketches for Approximately Multi-Set Multi-Membership Querying”. SIGMOD 2021. (CCF Rank A),
  20. Yue Wang, Yufeng Liu, Xiaofei Yang, Hui Guo, Jiadong Lin, Jinkui Yang, Mingqian He, Jingya Wang, Xiaomei Liu, Tingting Shi, Liping Wu, Chengsheng Zhang, Kai Ye, Bingyin Shi. Predicting the early risk of ophthalmopathy in Graves’ disease patients using TCR repertoire[J]. Clinical and translational medicine, 2020, 10(7): e218.
  21. Peng. Jia, Xiaofei Yang, Li. Guo, Bowen. Liu, Jiadong. Lin, Hao. Liang, Jianyong. Sun, Chengsheng Zhang, Kai Ye, MSIsensorpro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite Instability, Genomics, Proteomics & Bioinformatics 2020 Feb; 18(1): 65-71, doi:
  22. Bowen Liu, Xiaofei Yang, Tingjie Wang, Jiadong Lin, Yongyong Kang, Peng Jia, Kai Ye. MEpurity: estimating tumor purity using DNA methylation data[J]. Bioinformatics, 2019 Dec 15; 35(24): 5298-5300,
  23. YongYong Kang, Xiaofei Yang, Jiadong Lin, Kai Ye. PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction [J]. Genes, 2019, 10(2): 73.
  24. Kai Ye, Li Guo, Xiaofei Yang, et al, Split-Read Indel and Structural Variant Calling Using PINDEL, Methods in molecular biology (Clifton, N.J.) 1833:95-105, July 2018, DOI: 10.1007/978-1-4939-8666-8_7. (引用3次)
  25. Xiaoke Ma, Liang Yu, Peizhuo Wang, Xiaofei Yang. Discovering DNA methylation patterns for long non-coding RNAs associated with cancer subtypes [J]. Computational biology and chemistry, 2017, 69: 164-170.(引用11次)
  26. Xingli Guo, Lin Gao, Qi Liao, Hui Xiao, Xiaoke Ma, Xiaofei Yang, et al. Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks[J]. Nucleic Acids Research, 2013, 41(2): e35-e35. (引用140次)
  27. Hao Wu, Lin Gao, Feng Li, Fei Song, Xiaofei Yang, et al. Identifying overlapping mutated driver pathways by constructing gene networks in cancer[J]. BMC Bioinformatics, 2015, 16(Suppl 5): S3. (引用14次)
  28. Hao Wu, Lin Gao, Jihua Dong, Xiaofei Yang. Detecting overlapping protein complexes by rough-fuzzy clustering in protein-protein interaction networks[J]. PloS One, 2014, 9(3): e91856. (引用28次)
  29. Xingli Guo, Lin Gao, Chunshui Wei, Xiaofei Yang, et al. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations[J]. PLoS One, 2011, 6(9): e34171. (引用29次)
  30. Feng Li, Lin Gao, Xiaoke Ma, Xiaofei Yang. Detection of driver pathways using mutated gene network in cancer [J]. Molecular BioSystems, 2016,12, 2135-2141. (引用5次)