Li Song is an Assistant Professor at the Department of Biomedical Data Science at Geisel School of Medicine at Dartmouth. He obtained his Ph.D degree in Computer Science from Johns Hopkins University in 2018 (Advisor: Liliana Florea), and a second master degree in Applied Mathematics and Statistics there. He then joined X. Shirley Liu lab and Heng Li lab at the Department of Data Science in Dana-Farber Cancer Insitute to be trained as postdoc. His work focuses on algorithm development to better analyze next-generation sequencing data, especially for immunology.
We are looking for passionate postdocs, graduate students, and rotation students to join our team! Please contact Li.Song@dartmouth.edu to discuss. Postdoc application should send CV and contact information of three potential references. Students can apply Dartmouth Quantative Biomedical Sciences program or Department of Computer Science.
Research interests
Our research interest is to design algorithms and developing highly-efficent methods to analyze sequencing data. These methods have been widely used to investigate immune receptors, microbiome and transcriptome.
Computational Immunology
Many immune-related genes are not well represented on human reference genome, and requires specialized method to analyze. One example is the T-cell receptor (TCR) and B-cell receptor (BCR) that can be different from cell to cell. We have developed the method TRUST4 to de novo-ly assemble these receptors from bulk and single-cell RNA-seq data. We have applied TRUST4 to study how the immune repertoire change in various diseases, such as cancers. Another immune gene is HLA and KIR, which can be different from people to people. Therefore, we developed the method T1K to genotype these highly polymorphic immune genes. Currently, we are particularly interested in improving these methods and applying them to disease studies.
Computational Microbiology
The challenge in identifying microbiome composition from sequencing data is to align the reads to the huge microbial genome database which is difficult to store in memory. We have developed the method Centrifuger that can losslessly compressing the microbial database for taxonomic classification. This method improves our previous method Centrifuge. Our future goal is keeping improving Centrifuger, in data structure and classificaiton accuracy, and applying it to more microbiome studies.
Selected Publications
- Song L*, Langmead B, Centrifuger: lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification, Genome Biol. 2024 Apr 25;25(1):106. [PubMed] (Best Paper Award at RECOMB2024) (*:corresponding author)
- Song L, Cohen, D, Ouyang, Z, Cao, Y, Hu, X and Liu, XS, TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat Methods. 2021 Jun;18(6):627-630. [PubMed]
- Song L, Bai G, Liu XS, Li B and Li H, Efficient and accurate KIR and HLA genotyping with massive parallel sequencing data. Genome Res. 2023 Jun;33(6):923-931. [PubMed]
- Zhang H, Song L#, …, Liu XS, Li H, Fast alignment and preprocessing of chromatin profiles with Chromap. Nat Commun. 2021 Nov 12;12(1):6566. [PubMed] (#: co-first author)
- Song L, Sabunciyan S, Yang G and Florea L, A multi-sample approach increases the accuracy of transcript assembly. Nat Commun. 2019 Nov 1;10(1):5000. [PubMed]