Our lab is interested in developing general machine learning models and algorithms for integrative analysis of large-scale genomic data to understand the molecular characteristics of biological functions and phenotypes. We design mathematically principled methods in the categories of graph-based semi-supervised learning, transfer learning, string kernels and other kernel methods, sequence alignment methods and various statistical models for a unified analysis of heterogeneous biological data. Our current projects center around the following topics,

  • Cancer genomics: Development of graph-based learning algorithms, sequence alignment algorithms and association rule-mining algorithms for building predictive models and mining biomarkers of cancer phenotypes from microarray or sequencing transcriptome data, DNA copy number variations, SNPs and protein-protein interactions.
  • Phenome-genome association analysis: Development of graph-based learning algorithms for analyzing disease and gene associations in a network context.
  • Protein remote homology detection: Development of string kernel algorithms and label propagation algorithms to infer the protein remote homologys and study their protein structures and functions.
    11 entries « 3 of 3 »
    Hwang, TaeHyun; Sicotte, Hugues; Tian, Ze; Wu, Baolin; Kocher, Jean-Pierre; Wigle, Dennis; Kumar, Vipin; Kuang, Rui (2008): Robust and efficient identification of biomarkers by classifying features on graphs . In: Bioinformatics, 24 (18), pp. 2023–2029, 2008, ISBN: 1460-2059. (Type: Journal Article | Abstract | Links)
    11 entries « 3 of 3 »
  • Semi-supervised and transfer learning algorithms: Development of general and scalable graph-based learning, transfer learning, sparse group learning and kernel learning method.