Research Areas


Our lab has been focused on genomic analysis of mammalian cells, with a specific emphasis on three areas.

Identifying the transcriptional regulatory sequences in genomes

With the rapid advances in sequencing technologies, obtaining the genome sequences of an individual organism is no longer rate limiting. Instead, identifying the functional elements throughout the genome has become a major bottleneck. Previously, my lab invented an approach for finding cis-element that involves the identification of transcription factor binding sites and chromatin modification status genome-wide using chromatin immunoprecipitation based methods, and demonstrated that this is an effective approach for genome-wide mapping of cis-elements (Kim et al., Nature 2005; Kim et al., Cell 2007; Heintzman et al. Nature Genetics, 2007). This approach has now been widely adopted in the field. My lab has continued to play a key role in delineating the functional elements in genomes. For example, as a member of the NIH ENCODE consortium my group has been working to map the cis-regulatory elements in the mouse genome. Using this strategy, we have been able to determine more than 300,000 cis-regulatory elements in the mouse genome, including a large number of novel promoters, enhancers and insulator elements (Shen et al., Nature 2012). We are continuing this research as part of the ENCODE project, with a focus on embryonic development.

Epigenetic mechanisms regulating pluripotency and lineage commitment

The functional properties and differentiation potential of each cell are controlled not only by its primary DNA sequences but also by its epigenome, which refers to the collection of covalent modifications to the DNA and chromatin. To understand the molecular basis of cellular differentiation and cell fate determination, we need to extensively characterize the epigenome of each cell type. Since 2008, we have been conducting research as part of the NIH Roadmap Epigenome project. We have generated comprehensive epigenome maps for the human embryonic stem cells (ESC), fibroblasts and a number of ES cell derived cell types. Analysis of these epigenomic profiles has revealed dramatic differences of DNA methylomes and chromatin landscapes between the pluripotent and lineage-committed cell types. For example, we showed that there is significant expansion of H3K27me3 and H3K9me3 chromatin domains in the differentiated cells, which tend to affect genes coding for developmental regulators and lineage-specific functions (Hawkins et al., Cell Stem Cell, 2010). We suggest that formation of such large, special chromatin domains is a critical step in cellular differentiation. The information that we produced has provided novel insights into processes regulating tissue-specific gene expression (Xie et al., Cell 2013). Additionally, we have gained considerable knowledge about the epigenomic landscapes in pluripotent and lineage-committed cell types, finding evidence for the involvement of several key epigenetic processes during normal development and tumorigenesis.

Higher-order genome architecture

Higher-order chromatin architecture is emerging as an important regulator of diverse nuclear processes, from gene regulation to DNA replication. Recent methodological advancements have allowed, for the first time, the ability to interrogate higher-order chromatin interactions on a genome-wide scale. However, how chromatin architecture changes during cellular differentiation and how these changes relate to chromatin state and gene expression need to be better understood. We recently discovered that the mammalian genomes are partitioned into a few thousand megabase-sized domains, which display strong local chromatin interactions but infrequent inter-domain interactions (Dixon et al. Nature 2012). These domains are surprisingly stable during development, and are evolutionarily conserved. The physical partitioning of the genome provides a structural basis for understanding long-range regulatory functions by distal enhancers, which are often located hundreds of kilobases away from their target genes. How they function has been a longstanding question. Existence of topological domains also explains well the coordinated regulation of gene expression programs shared by clusters of genes, and potentially could provide insights into mechanisms of chromosomal translocations during evolution and in cancer cells. We are investigating the mechanisms that regulate the establishment and maintenance of topological domains, and how they contribute to gene regulation.