Research Summary

Systems biology is a new field that aims to gain system-level understanding of complex functions of biological systems and biological processes, ranging from molecules to cells, tissues, or entire organisms. From the systems science point of view, many important properties of a complex system emerge from the interaction of the system components and, therefore, rather than investigating the characteristics of isolated system parts separately, systems biologists focus on discovering emergent properties and functions that do not appear in individual components, but are driven by the interactions among all the system components or component groups.

Thanks to the development of high-throughput ‘-omics’ technologies, such as genomic sequences gathered by the Human Genome Project, gene expression data from microarray experiments, proteome databases and protein interaction databases, together with large volume of digital textural documents such as the PubMed, have created an unprecedented opportunity to apply computational techniques/tools for a comprehensive study of the structure and dynamics of system components, and thus provides a foundation to systems biology.

A variety of data mining and machine learning techniques have been developed during recent years aiming at extracting information/knowledge/patterns from large volumes of data. Due to the useful capability of processing large datasets and extracting hidden patterns, data mining and machine learning have been applied to bioinformatics for managing the data, identifying the structure and functions of system components/elements (genes and proteins), and their functional relationships. However, the traditional data mining techniques, when working for systems biology, are facing many new challenges. A fundamental issue is that the biomedical data repositories are formed with various (diverse) technologies and are normally presented in heterogeneous and unstructured forms. The ability to automatically and effectively extract, integrate, understand, and make use of information embedded in such heterogeneous unstructured data remains a challenging task.

The research of Dr Peng’s group aims at developing new computational approach called integrative data mining that can assist systems biologists to direct the whole investigation process from information gathering, analysis and interpretation and incrementally improve our understanding and eventually gain a panorama of the biological systems. The integrative data mining approach can be generally described by there main steps: (1) identifying elements/components of the system; (2) describing the system using connective networks, in which nodes represent the system components and edges represent interactions between nodes. The network describes the functional relationship among the system components, and the interactions ultimately determine an organism’s behaviour and functions; (3) gaining insights into emergent properties of biological systems by means of analyzing structural properties and dynamics of the network.

Please contact me at: y.h.peng@brad.ac.uk if you are interested in joining us as research students, visiting researchers or postdoctoral fellow.