The mapping of the human genome and the determination of corresponding gene functions, pathways, and biological mechanisms are driving the emergence of the new research fields of toxicogenomics and systems toxicology. Many technological advances such as microarrays are enabling this paradigm shift that indicates an unprecedented advancement in the methods of understanding the expression of toxicity at the molecular level. At the National Center for Toxicological Research (NCTR) of the U.S. Food and Drug Administration, core facilities for genomic, proteomic, and metabonomic technologies have been established that use standardized experimental procedures to support centerwide toxicogenomic research. Collectively, these facilities are continuously producing an unprecedented volume of data. NCTR plans to develop a toxicoinformatics integrated system (TIS) for the purpose of fully integrating genomic, proteomic, and metabonomic data with the data in public repositories as well as conventional in vitro and in vivo toxicology data. The TIS will enable data curation in accordance with standard ontology and provide or interface a rich collection of tools for data analysis and knowledge mining. In this article the design, practical issues, and functions of the TIS are discussed through presenting its prototype version, ArrayTrack, for the management and analysis of DNA microarray data. ArrayTrack is logically constructed of three linked components: a) a library (LIB) that mirrors critical data in public databases; b) a database (MicroarrayDB) that stores microarray experiment information that is Minimal Information About a Microarray Experiment (MIAME) compliant; and c) tools (TOOL) that operate on experimental and public data for knowledge discovery. Using ArrayTrack, we can select an analysis method from the TOOL and apply the method to selected microarray data stored in the MicroarrayDB; the analysis results can be linked directly to gene information in the LIB. Key words: bioinformatics, data analysis, database, genomics, infrastructure, MIAME, microarray, systems toxicology, toxicogenomics, toxicology. Environ Health Perspect 111:1819-1826 (2003). doi:10.1289/txg.6497 available via http://dx.doi.org/[Online 15 September 2003]
While modern toxicology has focused on understanding biological mechanisms involved in the expression of toxicity at the molecular level, a technological revolution has occurred enabling researchers to perform experiments on a scale of unprecedented proportions (Marshall and Hodgson 1998; Ramsay 1998). High-throughput experimentation is producing large amounts of data impossible to analyze without informatics-related support (Bellenson 1999; Spengler 2000). We see a paradigm shift in toxicology research, where hypothesis-driven research is complemented by data-driven experimentation designed to be hypothesis generating (Afshari et al. 1999). Although toxicogenomics, the study of toxicology using high-throughput "omics" technologies (Aardema and MacGregor 2002; Hamadeh et al. 2002; Nuwaysir et al. 1999; Schmidt 2002; Ulrich and Friend 2002), and systems toxicology, the study of toxicology through data integration (Waters et al. 2003), have advanced rapidly and are likely to continue to advance, development of software infrastructures to manage, analyze, and integrate the diverse data has lagged behind. Recently, Waters et al. (2003) proposed a conceptual framework of chemical effects in biological systems [(CEBS) Chemical Effects in Biological Systems knowledge base] to meet the expanding toxicogenomic research needs at the National Center for Toxicogenomics (NCT) (Tennant 2002), including both NCT intramural research and research within the Toxicogenomics Research Consortium (TRC) (Medlin 2002). Both the NCT and the TRC are located at the National Institute of Environmental Health Sciences (NIEHS) in the Research Triangle Park, North Carolina.