2003-05-27

Bio-informatics and database construction for the interpretation of human gene variation.

The human genome contains a great deal of sequence variation between persons. This variation involves single nucleotide polymorphism (SNP) as well as sequence deletions and insertions of various sizes. The variations can occur anywhere in a gene, and may affect transcription/expression levels, and/or may alter the resulting protein sequence (which may or may not change protein function), and/or may be neutral (of no consequence).
Worldwide, the study of gene sequence polymorphism is growing dramatically because it is the most functionally relevant dimension of the maturing Human Genome Project. But, despite the existence of a range of multi-million dollar projects for polymorphism discovery, there is only one effort attempting to centrally collect the portions of this data that relate to functional variations in known human genes - it is the database we have produced (HGBASE : http://hgbase.interactiva.de/). HGBASE is
visited by over 2,000 scientists every month, and currently contains details of ~ 500,000 human genee variations with many more being processed for release later this. Our data processing activities entail complex integration of different data sources, quality checking of information, ´in silico´ prediction of coding consequences, and semi-automated Web and literature scanning. This requires high curation and programming input that must continually be advanced and improved to keep up with the world´s discovery and functional investigation efforts. Additionally, our database design must be continually enhanced to capture all the relevant information that scientists are defining to be of importance. Any and all of these bio-informatics challenges are available as part of the current project, which will endow a wide ranging experience at the interface of genomics and computing expertise.
The Center for Genomics Research at Karolinska Institute is a prominent research and educational organisation, concentrating upon research into the function of the human genome. It harbours units for Functional Genomics, Bioinformatics, Genomics
Technologies and Clinical Genomics.


