Bioinformatics

Bioinformatics is the use of mathematical and informational techniques to solve biological problems, usually by creating or using computer programs, mathematical models or both. One of the main applications of bioinformatics is the data mining in and analysis of the data gathered in genome projects. Other applications are sequence alignment, protein structure prediction, metabolic networks, morphometrics and virtual evolution.

Computer scripting languages such as Perl and Python are often used to interface with biological databases and parse output from bioinformatics programs. Communities of bioinformatics programmers have setup free/open source projects such as Bioperl, Bioruby, and Biopython which develop and distribute shared programming tools and objects (as program modules) that make bioinformatics easier.

Since the Epstein-Barr virus was sequenced in 1984, the DNA sequence of more and more organisms is stored in electronic databases. This data is analyzed to determine genes that code for proteins, as well as regulatory sequences. A comparison of genes within a species or between different species can show similarities between protein functions, or relations between species (phylogenetic trees). With the growing amount of data, it becomes impossible to analyze DNA sequences manually. Today, computer programs are used to find similar sequences in the genome of dozens of organisms, within billions of nucleotides. The programs can compensate for mutations (exchanged, deleted or inserted bases) in the DNA sequence. A variant of this sequence alignment is used in the sequencing process itself. The so-called shotgun sequencing (that was used, for example, by Celera Genomics to sequence the human genome) does not give a sequential list of nucleotides, but instead the sequences of thousands of small DNA fragments (each about 600 nucleotides long). The ends of these fragments overlap and, aligned in the right way, make up the complete genome. Shotgun sequencing works very fast, but the task to re-align the fragments is quite complicated. In the case of the Human Genome Project (1988-2000), it took several months on a supercomputer array to align them correctly.

Protein structure prediction is another important application of bioinformatics. The amino acid sequence of a protein, the so-called primary structure, can be easily determined from the sequence on the gene that codes for it. But, the protein can only function correctly if it is folded in a very special and individual way (if it has the correct secondary, tertiary and quartery structure). The prediction of this folding just by looking at the amino acid sequence is quite difficult. Several methods for computer predictions of protein folding are currently (2001) under development.

One of the key principles in bioinformatics is homology. In the genomic branch of bioinformatics, homology is used to predict the function of a gene. If gene A is homologous to gene B of which the function is known, it is likely to have a similar function. In the structural branch of bioinformatics homology is used to determine which parts of the protein are important in structure formation and interaction with other proteins. In a technique called homology modelling, this information is used to predict the structure of a protein once the structure of a homologous protein is known. Despite many attempts, this is currently the only way to predict protein structures with some reliability.

There are many other applications of bioinformatics. Computer simulations of cellular subsystems such as the networks of metabolites and enzymes which comprise metabolism, signal transduction pathways and gene networks can be constructed that help to both analyze and visualize the complex connections of these cellular processes. Morphometrics is used to analyze pictures of embryos to track and to predict the fate of cell clusters during morphogenesis. Artificial life or virtual evolution attempts to understand evolutionary processes via the computer simulation of simple (artificial) life forms. Another application is the automatic search for genes and regulatory sequences within a genome. Not all of the nucleotides within a genome are genes. Within the genome of higher organisms, large parts of the DNA do not serve any obvious purpose (often called junk DNA). Bioinformatics helps to bridge the gap between genome and proteome projects, for example in the use of DNA sequence for protein identification.

As a summary, it can be said that the genome projects gave us long lists of letters, and with bioinformatics, we can determine words, grammar, sentences and, finally, their meaning.

See also: biologically-inspired computing
Back to: applied mathematics -- computer science -- biology

External links



In the News

Alzheimer's: High Stress And Genetic Risk Factor Lead To Increased Mem
High stress levels may contribute to memory loss among people at risk for developing Alzheimer's disease. The å4 variant of the apolipoprotein E (APOE) gene contributes to the risk for memory loss related to Alzheimer's disease. Similarly, high circulating levels of cortisol, associated with high stress levels, also impairs memory.

Airborne Germs and Handwringing
Just before the Christmas break, right as my annual winter festival cold kicked in and I was up to my neck in end of year deadlines, I posted a link to a press release in my Geeky Bits science extra column. That page is a repository of the less worthy, but hopefully interesting stuff I [...]

The Marsupial Society of Australia
This organization "based in Adelaide, South Australia, [is] dedicated to providing information and education regarding keeping and breeding our native Fauna in captivity."The site features fact sheets about animals such as the dalgyte or rabbit-eared bandicoot, the brush-tailed bettong, the fat-tailed dunnart, the Tasmanian devil, and more. Also includes information about hand rearing, games and coloring pages for children, and links to related sites.

Workers Exposed To Libby Vermiculite Ore Have High Rate Of Chest Wall
More than one-quarter of tested workers at an Ohio manufacturing plant historically exposed to asbestos-containing vermiculite ore exhibited signs of scarring of the chest wall lining, or pleural plaques, which are usually considered markers of previous exposure to asbestos fibers, according to research from the University of Cincinnati College of Medicine.

Popular New Year's Resolutions
List of 13 common New Year's resolutions with links to related government websites. Topics include losing weight, paying off debt, getting a better job, getting fit, quitting smoking, reducing stress, taking a trip, and volunteering to help others. From the official Web portal for the U.S. government.

[Ironic] LONDON: A jailed cocaine dealer is working as Santa Claus on
John Tams, who dons beard, boots and red suit to work in a cafe's Christmas grotto, said he wanted to give something back to the community...

Feds Want Telco Spy Suits Halted
Pressing a judge to stop suits against top telecoms, the government cites national security. Plaintiffs argue everybody knows the government spies on Americans, so suits should carry on. In 27B Stroke 6.

NCI Fact Sheet: Tea and Cancer Prevention
Provides an overview of clinical trials and research to study the possible link between antioxidants (catechins) found in tea and cancer prevention. From the National Cancer Institute (NCI).

Space Geeks Vie for X Prize Cup
A two-day science fair in New Mexico draws space entrepreneurs and enthusiasts, competing for $2 million in prize money.

Sturgeon's General Warning: Stable For Now, But Beware
They take a long time before they mate and, once old enough, don't mate every year. Even so, sturgeons are heavily sought after for their eggs, which are made into caviar. For these and other reasons, many sturgeons -- a variety of ancient, bottom-feeding fish -- are in trouble.


MP3 Music Downloads

Preview songs, Download Free Music,Burn CDs at ITunes.com
iTunes_RGB_9mm

 


Google




InformationQuickFind.com - Find Information Fast

Links