Biological Networks

Networks are widely used in many branches of biology as a convenient representation of patterns of interaction between appropriate biological elements. These biological networks include biochemical networks, neural networks, and ecological networks.

Biochemical networks

Biochemical networks represent the molecular-level patterns of interaction and mechanisms of control in the biological cell. The principal types of these networks are metabolic networks, protein-protein networks, and genetic regulatory networks.

Metabolic networks

Metabolism is the chemical process by which cells break down food and nutrients into usable building blocks and then reassemble those building block to form the biological molecules the cell needs to complete its other tasks. Typically this breakdown and reassembly involves chains or pathways, sets of successive chemical reactions that convert initial inputs into useful end products by a series of steps. The complete set of all reactions in all pathways forms the metabolic network.

The vertices in a metabolic network are chemicals produced and consumed by the reactions. These chemicals are known as metabolites. These are small molecules like carbohydrates, lipids, as well as amino acids and nucleotides. The metabolites consumed are called the substrates of the reaction, while those produced are called the products.

Most metabolic reactions do not occur spontaneously, or do so only at a very low rate. To make reactions occur at a usable rate, the cell employs an array of chemical catalysts, referred to as enzymes. Enzymes are not consumed in the reactions they catalyze. By increasing or decreasing the concentration of the enzyme that catalyzes a particular reaction, the cell can turn that reaction on or off, or moderate its speed. Enzymes tend to be highly specific to the reaction they catalyze.

The most correct representation of a metabolic network is as a bipartite graph. The two types of vertices represent metabolites and metabolic reactions, with edges joining each metabolite to the reaction in which it participates. The edges are directed, since some metabolites (the substrates) go into the reaction and some (the products) come out of it. Enzymes can be incorporated by adding a third class of vertex to represent them, with undirected edges connecting them to the reactions they catalyze. The resulting graph is a mixed (directed and undirected) tripartite network.

Nevertheless, the most common representations of metabolic network project the bipartite network onto metabolite vertices. In one approach, the vertices in the network represent metabolites and there is an undirected edge between any two metabolites that participate in the same reaction, either as substrates or as products. Clearly this projection loses much of the information contained in the bipartite network. Another more informative approach is to represent the network as a directed graph with a single type of vertex representing metabolites and a directed edge from one metabolite to another if there is a reaction in which the first metabolite appears as a substrate and the second as a product. This representation is more reach in terms of information with respect to the undirected one but still loses the association of metabolites with reactions.

Protein-protein interaction networks

Proteins do interact with one another and with other biomolecules, both large and small, but the interactions are not purely chemical. Proteins are long-chain molecules formed by the concatenation of a series of basic units called amino acids. Once created, a protein does not stay in a loose chain-like form, but folds on itself in a folded form whose shape depends on the amino acid sequence. The folded form dictates the physical interaction it can have with other molecules. Hence, the primary mode of protein-protein interaction is physical rather than chemical, their complicated folded shapes interlocking to create so-called protein complexes but without the exchange of particles that defines chemical reactions.

In a protein-protein interaction network the vertices are proteins and two vertices are connected by an undirected edge if the corresponding protein interact. However, this representation omits useful information. Interactions that involve three or more proteins are represented by multiple edges, and there is no way to tell from the network itself that such edges represent aspects of the same interaction. This problem could be addressed by adopting a bipartite representation, with proteins and interactions as different types of vertex, and undirected edges connection proteins to the interactions in which they participate.

Genetic regulatory networks

The small molecules needed by biological organisms, such as sugars and fats, are manufactured in the cell by the chemical reactions of metabolism. Proteins, however, which are much larger molecules, are manufactured in a different manner, following recipes recorded in the cell's genetic material, DNA.

Proteins are long-chain molecules formed by the concatenation of amino acids. The protein amino acid sequence is determined by a corresponding sequence stored in the DNA of the cell in which the protein is synthesized. This is the primary function of DNA, to act as an information storage medium containing the sequences of proteins needed by the cell. DNA is itself made up of units called nucleotides, of which there are four distinct species, denoted A, C, G, and T. The amino acids in proteins are encoded in DNA as trios of consecutive nucleotides called codons, such as ACG and TTT, and a succession of codons spells out the complete sequence of amino acids in a protein. A single strand of DNA can code for many proteins, and two special codons, called the start and end codons, signal the beginning and end of the sequence coding for a protein. The DNA code for a single protein, from start to stop codon, is called a gene.

Proteins are created in the cell by a mechanism that operates in two stages. In the first stage, known as transcription, an enzyme makes a copy of the coding sequence of a single gene. The copy is made of RNA, another information-bearing similar to DNA. In the second stage, called translation, the protein is assembled step by step from the RNA sequence. In the jargon of molecular biology, one says that the gene has been expressed.

The cell does not, in general, need to produce at all times every possible protein for which it contains a gene. Individual proteins serve specific purposes, such as catalyzing metabolic reactions, and it is important for the cell to respond to its environment by turning on or off the production of individual proteins. It does this by the use of transcription factors, which are themselves proteins. The presence in the cell of the transcription factor for the gene turns on or enhances the expression of that gene, or inhibits it, depending on the type of transcription factor (promoting and inhibiting).

Here comes the network. Being proteins, transcription factors are themselves produced by transcription from genes, and the transcription process is regulated by other transcription factors, which again are proteins, and so forth. The complete set of such interactions forms a genetic regulatory network. The vertices in this network are proteins, or equivalently the genes that code for them and a directed edge from gene A to gene B indicates that A regulates the expression of B. One can distinguish between promoting and inhibiting transcription factors, giving the network two distinct types of edges.

Neural networks

One of the main functions of the brain is to process information and the primary information processing element is the neuron, a specialized brain cell that combines several inputs to generate a single output.

A typical neuron consists of a cell body or soma, along with a number of protruding tentacles, called dendrites, which are input wires for carrying signals in the cell. Most neurons have only one output, called the axon, which is typically longer than the dendrites. It usually branches near its end into axon terminals to allow the output of the cell to feed the input of several others. There is a small gap, called synapse, between terminal and dendrite across which the output signal of the first (presynaptic) neuron must be conveyed to reach the second (postsynaptic) neuron.

At the simplest level, a neural network can be represented as a set of vertices, the neurons, connected by two types of directed edges, one for excitatory inputs and one for inhibiting inputs. In practice, neurons are not all the same. This variation can be encoded in our network representation by different types of vertices.

Current science cannot tell us exactly how the brain performs the more sophisticated cognitive tasks that allow animals to survive, but it is known that the brain constantly changes the pattern of wiring between neurons in response to inputs and experience, and it is presumed that this pattern - the neural network - holds much of the secret.

Ecological networks

Ecological networks are networks of ecological interactions between species. Species in an ecosystem can interact in different ways: they can eat one another, they can parasitize one another, or they can have any of a variety of mutually advantageous interactions, such as pollination or seed dispersal.

Food webs are the most studied ecological networks. The biological organisms on our planet can be divided into ecosystems, groups of organisms that interact with one another and with elements of their environment. A food web is a directed network that represents which species prey on which others in a given ecosystem. The vertices of the network correspond to species and the directed edges to predator-prey interactions. In fact, ecologists conventionally draw edges in the opposite direction, from prey to predator, that is, in the direction of energy flow. In some cases attempts have been made to measure not merely the presence of interactions between species but also their strength, for instance as the fraction of energy that a species derives from each of its preys. The result is a weighted directed network that sheds more light on the flow of energy through an ecosystem.

Food webs are approximately directed acyclic graphs (DAGs). The acyclic nature of food webs indicate that there is an intrinsic hierarchy among the species in ecosystems. Those higher up the hierarchy prey on those lower down, but not vice versa, although there are some counterexamples. A species's position in this hierarchy is called by ecologists its trophic level. This is the rank of the species' vertex on the acyclic food graph, that is the length of the longest path leading to the vertex representing the species. Those species in lower trophic levels tend to be smaller and more abundant, while those in higher trophic positions are usually larger-bodied and less numerous predators.

Other ecological networks include host-parasite networks and mutualistic networks. Host-parasite networks are networks of parasitic relationships between organisms. Parasites tend to be smaller-bodied then their hosts and parasites can live off their hosts without killing them. Host-parasite networks are directed acyclic networks. Mutualistic networks are networks of mutually beneficial interactions between species. These include networks of plants and the animals (insects) that pollinate them, networks of plants and the animals (birds) that disperse their seeds, and networks of ant species and the plants that thy protect and eat. Mutualistic networks can be represented as undirected bipartite graphs.