Results for gene
On this page:
 
Dictionary:

gene

  (jēn) pronunciation
n.

A hereditary unit consisting of a sequence of DNA that occupies a specific location on a chromosome and determines a particular characteristic in an organism. Genes undergo mutation when their DNA sequence changes.

[German Gen, from gen-, begetting, in Greek words (such as genos, race, offspring).]


 
 

The basic unit in inheritance. There is no general agreement as to the exact usage of the term, since several criteria that have been used for its definition have been shown not to be equivalent.

The facts of mendelian inheritance indicate the presence of discrete hereditary units that replicate at each cell division, producing remarkably exact copies of themselves, and that in some highly specific way determine the characteristics of the individuals that bear them. The evidence also shows that each of these units may at times mutate to give a new equally stable unit (called an allele), which has more or less similar but not identical effects on the characters of its bearers. These hereditary units are the genes, and the criteria for the recognition that certain genes are alleles have been that they (1) arise from one another by a single mutation, (2) have similar effects on the characters of the organism, and (3) occupy the same locus in the chromosome. It has long been known that there were a few cases where these criteria did not give consistent results, but these were explained by special hypotheses in the individual cases. However, such cases have been found to be so numerous that they appear to be the rule rather than the exception. See also Allele; Gene action; Mendelism; Mutation; Recombination (genetics).

The term gene, or cistron, may be used to indicate a unit of function. The term is used to designate an area in a chromosome made up of subunits present in an unbroken unit to give their characteristic effect. See also Chromosome.

Every gene consists of a linear sequence of bases in a nucleic acid molecule. Genes are specified by the sequence of bases in DNA in prokaryotic, archaeal, and eukaryotic cells, and in DNA or ribonucleic acid (RNA) in prokaryotic or eukaryotic viruses. The ultimate expressions of gene function are the formation of structural and regulatory RNA molecules and proteins. These macromolecules carry out the biochemical reactions and provide the structural elements that make up cells. See also Deoxyribonucleic acid (DNA); Nucleic acid; Ribonucleic acid (RNA); Virus.

One goal of molecular biology is to understand the function, expression, and regulation of a gene in terms of its DNA or RNA sequence. The genetic information in genes that encode proteins is first transcribed from one strand of DNA into a complementary messenger RNA (mRNA) molecule by the action of the RNA polymerase enzyme. Many kinds of eukaryotic and a limited number of prokaryotic mRNA molecules are further processed by splicing, which removes intervening sequences called introns. In some eukaryotic mRNA molecules, certain bases are also changed posttranscriptionally by a process called RNA editing. The genetic code in the resulting mRNA molecules is translated into proteins with specific amino acid sequences by the action of the translation apparatus, consisting of transfer RNA (tRNA) molecules, ribosomes, and many other proteins. The genetic code in an mRNA molecule is the correspondence of three contiguous (triplet) bases, called a codon, to the common amino acids and translation stop signals; the bases are adenine (A), uracil (U), guanine (G), and cytosine (C). There are 61 codons that specify the 20 common amino acids, and 3 codons that lead to translation stopping. See also Genetic code; Intron.

In many cases, the genes that mediate a specific cellular or viral function can be isolated. The recombinant DNA methods used to isolate a gene vary widely depending on the experimental system, and genes from RNA genomes must be converted into a corresponding DNA molecule by biochemical manipulation using the enzyme reverse transcriptase. The isolation of the gene is referred to as cloning, and allows large quantities of DNA corresponding to a gene of interest to be isolated and manipulated.

After the gene is isolated, the sequence of the nucleotide bases can be determined. The goal of the large-scale Human Genome Project is to sequence all the genes of several model organisms and humans. The sequence of the region containing the gene can reveal numerous features. If a gene is thought to encode a protein molecule, the genetic code can be applied to the sequence of bases determined from the cloned DNA. The application of the genetic code is done automatically by computer programs, which can identify the sequence of contiguous amino acids of the protein molecule encoded by the gene. If the function of a gene is unknown, comparisons of its nucleic acid or predicted amino acid sequence with the contents of huge international databases can often identify genes or proteins with analogous or related functions. These databases contain all the known sequences from many prokaryotic, archaeal, and eukaryotic organisms. Putative regulatory and transcript-processing sites can also be identified by computer. These putative sites, called consensus sequences, have been shown to play roles in the regulation and expression of groups of prokaryotic, archaeal, or eukaryotic genes. However, computer predictions are just a guide and not a substitute for analyzing expression and regulation by direct experimentation. See also Genetic engineering; Human Genome Project; Molecular biology.


 

A gene is part of a DNA molecule within the nucleus of all cells. Each gene codes for a particular protein. Thus a gene is a unit of the inheritable characteristics of the organism. Humans have tens of thousands of different genes; these determine the phenotype of the individual.

— Alan W. Cuthbert

See cell; gene therapy; genetic testing; genetics, human.

 

In the early 1860s, Gregor Mendel developed the concept of the gene to help explain results obtained while crossbreeding strains of garden peas. He identified physical characteristics (phenotypes), such as plant height and seed color, that could be passed on, unchanged, from one generation to the next. The hereditary factor that predicted the phenotype was termed a "gene." Mendel hypothesized that genes were inherited in pairs, one from the male and one from the female parent. Plants that bred true (homozygotes) had inherited identical genes from their parents, whereas plants that did not breed true (hybrids, or heterozygotes) inherited alternative copies of the genes (alleles) from one parent that were similar, but not identical, to those from the other parent.

Some of these alleles had a greater effect on the phenotypes of hybrids than others. For example, if a single copy of a given allele was sufficient to produce the same phenotype seen in homozygous organisms, that gene was termed a "dominant." Conversely, if the allele could only be detected in the minority of the offspring of hybrid parents that were homozygous for that "weaker" allele, the gene was termed a "recessive." Based on these observations, Mendel formulated a series of laws that are the basis of what we now term "Mendelian" inheritance patterns.

The "law of unit inheritance" holds that factors retain their identity from generation to generation and do not blend in the hybrid. The "law of segregation" states that two members (alleles) of a single pair of genes are never found in the same mature sperm or ovum (gamete) but always separate out (segregate). Finally, the "law of independent assortment" holds that members of different pairs of genes (nonalleles) are sorted out (assort) independently to different gametes.

Almost a century later, in 1953, Watson and Crick solved the structure of the DNA molecule and helped explain how this genetic information could be encoded in a polymer, deoxyribonucleic acid (DNA), which was found in the nucleus of the cell. They demonstrated that DNA is a double-stranded polymer consisting of two linear arrays of diverse purine (adenine [A] and guanine [G]) and pyrimidine (thymine [T] and cytosine [C]) bases. Each purine or pyrimidine on one strand pairs with a complementary base (A:T and G:C) on the other strand. Each strand is thus complementary to the other. The two antiparallel polynucleotide strands are gently twisted to form what is termed a "double helix."

In humans, the nucleus of each somatic cell contains twenty-three pairs of chromosomes, which are formed by tightly coiled DNA strands. Twenty-two pairs of the chromosome pairs are found in the cells of both men and women. These chromosomes are termed "autosomes," and they are numbered by size from 1 (the largest) to 22 (the smallest). The twenty-third pair of chromosomes determine the sex of the individual, and these two chromosomes are thus termed the "sex chromosomes." Women have a pair of X chromosomes, whereas men have a single X chromosome, which they inherit from their mother, and a single Y chromosome, which they inherit from their father. The Y chromosome is dominant for maleness.

During "mitosis," the DNA double strand is unwound and split apart. Each individual strand is then duplicated. By making copies of each DNA strand, a parental cell can transmit a complete set of genetic information into each of its two daughter cells.

Gametes result from "meiosis," which differs from mitosis in two ways. First, allelic chromosomes are paired prior to their duplication. Second, there are two sets of divisions before the final product, the gamete, is created. In the first set of divisions after DNA duplication, allelic chromosomes, rather than chromatids, segregate into the daughter cells. In the second set of divisions, the chromatids separate and segregate into the gamete. Thus, one and only one copy of each allelic pair is contributed to the gamete. In this way, a "diploid" germ cell gives rise to a "haploid" sperm or egg that contains an assortment of one of each of the twenty-three pairs of allelic chromosomes in the parental cell. During fertilization, a sperm and an egg unite to create a zygote with a newly constituted complete set of forty-six chromosomes. These fundamental properties of DNA and cell division are the basis of Mendel's laws of unit inheritance, segregation, and independent assortment.

The central dogma of molecular genetics holds that each gene encodes one polypeptide, forming a monomeric protein. The portion of the gene that specifies the polypeptide sequence is termed "coding" DNA. Each human cell contains approximately 3.9 × 109 base pairs of DNA per haploid genome, which is enough to encode about 1 million polypeptides of average length. However, there are approximately 35,000 structural genes—possibly in the range of 30,000—in humans; thus more than 90 percent of DNA does not encode peptide sequences. The DNA that does not code for protein, termed "noncoding" DNA, is often involved in the regulation of gene expression. Noncoding DNA can also play a structural role. Structural functions include providing structural stability for the chromosome (e.g., matrix-associated regions, or MARs), providing the specialized sequences that define the ends of the chromosome (telomeres), and providing a site to which the cellular cytoskeleton can be attached in order to allow the movement of chromosomes during meiosis and mitosis (centromeres). Approximately 10 percent of cellular DNA consists of a repetitive sequence that has been randomly inserted throughout the genome. Although the function of this repetitive DNA is unknown, its presence has proven useful for gene mapping studies.

Genetic information proceeds in a stepwise fashion from the sequence of a gene to the synthesis of a polypeptide. Located near the coding sequence of the gene are sequences, called DNA control regions, that identify the transcription start site (promoters), mark the tissue in which it will be expressed (enhancers), and control the use of batteries of genes during ontogeny (locus control regions). The regions of DNA that specify the sequence of a polypeptide chain, or structural genes, are organized into discrete units (exons) that are separated by noncoding sequences (introns). The first step in synthesizing a new protein occurs in the nucleus, where the sequence of the coding DNA is copied (transcribed) into ribonucleic acid (RNA), a less stable nucleic acid that can be rapidly degraded. The ends of the RNA are modified to help stabilize the final product and the introns are removed, or spliced out, generating messenger ribonucleic acid (mRNA). The mRNA is transported from the nucleus to the cytoplasm, where it is translated by ribosomes into polypeptide strands.

Ribosomes read the sequence of the mRNA in sequential groups of three, or triplets, termed a codon. There are sixty-four different combinations (e.g., AAA, TTT, CAC), all but three of which specify a specific amino acid. Each codon specifies a single amino acid, but amino acids can be encoded by more than one codon, thus there is considerable degeneracy in the code. Translation begins when the mRNA is bound to the ribosome. Transfer RNA (tRNA), an adapter molecule, contains a complementary triplet anticodon at one end, and an amino acid bound to the other end. The tRNA anticodon binds to the mRNA codon and helps stabilize the interaction with the ribosome. Each ribosome has two sites where the tRNA can bind. Binding of the downstream tRNA, which contains sequence complementary to the next three nucleotide codon on the RNA, brings its amino acid next to the end of the growing polypeptide strand. Formation of a peptide bond allows the ribosome to shift down the mRNA, providing a site for the next amino acid and its adapter to bind. Step by step, the protein is allowed to grow until the mRNA brings one of the three remaining codons into the ribosome. These codons do not have tRNA partners, and they function to terminate translation and allow the release from the ribosome of the mRNA and its protein product.

Many genes are composed of a series of structural or functional domains, with each exon specifying part or all of the sequence of a single structural domain. Each domain can endow the protein with a different property. For example, a protein may have one or more extracellular domains that allow it to bind to a specific soluble ligand, a transmembrane domain that allows it to be anchored in the cell membrane, and one or more intracellular domains that allow it to signal inside the cell. These types of proteins are the product of mixing and matching different types of domains during evolution, a process that is facilitated by the exon/intron structure of the gene. By changing the extracellular domains while maintaining the rest of the molecule relatively intact, for example, a similar signal can be elicited by the binding of several different types of ligands. Conversely, the presence or absence of a transmembrane domain can allow the protein to be tethered to the cell or to exist as a soluble factor. The function of an unknown protein can often be guessed by analyzing its complement of domains.

At first glance, the linking of genes in chromosomal units and their transmission as a unit to daughter cells would seem to violate Mendel's laws of independent assortment and segregation, because effectively one might expect genes to be inherited as part of only 23 sets of genes. However, when allelic chromosomes are brought into close juxtaposition during the process of meiosis, breaks occur in the chromosomes and allow bridges, or chiasmata, to form between homologous portions of the chromosomes. This crossing over of DNA strands allows allelic chromosomes to recombine, forming patchwork or chimeric chromosomes that contain portions of each of the parental chromosomes. Although recombination can occur anywhere in the chromosome, only a limited number of chiasmata form during each meiosis. Two genes that are on opposite ends of the chromosome may thus behave as if they were on different chromosomes, whereas recombination is less likely between genes that are very close to each other in their primary sequence. The increased frequency of the joint inheritance of two genes that are closely physically linked on a chromosome is termed "linkage disequilibrium."

Distances between genes on a chromosome are quantified by either their physical distance from each other in millions of base pairs (megabases), or by their genetic distance, as measured by the frequency of recombination between the two genes per generation. One percent of genetic recombination is termed a "centimorgan," after the geneticist Thomas Hunt Morgan, whose studies of the common fruitfly, Drosophila, in the first half of the twentieth century helped elucidate the properties of recombination. As a rough guide, one centimorgan covers approximately one megabase of DNA. However, the relationship between linear and genetic distance is not absolute. The frequency of recombination, and thus the genetic distance between genes in specific regions of the genome, may differ depending on the sequence or the nonhistone proteins that cover the DNA. Recombination frequencies in selected regions of the genome may differ in male and female gametes, implying that segments of chromosomes can be handled differently by spermatogonia and oocytes. This disparity in how DNA is treated by male and female gametes can lead to differences in the function of alleles, depending on whether they have been inherited from the mother or the father, a process termed "imprinting."

A "mutation" is defined as a stable, heritable alteration in the DNA sequence that can be passed from a parental cell to at least one its daughters. From the standpoint of evolution, mutations are required to generate the genetic diversity that is needed to permit species to adapt to a changing environment. The normal rate of mutation is approximately one base pair change per generation per 107 base pairs; thus, on average, each child differs from its parent by approximately 390 base pairs as a result of mutations in the gametes. Mutations in the nonreproductive cells of the body are termed "somatic" mutations. Although by definition these alterations are not transmitted to the gametes, the mutations are passed on to the daughter cells of the mutated parent. Somatic mutations in oncogenes, for example, foster the development of many cancers.

Mutations can involve an entire human genome, as in triploidy, in which a third copy of the entire chromosomal complement occurs. Mutations may involve all or part of a single chromosome, including duplications, deletions, and translocations of a portion of one chromosome to another. At the other extreme, a mutation can be minute and involve a small deletion or insertion, or a replacement of only a single base pair (point mutation). Deletions or insertions that occur in a coding region can alter the reading frame distal to the mutation (frameshift mutations). Frameshift mutations frequently alter the protein sequence and can lead to premature peptide termination by generating a stop codon, one of the three triplet sequences that does not encode an amino acid. Point mutations in coding regions may be of three types: (1) a nonsense mutation (about 4% of base substitutions in coding regions), in which the base change generates one of the three termination codons; (2) a missense, or replacement, mutation (about 73% of base substitutions in coding regions), in which the base change results in substitution of one amino acid for another; and (3) a synonymous, or silent, mutation (about 23% of random base substitutions in coding regions), in which the base replacement does not lead to a change in the amino acid but only to a different codon for the same amino acid. Even synonymous mutations can have deleterious affects, however. A change in the coding sequence of a given gene may alter splicing patterns or diminish mRNA stability, reducing protein production.

The consequences of a single-point mutation to the function of a given protein can vary greatly. Enzymes, for example, exhibit a hierarchy of resistance to mutation. Portions of the hydrophilic exterior may serve primarily to allow the protein to be soluble in an aqueous solution, hence changes in the amino acid sequence that preserve hydropathicity may have little or no effect on the function of the protein. The hydrophobic core provides structural stability for the molecule, and amino acid changes may result in an unstable protein product that is temperature sensitive (e.g., falling apart at high temperature). Finally, the catalytic site is exquisitely sensitive, and a single mutation may completely abolish function.

Large deletions may interrupt a coding region and cause an absence of one or more closely linked protein products. If the deletion removes a bridge between two coding regions, the result may be a fusion or hybrid protein containing the initial sequence of one protein and the terminal portion of the other. Such deletions can also result from unequal crossing-over between homologous genes. Finally, alterations of the DNA in the surrounding regions may lead to changes in RNA splicing, transcriptional efficiency, or control of tissue expression.

The Human Genome Project began in 1990 with the goals of developing genetic and physical maps and determining the complete DNA sequence of the human genome. The ultimate goal is to use this mapping and sequence information to isolate and study the structure and function of genes that can contribute to the development of disease. Knowledge of the genetic basis of susceptibility for specific diseases is likely to aid in disease prevention as well as therapy. Associated with these benefits, however, is the risk of discrimination against healthy at-risk individuals that may never develop a disorder. Thus, in addition to learning how to use this new knowledge, we must gain the wisdom to use genetic information appropriately.

(SEE ALSO: Genetic Disorders; Genetics and Health; Human Genome Project; Medical Genetics)

Bibliography

Alberts, B. (1994). Molecular Biology of the Cell, 3rd edition. New York: Garland Publishing.

Macfarlane, W. M. (2000). "Demystified Transcription." Molecular Pathology 53(1):1–7.

Macilwain, C. (2000). "World Leaders Heap Praise on Human Genome Landmark." Nature 405:983–984.

Monk, M. (1995). "Epigenetic Programming of Differential Gene Expression in Development and Evolution." Developmental Genetics 17(3):188–197.

Paques, F., and Haber, J. E. (1994). "Multiple Pathways of Recombination Induced by Double-Strand Breaks in Saccharomyces Cerevisiae." Microbiology & Molecular Biology Review 63(2):349–404.

Preston, R. J. (1997). "Telomeres, Telomerase and Chromosome Stability." Radiation Research 147(5):529–534.

Russell, D. W.; Lehrman, M. A.; Sudhof, T. C.; Yamamoto, T.; Davis, C. G.; Hobbs, H. H.; Brown, M. S.; and Goldstein, J. L. (1986). "The LDL Receptor in Familial Hypercholesterolemia: Use of Human Mutations to Dissect a Membrane Protein." Cold Spring Harbor Symposia on Quantitative Biology 51(2):811–819.

Sybenga, J. (1999). "What Makes Homologous Chromosomes Find Each Other in Meiosis? A Review and an Hypothesis." Chromosoma 108(4):209–219.

Tournebize, R.; Heald, R.; and Hyman, A. (1997). "A Role of Chromosomes in Assembly of Meiotic and Mitotic Spindles." Progress in Cell Cycle Research 3: 271–384.

Vogel, F., and Motulsky, A. G. (1997). Human Genetics: Problems and Approaches, 3rd edition. Berlin: Springer-Verlag.

Watson, J. D. (1998). The Double Helix: A Personal Account of the Discovery of the Structure of DNA. New York: Scribners.

— HARRY W. SCHROEDER, JR.



 

Genes are functional units of DNA that contain the instructions for making proteins or RNA. Genes also act as units of heredity, transferring the same instructions from parent to offspring. The nature, structure, and regulation of genes has been a central topic of scientific research for more than 100 years.

History of the Gene and Structure of Dna

Genes were first defined as units of hereditary transmission. The name "gene" was coined by Wilhelm Johannsen in 1909, although the concept of a discrete unit governing inherited characteristics goes back at least to Gregor Mendel in 1861. The work of Thomas Hunt Morgan and his colleagues established that genes were located on chromosomes, and in the mid-1940s Oswald Avery demonstrated that genes were composed of DNA (deoxyribonucleic acid). Since that time, some types of viruses have been discovered that use ribonucleic acid (RNA) instead of DNA, but here we shall concentrate on DNA genes. The discovery of the structure of DNA in 1953 by James Watson and Francis Crick set the stage for the next fifty years of research into gene structure, function, and regulation.

DNA is a linear molecule composed of subunits called nucleotides. Each nucleotide is made of a sugar and phosphate group, plus a chemical base, of which there are four types: adenine, thymine, guanine, and cytosine (A, T, G, C). Nucleotides are typically referred to by the name of their base. DNA exists as a pair of strands, wound around one another into a double helix, with the bases directed into the center. The structure and charges of the bases dictate that A on one strand can match only up with T on the other, and C only with G. This complementarity provides the basis for faithful replication of the entire DNA molecule.

Genes Code for Protein and Rna

While all genes are made of DNA, not all stretches of DNA act as genes. Indeed, in eukaryotic organisms, most of the DNA does not function as genes, meaning it is not the code for making proteins or RNA. Some DNA outside of genes has a structural role, some are remnants of old genes that now are functionless, and much of it appears to be "junk," inserted and copied by viruslike sequences. Within a gene, usually only one side of the double helix actually codes for product; the other side is silent. Which side of the helix acts as code varies from gene to gene.

Almost all genes code for proteins. Proteins are strings of amino acids, and the sequence of nucleotides in the gene dictates the sequence of amino acids in the protein. Proteins perform almost all the functions in cells, and can be grouped into four major classes: they act as enzymes that control the rate of chemical reactions in the cell; they form structural components of organelles, membranes, and other cell components; they receive and transmit signals between and within cells; or they act as regulators of genes by latching onto DNA, thereby increasing or decreasing the rate at which the gene is used, or "expressed."

Genes vary in length. The largest human gene is 2.5 million base pairs in length, and codes for the muscle protein named dystrophin, which is more than 3,500 amino acids long. Eukaryotic genes generally produce proteins of about 150 to 3,000 amino acids in length. Some genes are relatively small, as in prokaryotes, which produce proteins of 50 to 300 amino acids. Most eukaryotic protein-coding genes are present in only two copies per genome, occurring in the same position on homologous chromosomes, one of which is received from each parent. If the two copies differ slightly they are called alleles. Changes in nucleotide sequences are termed mutations or polymorphisms, depending on their effect.

Some genes code not for protein but for RNA molecules that have their own functions within the cell. These include the transfer RNAs, ribosomal RNAs, and a variety of other smaller RNAs with roles in the nucleus. RNA-coding genes are usually present in multiple copies per eukaryotic genome.

Gene Expression

Expression of protein-coding genes begins with the process of transcription. During transcription, the helix is unwound, and an enzyme (RNA polymerase) binds to the DNA. It then moves along the DNA, and beginning slightly "downstream" at the so-called initiation site, it copies one of the strands to form a molecule of RNA. Transcription ceases when the polymerase reaches a special DNA sequence called the termination site, usually a region high in G-Cs followed by A-Ts.

In prokaryotes, this RNA product is ready to use for protein synthesis, and is called messenger RNA (mRNA). After the mRNA of a gene is formed, it is used by the cell in protein synthesis (translation) at the ribosomes.

Thus, the prokaryotic gene consists of an RNA binding site (called the "promoter"), a transcription initiation site, the coding region, and a termination signal. The initiation site should not be confused with the start signal for protein synthesis, nor the termination site with the stop signal in protein synthesis. Each of the translation signals is within the coding region, or "open reading frame," of the gene.

Eukaryotic Genes

In eukaryotic cells, genes are more complex. It was discovered in 1977 that eukaryotic genes are functionally separated into coding segments called exons, which are interrupted by noncoding sequences of DNA called introns. The entire region between the initiation and termination sites is transcribed, including the introns, to form the primary transcript. This must then be processed by special enzymes that cut out the introns and splice together the exons to form an mRNA. The mRNA is then exported from the nucleus for translation.

The existence of introns allows for the creation of multiple proteins from one gene, by the use or exclusion of different exons. Such alternative splicing gives rise to protein "isoforms," highly similar but slightly different proteins, with functions that vary as well. Isoforms are typically tissue-specific. For example, the muscle enzyme creatine kinase exists in one form in the heart, and another form in the skeletal muscles (such as the biceps), which have different ends formed through use of different exons. Even though it codes for two or more proteins, most scientists call such a DNA sequence a single gene.

Eukaryotic genes also contain a sequence close to the termination site called the polyadenylation signal. After transcription, this sequence prompts a special enzyme, called poly-A polymerase, to cut the RNA chain and begin adding multiple adenine nucleotides, as many as 250, to the primary transcript. This poly-A tail helps transport the RNA out of the nucleus, stabilizes it in the cytoplasm, and promotes efficient transcription at the ribosome.

Thus, the eukaryotic gene consists of an RNA binding site (promoter), a transcription initiation site, the coding region including exons and introns, the polyadenylation signal, and a termination site.

Genes for RNAs are transcribed in the same way, but the RNA formed is not translated into protein. Details vary among different types, but most RNA-coding genes do not contain introns. Transcripts of the ribosomal RNA genes must be cut apart to form a number of smaller functional RNA molecules.

Controlling Gene Expression

The complexity of any living cell is due to the well-orchestrated interactions of its proteins. Just as an orchestra cannot have every instrument play at once, a cell cannot have all its proteins function at once. One method of regulating protein function is to control when the protein is made, which is to say when the gene is expressed. Prokaryotic genes are usually controlled by operon systems, relatively simple systems that tie expression directly to metabolic activity in the cell. Eukaryotic genes are controlled by more complex regulatory systems that respond to hormones, growth factors, internal conditions, and many other influences.

To ensure that each gene is expressed when, and only when, it is needed, each eukaryotic gene has several control regions, termed the promoter and enhancer regions. These do not code for amino acids but are critical for proper gene expression. Mutations in these regions often change the rate at which a gene is expressed, or the factors in the cell or the environment to which it responds.

The promoter region is a sequence of 20 to 200 nucleotides "upstream" of the coding region to which the RNA polymerase enzyme binds, permitting it to begin transcribing the DNA. Promoters differ in size and sequence in prokaryotic and eukaryotic genes. Promoters attract RNA polymerase by first binding a variety of other proteins, called transcription factors. In some eukaryotic genes, promoter sites also occur within the coding region, allowing alternative transcripts with fewer exons.

Enhancers, also called activation sites, are located either nearby or far away from the promoter. Because DNA is looped and coiled, however, these sites are actually physically close to the gene's promoter even when distant on the DNA strand. Enhancers are gene-specific, and attract a variety of transcription factors. All of these work together to increase the rate of transcription by increasing the likelihood of RNA polymerase binding. Controlling the availability of these proteins is an important factor in regulating expression of the gene.

Bibliography

Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Science,2002.

Carlson, Elof. The Gene: A Critical History. Philadelphia, PA: Saunders Publishing,1966.

Muller, H. J. "The Development of the Gene Theory." In Genetics in the Twentieth Century, L. C. Dunn, ed. New York: Macmillan, 1951.

Olby, Robert. The Path to the Double Helix. Seattle, WA: University of Washington Press, 1974.

—Elof Carlson

 

Unit of heredity that occupies a fixed position on a chromosome. Genes achieve their effects by directing protein synthesis. They are composed of DNA, except in some viruses that contain RNA instead. The sequence of nitrogenous bases along a strand of DNA determines the genetic code. When the product of a particular gene is needed, the portion of the DNA molecule that contains that gene splits, and a complementary strand of RNA, called messenger RNA (mRNA), forms and then passes to ribosomes, where proteins are synthesized. A second type of RNA, transfer RNA (tRNA), matches up the mRNA with specific amino acids, which combine in series to form polypeptide chains, the building blocks of proteins. Experiments have shown that many of the genes within a cell are inactive much or even all of the time, but they can be switched on and off. Mutations occur when the number or order of bases in a gene is disrupted. See also genetic engineering, genetics, Hardy-Weinberg law, Human Genome Project, linkage group.

For more information on gene, visit Britannica.com.

 

The ‘unit of inheritance’ that controls the passing of a hereditary characteristic from parent to offspring, by controlling the structure of proteins or other genetic material. The term was introduced by W. L. Johannsen in 1909 as an abbreviation of ‘pangene’. It is interesting as having been consciously intended as a purely functional notion: ‘completely free from any hypothesis; it expresses only the evident fact that, in any case, many characteristics of the organism are specified in the gametes by means of special conditions, foundations and determiners…’ Genes are now identified with lengths of DNA or RNA. Simplistic forms of biological determinism suppose that arbitrary characteristics of an organism (e.g. poverty, criminality) are genetically specified.

 

The basic unit of inheritance by which hereditary characteristics are passed from parents to offspring. It is generally considered that one gene contains the information responsible for the synthesis of one polypeptide chain. See also genetic endowment.

 
the structural unit of inheritance in living organisms. A gene is, in essence, a segment of DNA that has a particular purpose, i.e., that codes for (contains the chemical information necessary for the creation of) a specific enzyme or other protein. The strands of DNA on which the genes occur are organized into chromosomes. The nucleus of each eukaryotic (nucleated) cell has a complete set of chromosomes and therefore a complete set of genes. Each gene provides a blueprint for the synthesis (via RNA) of enzymes and other proteins and specifies when these substances are to be made (see nucleic acid). Genes govern both the structure and metabolic functions of the cells, and thus of the entire organism and, when located in reproductive cells, they pass their information to the next generation.

Chemically, each gene consists of a specific sequence of DNA building blocks called nucleotides. Each nucleotide is composed of three subunits: a nitrogen-containing compound, a sugar, and phosphoric acid. Genes may vary in their precise makeup from person to person, including, for example, one nucleotide in a certain location in some people but another nucleotide in that location in others. Geometrically, the gene is a double helix formed by the nucleotides. Gene loci are often interspersed with segments of DNA that do not code for proteins; these segments are termed “junk DNA.” When junk DNA occurs within a gene, the coding portions are called exons and the noncoding (junk) portions are called introns. Junk DNA makes up 97% of the DNA in the human genome, and, despite its name, is necessary for the proper functioning of the genes.

Each chromosome of each species has a definite number and arrangement of genes. Alteration of the number or arrangement of the genes can result in mutation. When the mutation occurs in the germ cells (egg or sperm), the change can be transmitted to the next generation. Mutations that affect somatic cells can result in certain cancers.

The scientific study of inheritance is genetics. The genetic makeup of an organism with reference to its set of genetic traits is called its genotype. The interaction of the environment and the genotype produces the observable attributes of the organism, or its phenotype. The sum total of the genes contained in an organism's full set of chromosomes is termed the genome. Scientists are working toward identifying the location and function of each gene in the human genome (see Human Genome Project). The decoding of the first free-living organism (a bacterium, Hemophilus influenzae) was completed in 1995 by J. Craig Venter and Hamilton Smith.

See also gene therapy; genetic engineering.


 

A portion of a DNA molecule that serves as the basic unit of heredity. Genes control the characteristics that an offspring will have by transmitting information in the sequence of nucleotides on short sections of DNA.

 

The unit of heredity most simply defined as a specific segment of DNA, usually in the order of 1000 nucleotides, that specifies a single polypeptide. Many phenotypic characteristics are determined by a single gene, while others are multigenic. Genes are specifically located in linear order along the single DNA molecule that makes up each chromosome. All eukaryotic cells contain a diploid (2n) set of chromosomes so that two copies of each gene, one derived from each parent, are present in each cell; the two copies often specify a different phenotype, i.e. the polypeptide will have a somewhat different amino acid composition. These alternative forms of gene, both within and between individuals, are called alleles. Genes determine the physical (structural genes), the biochemical (enzymes), physiological and behavioral characteristics of an animal.
The formation of gametes (sperm, ova) involves a process of meiosis, which allows crossing over between four pairs of chromosomes, two derived from each parent, which means that new forms of a particular chromosome are created. Gamete formation also results in cells (gametes) with a haploid (n) set of chromosomes that in fertilization creates a new individual, which is a recombinant of 2n chromosomes, half derived by way of the ovum from the mother and half via the spermatozoa from the father.
Changes in the nucleotide sequence of a gene, either by substitution of a different nucleotide or by deletion or insertion of other nucleotides, constitute mutations which add to the diversity of animal species by creating different alleles and can be used as a basis for genetic selection of different phenotypes. Some mutations, be they a single base change in a single gene or a major deletion, are lethal.

  • g. action — the way in which genes exert their effects on tissues or processes, e.g. by being dominant or recessive, or partially so, being absent, being sex-linked, being involved in chromosomal aberrations.
  • allelic g's — different forms of a particular gene usually situated at the same position (locus) in a pair of chromosomes.
  • g. amplification — see gene duplication (below).
  • g. bank — the collection of DNA sequences in a given genome. Called also gene library.
  • barring g. — responsible for the barred pattern on the feathers of Barred Plymouth Rock birds.
  • g. box — see box (4).
  • g. clone — see clone.
  • g. cluster — a group of related genes derived from a common ancestral gene, located closely together on the same chromosome. Called also multigene family.
  • complementary g's — two independent pairs of nonallelic genes, neither of which is functional without the other.
  • g. conversion — a non-reciprocal exchange of DNA elements during meiosis which results in a functional rearrangement of chromosomal DNA.
  • dhfr g. — dihydrofolate reductase gene; an enzyme required to maintain cellular concentrations of H2 folate for nucleotide biosynthesis, and which has been used as a ‘selective marker’; cells lacking the enzyme only survive in media containing thymidine, glycine and purines; mutant cells (dhfr) transfected with DNA that is dhfr′ can be selectively grown in medium lacking these elements.
  • diversity (D) g. — genes located in diversity (D) segment; contribute to the hypervariable region of immunoglobulins.
  • dominant g. — one that produces an effect (the phenotype) in the organism regardless of the state of the corresponding allele. Examples of traits determined by dominant genes are short hair in cats and black coat color in dogs.
  • g. duplication — as a result of non-homologous recombination, a chromosome carries two or more copies of a gene.
  • g. expression — see expression (3).
  • g. frequency — the proportion of the substances or animals in the group which carry a particular gene.
  • holandric g's — genes located on the Y chromosome and appearing only in male offspring.
  • immune response (Ir) g's — genes of the major histocompatibility complex (MHC) that govern the immune response to individual immunogens.
  • jumping g. — see mobile dna.
  • g. knockout — replacement of a normal gene with a mutant allele, as in gene knockout mice.
  • lethal g. — one whose presence brings about the death of the organism or permits survival only under certain conditions.
  • g. library — see gene bank (above).
  • g. locus — see locus.
  • mutant g. — one that has undergone a detectable mutation.
  • non-protein encoding g. — the final products of some genes are RNA molecules rather than proteins.
  • overlapping g's — when more than one mRNA is transcribed from the same DNA sequence; the mRNAs may be in the same reading frame but of different size or they may be in different reading frames.
  • g. pool — total of all genes possessed by all members of the population which are capable of reproducing during their lifetime.
  • g. probe — see probe (2).
  • recessive g. — one that produces an effect in the organism only when it is transmitted by both parents, i.e. only when the individual is homozygous.
  • regulator g., repressor g. — one that synthesizes repressor, a substance which, through interaction with the operator gene, switches off the activity of the structural genes associated with it in the operon.
  • reporter g. — one that produces products which can be measured and therefore used as an indicator of whether a DNA construct has successfully been transferred.
  • sex-linked g. — one that is carried on a sex chromosome, especially an X chromosome.
  • g. splicing — see splicing.
  • structural g. — nucleotide sequences coding for proteins.
  • g. therapy — the insertion of functional genes into cells of the host in order to alter its phenotype, usually used to treat an inherited defect.
  • g. transcription — see transcription.
  • g. transfer — see recombination.
  • tumor suppressor g's — a class of genes that encode proteins that normally suppress cell division that when mutated allow cells to continue unrestricted cell division and may result in a tumor.
 
Essay: The discovery of genes

The monk Gregor Mendel is now famous for explaining the laws of heredity, including the roles of dominant and recessive genes and the different mathematical consequences arising from the two types of genes when sexual organisms reproduce. He worked with garden peas and looked for such traits as tall versus short or smooth seed versus wrinkled. By mating peas with different traits, he discovered such rules as, when two organisms each has one gene for a recessive trait and they are mated, approximately one-quarter of the offspring will exhibit the recessive trait. In fact, Mendel worked backward from the ratios of traits in offspring to determine the rules of inheritance.

Mendel performed his famous experiments at about the same time that Charles Darwin was explaining evolution. Darwin was already a popular author, and his ideas were soon known around the world. Mendel, however, had trouble getting published. He first sent his work to a prominent biologist who, as it happened, did not like mathematics. The biologist sent the paper back to Mendel with negative comments. In 1865 and 1869 Mendel's work was published -- by the local natural history society. After this Mendel was promoted to abbot, which kept him busy at the same time that it allowed him to grow fat. He gave up both gardening and science.

Darwin never got the chance to learn of Mendel's work, which is unfortunate, since Mendel's laws neatly fill a major gap in Darwin's theory. Darwin knew that variation occurred, but he did not know how it was inherited. Mendel's laws described the mechanism by which many traits pass from generation to generation.

In 1900, however, an astonishing coincidence put Mendel's work into the scientific mainstream. Three different biologists working in three different countries -- Hugo de Vries in the Netherlands, Karl Correns in Germany, and Erich Tschermak von Seysenegg in Austria -- worked out Mendel's laws for themselves. Each searched the scientific literature for prior discoveries of these laws and each somehow found the obscure papers from over 30 years before. When they published their work, they each unselfishly credited Mendel. The concept of a gene finally entered the mainstream of science.

 
pronunciation

IN BRIEF: Any of the units for inherited characteristics that are carried by chromosomes.

pronunciation Scientists recently discovered the gene that determines if a person will have blond hair.

Tutor's tip: You must have the skinny "gene" (part of a cell which determines hereditary traits) to wear tight "jeans" (pants mad of a kind of cotton cloth) well.

 
Wikipedia: gene
This stylistic schematic diagram shows a gene in relation to the double helix structure of DNA and to a chromosome (right). Introns are regions often found in eukaryote genes which are removed in the splicing process (after the DNA is transcribed into RNA): only the exons encode the protein. This diagram labels a region of only 40 or so bases as a gene. In reality most genes are hundreds of times larger, and the relationships between Introns and exons can be highly complex.
Enlarge
This stylistic schematic diagram shows a gene in relation to the double helix structure of DNA and to a chromosome (right). Introns are regions often found in eukaryote genes which are removed in the splicing process (after the DNA is transcribed into RNA): only the exons encode the protein. This diagram labels a region of only 40 or so bases as a gene. In reality most genes are hundreds of times larger, and the relationships between Introns and exons can be highly complex.

A gene is a locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions and/or other functional sequence regions.[1][2] The physical development and phenotype of organisms can be thought of as a product of genes interacting with each other and with the environment[3], and genes can be considered as units of inheritance. A concise definition of gene taking into account complex patterns of regulation and transcription, genic conservation and non-coding RNA genes, has been proposed by Gerstein et al.[4] "A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products".

In cells, genes consist of a long strand of DNA that contains a promoter, which controls the activity of a gene, and a coding sequence, which determines what the gene produces. When a gene is active, the coding sequence is copied in a process called transcription, producing an RNA copy of the gene's information. This RNA can then direct the synthesis of proteins via the genetic code. However, RNAs can also be used directly, for example as part of the ribosome. These molecules resulting from gene expression, whether RNA or protein, are known as gene products.

Most genes contain non-coding regions that do not code for the gene products, but regulate gene expression. The genes of eukaryotic organisms can contain non-coding regions called introns that are removed from the messenger RNA in a process known as splicing. The regions that actually encode the gene product, which can be much smaller than the introns, are known as exons. One single gene can lead to the synthesis of multiple proteins through the different arrangements of exons produced by alternative splicings.

The total complement of genes in an organism or cell is known as its genome. The genome size of an organism is generally lower in prokaryotes such as bacteria and archaea have generally smaller genomes, both in number of base pairs and number of genes, than even single-celled eukaryotes, although there is no clear relationship between genome sizes and perceived complexity of eukaryotic organisms. One of the largest known genomes belongs to the single-celled amoeba Amoeba dubia, with over 670 billion base pairs, some 200 times larger than the human genome.[5] The estimated number of genes in the human genome has been repeatedly revised downward since the completion of the Human Genome Project; current estimates place the human genome at just under 3 billion base pairs and about 20,000–25,000 genes.[6]. A recent Science article gives a final number of 20,488, with perhaps 100 more yet to be discovered .[7] The gene density of a genome is a measure of the number of genes per million base pairs (called a megabase, Mb); prokaryotic genomes have much higher gene densities than eukaryotes. The gene density of the human genome is roughly 12–15 genes/Mb.[8]

History

Main article: History of genetics

The existence of genes was first suggested by Gregor Mendel (1822-1884), who, in the 1860s, studied inheritance in pea plants and hypothesized a factor that conveys traits from parent to offspring. He spent over 10 years of his life on one experiment. Although he did not use the term gene, he explained his results in terms of inherited characteristics. Mendel was also the first to hypothesize independent assortment, the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote, and the difference between what would later be described as genotype and phenotype. Mendel's concept was given a name by Hugo de Vries in 1889, who, at that time probably unaware of Mendel's work, in his book Intracellular Pangenesis coined the term "pangen" for "the smallest particle [representing] one hereditary characteristic"[9]. Wilhelm Johannsen abbreviated this term to "gene" ("gen" in Danish and German) two decades later.

In the early 1900s, Mendel's work received renewed attention from scientists. In 1910, Thomas Hunt Morgan showed that genes reside on specific chromosomes. He later showed that genes occupy specific locations on the chromosome. With this knowledge, Morgan and his students began the first chromosomal map of the fruit fly Drosophila. In 1928, Frederick Griffith showed that genes could be transferred. In what is now known as Griffith's experiment, injections into a mouse of a deadly strain of bacteria that had been heat-killed transferred genetic information to a safe strain of the same bacteria, killing the mouse.

In 1941, George Wells Beadle and Edward Lawrie Tatum showed that mutations in genes caused errors in certain steps in metabolic pathways. This showed that specific genes code for specific proteins, leading to the "one gene, one enzyme" hypothesis. [10] Oswald Avery, Collin Macleod, and Maclyn McCarty showed in 1944 that DNA holds the gene's information. In 1953, James D. Watson and Francis Crick demonstrated the molecular structure of DNA. Together, these discoveries established the central dogma of molecular biology, which states that proteins are translated from RNA which is transcribed from DNA. This dogma has since been shown to have exceptions, such as reverse transcription in retroviruses.

In 1972, Walter Fiers and his team at the Laboratory of Molecular Biology of the University of Ghent (Ghent, Belgium) were the first to determine the sequence of a gene: the gene for