Genitic Program and Genome Structure
On Earth, there are billions of animal and plant species, which differ from each other by morphological and behavioral criteria. The information needed to establish these criteria is included in the genome, which is specific to each species. Within a species is generally observed variability between individuals due to a polymorphism in the genome. These individual differences are exploited, for example, forensic
science, which can positively identify an individual from the analysis of the genome of the cells constituting evidence (blood, semen ...). Similarly, archaeologists can establish kinship uniting various individuals found in the same tomb, to the extent that the state of conservation of the remains can extract at least a small part of the genome.
Within an individual, all cells have the same genome because they come from the division of a single cell of origin: egg. Nevertheless, there are a large number of cell types, about two hundred in humans, which have a different morphology and function. This is due to the fact that each cell type in the genome is expressed in different ways. It is said that adult stem cells are differentiated, as opposed to the original egg cell that is totipotent. The differentiated state is acquired gradually during embryonic development. It involves a sequence of events that are linked together in a specific order. We like to compare the establishment of this cell differentiation in the execution of a program. The paradox is that this program, which leads ultimately to a differential expression of the genome is inherited, that is to say, essentially, within the genome itself: a bean seed will grow a foot still in bean and an egg earthworm still worm. The environment, in this case the soil for selected examples, does not change the program, at least in outline. In addition, the genetic program of a given cell can not be executed in the context of the cell, which is the only one to contain all the molecular elements that allow reading of a particular gene at a particular time of embryonic development.
1. Structure of the genome of eukaryotic cells:
Specifically, the genome is a set of deoxyribonucleic acid, or DNA (forty-six molecules in humans). DNA is a macromolecule formed by the chain of nucleotides, which are composed of a sugar, deoxyribose, and a base [cf. MOLECULAR BIOLOGY]. There are only four different bases: adenine, thymine, guanine and cytosine, denoted A, T, G and C. This is the order of non-random sequence of these four bases is the genetic message. The extent of this message is considerable: the human genome contains about 3 × 109 bases, which would be written with only the letters A, T, G and C a book ... five hundred thousand pages! However, the genome size do not necessarily reflect the degree of complexity or evolution of an organism: some plants and amphibians have a hundred times more DNA than humans. In addition, only 10% of the DNA is encoding, that is to say, contains the information necessary for protein synthesis, predominant role in the structure and function of cells. In DNA, there are sequences classified into three categories: highly repeated sequences, moderately repeated sequences and unique sequences. The highly repetitive DNA is not coding and its function is not known, it is included primarily at the centromeres and telomeres of chromosomes [cf. CHROMOSOMES]. The majority of moderately repeated DNA is non-coding sequences and consists of mobile dispersed throughout the genome. The other part of the DNA code of moderately repeated small RNA, ribosomal RNA and transfer RNA, which are part of the translation machinery [cf. MOLECULAR BIOLOGY]. Finally, the unique sequences represent the keywords of the genetic message: these contain sequences encoding proteins. Any known gene DNA sequence encoding a protein or a small RNA. More specifically, the gene is not limited strictly to the coding part but also includes the flanking sequences required for its expression and regulation of this expression.
2. General mechanisms of information transfer:
DNA is contained in the nucleus of the cell, while protein synthesis takes place in the cytoplasm. This implies the involvement of an intermediary, a messenger, able to pass from the nucleus to the cytoplasm. To simplify, the gene is first transcribed into ribonucleic acid, or RNA, premessenger said. RNA is very similar to DNA: it is also a macromolecule composed of a chain of nucleotides, each comprising a ribose (instead of deoxyribose in DNA), and a base, adenine, guanine, cytosine, or uracil (instead of thymine in DNA). The pre-messenger RNA is actually a complementary copy of the gene: it is synthesized by the enzyme RNA polymerase, which is capable of positioning in RNA guanine every time she read a cytosine DNA cytosine to guanine, uracil and adenine for a thymine for adenine. This pre-messenger RNA is then matured in a messenger RNA. Indeed, in the gene, the coding sequence is interrupted by sequences called "intronic" whose function is still unknown. Most of maturation is to eliminate introns - we say "splice" - leaving only the coding sequences, also called "exons". In addition, during the maturation, both ends of the RNA are modified: upstream is added a "hat" consisting methylated nucleotide. This hat is essential for the subsequent translation of messenger RNA. Downstream enzyme, poly-A polymerase, fixed a "tail" composed exclusively of adenine. This tail gives a stable RNA degradation. Maturation of messenger RNA is accompanied by a migration to the cytoplasm, where translation into protein takes place. As well as DNA and RNA, proteins are polymers formed by the non-random sequence of basic elements, amino acids, which exist only in a limited number twenty. Therefore, the four-letter language of DNA is simply translated into a language of twenty letters. Each amino acid is encoded by a nucleotide triplet. The genetic code, that is to say, the dictionary gives the translation from one language into another is degenerated several nucleotide triplets encode the same amino acid. This gives the cells a resistance to change: from the mutations that affect only basis, will be silent for which the meaning of the nucleotide triplet remain unchanged.
3. Regulation of transcription:
Historically, the first question that arose was whether during cell differentiation different types of cells lose some genes differentially or whether, on the contrary, all the cells retain all the genes present in the egg, expression is different depending on cell types. The answer to this question was clearly provided in 1960 by the team of John Gurdon, molecular embryologist at the University of Cambridge, UK. The principle of his experiments, conducted on a species of South African toad, Xenopus laevis, is to remove the nucleus from an egg and replacing it with the nucleus of a differentiated cell in the occurrence of an intestinal cell tadpole. Thus transformed, the egg develops into a tadpole normally viable capable of metamorphosis. That indicates that the nucleus of the differentiated intestinal cell always has all the genes that were present in the egg. Differentiation therefore involves not the elimination of differential genes, but differential expression of genes. This is the reason why many efforts are devoted to understanding the mechanisms for this differential expression.
As we have seen, the coding portion of a gene is typically surrounded by non-coding parts involved in the regulation of its expression. The distinction is immediately upstream of the coding sequence, a sequence called "promoter", which contains binding sites for RNA polymerase and a number of factors, called "general transcription factors". These actors provide a molecular means basal transcription may be modulated up or down by other transcription factors, called specific. It is these factors that are assumed to be responsible for the differential regulation of gene expression. Specific transcription factors bind to specific DNA sequences, called "consensus sequences" long from five to twenty nucleotides, different for each transcription factor. These sequences are located either in the promoter region itself or in sequences known stimulator of transcription (enhancer in English), located upstream or downstream of the gene, sometimes very far from the coding sequence or, alternatively, directly in an intron within the coding sequence [cf. MOLECULAR GENETICS].
Specific transcription factors are proteins consisting of multiple domains specialized function. They each have, on one hand, a binding domain to the DNA that allows the protein to recognize its consensus sequence from binding and, on the other hand, a domain capable of interacting with the RNA polymerase or the general transcription factors and thereby modulate their transcriptional activity. The specific transcription factors bind to DNA often form homodimers or heterodimers. To do this, they have a dimerization domain that allows the protein-protein interactions. Finally, some transcription factors are able to bind to DNA when they have secured a ligand. This is the case of nuclear hormone receptors. In the absence of hormones, these factors are localized in the cytoplasm of the cell. When hormone enters the cell, it binds to its receptor which is then able to migrate into the nucleus and bind to DNA in its consensus sequence.
There are transcription factors capable of activating transcription, with varying efficiency, others can suppress it. Paradoxically, some transcription factors such as p53, which is mutated in many human cancers, or Drosophila Dorsal protein are able to activate some genes and repress others. Among the repressors of transcription, there are, on the one hand, active repressors, which have the intrinsic ability to reduce the rate of gene transcription and, on the other hand, passive repressors, which act indirectly by decreasing the activity a transcription activator. For example, the factor AP1 represses gene activity of osteocalcin by competing with the nuclear receptor for retinoic acid, which is a transcriptional activator. The competition between the two factors is explained by the fact that in the osteocalcin promoter consensus sequences receptor binding retinoic acid partly overlap those of factor AP1. AP1 binding of the DNA gene thus that of retinoic acid receptor. Other passive repressors act by a more direct mechanism. For example, during the differentiation of muscle cells, the transcription factor c-Jun, which is highly oncogenic in vertebrates, binds to the transcription factor activator MyoD, which prevents it from binding to DNA, and therefore activate transcription.
The expression of each gene is controlled finely by the fact that several transcription factors act in combinatorics. Indeed, there is a finite and limited transcription factors, probably not yet all known today, but the number of different combinations is infinite. For example, the albumin gene, which is expressed only in the liver, sees its transcription regulated by at least five different factors called C / EBP, HNF3, NF1, NFY and HNF1, three of which operate in two different sites the promoter or enhancer. NF1 and NFY are ubiquitous, that is to say, they are present in all cell types, whereas C / EBP and HNF1 are restricted to a small number of cell types, including liver, and HNF3 is specific liver and lungs. Understandably, with this example, that determine the transcription factors involved in the expression of a gene is not enough to understand how this gene is regulated during cell differentiation since seen some of these regulators are themselves regulated. It is therefore important to understand the cellular differentiation to study the variations of expression of genes that encode transcription factors.
4. Other levels of regulation of gene expression:
For a gene to be transcribed and, a fortiori, that the rate of transcription is regulated, it is necessary that its promoter and enhancer are accessible to RNA polymerase and various transcription factors. However, in the nucleus of the cell, the DNA is tightly compacted into chromatin, because of its association with a number of proteins. The first level of DNA packaging is the nucleosome. Nucleosome consists of a protein octamer particular, histones, around which DNA rotates two revolutions. Each nucleosome is attached to the next by means of another histone, so that the whole DNA-histone fiber is 10 nanometers in diameter. In turn, this fiber is wound on itself, like a solenoid, to form a new fiber thicker than 30 nanometers in diameter. In addition to histones, are also found associated with DNA protein HMG (high mobility group for). The state of the chromatin when DNA is transcribed is still poorly understood. What is known is that the transcribed DNA is particularly susceptible to degradation by nucleases, indicating that it is not protected by proteins, and two HMG proteins (HMG 14 and HMG 17) associate preferentially sensitive to these sites.
The DNA molecule is composed of the union of two strands, most commonly wound double helix right: the B conformation of DNA. However, in specific areas of the DNA molecule, the double helix is not right, but left is the Z conformation It seems that the same area of a DNA molecule can move from one conformation to another, for example during embryonic development, and that this conformational change would alter the level of transcription of genes located in these areas .
DNA can be chemically modified. Changes the most frequent and best known are the methylation of cytosines. In general, it appears that more DNA is methylated, the less it is transcribed. This is particularly evident in the case of sex chromosome X. In females, one of the two X chromosomes is hypermethylated, it follows that is almost completely inactivated [cf. REGULATIONS BIOCHEMICAL]. In addition, it has long been known that the two alleles of the same gene, one from the father and one from the mother, does not always express the same rate. This difference in expression appears to be due to a "footprint" acquired by the gene during his stay in the gametes. This footprint would essentially methylation.
The level of gene expression is also affected by the fate of RNA after transcription. Indeed, what has a functional point of view is not the amount of a given RNA, but the quality and quantity of the final protein product. As we have seen, the gene is first transcribed into pre-messenger RNA, which is then matured in a messenger RNA. Given a pre-messenger RNA can lead to several different messenger RNAs that are then translated into several related proteins but differing in one or more areas. This is due to the fact that during splicing introns, exons some may not be taken into account. We speak of alternative splicing. Different protein isoforms from alternative splicing usually have similar functions, but in some cases may have totally different functions. The best example is the calcitonin gene, which may give rise either by alternative splicing of calcitonin itself is a neurotransmitter.
The amount of protein synthesized at a given time by the cell obviously depends on the amount of mRNA present. This amount depends, in turn, on the one hand the level of transcription and, on the other hand, the stability of the RNA. This stability is controlled both at the pre-messenger and courier. Premessenger level, it depends on the presence, upstream of the coding sequence, sequences rich in A and U, which have a destabilizing effect. At the messenger, stability is rather strengthened by the presence of the poly-A tail, downstream of the coding sequence.
Finally, a molecule, whatsoever, to fulfill its normal function, it must be located in the correct cellular compartment. Thus, a transcription factor can modulate the transcription of its target genes that when present in the nucleus. Similarly, RNA can be translated if it is present in the cytoplasm and accessible to the translational machinery. Once transcription is initiated, the nascent RNA is surrounded by proteins to form ribonucleoprotein particles, called hnRNP (heterogenous nuclear ribonucleoparticles for). It is estimated that only one-twentieth of the synthesized RNA leave the nucleus. Half of these RNAs are degraded directly into the nucleus and the remaining 30% will be sequestered, a more or less long, probably by a mechanism involving the hnRNP proteins, which would thus regulate the expression levels of genes corresponding. Storage of RNA may also be in the cytoplasm. An extreme example is the storage of oocytes by RNA rich in yolk, such as Xenopus laevis. During oogenesis, the oocytes synthesize and store a considerable amount of RNA. They will be used, sometimes several months later, in the early divisions of the early embryo, whereas the genome of the embryo itself is silent.
Some transcription factors are constitutively present in many cell types but are inactive because localized in the cytoplasm. This is the case of the ubiquitous transcription factor NF-kB, which is retained in the cytoplasm due to its association with a protein called IkB (inhibitor of I). This protein has in its amino acid sequence a pattern repeated six times, which thinks it gives it the ability to bind to the cytoskeleton. Thus, the complex NF-kB/IkB anchored in the cytoplasm. In response to extracellular stimuli such as growth factors or certain interleukins, protein IkB is phosphorylated it dissociates when NF-kB is then degraded. NF-kB is thus released and can migrate into the nucleus, where it will activate target genes. Interestingly, one of the target genes of NF-kB is the gene that encodes ... IkB! So, very quickly, the rate rises again IkB proteins in the cell, and NF-kB is once again sequestered in the cytoplasm. This mechanism allows activation of the transcription factor NF-kB both very fast - it does not require transcription or translation - and transient.
No comments:
Post a Comment