What distinguishes you from me is encoded by our genetic make-up, or DNA. DNA, or deoxyribonucleic acid, defines our unique physical features, such as eye color and height, and even plays a role in determining our personalities. But how exactly is the information in our DNA translated into the characteristics that we observe? The answer to this question is defined by the central dogma of molecular biology that states: DNA directs its own replication and transcription to yield RNA (ribonucleic acid), which, in turn, directs translation to form proteins.
A protein is one of the major macromolecules that form the foundation of biology. This molecule consists of one or more chains of amino acids called a polypeptide. Unique combinations of these amino acids define each individual protein, which is predetermined by the nucleotide sequence of a gene, specifically the DNA. These chains of amino acids are then folded into specific three-dimensional structures, which dictate the activity and function of the protein.
Each protein is defined by a specific order of amino acids that is encoded by the nucleotide sequence of the gene. An amino acid is specified by a combination of three nucleotides called a codon. Because DNA is comprised of four nucleotides, the total possible number of codons in the genetic code is 64. Thus, a particular amino acid may be determined by several codons.
In order to synthesize a protein from a gene, the DNA must first be transcribed into messenger RNA (mRNA) by a large protein complex called the ribosome. This process occurs in the nucleus of the cell. After undergoing several modifications, the mRNA is exported from the nucleus into the cytoplasm where the ribosome converts the mRNA into an amino acid sequence. The process of synthesizing the mRNA at the ribosome into a protein is called translation.
After the mRNA sequence is translated into the corresponding amino acids, the polypeptide is folded into a specific three-dimensional structure known as the native conformation of the protein. The structure of a protein can be defined by four characteristics. These four characteristics are: the primary structure, the secondary structure, the tertiary structure, and the quaternary structure. The linear polypeptide sequence obtained after translation is called the primary structure. These amino acids form hydrogen bonds with one another to generate secondary structures, such as alpha helices or beta sheets, which are repeated throughout the protein structure. These secondary structures form the tertiary structure of the protein through more chemical interactions, such as hydrogen bonds and salt bridges, and define its final shape, fold and activity. Proteins that consist of two or more subunits form a quaternary structure. Protein structures, however, are not rigid and alter their conformation in response to protein-protein interactions and chemical changes within the cell. Protein structures can be studied using techniques such as X-ray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy that define the specific orientation of the atoms in a protein molecule. There currently exists a large database called the Protein Data Bank, which contains the structures of all known proteins that have been solved using these techniques. Knowledge of the protein structure may provide insight into its function. Moreover, knowledge of the protein structure is important for translational applications. For example, many therapeutics use protein kinase inhibitors, which bind to a specific pocket in the active region of the protein and block its activity.
Proteins have many different functions in the cell. One of the most well known roles of a protein is an enzyme, a catalyst for a specific cellular reaction. Enzymes are involved in many cellular processes such as DNA transcription and metabolism. Proteins can also participate in cell signaling pathways that regulate various cellular processes. For example, insulin is a protein that is released from beta cells in the pancreas that can signal to target cells in the liver. Through this signaling pathway, cells communicate with one another using the insulin ‘signal’ to regulate glucose levels in the blood, which is dysregulated in diabetes. Proteins also serve structural roles within the body. Collagen and elastin are proteins that form the connective tissue found in our hair and fingernails. Other structural proteins such as actin and tubulin comprise the cytoskeleton, which provides cells with a distinct shape and size.
The total number of proteins present within in a cell is defined as the proteome. Using techniques such as mass spectrometry and protein microarrays, scientists can generate large datasets from cells in their disease of interest or under particular stimulation conditions. These data can be analyzed by databases such as the Ingenuity Knowledge Base to generate large maps of all of the known interactions between proteins in a cell. In disease studies, this information facilitates the development of new drugs to target these proteins.
Knowledge of protein structure is critical for drug design. Chemists in pharmaceutical companies utilize structure-activity relationships (SAR) to optimize chemical compounds to specifically bind to their protein targets of interest and will be assayed according to the protein’s known function in the cell. Here, detailed understanding of protein structure is essential to prevent drugs from binding other proteins, thus creating off-target, toxic side effects.