8_Proj_S23

.docx

School

Northeastern University *

*We aren’t endorsed by this school

Course

2301

Subject

Biology

Date

Apr 3, 2024

Type

docx

Pages

3

Uploaded by KidBeeMaster1068

Report
Project Eight: Homologs and conserved domains (15 points) Phylogenetic analyses focus on homologs and can yield a lot of useful information about a gene and its products, including the evolutionary relationships among organisms in which it is found, regions of conservation in the sequence that indicate active sites in a protein, and regions of variability that can indicate rapid evolution in response to pathogens. In a previous project you used pairwise alignments through the BLAST programs to identify homologs. Identifying homologs is the first step in carrying out a phylogenetic analysis. In this project, you'll see how comparison of amino acid sequences of homologs reveals regions (conserved domains) that are important for a protein's function. For the next project, you'll align homologs and generate a phylogenetic tree. The "Phylogenetics Guide" (available on your section's course site) is meant to introduce you to the key concepts and considerations in performing good, sound phylogenetic analyses. Please read it and refer to it when answering the questions in part I. I. Concepts in Phylogenetics (refer to the Guide to answer these questions) 1. Define “homolog” in your own words, and provide an example. Homolog is a gene that is inherited in two species from a common ancestor. An example of a homolog would be a house cat and a leopard sharing the same ancestors. 2. What are two reasons a pair of homologs could have a high percentage of matching positions (a high % identity)? Two reasons would be because there are only four bases, each having a base pair that it matches with. This doesn’t allow for much variation because there are not many options, making many base pair combinations match up with each other. Also, homologs derived from a common ancestor and will therefore share similar genes, leading to a high number of matching positions. 3. Which e-value indicates a more significant hit, 1x10 -6 or 1x10 -3 ? 1 x 10-6 indicates a more significant hit. 4. What do the internal nodes on a phylogenetic tree represent? The internal nodes on a phylogenetic tree represent a common ancestor between the branches that are connected to it. 5. What do the numbers at the nodes of a phylogenetic tree usually represent? The numbers at the nodes of the phylogenetic trees usually represent the level of support for the node and the higher the number, the more support, the higher the number. 6. What does the number on the scale bar of a phylogenetic tree usually represent? The number on the scale bar of a phylogenetic tree represents the length, which corresponds to a number of substitutions per sit or the percentage of sites with substitutions. II. Visualizing conserved domains in protein structures We will be working with a gene that codes for the enzyme phosphoglycerate kinase (PGK1). Go to the NCBI Gene database and search for mouse PGK1 . Pay special attention to the summary to understand the function of the protein product. A protein domain is a distinct portion of a protein. A conserved domain is a sequence or structural unit of a protein that is recurrently seen throughout molecular evolution. Conserved domains can be thought
of as foundational “building blocks” of proteins. These gene regions sometimes undergo duplications and recombination that result in proteins with different functions. When a region is highly conserved among many species, it indicates that mutations to that specific region are not tolerated, which in turn indicates that the region is important either structurally or functionally. Follow the steps below to identify a conserved domain of PGK1 and visualize its sequence and 3D structure. Use the back button to return to the page that resulted from your search. In the box with information on the gene, click on "RefSeq proteins." Remember that RefSeq is a curated database that links records for a gene’s DNA, mRNA, and protein. On the right side of the page, click on Identify Conserved Domains. On the conserved domains page, find the Graphical Summary section. Click on the “zoom to residue level”. The term "residue" refers to an amino acid when it is part of a polypeptide chain. When amino acids form polypeptide bonds, the elements of water are removed, leaving behind amino acid "residues". So when referring to protein sequences, "residue" is interchangeable with "amino acid". Scroll along the sequence until you find the specific residues involves in the “substrate binding site”. The site is defined by five residues that are conserved across species, shown in bold. 7. Write each of the five residues and its position below (the first one is done for you): (4 points) 1. D 24 2. N 26 3. R 39 4. H 63 5. R 123 8. What are the names of the other three sites or regions visible in the graph display? Click on any of the orange arrows that are part of the “substrate binding site" to be brought to the Conserved Protein Domain Family page. Scroll down the page to see the MSA of this protein in species (listed to the right of the MSA) in which these substrate binding sites are very well conserved. Modify the “Row Display” option to select the largest number of rows. The other sites on the graph display the catalytic site, ADP binding sites, and hinge regions. The top row of the alignment shows the sequences in human PGK1 that represent conserved domains. The rows beneath each show the same conserved regions in numerous homologs of PGK1. For example, near the top is the sequence of these conserved regions in the organism known as "baker's yeast". 9. How conserved (how similar in sequence) would you say this protein is across species? Not very, somewhat, very much? Scroll to the top of the page and click on “Structure” on the header ribbon. Enter "mouse PGK1" in the window
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help