8_Proj_S23
.docx
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
2301
Subject
Biology
Date
Apr 3, 2024
Type
docx
Pages
3
Uploaded by KidBeeMaster1068
Project Eight: Homologs and conserved domains (15 points)
Phylogenetic analyses focus on homologs and can yield a lot of useful information about a gene and its products, including the evolutionary relationships among organisms in which it is found, regions of conservation in the sequence that indicate active sites in a protein, and regions of variability that can indicate rapid evolution in response to pathogens. In a previous project you used pairwise alignments through the BLAST programs to identify homologs. Identifying homologs is the first step in carrying out a
phylogenetic analysis. In this project, you'll see how comparison of amino acid sequences of homologs reveals regions (conserved domains) that are important for a protein's function. For the next project, you'll align homologs and generate a phylogenetic tree.
The "Phylogenetics Guide" (available on your section's course site) is meant to introduce you to the key concepts and considerations in performing good, sound phylogenetic analyses. Please read it and refer to it when answering the questions in part I.
I. Concepts in Phylogenetics (refer to the Guide to answer these questions)
1. Define “homolog” in your own words, and provide an example. Homolog is a gene that is inherited in
two species from a common ancestor. An example of a homolog would be a house cat and a leopard sharing the same ancestors. 2. What are two reasons a pair of homologs could have a high percentage of matching positions (a high % identity)? Two reasons would be because there are only four bases, each having a base pair that it matches with. This doesn’t allow for much variation because there are not many options, making many base pair combinations match up with each other. Also, homologs derived from a common ancestor and will therefore share similar genes, leading to a high number of matching positions. 3. Which e-value indicates a more significant hit, 1x10
-6
or 1x10
-3
? 1 x 10-6 indicates a more significant hit. 4. What do the internal nodes on a phylogenetic tree represent? The internal nodes on a phylogenetic tree represent a common ancestor between the branches that are connected to it. 5. What do the numbers at the nodes of a phylogenetic tree usually represent? The numbers at the nodes of the phylogenetic trees usually represent the level of support for the node and the higher the number, the more support, the higher the number. 6. What does the number on the scale bar of a phylogenetic tree usually represent? The number on the scale bar of a phylogenetic tree represents the length, which corresponds to a number of substitutions per sit or the percentage of sites with substitutions. II. Visualizing conserved domains in protein structures
We will be working with a gene that codes for the enzyme phosphoglycerate kinase (PGK1). Go to the NCBI Gene database and search for mouse PGK1
. Pay special attention to the summary to understand the function of the protein product. A protein domain is a distinct portion of a protein. A conserved
domain is a sequence or structural unit of a protein that is recurrently seen throughout molecular evolution. Conserved domains can be thought
of as foundational “building blocks” of proteins. These gene regions sometimes undergo duplications and recombination that result in proteins with different functions. When a region is highly conserved among many species, it indicates that mutations to that specific region are not tolerated, which in turn indicates that the region is important either structurally or functionally. Follow the steps below to identify a conserved domain of PGK1 and visualize its sequence and 3D structure.
Use the back button to return to the page that resulted from your search. In the box with information on the gene, click on "RefSeq proteins." Remember that RefSeq is a curated database that links records for a gene’s DNA, mRNA, and protein.
On the right side of the page, click on Identify Conserved Domains.
On the conserved domains page, find the Graphical Summary section. Click on the “zoom to residue level”. The term "residue" refers to an amino acid when it is part of a polypeptide chain. When amino acids form polypeptide bonds, the elements of water are removed, leaving behind amino acid "residues". So when referring to protein sequences, "residue" is interchangeable with "amino acid".
Scroll along the sequence until you find the specific residues involves in the “substrate binding site”. The site is defined by five residues that are conserved across species, shown in bold. 7. Write each of the five residues and its position below (the first one is done for you):
(4 points) 1.
D
24 2.
N
26
3.
R
39
4.
H
63
5.
R
123
8. What are the names of the other three sites or regions visible in the graph display?
Click on any of the orange arrows that are part of the “substrate binding site" to be brought to the Conserved Protein Domain Family page. Scroll down the page to see the MSA of this protein in species (listed to the right of the MSA) in which these substrate binding sites are very well conserved.
Modify the “Row Display” option to select the largest number of rows. The other sites on the graph display the catalytic site, ADP binding sites, and hinge regions. The top row of the alignment shows the sequences in human PGK1 that represent conserved domains. The rows beneath each show the same conserved regions in numerous homologs of PGK1. For example, near the top is the sequence of these conserved regions in the organism known as "baker's yeast".
9. How conserved (how similar in sequence) would you say this protein is across species? Not very, somewhat, very much?
Scroll to the top of the page and click on “Structure” on the header ribbon.
Enter "mouse PGK1" in the window
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
need help in choosing correct option.
arrow_forward
Pls help ASAP
arrow_forward
Question:-
Which characteristics would you expect to be indicativeof horizontal gene transfer?T / F A significant change in %GC in part of the genome of an organismT / F Deletions of gene in the genomeT / F An insertion of a stretch of DNA when aligning sequences from closely related speciesT / F Missing genomic DNA sequences when aligning sequences from closely related speciesT / F Duplications of genes in parts of the genome
arrow_forward
Give typed full explanation
there are about 28,000 copies of zinc finger domains in the human genome, most of them as constituents of transcribed genes. This is a result of what process?
Retro transposition of mobile sequences
Evolutionary conservation, exon duplication and exon shuffling
Evolutionary conversion of leucine zipper, helix-turn-helix, and helix-loop-helix domains into zinc finger domains
Evolutionary selection for proteins that interfere with nucleosome packing
Genes that “jump” with the help of a transposase.
arrow_forward
Hi, can you help me translating
arrow_forward
Q10. Does frame +2 have an ORF in the coding region of this exon? What about frame +1 and frame +3?
Q11. Given that 3 of the 64 possible codons are stop codons, what is the chance of having a stop codon at any given position, assuming that the sequence is random?
arrow_forward
Q2 Comment on the similarity (or otherwise) between the two sequences. Are they similar? Does that surprise you? Does the similarity (or otherwise) make sense in terms of protein function and phylogenetic relationships?
arrow_forward
3
arrow_forward
. In examining Figure 3-19, what do you think is the mainreason for the difference in size of yeast and humanmtDNA?
arrow_forward
CA Live Remote
Consider the following chart: What type of mutation is this? *
DNA:
TAC GCA TGG AAT
MRNA: AUG CGU ACC UUA
Amino
acids:
Met-Arg-Thr-Leu
DNA:
TAC GTA TGG AAT
MRNA: AUG CAU ACC UUA
Amino
acids:
Met - His - Thr- Leu
substitution
deletion
insertion
frameshift mutation
arrow_forward
Q9. Based on the screenshot shown in Figure 22, which reading frame contains the amino acid sequence for the second coding exon of tra-RA?
arrow_forward
8
Q Search
8 Bb Bb Bb
616033?X-Blackboard-S3-Bucket-learn-ap-southeast-2-prod-fleet01-xythos&X-Blackboard-Expiration=1682402400000&X-Blackb
12 / 26
Reserve for analysis
using agarose gel
electrophoresis (250ng)
- 100% + B
Elective Ideas
Activity 1: Digestion of Genomic DNA with Hpall or Mspl
Water
1μg genomic DNA
20 units of Mspl per ug DNA
Reaction volume 20μl
Hpall OR Mspl*
Digest 90 mins 37°C
Take 500ng of digested DNA
Sequencing adaptor final concentration
0.5 or 1μM
Ligation reaction 10 mins
Reaction volume 20μl
Take 50ng, PCR reaction
Column-based purification of PCR products
Nanodrop quantification of products
Analyse up to 200ng
using agarose gel
electrophoresis
Reaction components
gDNA
Table 2 Restriction endonuclease digest of genomic DNA
Restriction endonuclease
buffer
b
1μg genomic DNA
20 units of Hpall per pg DNA
Reaction volume 20μl
Digest 90 mins 37°C
Take 500ng of digested DNA
Sequencing adaptor final concentration
0.5 or 1μM
Ligation reaction 10 mins
Reaction volume…
arrow_forward
GTTTTCACTGGCGAGCGTCATCTTCCTACT
7. Are there homologs for the identified gene in other systems? Identify one homolog in an invertebrate system (if there is none, provide a vertebratehomolog).
10. Generate a secondary structure prediction for one identified protein.
arrow_forward
Here is some information about the sequences:
All these sequences, “SEQUENCE_21” to “SEQUENCE_27” are in the same subfamily or “clade” of a large phylogenetic alignment of all Rab proteins in these three species (see “Figure 1.pdf” for a full view of gene family in humans, plants and yeast, see the “D” branch towards the bottom of the tree in Figure 1). “SEQUENCE_28” is a different Rab protein (actually it is the Rab39 protein at the bottom of the tree).
“SEQUENCE_21” is from yeast.
“SEQUENCE_22” to “SEQUENCE_25” are from the plant, Arabidopsis.
“SEQUENCE_26” and “SEQUENCE_27” are from humans.
Question: Based on the information above, what can you speculate about the possible evolution of the genes that “SEQUENCE_21” to “SEQUENCE_27” represent? (write 5 points at least).
arrow_forward
pls help me to do these my activity and answer properly
arrow_forward
Please asap
arrow_forward
Please asap
arrow_forward
Q7.1
4 Points
Explain what changes were made to CRISPR to make it more suited as a molecular tool in non-
bacterial species?
Q7.2
10 Points
I would like to make a mouse model of sickle cell anemia. I first need to introduce the point
mutation that causes sickle cell anemia into mouse embryonic stem cells. Explain how you would
use CRISPR to help you do this. You have to think carefully about all the pieces you need to achieve
this IN the cell. You can't just say "I would use CRISPR" - you need to describe at the molecular level
HOW you would use CRISPR and how it would work. You can upload supporting drawings if you
would like.
arrow_forward
LO 64- Explain how Next gen sequencing is applied in different technologies
In which of the following scenario would it be best to use Next-gen sequencing? (select all that apply)
a-To separate fragments of DNA based on their size-
b-To insert a mutation at random in a gene
c-To idently a novel pathogen
d-To determine the nucleotide sequence of a microbe
e- To insert methyl groups to cytosines in a DNA sequence
asap please
arrow_forward
Please asap
arrow_forward
Human Genome ProjectIn 2003, the Human Genome Project was successfully completed, determining the exact sequence of the entire human genome, which is made up of 3 billion nucleotide base pairs. The data generated from the Human Genome Project is freely available online to anyone. Many pieces of research and innovations stemmed from the HGP, allowing the identifications of 1 800 disease genes. Many of the corporations using the results from the HGP are privately funded, and research is being done for profit even though the HGP results are provided freely.
Identify one advantage and one disadvantage of corporate funding and patenting genetic research results.
arrow_forward
help.
Select all that are true
1- traditional' (as nicknamed from the last 70-8- years) and organic farming are the same.
2- GOMs may involve including a gene from a completely unrelated organism, such as agene from a jelly fish into amonkey.
3- Traditional'( as nicknamed from the last 70-80 years) and GOM farming generally include the use of pesticides and herbicides
4-Organic food may not include genetics modifications, pesticides, or herbicides.
arrow_forward
Show your work please
arrow_forward
construct! Which of the four constructs
(Construct 1-4) below could you use to make
your own mouse knockout? For each one
you did not choose, explain why you did not
choose that construct. Part of DHCR7 gene
structure is shown on top row in the
schematic. (Antibiotic selection and negative
selection genes have been abbreviated.)
Gene DHCR7
Intron 1
Exon 2
Intron 2
Exon 3
Exon1
Construct #1
Construct #2
Intron1 Negative Sel.
Intron1
|Antibio. Sel. Negative Sel.
Intron2
Antibio. Sel
Intron 2
Construct #3
Construct #4
Introni Antibio. Sel Negative Sel.
Negative Sel.
Antibio. Sel.
Intron2
Intron1
Intron2
arrow_forward
Help pls !! You are looking at a region of the genome that codes for a gene involved in enamel syntheiss. You do not have a transcripome (RNA sequence). Outline a protocol for deducing the ORF and the protein sequence.
arrow_forward
Pls help ASAP
arrow_forward
Genetics of man question
arrow_forward
help
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Biology (MindTap Course List)
Biology
ISBN:9781337392938
Author:Eldra Solomon, Charles Martin, Diana W. Martin, Linda R. Berg
Publisher:Cengage Learning
Related Questions
- need help in choosing correct option.arrow_forwardPls help ASAParrow_forwardQuestion:- Which characteristics would you expect to be indicativeof horizontal gene transfer?T / F A significant change in %GC in part of the genome of an organismT / F Deletions of gene in the genomeT / F An insertion of a stretch of DNA when aligning sequences from closely related speciesT / F Missing genomic DNA sequences when aligning sequences from closely related speciesT / F Duplications of genes in parts of the genomearrow_forward
- Give typed full explanation there are about 28,000 copies of zinc finger domains in the human genome, most of them as constituents of transcribed genes. This is a result of what process? Retro transposition of mobile sequences Evolutionary conservation, exon duplication and exon shuffling Evolutionary conversion of leucine zipper, helix-turn-helix, and helix-loop-helix domains into zinc finger domains Evolutionary selection for proteins that interfere with nucleosome packing Genes that “jump” with the help of a transposase.arrow_forwardHi, can you help me translatingarrow_forwardQ10. Does frame +2 have an ORF in the coding region of this exon? What about frame +1 and frame +3? Q11. Given that 3 of the 64 possible codons are stop codons, what is the chance of having a stop codon at any given position, assuming that the sequence is random?arrow_forward
- Q2 Comment on the similarity (or otherwise) between the two sequences. Are they similar? Does that surprise you? Does the similarity (or otherwise) make sense in terms of protein function and phylogenetic relationships?arrow_forward3arrow_forward. In examining Figure 3-19, what do you think is the mainreason for the difference in size of yeast and humanmtDNA?arrow_forward
- CA Live Remote Consider the following chart: What type of mutation is this? * DNA: TAC GCA TGG AAT MRNA: AUG CGU ACC UUA Amino acids: Met-Arg-Thr-Leu DNA: TAC GTA TGG AAT MRNA: AUG CAU ACC UUA Amino acids: Met - His - Thr- Leu substitution deletion insertion frameshift mutationarrow_forwardQ9. Based on the screenshot shown in Figure 22, which reading frame contains the amino acid sequence for the second coding exon of tra-RA?arrow_forward8 Q Search 8 Bb Bb Bb 616033?X-Blackboard-S3-Bucket-learn-ap-southeast-2-prod-fleet01-xythos&X-Blackboard-Expiration=1682402400000&X-Blackb 12 / 26 Reserve for analysis using agarose gel electrophoresis (250ng) - 100% + B Elective Ideas Activity 1: Digestion of Genomic DNA with Hpall or Mspl Water 1μg genomic DNA 20 units of Mspl per ug DNA Reaction volume 20μl Hpall OR Mspl* Digest 90 mins 37°C Take 500ng of digested DNA Sequencing adaptor final concentration 0.5 or 1μM Ligation reaction 10 mins Reaction volume 20μl Take 50ng, PCR reaction Column-based purification of PCR products Nanodrop quantification of products Analyse up to 200ng using agarose gel electrophoresis Reaction components gDNA Table 2 Restriction endonuclease digest of genomic DNA Restriction endonuclease buffer b 1μg genomic DNA 20 units of Hpall per pg DNA Reaction volume 20μl Digest 90 mins 37°C Take 500ng of digested DNA Sequencing adaptor final concentration 0.5 or 1μM Ligation reaction 10 mins Reaction volume…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Biology (MindTap Course List)BiologyISBN:9781337392938Author:Eldra Solomon, Charles Martin, Diana W. Martin, Linda R. BergPublisher:Cengage Learning
Biology (MindTap Course List)
Biology
ISBN:9781337392938
Author:Eldra Solomon, Charles Martin, Diana W. Martin, Linda R. Berg
Publisher:Cengage Learning