Eukaryotic genomes are replete with repetitive sequences that make genome assembly from sequencereads difficult. For example, sequences such asCTCTCTCTCT . . . (tandem repeats of the dinucleotide sequence CT) are found at many chromosomallocations, with variable numbers (n) of the CT repeating unit at each location. Scientists can assemblegenomes despite these difficulties by using the pairedend sequencing strategy diagrammed in Fig. 9.9. Inother words, they can make libraries with genomicinserts of defined size, and then sequence both endsof individual clones. Following are 12 DNA sequence reads from sixcloned fragments analyzed in a genome project. 1Aand 1B represent the two end reads from clone 1, 2Aand 2B the two end reads from clone 2, etc. Clones1–4 were obtained from a library in which the genomic inserts are about 2 kb long, while the inserts inclones 5 and 6 are about 4 kb long. All of these sequences have their 5′ ends at the left and their 3′ endsat the right. To simplify your analysis, assume thatthese sequences together represent two genomic locations (loci; singular locus), each of which contains a(CT)n repeat, and that each of the 12 sequences overlaps with one and only one other sequence. 1A: CCGGGAACTCCTAGTGCCTGTGGCACGATCCTATCAAC1B: AGGACTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCT2A: GTTTTTGAGAGAGAGAGAGAGAGAGAGAGACCTGGGGG2B: ACGTAGCTAGCTAACCGGTTAAGCGCGCATTACTTCAA3A: CTCTCTCTCTCTCTCTCTCTCAAAAACTATGGAAATTT3B: TAGTGATAGGTAACCCAGGTACTGCACCACCAGAAGTC4A: GGCCGGCCGTTGTTGACGCAATCATGAATTTAATGCCG4B: TCATGGGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA5A: TAGTGCCTGTGGCACGATCCTATCAACTAACGACTGCT5B: AAGGAAAGGCCGGCCGTTGTTGACGCAATCATGAATTT6A: CAGCAGCTAGTGATAGGTAACCCAGGTACTGCACCACC6B: GGACTATACGTAGCTAGCTAACCGGTTAAGCGCGCATTa. Diagram the two loci, showing the locations ofthe repetitive DNA and the relative positions andorientations of the 12 DNA sequence reads.b. If possible, indicate how many copies of the CTrepeating unit reside at either locus.c. Are the data compatible with the alternativehypothesis that these clones actually represent twoalleles of a single locus that differ in the numberof CT repeating units?

Biology: The Dynamic Science (MindTap Course List)
4th Edition
ISBN:9781305389892
Author:Peter J. Russell, Paul E. Hertz, Beverly McMillan
Publisher:Peter J. Russell, Paul E. Hertz, Beverly McMillan
Chapter19: Genomes And Proteomes
Section: Chapter Questions
Problem 1ITD: Below is a sequence of 540 bases from a genome. What information would you use to find the...
icon
Related questions
Question

Eukaryotic genomes are replete with repetitive sequences that make genome assembly from sequence
reads difficult. For example, sequences such as
CTCTCTCTCT . . . (tandem repeats of the dinucleotide sequence CT) are found at many chromosomal
locations, with variable numbers (n) of the CT repeating unit at each location. Scientists can assemble
genomes despite these difficulties by using the pairedend sequencing strategy diagrammed in Fig. 9.9. In
other words, they can make libraries with genomic
inserts of defined size, and then sequence both ends
of individual clones.
Following are 12 DNA sequence reads from six
cloned fragments analyzed in a genome project. 1A
and 1B represent the two end reads from clone 1, 2A
and 2B the two end reads from clone 2, etc. Clones
1–4 were obtained from a library in which the genomic inserts are about 2 kb long, while the inserts in
clones 5 and 6 are about 4 kb long. All of these sequences have their 5′ ends at the left and their 3′ ends
at the right. To simplify your analysis, assume that
these sequences together represent two genomic locations (loci; singular locus), each of which contains a
(CT)n repeat, and that each of the 12 sequences overlaps with one and only one other sequence. 1A: CCGGGAACTCCTAGTGCCTGTGGCACGATCCTATCAAC
1B: AGGACTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCT
2A: GTTTTTGAGAGAGAGAGAGAGAGAGAGAGACCTGGGGG
2B: ACGTAGCTAGCTAACCGGTTAAGCGCGCATTACTTCAA
3A: CTCTCTCTCTCTCTCTCTCTCAAAAACTATGGAAATTT
3B: TAGTGATAGGTAACCCAGGTACTGCACCACCAGAAGTC
4A: GGCCGGCCGTTGTTGACGCAATCATGAATTTAATGCCG
4B: TCATGGGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA
5A: TAGTGCCTGTGGCACGATCCTATCAACTAACGACTGCT
5B: AAGGAAAGGCCGGCCGTTGTTGACGCAATCATGAATTT
6A: CAGCAGCTAGTGATAGGTAACCCAGGTACTGCACCACC
6B: GGACTATACGTAGCTAGCTAACCGGTTAAGCGCGCATT
a. Diagram the two loci, showing the locations of
the repetitive DNA and the relative positions and
orientations of the 12 DNA sequence reads.
b. If possible, indicate how many copies of the CT
repeating unit reside at either locus.
c. Are the data compatible with the alternative
hypothesis that these clones actually represent two
alleles of a single locus that differ in the number
of CT repeating units?

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Genome annotation
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, biology and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Biology: The Dynamic Science (MindTap Course List)
Biology: The Dynamic Science (MindTap Course List)
Biology
ISBN:
9781305389892
Author:
Peter J. Russell, Paul E. Hertz, Beverly McMillan
Publisher:
Cengage Learning