Database System Concepts
Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
Bartleby Related Questions Icon

Related questions

Question

Could you help with the code and explanation? 

Write a function in Python that takes a DNA sequence and kmer size (integer) as input, and returns a dictionary of all kmers (keys) in the string with a list of
positions as values. The positions should start at 1. Use your function to make a dictionary of the 'seq' string below and print the dictionary.
The following sequence with size = 3 should return:
seq =
'ATCGTTCATCG'
kmerdict(seq,3)
{'АTC': [1, 8],
"CAT': [7],
'CGT': [3],
GTT': [4],
'TCA': [6],
'TCG': [2, 9],
TTC': [5]}
Note that the order in the output is not important.
Use your function and the second string and print the positions of all ATGS
]:
seg
'АТCGTTCAТCG'
def kmerdict(sequence, size):
index
{ }
return index
"САСТТСАСТССАТGGCCСАТСТСТСАTGAATCAGTАССАААТGCAСТСАСАТСАТТАTGCACGGCACTTGCCТСAGCGGTCТАТАСССТGTтGCCATTTACССАТААСGCСС
"Here are all the ATG positions in seg2: ")
seq2
%3D
print(
expand button
Transcribed Image Text:Write a function in Python that takes a DNA sequence and kmer size (integer) as input, and returns a dictionary of all kmers (keys) in the string with a list of positions as values. The positions should start at 1. Use your function to make a dictionary of the 'seq' string below and print the dictionary. The following sequence with size = 3 should return: seq = 'ATCGTTCATCG' kmerdict(seq,3) {'АTC': [1, 8], "CAT': [7], 'CGT': [3], GTT': [4], 'TCA': [6], 'TCG': [2, 9], TTC': [5]} Note that the order in the output is not important. Use your function and the second string and print the positions of all ATGS ]: seg 'АТCGTTCAТCG' def kmerdict(sequence, size): index { } return index "САСТТСАСТССАТGGCCСАТСТСТСАTGAATCAGTАССАААТGCAСТСАСАТСАТТАTGCACGGCACTTGCCТСAGCGGTCТАТАСССТGTтGCCATTTACССАТААСGCСС "Here are all the ATG positions in seg2: ") seq2 %3D print(
Expert Solution
Check Mark
Explaination and approach

The approach i used is as follows:- 

  • First find all the possible substrings of the string and get them in the list 
  • Then only take those substrings which are of the size in our case it is 3 
  • Then find the occurence of those substring in the sequecne using startswith 
  • Then add the occurences one by one in the list 
  • Add the record in the dictionary as the substring as key and list of occurence as value
  • Finally return the dictionary

Everything is mentioned in the code comments 

Code is added in the step 2 along with screenshot for the code and output

#To find all the substring of string 

We use approach in which 1 string hold the index of current element and other element take the substrings 

Ex. Hello 

Now i will hold H and  j will also hold H first substring = H 

Now i stays there only and j increment to e second substring = He again j increments we get substring = Hel

Like this we do till end but at end j will be at o hence we will get hello but we eliminate that and not add that 

Then i increment to e and also j = e hence another substring = e again j keeps incrementing till end

 

Knowledge Booster
Background pattern image
Computer Science
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Text book image
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Text book image
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Text book image
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
Text book image
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Text book image
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Text book image
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education