Database System Concepts
Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
Bartleby Related Questions Icon

Related questions

bartleby

Concept explainers

Question

need help for Python

Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
1/2
strings
["is", "gi*1"]
print (text search ("File 1.txt", strings))
=
1
expand button
Transcribed Image Text:Background In Lecture 5 to Lecture 6, we learned the python programming skills to process biological sequences and pattern searching. In this assignment, you are required to write python programs to practice processing of biological sequences and pattern searching. Question 1 (3%) Write a function that check whether an input sequence is a valid protein sequence. The function template is given to you as below. The function should return True if the input sequence is a valid protein sequence, and return False otherwise. The following is the list of valid aminoacid symbols: "A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T", "V", "W", "y" def validate_protein (protein_seq) : """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and values are number of times they occur (at least 2) """ dic = {} # To be completed... return dic Question 3 (7%) Write a function that performs a text search in a text file. The function takes the filename and a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A "*" character within the string is a wildcard character, which can stand for unknown characters with any length greater than zero. It returns a dictionary where keys are the patterns in the list of strings being searched, and values are the number of times they occur. Your function should have the following template: def text_search (filename, patterns) : """It searches the file filename and returns a dictionary of search result, showing patterns with number of occurrences""" dic = {} # To be completed... return dic For example, suppose a file with filename "File 1.txt" contains the following texts, Mary is a girl. They are girls. Fish has gills. By calling the following lines of codes, 1/2 strings ["is", "gi*1"] print (text search ("File 1.txt", strings)) = 1
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V",
"W",
"Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
" " #
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2
expand button
Transcribed Image Text:"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S", "T", "V", "W", "Y" def validate_protein (protein_seq) : """ Checks if protein sequence is valid. Returns True is sequence is valid, or False otherwise. # To be completed... Question 2 (5%) Write a function that, given a sequence as an argument, allows to detect if there are repeated sub-sequences of size k (the second argument of the function). The result should be a dictionary where keys are sub-sequences and values are the number of times they occur (at least 2). The function template is given to you as below. (Hints: you can make use of the function "search_all_occurrences" shown to you in Lecture 6) def number_of_repeated_subseq (seq, k): """Return a dictionary where keys are sub-sequences of size k and values are number of times they occur (at least 2) """ dic = {} # To be completed... return dic Question 3 (7%) " " # Write a function that performs a text search in a text file. The function takes the filename and a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A "*" character within the string is a wildcard character, which can stand for unknown characters with any length greater than zero. It returns a dictionary where keys are the patterns in the list of strings being searched, and values are the number of times they occur. Your function should have the following template: def text_search (filename, patterns) : """It searches the file filename and returns a dictionary of search result, showing patterns with number of occurrences""" dic = {} # To be completed... return dic For example, suppose a file with filename "File 1.txt" contains the following texts, Mary is a girl. They are girls. Fish has gills. By calling the following lines of codes, strings = ["is", "gi*1"] print (text_search ("File 1.txt", strings)) The following output is obtained: {'is': 2, 'girl': 2, 'gill': 1} 1 2/2 2
Expert Solution
Check Mark
Knowledge Booster
Background pattern image
Computer Science
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Text book image
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Text book image
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Text book image
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
Text book image
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Text book image
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Text book image
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education