
Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
expand_more
expand_more
format_list_bulleted
Concept explainers
Question
need help for Python
![Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
1/2
strings
["is", "gi*1"]
print (text search ("File 1.txt", strings))
=
1](https://content.bartleby.com/qna-images/question/1cbaac23-ac85-41b4-a340-68ba227c3863/a8e27df9-3cc0-4ee4-868c-50a0c3a63fc3/yw3or4_thumbnail.png)
Transcribed Image Text:Background
In Lecture 5 to Lecture 6, we learned the python programming skills to process biological
sequences and pattern searching. In this assignment, you are required to write python programs
to practice processing of biological sequences and pattern searching.
Question 1 (3%)
Write a function that check whether an input sequence is a valid protein sequence. The function
template is given to you as below. The function should return True if the input sequence is a
valid protein sequence, and return False otherwise. The following is the list of valid aminoacid
symbols:
"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "W", "y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
1/2
strings
["is", "gi*1"]
print (text search ("File 1.txt", strings))
=
1
!["A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V",
"W",
"Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
" " #
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2](https://content.bartleby.com/qna-images/question/1cbaac23-ac85-41b4-a340-68ba227c3863/a8e27df9-3cc0-4ee4-868c-50a0c3a63fc3/iit5ls_thumbnail.png)
Transcribed Image Text:"A", "C", "D", "E", "F", "G", "H", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V",
"W",
"Y"
def validate_protein (protein_seq) :
""" Checks if protein sequence is valid. Returns True is sequence is
valid, or False otherwise.
# To be completed...
Question 2 (5%)
Write a function that, given a sequence as an argument, allows to detect if there are repeated
sub-sequences of size k (the second argument of the function). The result should be a dictionary
where keys are sub-sequences and values are the number of times they occur (at least 2). The
function template is given to you as below. (Hints: you can make use of the function
"search_all_occurrences" shown to you in Lecture 6)
def number_of_repeated_subseq (seq, k):
"""Return a dictionary where keys are sub-sequences of size k and
values are number of times they occur (at least 2) """
dic = {}
# To be completed...
return dic
Question 3 (7%)
" " #
Write a function that performs a text search in a text file. The function takes the filename and
a list of strings as inputs. The list of strings is the patterns to be searched within the text file. A
"*" character within the string is a wildcard character, which can stand for unknown characters
with any length greater than zero. It returns a dictionary where keys are the patterns in the list
of strings being searched, and values are the number of times they occur. Your function should
have the following template:
def text_search (filename, patterns) :
"""It searches the file filename and returns a dictionary of search
result, showing patterns with number of occurrences"""
dic = {}
# To be completed...
return dic
For example, suppose a file with filename "File 1.txt" contains the following texts,
Mary is a girl.
They are girls.
Fish has gills.
By calling the following lines of codes,
strings = ["is", "gi*1"]
print (text_search ("File 1.txt", strings))
The following output is obtained:
{'is': 2, 'girl': 2, 'gill': 1}
1
2/2
2
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by stepSolved in 5 steps with 2 images

Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Q. List out names of 10 reserved words in Python?arrow_forwardConvert from C to C++ .Thank you. / include all required libraries#include<stdio.h>#include<stdlib.h>#include<sys/types.h>#include<sys/wait.h>#include<unistd.h> // variable to store number of generationsint num_generations; // function to print the generationsvoid printGeneration(int n){// stop if number of generations are reachedif(n>num_generations)return; // create a forkpid_t p = fork(); // generate new generationsif(p==0)printGeneration(n+1);else if(p>0)wait(0); // wait until child process terminates // print the present generationif(p!=0){// if n is 0if(n==0)printf("Parent. "); // print parent// if n is 1else if(n==1)printf("Child. "); // print child// for all other caseselse{// print grandchildrenfor(int i=0;i<n-2;i++)printf("Great ");printf("Grandchild. ");}// print pid and ppidprintf("pid: %d ppid: %d\n",getpid(),getppid());}} // main functionint main(int argc, char* argv[]){// if no arguments are providedif(argc==1){// print…arrow_forwardWhy do we use functions in python?arrow_forward
- *Needs to be done in Haskell* You are to write a version of the Unix cp program where then this program would copy a file and let you rename it. (for this program you can just show the basic functionality and ignore the specific flags)arrow_forwardHASKELL Please submit a single Haskell file • Please put comments in your code to show which question you are answering with each piece ofcode. • You may create auxiliary functions if you like. You may use library functions from Haskell’sstandard library. • Please limit your line lengths to 100 characters max. Please use the following two data types which you can copy-and-paste into your code. data Action = Rock | Paper | Scissors deriving (Eq, Show)data Outcome = Player1Win | Player2Win | Draw deriving Show Action represents a player’s chosen action and Outcome represents the outcome of playing a game. TASK Define a function check :: Action -> Action -> Outcome which calculates the outcome of a game where the first argument is the action of player 1 and the second argument is the action ofplayer 2.For example, check Rock Paper = Player2Win and check Paper Paper = Draw. Define a function parse :: String -> Maybe Action which will be used to parse user input for the choice…arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education

Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON

C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning

Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education