Week 8 Discussion

.docx

School

University of Maryland Global Campus (UMGC) *

*We aren’t endorsed by this school

Course

617

Subject

Biology

Date

Apr 3, 2024

Type

docx

Pages

3

Uploaded by DeanGullMaster970

Report
Write a program that uses a regular expression to extracts out and prints to the screen the accession number of a sequence in a FASTA file. So for instance if the sequence file contains: >AF145348.1 Glycine max peroxidase (Prx2b) mRNA, complete cds ATATTTTATTTGCACTACTACTACTACTTTATTTAAGCAGAAGCAGGCGAGTGAGG AAGAGCGAAGAGTG AAGAGAATGGCTCCCAAGGGTTTAACCTTTCTGGCTGTGTTAATATGCGTTTCAGC ACTGTCACTGAGTC CTTCTGTTGCGGGGGAAGGGCAAAATAATGGCCTTGTTATGAACTTCTACAAGGAA TCATGCCCTCAGGC TGAAGACATCATCACAGAACAAGTCAAGCTTCTCTACAAGCGCCACAAGAACACT GCTTTCTCCTGGCTC AGGAACATCTTCCATGACTGTGCTGTTCAGAGTTGTGATGCTTCACTGTTGCTGGA CTCCACAAGAAGGA GCTTGTCTGAGAAGGAAACAGATAGAAGCTTTGGGTTGAGAAATTTCAGGTACAT TGAGACCATCAAAGA The output should be: AF145348.1 import re def extract_accession_number(fasta_file):     with open(fasta_file, 'r') as file:         data = file.read()         match = re.search(r'^>(\S+)', data, re.MULTILINE)         if match:             accession_number = match.group(1)             print(accession_number)         else:             print("No accession number found in the FASTA file.") fasta_file = "fasta_file.txt"   extract_accession_number(fasta_file) Write a program that produces the following output when searching for the word RNA in the paragraph below:  
Output: (RNAi) ends at position 49 (dsRNA) ends at position 211 (ssRNAs) ends at position 374 mRNAs. ends at position 431   Search Text (you can assign this to a string in your program):    "Several rapidly developing RNA interference (RNAi) methodologies hold the promise to selectively inhibit gene expression in mammals. RNAi is an innate cellular process activated when a double-stranded RNA (dsRNA) molecule of greater than 19 duplex nucleotides enters the cell, causing the degradation of not only the invading dsRNA molecule, but also single-stranded (ssRNAs) RNAs of identical sequences, including endogenous mRNAs."   text = """Several rapidly developing RNA interference (RNAi) methodologies hold the promise to selectively inhibit gene expression in mammals. RNAi is an innate cellular process activated when a double-stranded RNA (dsRNA) molecule of greater than 19 duplex nucleotides enters the cell, causing the degradation of not only the invading dsRNA molecule, but also single-stranded (ssRNAs) RNAs of identical sequences, including endogenous mRNAs.""" def find_rna_variants(text):   variants = ["RNAi", "dsRNA", "ssRNAs", "mRNAs"]   for variant in variants:     index = text.find(variant)     if index != -1:       end_index = index + len(variant)       print(f"{variant} ends at position {end_index}") find_rna_variants(text)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help