Enter your name and EID here
This homework is due on Apr. 23, 2019 at 4:00pm. Please submit as a PDF file on Canvas. Before submission, please re-run all cells by clicking "Kernel" and selecting "Restart & Run All."
Problem 1 (4 pts): Write python code that can take a string of the form "https://website.com" and of the form "https://website.com/page1", extract the name of the website (indicated here by "website"), and then print it. Make sure you get just the part between "https://" and ".com".
# You will need re to solve this problem
import re
test_string1 = "https://github.com"
test_string2 = "https://twitter.com/dariyasydykova"
# Your code goes here
Problem 2 (6 pts): We will work with the E. coli genome. First, we download it:
from Bio import Entrez
Entrez.email = "your email goes here"
# Download E. coli K12 genome:
download_handle = Entrez.efetch(db="nucleotide", id="CP009685", rettype="gb", retmode="text")
data = download_handle.read()
download_handle.close()
# Store data into file "Ecoli_K12.gb":
out_handle = open("Ecoli_K12.gb", "w")
out_handle.write(data)
out_handle.close()
Write code that loops over all features in the E. coli genome, and counts the number of tRNAs and rRNAs that are contained within it. Use regular expressions to find an answer.
# You will need re and SeqIO to solve this problem
import re
from Bio import SeqIO
# Your code goes here