Lab Worksheet 12

Problem 1: Use Biopython to download 10 influenza hemagglutinin sequences like we did in the Class 21 worksheet. Print the list of genbank identifiers, then fetch and save all of the records to a file called "influenza_HA.gb".

In [1]:
# You will need Entrez and SeqIO to solve this problem
from Bio import Entrez, SeqIO

Entrez.email = "your email goes here"

# Your code goes here
    

Problem 2: Restriction enzymes cut DNA by recognizing specific motifs (patterns in the DNA sequence usually less than 10 nucleotides). Some restriction enzymes recognize degenerate motifs. That is, they recognize multiple motifs that differ by only 1 or 2 nucleotides.

Using your sequence file from Problem 1 and regular expressions, determine if any of the influenza sequences contain the following restriction sites:

  • EcoRI: GAATTC
  • BisI: GCNGC, where N represents any nucleotide
In [2]:
# You'll need the re module to solve this problem
import re

# Your code goes here
In [ ]: