# Homework 11 Solutions¶

Enter your name and EID here

Problem 2 of this homework is due on April 24, 2018 at 7:00pm. Please submit Part 2 as a PDF file on Canvas. Before submission, please re-run all cells by clicking "Kernel" and selecting "Restart & Run All."

Problem 1 is due, on paper, at the beginning of lab on April 25, 2018. Problem 1 should not contain any code!

Problem 1 (5 points): Using Smith-Waterman (not Needleman-Wunsch!), align the following two sequences by hand:

ACAGC
ACAAGT

Draw out a score matrix, with the back-tracing arrows, using the following scoring function:

Match: +2
Mismatch: -2
Gap: -1

After you have filled out your score matrix, be sure to write out the final alignment.

Alignment:

ACA-G
ACAAG

Alignment:

AC-AG
ACAAG

Both alignments have a score of 7.

Problem 2 (5 points): Modify the code from the Lab 13 Worksheet, Part 1 so that it runs the Smith-Waterman algorithm. Several helper functions are provided for you below. Your function final should produce the matrix of scores only. You do not need to do back-tracing. Use the same scoring function as in Problem 1.

Run the sequences from Problem 1 through your function and print the output using print_matrix().

In [3]:
# Use these values to calculate scores
match_award = 2
mismatch_penalty = -2
gap_penalty = -1

# Make a score matrix with these two sequences
seq1 = "ACAGC"
seq2 = "ACAAGT"

# Here is a helper function to print out matrices
def print_matrix(mat):
# Loop over all rows
for i in range(0, len(mat)):
print("[", end = "")
# Loop over each column in row i
for j in range(0, len(mat[i])):
# Print out the value in row i, column j
print(mat[i][j], end = "")
# Only add a tab if we're not in the last column
if j != len(mat[i]) - 1:
print("\t", end = "")
print("]\n")

# A function for making a matrix of zeroes
def zeros(rows, cols):
# Define an empty list
retval = []
# Set up the rows of the matrix
for x in range(rows):
# For each row, add an empty list
retval.append([])
# Set up the columns in each row
for y in range(cols):
# Add a zero to each column in each row
retval[-1].append(0)
# Return the matrix of zeros
return retval

# A function for determining the score between any two bases in alignment
def match_score(alpha, beta):
if alpha == beta:
return match_award
elif alpha == '-' or beta == '-':
return gap_penalty
else:
return mismatch_penalty

# The function that actually fills out a matrix of scores
def smith_waterman(seq1, seq2):

# length of two sequences
n = len(seq1)
m = len(seq2)

# Generate matrix of zeros to store scores
score = zeros(m+1, n+1)

# Calculate score table

# Your code goes here

# Fill out first column
for i in range(0, m + 1):
score[i][0] = 0

# Fill out first row
for j in range(0, n + 1):
score[0][j] = 0

# Fill out all other values in the score matrix
for i in range(1, m + 1):
for j in range(1, n + 1):
# Calculate the score by checking the top, left, and diagonal cells
match = score[i - 1][j - 1] + match_score(seq1[j-1], seq2[i-1])
delete = score[i - 1][j] + gap_penalty
insert = score[i][j - 1] + gap_penalty
# Record the maximum score from the three possible scores calculated above
score[i][j] = max(0, match, delete, insert)

return score

print_matrix(smith_waterman(seq1, seq2))

[0	0	0	0	0	0]

[0	2	1	2	1	0]

[0	1	4	3	2	3]

[0	2	3	6	5	4]

[0	2	2	5	4	3]

[0	1	1	4	7	6]

[0	0	0	3	6	5]


In [ ]: