{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lab Worksheet 13\n",
    "\n",
    "For this assignment, your goal is to implement the Needleman-Wunsch algorithm in Python. You can read more about the Needleman-Wunsch algorithm on [Wikipedia](https://en.wikipedia.org/wiki/Needleman–Wunsch_algorithm). The Wikipedia page contains psuedo-code which you might find helpful.\n",
    "\n",
    "## Part 1\n",
    "\n",
    "Write a function that takes two sequences as input, and returns a matrix of scores as we saw in Class 25. You **do not** have to do the back-tracing, just fill out the matrix. \n",
    "\n",
    "To get you started, a matrix can be represented in Python as a list of lists. Let's say we want to make a matrix that looks like this:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<table>\n",
    "<tr>\n",
    "<td>1</td>\n",
    "<td>3</td>\n",
    "<td>5</td>\n",
    "<td>7</td>\n",
    "</tr>\n",
    "<tr>\n",
    "<td>2</td>\n",
    "<td>3</td>\n",
    "<td>4</td>\n",
    "<td>5</td>\n",
    "</tr>\n",
    "<tr>\n",
    "<td>5</td>\n",
    "<td>2</td>\n",
    "<td>20</td>\n",
    "<td>3</td>\n",
    "</tr>\n",
    "</table>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1\t3\t5\t7]\n",
      "\n",
      "[2\t3\t4\t5]\n",
      "\n",
      "[5\t2\t20\t3]\n",
      "\n",
      "The value in the 2nd row and the 0th column is: 5\n"
     ]
    }
   ],
   "source": [
    "# Here's how to make the matrix above from a list of lists\n",
    "my_matrix = []\n",
    "# Fill out the 0th row\n",
    "my_matrix.append([1, 3, 5, 7])\n",
    "# Fill out the 1st row\n",
    "my_matrix.append([2, 3, 4, 5])\n",
    "# Fill out the 2nd row\n",
    "my_matrix.append([5, 2, 20, 3])\n",
    "\n",
    "# Here is a helper function to print out matrices\n",
    "def print_matrix(mat):\n",
    "    # Loop over all rows\n",
    "    for i in range(0, len(mat)):\n",
    "        print(\"[\", end = \"\")\n",
    "        # Loop over each column in row i\n",
    "        for j in range(0, len(mat[i])):\n",
    "            # Print out the value in row i, column j\n",
    "            print(mat[i][j], end = \"\")\n",
    "            # Only add a tab if we're not in the last column\n",
    "            if j != len(mat[i]) - 1:\n",
    "                print(\"\\t\", end = \"\")\n",
    "        print(\"]\\n\")\n",
    "\n",
    "print_matrix(my_matrix)\n",
    "\n",
    "# To retrieve the value from the 2nd row, in the 0th column, is relatively simple:\n",
    "print(\"The value in the 2nd row and the 0th column is:\", my_matrix[2][0])\n",
    "\n",
    "# The format is always my_matrix[row][column]. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Part 1 Hints\n",
    "\n",
    "Break the problem down into as many small steps as possible. Here are a few hints:\n",
    "\n",
    "* Before you calculate any scores, make an empty matrix of the appropriate size using the `zeros()` function defined below.\n",
    "* Fill out the 0th row and 0th column before you calculate any other scores.\n",
    "* The `max()` function will return the maximum value from a list of values. For example `max(1,7,3)` will return `7`.\n",
    "* Make liberal use of the `range()` function.\n",
    "* Use the `print_matrix()` function to print out your matrix as frequently as possible. Always make sure that your code is doing what you think it's doing!\n",
    "* Remember, in Python, we start counting from 0.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n",
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n",
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n",
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n",
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n",
      "[0\t0\t0\t0\t0\t0\t0]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Use these values to calculate scores\n",
    "gap_penalty = -1\n",
    "match_award = 1\n",
    "mismatch_penalty = -1\n",
    "\n",
    "# Make a score matrix with these two sequences\n",
    "seq1 = \"ATTACA\"\n",
    "seq2 = \"ATGCT\"\n",
    "\n",
    "# A function for making a matrix of zeroes\n",
    "def zeros(rows, cols):\n",
    "    # Define an empty list\n",
    "    retval = []\n",
    "    # Set up the rows of the matrix\n",
    "    for x in range(rows):\n",
    "        # For each row, add an empty list\n",
    "        retval.append([])\n",
    "        # Set up the columns in each row\n",
    "        for y in range(cols):\n",
    "            # Add a zero to each column in each row\n",
    "            retval[-1].append(0)\n",
    "    # Return the matrix of zeros\n",
    "    return retval\n",
    "\n",
    "# A function for determining the score between any two bases in alignment\n",
    "def match_score(alpha, beta):\n",
    "    if alpha == beta:\n",
    "        return match_award\n",
    "    elif alpha == '-' or beta == '-':\n",
    "        return gap_penalty\n",
    "    else:\n",
    "        return mismatch_penalty\n",
    "\n",
    "# The function that actually fills out a matrix of scores\n",
    "def needleman_wunsch(seq1, seq2):\n",
    "    \n",
    "    # length of two sequences\n",
    "    n = len(seq1) \n",
    "    m = len(seq2)\n",
    "    \n",
    "    # Generate matrix of zeros to store scores\n",
    "    score = zeros(m+1, n+1)\n",
    "    \n",
    "    ########################\n",
    "    # Your code starts here\n",
    "    ########################\n",
    "    \n",
    "    # Use the following steps as a guide to calculate the score matrix\n",
    "    \n",
    "    # 1. Fill out first column\n",
    "    \n",
    "    # 2. Fill out first row\n",
    "    \n",
    "    # 3. Fill out all other values in the score matrix\n",
    "\n",
    "    # Return the final score matrix\n",
    "    return score\n",
    "\n",
    "# Test out the needleman_wunsch() function\n",
    "print_matrix(needleman_wunsch(seq1, seq2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2: If that was easy...\n",
    "\n",
    "Modify your code from Part 1 to back-trace through the score matrix and print out the final alignment. **HINT:** For the back-tracing, you'll want to use a `while` loop (or several of them)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def needleman_wunsch(seq1, seq2):\n",
    "    \n",
    "    # Store length of two sequences\n",
    "    n = len(seq1)  \n",
    "    m = len(seq2)\n",
    "    \n",
    "    # Generate matrix of zeros to store scores\n",
    "    score = zeros(m+1, n+1)\n",
    "   \n",
    "    # Calculate score table\n",
    "    \n",
    "    # Create variables to store alignment\n",
    "    align1 = \"\"\n",
    "    align2 = \"\"\n",
    "    \n",
    "    # Back-trace from bottom right of the score matrix and compute the alignment \n",
    "    \n",
    "    return(align1, align2)\n",
    "\n",
    "output1, output2 = needleman_wunsch(seq1, seq2)\n",
    "\n",
    "print(output1 + \"\\n\" + output2)\n",
    "    \n",
    "    \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}