Lab Worksheet 9 Solutions

Problem 1: The list mylist defined below contains 10 entries, some of which are numbers (integers and floats) and some of which are strings. For this question, you will determine the mean value for all numeric entries in this list. To accomplish this goal, write a for-loop that iterates over each entry in the list. Your code should print a final statement that reads, "The mean of all numbers in the list is X" (where X has been properly replaced with the mean).

Hint - when solving this question, these functions may be useful: type(), len(), and sum().

In [1]:
mylist = ["hello", 89.21, -3, "goodbye", 21, 0.0056, -12.34, "thank you", "please", 999.44409]

# Two solutions are given below. Either, or otherwise, is fine!

#### One solution #####
# Save all numbers in mylist to a new list
numbers = []
for entry in mylist:
    if type(entry) is not str:
        numbers.append(entry)
        
# Determine mean of the numbers
mean_of_numbers = float(sum(numbers)) / len(numbers)

# Print mean
print "The mean of all numbers in the list is", mean_of_numbers


#### Another solution ####
thesum = 0.
thesize = 0.
for entry in mylist:
    if type(entry) == int or type(entry) == float:
        thesum += entry
        thesize += 1
print "The mean of all numbers in the list is", thesum / thesize
    
The mean of all numbers in the list is 182.386615
The mean of all numbers in the list is 182.386615



Problem 2: Write a function to calculate the counts of A's, C's, G's, and T's in a DNA sequence. Your function should take a single argument (a string of a DNA sequence) and return a dictionary of nucleotide counts. For example, if the argument "ACGTACGT" is provided, the function should return this dictionary: {"A":2, "C":2, "G":2, "T":2}. Once your function has been written, run the function on the the provided dna_string string given below, and print the returned dictionary.

Hint - use the string method .count() as part of your solution.

In [2]:
# Three functions which accomplish the same task are given below. Either one, or any other option, is fully acceptable.
def count_nucleotides1(dna):
    nuc_dict = {}
    for d in dna:
        if d in nuc_dict:
            nuc_dict[d] += 1
        else:
            nuc_dict[d] = 1
    return nuc_dict

def count_nucleotides2(dna):
    nuc_dict = {}
    nuc_dict["A"] = dna.count("A")
    nuc_dict["C"] = dna.count("C")
    nuc_dict["G"] = dna.count("G")
    nuc_dict["T"] = dna.count("T")
    return nuc_dict

def count_nucleotides3(dna):
    nuc_dict = {}
    for nuc in ["A", "C", "G", "T"]:
        nuc_dict[nuc] = dna.count(nuc)
    return nuc_dict


# Variable to call the function on:
dna_string = "ATCGAGCTATACCGATACAGGCTGGTATAAAAGATTC"
print count_nucleotides1(dna_string)
print count_nucleotides2(dna_string)
print count_nucleotides3(dna_string)
{'A': 13, 'C': 7, 'T': 9, 'G': 8}
{'A': 13, 'C': 7, 'T': 9, 'G': 8}
{'A': 13, 'C': 7, 'T': 9, 'G': 8}
In [ ]: