Last week, we talked about how to use Entrez to access genomic information from the NCBI database. This week, we're focusing on how to use Entrez and Medline to search the PubMed (literature) database. For this module, a list of important abbreviations and their meanings can be found here: https://www.nlm.nih.gov/bsd/mms/medlineelements.html
Problem 1:
(a) Download the Medline record for the publication with pubmed id 32191846 and parse it with the Medline.parse()
function. Then print a list of all key-value pairs returned in that record.
(b) Use an Entrez esearch query of the pubmed database to find out how many publications "Marcotte EM" wrote in 2020.
(c) From the results of part (b), compile a dictionary of all the publication titles and abstracts for "Marcotte EM" in 2020. Print each publication title, followed by that paper's abstract.
# problem 1a
from Bio import Entrez, Medline
Entrez.email = "your.email@utexas.edu"
# your code here
# hint--you'll need this code after running `Entrez.efetch`:
#records = Medline.parse(handle)
#record = list(records)[0]
# problem 1b
from Bio import Entrez
Entrez.email = "your.email@utexas.edu"
# your code here
# problem 1c
from Bio import Medline
Entrez.email = "your.email@utexas.edu"
# your code here
Problem 4: From the results of part (b), compile a dictionary with each publication title and its associated author list (AU), source (SO), and abstract (AB) for "Marcotte EM" in 2020. From that dictionary, print each publication title, followed by that paper's author list, then source, then abstract.
# problem 2
from Bio import Medline
Entrez.email = "your.email@utexas.edu"
# your code here