April 14, 2020
Every time we download data through the Entrez module, we can interact with the results in different ways. First, we can just use the handle we obtain as an ordinary file handle and just store or process the raw data provided by Entrez. Second, we can process the data with an appropriate, existing Biopython module. The latter will generally be preferable if an appropriate module exists. However, the various choices that are available may make things confusing.
As an example, we will again download the genbank record with the ID "KT220438", containing an influenza HA protein. We will consider four different ways of looking at the data. First, we use retmode="text"
in Entrez.efetch()
and just download the raw data and print it. We get a regular genbank file as output:
from Bio import Entrez, SeqIO
Entrez.email = "wilke@austin.utexas.edu" # put your email here
# Download sequence record for genbank id KT220438 (HA from influenza A)
# Using text mode
handle = Entrez.efetch(db="nucleotide", id="KT220438", rettype="gb", retmode="text")
record = handle.read() # read file directly
print(record)
handle.close()
LOCUS KT220438 1701 bp cRNA linear VRL 20-JUL-2015 DEFINITION Influenza A virus (A/NewJersey/NHRC_93219/2015(H3N2)) segment 4 hemagglutinin (HA) gene, complete cds. ACCESSION KT220438 VERSION KT220438.1 KEYWORDS . SOURCE Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2)) ORGANISM Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2)) Viruses; Riboviria; Negarnaviricota; Polyploviricotina; Insthoviricetes; Articulavirales; Orthomyxoviridae; Alphainfluenzavirus. REFERENCE 1 (bases 1 to 1701) AUTHORS Sitz,C.R., Thammavong,H.L., Balansay-Ames,M.S., Hawksworth,A.W., Myers,C.A. and Brice,G.T. TITLE GEISS Influenza Surveillance Response Program JOURNAL Unpublished REFERENCE 2 (bases 1 to 1701) AUTHORS Sitz,C.R., Thammavong,H.L., Balansay-Ames,M.S., Hawksworth,A.W., Myers,C.A. and Brice,G.T. TITLE Direct Submission JOURNAL Submitted (29-JUN-2015) Operational Infectious Diseases, Naval Health Research Center, 140 Sylvester Rd., San Diego, CA 92106, USA COMMENT ##Assembly-Data-START## Sequencing Technology :: Sanger dideoxy sequencing ##Assembly-Data-END## FEATURES Location/Qualifiers source 1..1701 /organism="Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))" /mol_type="viral cRNA" /strain="A/NewJersey/NHRC_93219/2015" /serotype="H3N2" /isolation_source="nasopharyngeal swab" /host="Homo sapiens" /db_xref="taxon:1682360" /segment="4" /lab_host="MDCK" /country="USA: New Jersey" /collection_date="17-Jan-2015" gene 1..1701 /gene="HA" CDS 1..1701 /gene="HA" /function="receptor binding and fusion protein" /codon_start=1 /product="hemagglutinin" /protein_id="AKQ43545.1" /translation="MKTIIALSYILCLVFAQKIPGNDNSTATLCLGHHAVPNGTIVKT ITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDL FVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSS SSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRI TVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKI RSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLAT GMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQ INGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQ HTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYR DEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI " mat_peptide 49..1035 /gene="HA" /product="HA1" mat_peptide 1036..1698 /gene="HA" /product="HA2" ORIGIN 1 atgaagacta tcattgcttt gagctacatt ctatgtctgg ttttcgctca aaaaattcct 61 ggaaatgaca atagcacggc aacgctgtgc cttgggcacc atgcagtacc aaacggaacg 121 atagtgaaaa caatcacaaa tgaccgaatt gaagttacta atgctactga gctggttcag 181 aattcctcaa taggtgaaat atgcgacagt cctcatcaga tccttgatgg agaaaactgc 241 acactaatag atgctctatt gggagaccct cagtgtgatg gctttcaaaa taagaaatgg 301 gacctttttg ttgaacgaag caaagcctac agcaactgct acccttatga tgtgccggat 361 tatgcctccc ttaggtcact agttgcctca tccggcacac tggagtttaa caatgaaagc 421 ttcaattgga ctggagtcac tcaaaacgga acaagttctg cttgcataag gagatctagt 481 agtagtttct ttagtagatt aaattggttg acccacttaa actacacata cccagcattg 541 aacgtgacta tgccaaacaa tgaacaattt gacaaattgt acatttgggg ggttcaccac 601 ccgggtacgg acaaggacca aatcttcctg tatgctcaat catcaggaag aatcacagta 661 tctaccaaaa gaagccaaca agctgtaatc ccaaatatcg gatctagacc cagaataagg 721 gatatcccta gcagaataag catctattgg acaatagtaa aaccgggaga catacttttg 781 attaacagca cagggaatct aattgctcct aggggttact tcaaaatacg aagtgggaaa 841 agctcaataa tgagatcaga tgcacccatt ggcaaatgca agtctgaatg catcactcca 901 aatggaagca ttcccaatga caaaccattc caaaatgtaa acaggatcac atacggggcc 961 tgtcccagat atgttaagca tagcactcta aaattggcaa caggaatgcg aaatgtacca 1021 gagaaacaaa ctagaggcat atttggcgca atagcgggtt tcatagaaaa tggttgggag 1081 ggaatggtgg atggttggta cggtttcagg catcaaaatt ctgagggaag aggacaagca 1141 gcagatctca aaagcactca agcagcaatc gatcaaatca atgggaagct gaatcgattg 1201 atcgggaaaa ccaacgagaa attccatcag attgaaaaag aattctcaga agtagaagga 1261 agaattcagg accttgagaa atatgttgag gacactaaaa tagatctctg gtcatacaac 1321 gcggagcttc ttgttgccct ggagaaccaa catacarttg atctaactga ctcagaaatg 1381 aacaaactgt ttgaaaaaac aaagaagcaa ctgagggaaa atgctgagga tatgggaaat 1441 ggttgtttca aaatatacca caaatgtgac aatgcctgca taggatcaat aagaaatgga 1501 acttatgacc acaatgtgta cagggatgaa gcattaaaca accggttcca gatcaaggga 1561 gttgagctga agtcagggta caaagattgg atcctatgga tttcctytgc catatcatgt 1621 tttttgcttt gtgttgcttt gttggggttc atcatgtggg cctgccaaaa gggcaacatt 1681 aggtgcaaca tttgcatttg a //
We can also, as we have done before, process this file using the SeqIO
module:
# Download sequence record for genbank id KT220438 (HA from influenza A)
# Using text mode
handle = Entrez.efetch(db="nucleotide", id="KT220438", rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank") # parse with SeqIO
print(record)
handle.close()
ID: KT220438.1 Name: KT220438 Description: Influenza A virus (A/NewJersey/NHRC_93219/2015(H3N2)) segment 4 hemagglutinin (HA) gene, complete cds Number of features: 5 /molecule_type=cRNA /topology=linear /data_file_division=VRL /date=20-JUL-2015 /accessions=['KT220438'] /sequence_version=1 /keywords=[''] /source=Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2)) /organism=Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2)) /taxonomy=['Viruses', 'Riboviria', 'Negarnaviricota', 'Polyploviricotina', 'Insthoviricetes', 'Articulavirales', 'Orthomyxoviridae', 'Alphainfluenzavirus'] /references=[Reference(title='GEISS Influenza Surveillance Response Program', ...), Reference(title='Direct Submission', ...)] /structured_comment=OrderedDict([('Assembly-Data', OrderedDict([('Sequencing Technology', 'Sanger dideoxy sequencing')]))]) Seq('ATGAAGACTATCATTGCTTTGAGCTACATTCTATGTCTGGTTTTCGCTCAAAAA...TGA', IUPACAmbiguousDNA())
In addition to text mode, we can also download the data in XML mode. XML is a structured data format that allows for easy machine-processing of complex data files. If we just print the raw data, though, it doesn't look appealing:
# Download sequence record for genbank id KT220438 (HA from influenza A)
# Using XML mode
handle = Entrez.efetch(db="nucleotide", id="KT220438", rettype="gb", retmode="xml")
record = handle.read() # read file directly
print(record)
handle.close()
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE GBSet PUBLIC "-//NCBI//NCBI GBSeq/EN" "https://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd"> <GBSet> <GBSeq> <GBSeq_locus>KT220438</GBSeq_locus> <GBSeq_length>1701</GBSeq_length> <GBSeq_strandedness>single</GBSeq_strandedness> <GBSeq_moltype>cRNA</GBSeq_moltype> <GBSeq_topology>linear</GBSeq_topology> <GBSeq_division>VRL</GBSeq_division> <GBSeq_update-date>20-JUL-2015</GBSeq_update-date> <GBSeq_create-date>20-JUL-2015</GBSeq_create-date> <GBSeq_definition>Influenza A virus (A/NewJersey/NHRC_93219/2015(H3N2)) segment 4 hemagglutinin (HA) gene, complete cds</GBSeq_definition> <GBSeq_primary-accession>KT220438</GBSeq_primary-accession> <GBSeq_accession-version>KT220438.1</GBSeq_accession-version> <GBSeq_other-seqids> <GBSeqid>gb|KT220438.1|</GBSeqid> <GBSeqid>gi|887493048</GBSeqid> </GBSeq_other-seqids> <GBSeq_source>Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))</GBSeq_source> <GBSeq_organism>Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))</GBSeq_organism> <GBSeq_taxonomy>Viruses; Riboviria; Negarnaviricota; Polyploviricotina; Insthoviricetes; Articulavirales; Orthomyxoviridae; Alphainfluenzavirus</GBSeq_taxonomy> <GBSeq_references> <GBReference> <GBReference_reference>1</GBReference_reference> <GBReference_position>1..1701</GBReference_position> <GBReference_authors> <GBAuthor>Sitz,C.R.</GBAuthor> <GBAuthor>Thammavong,H.L.</GBAuthor> <GBAuthor>Balansay-Ames,M.S.</GBAuthor> <GBAuthor>Hawksworth,A.W.</GBAuthor> <GBAuthor>Myers,C.A.</GBAuthor> <GBAuthor>Brice,G.T.</GBAuthor> </GBReference_authors> <GBReference_title>GEISS Influenza Surveillance Response Program</GBReference_title> <GBReference_journal>Unpublished</GBReference_journal> </GBReference> <GBReference> <GBReference_reference>2</GBReference_reference> <GBReference_position>1..1701</GBReference_position> <GBReference_authors> <GBAuthor>Sitz,C.R.</GBAuthor> <GBAuthor>Thammavong,H.L.</GBAuthor> <GBAuthor>Balansay-Ames,M.S.</GBAuthor> <GBAuthor>Hawksworth,A.W.</GBAuthor> <GBAuthor>Myers,C.A.</GBAuthor> <GBAuthor>Brice,G.T.</GBAuthor> </GBReference_authors> <GBReference_title>Direct Submission</GBReference_title> <GBReference_journal>Submitted (29-JUN-2015) Operational Infectious Diseases, Naval Health Research Center, 140 Sylvester Rd., San Diego, CA 92106, USA</GBReference_journal> </GBReference> </GBSeq_references> <GBSeq_comment>##Assembly-Data-START## ; Sequencing Technology :: Sanger dideoxy sequencing ; ##Assembly-Data-END##</GBSeq_comment> <GBSeq_feature-table> <GBFeature> <GBFeature_key>source</GBFeature_key> <GBFeature_location>1..1701</GBFeature_location> <GBFeature_intervals> <GBInterval> <GBInterval_from>1</GBInterval_from> <GBInterval_to>1701</GBInterval_to> <GBInterval_accession>KT220438.1</GBInterval_accession> </GBInterval> </GBFeature_intervals> <GBFeature_quals> <GBQualifier> <GBQualifier_name>organism</GBQualifier_name> <GBQualifier_value>Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>mol_type</GBQualifier_name> <GBQualifier_value>viral cRNA</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>strain</GBQualifier_name> <GBQualifier_value>A/NewJersey/NHRC_93219/2015</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>serotype</GBQualifier_name> <GBQualifier_value>H3N2</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>isolation_source</GBQualifier_name> <GBQualifier_value>nasopharyngeal swab</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>host</GBQualifier_name> <GBQualifier_value>Homo sapiens</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>db_xref</GBQualifier_name> <GBQualifier_value>taxon:1682360</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>segment</GBQualifier_name> <GBQualifier_value>4</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>lab_host</GBQualifier_name> <GBQualifier_value>MDCK</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>country</GBQualifier_name> <GBQualifier_value>USA: New Jersey</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>collection_date</GBQualifier_name> <GBQualifier_value>17-Jan-2015</GBQualifier_value> </GBQualifier> </GBFeature_quals> </GBFeature> <GBFeature> <GBFeature_key>gene</GBFeature_key> <GBFeature_location>1..1701</GBFeature_location> <GBFeature_intervals> <GBInterval> <GBInterval_from>1</GBInterval_from> <GBInterval_to>1701</GBInterval_to> <GBInterval_accession>KT220438.1</GBInterval_accession> </GBInterval> </GBFeature_intervals> <GBFeature_quals> <GBQualifier> <GBQualifier_name>gene</GBQualifier_name> <GBQualifier_value>HA</GBQualifier_value> </GBQualifier> </GBFeature_quals> </GBFeature> <GBFeature> <GBFeature_key>CDS</GBFeature_key> <GBFeature_location>1..1701</GBFeature_location> <GBFeature_intervals> <GBInterval> <GBInterval_from>1</GBInterval_from> <GBInterval_to>1701</GBInterval_to> <GBInterval_accession>KT220438.1</GBInterval_accession> </GBInterval> </GBFeature_intervals> <GBFeature_quals> <GBQualifier> <GBQualifier_name>gene</GBQualifier_name> <GBQualifier_value>HA</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>function</GBQualifier_name> <GBQualifier_value>receptor binding and fusion protein</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>codon_start</GBQualifier_name> <GBQualifier_value>1</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>transl_table</GBQualifier_name> <GBQualifier_value>1</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>product</GBQualifier_name> <GBQualifier_value>hemagglutinin</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>protein_id</GBQualifier_name> <GBQualifier_value>AKQ43545.1</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>translation</GBQualifier_name> <GBQualifier_value>MKTIIALSYILCLVFAQKIPGNDNSTATLCLGHHAVPNGTIVKTITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSSSSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRITVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYRDEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI</GBQualifier_value> </GBQualifier> </GBFeature_quals> </GBFeature> <GBFeature> <GBFeature_key>mat_peptide</GBFeature_key> <GBFeature_location>49..1035</GBFeature_location> <GBFeature_intervals> <GBInterval> <GBInterval_from>49</GBInterval_from> <GBInterval_to>1035</GBInterval_to> <GBInterval_accession>KT220438.1</GBInterval_accession> </GBInterval> </GBFeature_intervals> <GBFeature_quals> <GBQualifier> <GBQualifier_name>gene</GBQualifier_name> <GBQualifier_value>HA</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>product</GBQualifier_name> <GBQualifier_value>HA1</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>peptide</GBQualifier_name> <GBQualifier_value>QKIPGNDNSTATLCLGHHAVPNGTIVKTITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSSSSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRITVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLATGMRNVPEKQTR</GBQualifier_value> </GBQualifier> </GBFeature_quals> </GBFeature> <GBFeature> <GBFeature_key>mat_peptide</GBFeature_key> <GBFeature_location>1036..1698</GBFeature_location> <GBFeature_intervals> <GBInterval> <GBInterval_from>1036</GBInterval_from> <GBInterval_to>1698</GBInterval_to> <GBInterval_accession>KT220438.1</GBInterval_accession> </GBInterval> </GBFeature_intervals> <GBFeature_quals> <GBQualifier> <GBQualifier_name>gene</GBQualifier_name> <GBQualifier_value>HA</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>product</GBQualifier_name> <GBQualifier_value>HA2</GBQualifier_value> </GBQualifier> <GBQualifier> <GBQualifier_name>peptide</GBQualifier_name> <GBQualifier_value>GIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYRDEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI</GBQualifier_value> </GBQualifier> </GBFeature_quals> </GBFeature> </GBSeq_feature-table> <GBSeq_sequence>atgaagactatcattgctttgagctacattctatgtctggttttcgctcaaaaaattcctggaaatgacaatagcacggcaacgctgtgccttgggcaccatgcagtaccaaacggaacgatagtgaaaacaatcacaaatgaccgaattgaagttactaatgctactgagctggttcagaattcctcaataggtgaaatatgcgacagtcctcatcagatccttgatggagaaaactgcacactaatagatgctctattgggagaccctcagtgtgatggctttcaaaataagaaatgggacctttttgttgaacgaagcaaagcctacagcaactgctacccttatgatgtgccggattatgcctcccttaggtcactagttgcctcatccggcacactggagtttaacaatgaaagcttcaattggactggagtcactcaaaacggaacaagttctgcttgcataaggagatctagtagtagtttctttagtagattaaattggttgacccacttaaactacacatacccagcattgaacgtgactatgccaaacaatgaacaatttgacaaattgtacatttggggggttcaccacccgggtacggacaaggaccaaatcttcctgtatgctcaatcatcaggaagaatcacagtatctaccaaaagaagccaacaagctgtaatcccaaatatcggatctagacccagaataagggatatccctagcagaataagcatctattggacaatagtaaaaccgggagacatacttttgattaacagcacagggaatctaattgctcctaggggttacttcaaaatacgaagtgggaaaagctcaataatgagatcagatgcacccattggcaaatgcaagtctgaatgcatcactccaaatggaagcattcccaatgacaaaccattccaaaatgtaaacaggatcacatacggggcctgtcccagatatgttaagcatagcactctaaaattggcaacaggaatgcgaaatgtaccagagaaacaaactagaggcatatttggcgcaatagcgggtttcatagaaaatggttgggagggaatggtggatggttggtacggtttcaggcatcaaaattctgagggaagaggacaagcagcagatctcaaaagcactcaagcagcaatcgatcaaatcaatgggaagctgaatcgattgatcgggaaaaccaacgagaaattccatcagattgaaaaagaattctcagaagtagaaggaagaattcaggaccttgagaaatatgttgaggacactaaaatagatctctggtcatacaacgcggagcttcttgttgccctggagaaccaacatacarttgatctaactgactcagaaatgaacaaactgtttgaaaaaacaaagaagcaactgagggaaaatgctgaggatatgggaaatggttgtttcaaaatataccacaaatgtgacaatgcctgcataggatcaataagaaatggaacttatgaccacaatgtgtacagggatgaagcattaaacaaccggttccagatcaagggagttgagctgaagtcagggtacaaagattggatcctatggatttcctytgccatatcatgttttttgctttgtgttgctttgttggggttcatcatgtgggcctgccaaaagggcaacattaggtgcaacatttgcatttga</GBSeq_sequence> </GBSeq> </GBSet>
The advantage of XML mode is that there is the generic Entrez.parse()
function that can parse XML files returned from Entrez.efetch()
. Also, some modules in Biopython cannot work with files obtained in text mode, they can only work on files obtained in XML mode. The documentation will generally tell you for each module what kind of input it expects.
Reading the above example with Entrez.parse()
gives us the following:
# Download sequence record for genbank id KT220438 (HA from influenza A)
handle = Entrez.efetch(db="nucleotide", id="KT220438", rettype="gb", retmode="xml")
parsed = Entrez.parse(handle)
record = list(parsed)[0] # We need to convert the parsed contents into a list. Here we want just the 0th element.
handle.close()
print(record)
DictElement({'GBSeq_locus': 'KT220438', 'GBSeq_length': '1701', 'GBSeq_strandedness': 'single', 'GBSeq_moltype': 'cRNA', 'GBSeq_topology': 'linear', 'GBSeq_division': 'VRL', 'GBSeq_update-date': '20-JUL-2015', 'GBSeq_create-date': '20-JUL-2015', 'GBSeq_definition': 'Influenza A virus (A/NewJersey/NHRC_93219/2015(H3N2)) segment 4 hemagglutinin (HA) gene, complete cds', 'GBSeq_primary-accession': 'KT220438', 'GBSeq_accession-version': 'KT220438.1', 'GBSeq_other-seqids': ['gb|KT220438.1|', 'gi|887493048'], 'GBSeq_source': 'Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))', 'GBSeq_organism': 'Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))', 'GBSeq_taxonomy': 'Viruses; Riboviria; Negarnaviricota; Polyploviricotina; Insthoviricetes; Articulavirales; Orthomyxoviridae; Alphainfluenzavirus', 'GBSeq_references': [DictElement({'GBReference_reference': '1', 'GBReference_position': '1..1701', 'GBReference_authors': ['Sitz,C.R.', 'Thammavong,H.L.', 'Balansay-Ames,M.S.', 'Hawksworth,A.W.', 'Myers,C.A.', 'Brice,G.T.'], 'GBReference_title': 'GEISS Influenza Surveillance Response Program', 'GBReference_journal': 'Unpublished'}, attributes={}), DictElement({'GBReference_reference': '2', 'GBReference_position': '1..1701', 'GBReference_authors': ['Sitz,C.R.', 'Thammavong,H.L.', 'Balansay-Ames,M.S.', 'Hawksworth,A.W.', 'Myers,C.A.', 'Brice,G.T.'], 'GBReference_title': 'Direct Submission', 'GBReference_journal': 'Submitted (29-JUN-2015) Operational Infectious Diseases, Naval Health Research Center, 140 Sylvester Rd., San Diego, CA 92106, USA'}, attributes={})], 'GBSeq_comment': '##Assembly-Data-START## ; Sequencing Technology :: Sanger dideoxy sequencing ; ##Assembly-Data-END##', 'GBSeq_feature-table': [DictElement({'GBFeature_key': 'source', 'GBFeature_location': '1..1701', 'GBFeature_intervals': [DictElement({'GBInterval_from': '1', 'GBInterval_to': '1701', 'GBInterval_accession': 'KT220438.1'}, attributes={})], 'GBFeature_quals': [DictElement({'GBQualifier_name': 'organism', 'GBQualifier_value': 'Influenza A virus (A/New Jersey/NHRC_93219/2015(H3N2))'}, attributes={}), DictElement({'GBQualifier_name': 'mol_type', 'GBQualifier_value': 'viral cRNA'}, attributes={}), DictElement({'GBQualifier_name': 'strain', 'GBQualifier_value': 'A/NewJersey/NHRC_93219/2015'}, attributes={}), DictElement({'GBQualifier_name': 'serotype', 'GBQualifier_value': 'H3N2'}, attributes={}), DictElement({'GBQualifier_name': 'isolation_source', 'GBQualifier_value': 'nasopharyngeal swab'}, attributes={}), DictElement({'GBQualifier_name': 'host', 'GBQualifier_value': 'Homo sapiens'}, attributes={}), DictElement({'GBQualifier_name': 'db_xref', 'GBQualifier_value': 'taxon:1682360'}, attributes={}), DictElement({'GBQualifier_name': 'segment', 'GBQualifier_value': '4'}, attributes={}), DictElement({'GBQualifier_name': 'lab_host', 'GBQualifier_value': 'MDCK'}, attributes={}), DictElement({'GBQualifier_name': 'country', 'GBQualifier_value': 'USA: New Jersey'}, attributes={}), DictElement({'GBQualifier_name': 'collection_date', 'GBQualifier_value': '17-Jan-2015'}, attributes={})]}, attributes={}), DictElement({'GBFeature_key': 'gene', 'GBFeature_location': '1..1701', 'GBFeature_intervals': [DictElement({'GBInterval_from': '1', 'GBInterval_to': '1701', 'GBInterval_accession': 'KT220438.1'}, attributes={})], 'GBFeature_quals': [DictElement({'GBQualifier_name': 'gene', 'GBQualifier_value': 'HA'}, attributes={})]}, attributes={}), DictElement({'GBFeature_key': 'CDS', 'GBFeature_location': '1..1701', 'GBFeature_intervals': [DictElement({'GBInterval_from': '1', 'GBInterval_to': '1701', 'GBInterval_accession': 'KT220438.1'}, attributes={})], 'GBFeature_quals': [DictElement({'GBQualifier_name': 'gene', 'GBQualifier_value': 'HA'}, attributes={}), DictElement({'GBQualifier_name': 'function', 'GBQualifier_value': 'receptor binding and fusion protein'}, attributes={}), DictElement({'GBQualifier_name': 'codon_start', 'GBQualifier_value': '1'}, attributes={}), DictElement({'GBQualifier_name': 'transl_table', 'GBQualifier_value': '1'}, attributes={}), DictElement({'GBQualifier_name': 'product', 'GBQualifier_value': 'hemagglutinin'}, attributes={}), DictElement({'GBQualifier_name': 'protein_id', 'GBQualifier_value': 'AKQ43545.1'}, attributes={}), DictElement({'GBQualifier_name': 'translation', 'GBQualifier_value': 'MKTIIALSYILCLVFAQKIPGNDNSTATLCLGHHAVPNGTIVKTITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSSSSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRITVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYRDEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI'}, attributes={})]}, attributes={}), DictElement({'GBFeature_key': 'mat_peptide', 'GBFeature_location': '49..1035', 'GBFeature_intervals': [DictElement({'GBInterval_from': '49', 'GBInterval_to': '1035', 'GBInterval_accession': 'KT220438.1'}, attributes={})], 'GBFeature_quals': [DictElement({'GBQualifier_name': 'gene', 'GBQualifier_value': 'HA'}, attributes={}), DictElement({'GBQualifier_name': 'product', 'GBQualifier_value': 'HA1'}, attributes={}), DictElement({'GBQualifier_name': 'peptide', 'GBQualifier_value': 'QKIPGNDNSTATLCLGHHAVPNGTIVKTITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSSSSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRITVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLATGMRNVPEKQTR'}, attributes={})]}, attributes={}), DictElement({'GBFeature_key': 'mat_peptide', 'GBFeature_location': '1036..1698', 'GBFeature_intervals': [DictElement({'GBInterval_from': '1036', 'GBInterval_to': '1698', 'GBInterval_accession': 'KT220438.1'}, attributes={})], 'GBFeature_quals': [DictElement({'GBQualifier_name': 'gene', 'GBQualifier_value': 'HA'}, attributes={}), DictElement({'GBQualifier_name': 'product', 'GBQualifier_value': 'HA2'}, attributes={}), DictElement({'GBQualifier_name': 'peptide', 'GBQualifier_value': 'GIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYRDEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI'}, attributes={})]}, attributes={})], 'GBSeq_sequence': 'atgaagactatcattgctttgagctacattctatgtctggttttcgctcaaaaaattcctggaaatgacaatagcacggcaacgctgtgccttgggcaccatgcagtaccaaacggaacgatagtgaaaacaatcacaaatgaccgaattgaagttactaatgctactgagctggttcagaattcctcaataggtgaaatatgcgacagtcctcatcagatccttgatggagaaaactgcacactaatagatgctctattgggagaccctcagtgtgatggctttcaaaataagaaatgggacctttttgttgaacgaagcaaagcctacagcaactgctacccttatgatgtgccggattatgcctcccttaggtcactagttgcctcatccggcacactggagtttaacaatgaaagcttcaattggactggagtcactcaaaacggaacaagttctgcttgcataaggagatctagtagtagtttctttagtagattaaattggttgacccacttaaactacacatacccagcattgaacgtgactatgccaaacaatgaacaatttgacaaattgtacatttggggggttcaccacccgggtacggacaaggaccaaatcttcctgtatgctcaatcatcaggaagaatcacagtatctaccaaaagaagccaacaagctgtaatcccaaatatcggatctagacccagaataagggatatccctagcagaataagcatctattggacaatagtaaaaccgggagacatacttttgattaacagcacagggaatctaattgctcctaggggttacttcaaaatacgaagtgggaaaagctcaataatgagatcagatgcacccattggcaaatgcaagtctgaatgcatcactccaaatggaagcattcccaatgacaaaccattccaaaatgtaaacaggatcacatacggggcctgtcccagatatgttaagcatagcactctaaaattggcaacaggaatgcgaaatgtaccagagaaacaaactagaggcatatttggcgcaatagcgggtttcatagaaaatggttgggagggaatggtggatggttggtacggtttcaggcatcaaaattctgagggaagaggacaagcagcagatctcaaaagcactcaagcagcaatcgatcaaatcaatgggaagctgaatcgattgatcgggaaaaccaacgagaaattccatcagattgaaaaagaattctcagaagtagaaggaagaattcaggaccttgagaaatatgttgaggacactaaaatagatctctggtcatacaacgcggagcttcttgttgccctggagaaccaacatacarttgatctaactgactcagaaatgaacaaactgtttgaaaaaacaaagaagcaactgagggaaaatgctgaggatatgggaaatggttgtttcaaaatataccacaaatgtgacaatgcctgcataggatcaataagaaatggaacttatgaccacaatgtgtacagggatgaagcattaaacaaccggttccagatcaagggagttgagctgaagtcagggtacaaagattggatcctatggatttcctytgccatatcatgttttttgctttgtgttgctttgttggggttcatcatgtgggcctgccaaaagggcaacattaggtgcaacatttgcatttga'}, attributes={})
While this output may not seem useful, we now just have a set of nested dictionaries that we can interrogate using standard Python techniques:
print(list(record.keys())) # print out all the keys in the dictionary
['GBSeq_locus', 'GBSeq_length', 'GBSeq_strandedness', 'GBSeq_moltype', 'GBSeq_topology', 'GBSeq_division', 'GBSeq_update-date', 'GBSeq_create-date', 'GBSeq_definition', 'GBSeq_primary-accession', 'GBSeq_accession-version', 'GBSeq_other-seqids', 'GBSeq_source', 'GBSeq_organism', 'GBSeq_taxonomy', 'GBSeq_references', 'GBSeq_comment', 'GBSeq_feature-table', 'GBSeq_sequence']
print(record['GBSeq_sequence']) # print out the sequence
atgaagactatcattgctttgagctacattctatgtctggttttcgctcaaaaaattcctggaaatgacaatagcacggcaacgctgtgccttgggcaccatgcagtaccaaacggaacgatagtgaaaacaatcacaaatgaccgaattgaagttactaatgctactgagctggttcagaattcctcaataggtgaaatatgcgacagtcctcatcagatccttgatggagaaaactgcacactaatagatgctctattgggagaccctcagtgtgatggctttcaaaataagaaatgggacctttttgttgaacgaagcaaagcctacagcaactgctacccttatgatgtgccggattatgcctcccttaggtcactagttgcctcatccggcacactggagtttaacaatgaaagcttcaattggactggagtcactcaaaacggaacaagttctgcttgcataaggagatctagtagtagtttctttagtagattaaattggttgacccacttaaactacacatacccagcattgaacgtgactatgccaaacaatgaacaatttgacaaattgtacatttggggggttcaccacccgggtacggacaaggaccaaatcttcctgtatgctcaatcatcaggaagaatcacagtatctaccaaaagaagccaacaagctgtaatcccaaatatcggatctagacccagaataagggatatccctagcagaataagcatctattggacaatagtaaaaccgggagacatacttttgattaacagcacagggaatctaattgctcctaggggttacttcaaaatacgaagtgggaaaagctcaataatgagatcagatgcacccattggcaaatgcaagtctgaatgcatcactccaaatggaagcattcccaatgacaaaccattccaaaatgtaaacaggatcacatacggggcctgtcccagatatgttaagcatagcactctaaaattggcaacaggaatgcgaaatgtaccagagaaacaaactagaggcatatttggcgcaatagcgggtttcatagaaaatggttgggagggaatggtggatggttggtacggtttcaggcatcaaaattctgagggaagaggacaagcagcagatctcaaaagcactcaagcagcaatcgatcaaatcaatgggaagctgaatcgattgatcgggaaaaccaacgagaaattccatcagattgaaaaagaattctcagaagtagaaggaagaattcaggaccttgagaaatatgttgaggacactaaaatagatctctggtcatacaacgcggagcttcttgttgccctggagaaccaacatacarttgatctaactgactcagaaatgaacaaactgtttgaaaaaacaaagaagcaactgagggaaaatgctgaggatatgggaaatggttgtttcaaaatataccacaaatgtgacaatgcctgcataggatcaataagaaatggaacttatgaccacaatgtgtacagggatgaagcattaaacaaccggttccagatcaagggagttgagctgaagtcagggtacaaagattggatcctatggatttcctytgccatatcatgttttttgctttgtgttgctttgttggggttcatcatgtgggcctgccaaaagggcaacattaggtgcaacatttgcatttga
features = record['GBSeq_feature-table'] # extract all the features
for feature in features: # loop over features and print feature key and feature location
print(feature['GBFeature_key'] + ": " + feature['GBFeature_location'])
source: 1..1701 gene: 1..1701 CDS: 1..1701 mat_peptide: 49..1035 mat_peptide: 1036..1698
So far we have only downloaded specific records from Entrez. In addition to just downloading records, however, we can also run searches directly from python. Any query that you can do on the Entrez website (https://www.ncbi.nlm.nih.gov/) can also be executed directly from python. This allows you to find a large number of records all at once and process them in an automated fashion.
For example, below we will see how to automatically run and retrieve the results for the following search term: "influenza a virus texas h1n1 hemagglutinin complete cds". A direct link to the search results on the Entrez website is here:
https://www.ncbi.nlm.nih.gov/nuccore/?term=influenza+a+virus+texas+h1n1+hemagglutinin+complete+cds
(Note that in the following Python code, we limit the number of search hits returned to the first 10.)
# let's do a search for complete genomes of the SARS-COV2 virus
handle = Entrez.esearch(
db="nucleotide", # database to search
term="sars-cov2 complete cds", # search term
retmax=10 # maximum number of results that are returned
)
record = Entrez.read(handle)
handle.close()
gi_list = record["IdList"] # list of genbank identifiers found
print(gi_list)
['1829138230', '1829138218', '1829138206', '1829138194', '1829138182', '1829138170', '1829138158', '1829138146', '1829138134', '1829138121']
Note that even though NCBI is phasing out sequence GI numbers, for now the esearch()
function still returns GI numbers (numerical sequence identifiers without version information).
We can download all the genbank records in the list of identifiers using the Entrez.efetch()
function. This function provides us with a handle that needs to be processed with SeqIO.parse()
. (We used SeqIO.read()
previously, which reads a single record. SeqIO.parse()
reads multiple records. See here for details.)
handle = Entrez.efetch(db="nucleotide", id=gi_list, rettype="gb", retmode="text")
records = SeqIO.parse(handle, "genbank")
for record in records:
print(record.description)
handle.close() # important, close the handle only after you have iterated over the records. Otherwise you will get an error!
Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0009/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0008/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0007/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0006/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0005/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0004/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0003/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0002/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0001/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UNC_200189/2020, complete genome
As another example, let's search the "pubmed" database (database of scientific publications) for papers from 2015 written by "Wilke CO". The exact search term we need to use is the following: "Wilke CO[Author] AND 2015[Date - Publication]"
You can click here to see the result from that search online.
handle = Entrez.esearch(
db="pubmed", # database to search
term="Wilke CO[Author] AND 2015[Date - Publication]", # search term
retmax=10 # number of results that are returned
)
record = Entrez.read(handle)
handle.close()
# search returns PubMed IDs (pmids)
pmid_list = record["IdList"]
print(pmid_list)
['26770819', '26468068', '26430238', '26397960', '26355089', '26275208', '26020774', '25999509', '25787027', '25737813']
Just like with genes and genomes, we can download the records corresponding to these identifiers. They are references. We'll print the author(s), title, and reference (source).
from Bio import Medline
handle = Entrez.efetch(db="pubmed", id=pmid_list, rettype="medline", retmode="text")
records = Medline.parse(handle)
for record in records:
print(record['AU']) # author list
print(record['TI']) # title
print(record['SO']) # source (reference)
print()
handle.close()
['Meyer AG', 'Spielman SJ', 'Bedford T', 'Wilke CO'] Time dependence of evolutionary metrics during the 2009 pandemic influenza virus outbreak. Virus Evol. 2015 Jan;1(1). doi: 10.1093/ve/vev006. Epub 2015 Jan 1. ['Meyer AG', 'Wilke CO'] The utility of protein structure as a predictor of site-wise dN/dS varies widely among HIV-1 proteins. J R Soc Interface. 2015 Oct 6;12(111):20150579. doi: 10.1098/rsif.2015.0579. ['Wilke CO'] Evolutionary paths of least resistance. Proc Natl Acad Sci U S A. 2015 Oct 13;112(41):12553-4. doi: 10.1073/pnas.1517390112. Epub 2015 Oct 1. ['Spielman SJ', 'Wilke CO'] Pyvolve: A Flexible Python Module for Simulating Sequences along Phylogenies. PLoS One. 2015 Sep 23;10(9):e0139047. doi: 10.1371/journal.pone.0139047. eCollection 2015. ['Kerr SA', 'Jackson EL', 'Lungu OI', 'Meyer AG', 'Demogines A', 'Ellington AD', 'Georgiou G', 'Wilke CO', 'Sawyer SL'] Computational and Functional Analysis of the Virus-Receptor Interface Reveals Host Range Trade-Offs in New World Arenaviruses. J Virol. 2015 Nov;89(22):11643-53. doi: 10.1128/JVI.01408-15. Epub 2015 Sep 9. ['Houser JR', 'Barnhart C', 'Boutz DR', 'Carroll SM', 'Dasgupta A', 'Michener JK', 'Needham BD', 'Papoulas O', 'Sridhara V', 'Sydykova DK', 'Marx CJ', 'Trent MS', 'Barrick JE', 'Marcotte EM', 'Wilke CO'] Controlled Measurement and Comparative Analysis of Cellular Components in E. coli Reveals Broad Regulatory Changes in Response to Glucose Starvation. PLoS Comput Biol. 2015 Aug 14;11(8):e1004400. doi: 10.1371/journal.pcbi.1004400. eCollection 2015 Aug. ['Meyer AG', 'Wilke CO'] Geometric Constraints Dominate the Antigenic Evolution of Influenza H3N2 Hemagglutinin. PLoS Pathog. 2015 May 28;11(5):e1004940. doi: 10.1371/journal.ppat.1004940. eCollection 2015 May. ['Kachroo AH', 'Laurent JM', 'Yellman CM', 'Meyer AG', 'Wilke CO', 'Marcotte EM'] Evolution. Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science. 2015 May 22;348(6237):921-5. doi: 10.1126/science.aaa0769. ['Echave J', 'Jackson EL', 'Wilke CO'] Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites. Phys Biol. 2015 Mar 19;12(2):025002. doi: 10.1088/1478-3975/12/2/025002. ['Spielman SJ', 'Kumar K', 'Wilke CO'] Comprehensive, structurally-informed alignment and phylogeny of vertebrate biogenic amine receptors. PeerJ. 2015 Feb 17;3:e773. doi: 10.7717/peerj.773. eCollection 2015.
Problem 1
Use the following code to download the genbank record KT220438
in XML and parse it with the Entrez.parse()
function:
# Download sequence record for genbank id KT220438 (HA from influenza A)
handle = Entrez.efetch(db="nucleotide", id="KT220438", rettype="gb", retmode="xml")
parsed = Entrez.parse(handle)
record = list(parsed)[0] # Convert the parsed contents into a list and take element 0.
handle.close()
Then:
(a) Print out the value for the key GBSeq_definition
.
(b) Find the CDS
feature and print out all its qualifiers. Note that qualifiers are provided under the keyword GBFeature_quals
.
# Problem 1a
print(record['GBSeq_definition'])
Influenza A virus (A/NewJersey/NHRC_93219/2015(H3N2)) segment 4 hemagglutinin (HA) gene, complete cds
# Problem 1b
features = record['GBSeq_feature-table'] # extract all the features
for feature in features: # loop over features and find CDS feature
if feature['GBFeature_key']=='CDS':
CDS_feature = feature
break
qualifiers = CDS_feature['GBFeature_quals']
for q in qualifiers:
print(q['GBQualifier_name'] + ": " + q['GBQualifier_value'])
gene: HA function: receptor binding and fusion protein codon_start: 1 transl_table: 1 product: hemagglutinin protein_id: AKQ43545.1 translation: MKTIIALSYILCLVFAQKIPGNDNSTATLCLGHHAVPNGTIVKTITNDRIEVTNATELVQNSSIGEICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSSSSFFSRLNWLTHLNYTYPALNVTMPNNEQFDKLYIWGVHHPGTDKDQIFLYAQSSGRITVSTKRSQQAVIPNIGSRPRIRDIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCKSECITPNGSIPNDKPFQNVNRITYGACPRYVKHSTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTXDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHNVYRDEALNNRFQIKGVELKSGYKDWILWISXAISCFLLCVALLGFIMWACQKGNIRCNICI
Problem 2:
(a) Use an Entrez esearch query of the pubmed database to find out how many publications "Spielman SJ" wrote in 2015.
(b) From the results of part (a), compile a combined list of all the co-authors of "Spielman SJ" in 2015.
# Problem 2a
handle = Entrez.esearch(
db="pubmed", # database to search
term="Spielman SJ[Author] AND 2015[Date - Publication]", # search term
retmax=10 # number of results that are returned
)
record = Entrez.read(handle)
handle.close()
# search returns PubMed IDs (pmids)
pmid_list = record["IdList"]
print("Publications found:", pmid_list)
print("Number of publications:", len(pmid_list))
Publications found: ['26770819', '26397960', '25737813', '25576365'] Number of publications: 4
# Problem 2b
from Bio import Medline
handle = Entrez.efetch(db="pubmed", id=pmid_list, rettype="medline", retmode="text")
records = Medline.parse(handle)
coauthors = [] # start with empty list of coauthors
for record in records:
au_list = record['AU']
for author in au_list:
if author != "Spielman SJ" and author not in coauthors:
coauthors.append(author)
handle.close()
print('Co-authors of "Spielman SJ" in 2015:')
for author in coauthors:
print(" ", author)
Co-authors of "Spielman SJ" in 2015: Meyer AG Bedford T Wilke CO Kumar K
Problem 3:
For larger searches, NCBI wants you to use the WebEnv method to download all your search results. This is explained in the Biopython tutorial here. Rewrite the SARS-COV2 search from the section "Running search queries on through Entrez" in such a way that it uses the WebEnv method. For this downloading method, it makes sense to write all the results into a file and then read the results back in.
handle = Entrez.esearch(
db="nucleotide", # database to search
term="sars-cov2 complete cds", # search term
usehistory="y" # this switches on the WebEnv method
)
results = Entrez.read(handle)
handle.close()
# Because we ran the search with usehistory="y", the results now contain two additional
# pieces of information, the WebEnv session cookie and the QueryKey:
webenv = results["WebEnv"]
query_key = results["QueryKey"]
# We also get the number of search results:
count = int(results["Count"])
# We now download the results and store them in a local file
# Downloading happens in batches, let's download 20 results at a time:
batch_size = 20
out_handle = open("sars-cov2.gb", "w")
for start in range(0, count, batch_size):
end = min(count, start+batch_size)
print("Downloading records %i through %i" % (start+1, end))
fetch_handle = Entrez.efetch(
db="nucleotide", rettype="gb", retmode="text",
retstart=start, retmax=batch_size,
webenv=webenv, query_key=query_key
)
data = fetch_handle.read()
fetch_handle.close()
out_handle.write(data)
out_handle.close()
Downloading records 1 through 20 Downloading records 21 through 40 Downloading records 41 through 60 Downloading records 61 through 80 Downloading records 81 through 100 Downloading records 101 through 120 Downloading records 121 through 140 Downloading records 141 through 160 Downloading records 161 through 180 Downloading records 181 through 200 Downloading records 201 through 220 Downloading records 221 through 240 Downloading records 241 through 260 Downloading records 261 through 280 Downloading records 281 through 300 Downloading records 301 through 320 Downloading records 321 through 340 Downloading records 341 through 360 Downloading records 361 through 380 Downloading records 381 through 400 Downloading records 401 through 420 Downloading records 421 through 440 Downloading records 441 through 460 Downloading records 461 through 467
# We can now open the file we created and parse all the records we have previously stored
in_handle = open("sars-cov2.gb", "r")
records = SeqIO.parse(in_handle, "genbank")
for record in records:
print(record.description)
in_handle.close()
Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0009/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0008/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0007/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0006/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0005/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0004/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0003/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0002/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/MI-SC2-0001/2020 ORF1ab polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UNC_200189/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UNC_200181/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UNC_200191/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TX_2967/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TX_2817/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TX_2039/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/RI_0556/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/OR_5430/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/NY_2929/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/NH_0008/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/NH_0004/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/IL_1375/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/IL_1293/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/GA_1445/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/GA_1320/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/GA_1299/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/FL_6318/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/AZ_4811/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/KOR/BA-ACH_2719/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/KOR/BA-ACH_2718/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/KOR/BA-ACH_2604/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UNC_200173/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/UF-2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS CoV-2/human/USA/UF-1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ07/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ05/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ04/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ03/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ02/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-BJ01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-WH05/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-WH04/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-WH03/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-WH02/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_IME-WH01/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW466/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW465/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW464/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW463/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW461/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW460/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW457/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW456/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW455/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW453/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW452/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW451/2020 ORF1ab polyprotein (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW450/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW449/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW448/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW447/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW446/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW445/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW444/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW443/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW442/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW441/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW439/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW438/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW437/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW436/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW435/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW434/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW433/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW432/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW430/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW429/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW428/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; surface glycoprotein (S) and ORF3a protein (ORF3a) genes, complete cds; envelope protein (E) gene, partial cds; and membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW427/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW426/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW425/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW424/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW423/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW422/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW421/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW420/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW419/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW418/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW417/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW416/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW415/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW414/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW413/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW412/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USACT-UW411/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW410/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAOR-UW409/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW408/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW407/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW406/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW405/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW404/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW403/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW402/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW401/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW400/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW399/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAUNKNOWN-UW398/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW397/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW396/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW395/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW394/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW393/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW392/2020 ORF1ab polyprotein (ORF1ab) and ORF1a polyprotein (ORF1ab) genes, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USAWA-UW391/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia26/2020 ORF1ab polyprotein (ORF1ab) gene, complete cds; ORF1a polyprotein (ORF1ab) gene, partial cds; surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), and ORF6 protein (ORF6) genes, complete cds; ORF7a protein (ORF7a) and ORF7b (ORF7b) genes, partial cds; and ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia23/2020 ORF1ab polyprotein (ORF1ab) gene, complete cds; ORF1a polyprotein (ORF1ab) gene, partial cds; and surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia25/2020 ORF1ab polyprotein (ORF1ab) gene, complete cds; ORF1a polyprotein (ORF1ab) gene, partial cds; surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), and ORF6 protein (ORF6) genes, complete cds; ORF7a protein (ORF7a) and ORF7b (ORF7b) genes, partial cds; and ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia22/2020 ORF1ab polyprotein (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia20/2020 ORF1ab polyprotein (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia19/2020 ORF1ab polyprotein (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia18/2020 ORF1ab polyprotein (ORF1ab), ORF1a polyprotein (ORF1ab), surface glycoprotein (S), ORF3a protein (ORF3a), envelope protein (E), membrane glycoprotein (M), ORF6 protein (ORF6), ORF7a protein (ORF7a), ORF7b (ORF7b), ORF8 protein (ORF8), nucleocapsid phosphoprotein (N), and ORF10 protein (ORF10) genes, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia16/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia15/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia14/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia11/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia12/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia17/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia13/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ISR/ISR_IT0320/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ISR/ISR_JP0320/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TX_2020/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/FL_5091/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/FL_5125/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/OR_2656/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/GA_2742/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/GA_2741/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA_5030/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CA_2602/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/RI_0520/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW390/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW389/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW388/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW386/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW385/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW384/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW383/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW380/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW379/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW378/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW376/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW375/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW374/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW373/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW372/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW371/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW370/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW369/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW368/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW367/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW366/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW365/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW364/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW363/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW362/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW361/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW360/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW359/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW358/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW357/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW356/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW355/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW354/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW353/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW352/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW351/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW350/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW349/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW348/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW346/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW345/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW344/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW343/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW342/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW341/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW340/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW339/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW338/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW337/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW336/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW335/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW334/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW333/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW332/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW331/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW330/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW328/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW326/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW325/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW324/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW323/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW322/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW320/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW319/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW317/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW315/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW314/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW313/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW310/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW308/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW307/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW305/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW304/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW303/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW301/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW300/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW299/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW298/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH23/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH13/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH24/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH22/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH21/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH20/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH19/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH18/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH17/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH14/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH12/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH11/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH10/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH9/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH8/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH7/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH6/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH5/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH4/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH3/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-NH2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/PER/Peru-10/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-Cov-2/human/PAK/Manga1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW297/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW296/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW295/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW294/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW292/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW291/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW290/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW288/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW287/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW285/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW284/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW282/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW279/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW277/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW276/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW275/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW274/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW272/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW271/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW269/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW268/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW266/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW265/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW264/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW262/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW261/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW260/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW259/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW258/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW257/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW256/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW255/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW254/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW253/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW252/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW251/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW249/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW247/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW245/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW244/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW243/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-015/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-014/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-013/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-011/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-007/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-006/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/CZB-RR057-005/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_YB012504/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_YB012506/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_YB012602/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_YB012605/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_YB012611/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Wuhan_OS52/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia003/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-91/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-90/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-79/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-638/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-62/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-60/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-576/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-551/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-49/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-48/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-481/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-477/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-185/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-178/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HZ-162/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW242/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW241/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW235/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW234/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW240/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW239/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW238/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW237/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW236/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW233/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW232/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW231/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW230/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW229/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW228/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW227/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW225/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW224/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW223/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW222/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW221/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW220/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW219/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW218/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW217/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW216/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW215/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW214/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW213/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW212/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW211/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW210/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW209/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW207/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW205/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW204/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW203/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW202/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW201/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW200/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW199/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW198/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW197/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW196/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW195/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW194/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW193/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA-UW192/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/PAK/Gilgit1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia8/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia7/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ESP/Valencia5/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/KMS1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/VNM/nCoV-19-02S/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/VNM/nCoV-19-01S/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/TWN/CGMH-CGU-01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/IQTC03/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/IQTC04/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/IQTC02/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/WH-09/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA7-UW4/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA6-UW3/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA4-UW2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA3-UW1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/SH01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/WA2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/235/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/233/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/231/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/105/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/BRA/SP02/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/VIE/NIHE/2020 ORF3a (orf3a) gene, partial cds; envelope protein (E), membrane glycoprotein (M), and ORF6 protein (ORF6) genes, complete cds; and ORF7a protein (ORF7a) gene, partial cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/VIE/NIHE/2020 ORF8 protein (ORF8) gene, partial cds; and nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/IQTC01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/SWE/01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_194/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_92/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_86/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_84/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_64/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_46/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_38/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_18/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_17/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/HS_8/2020 nucleocapsid phosphoprotein (N) gene, complete cds Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/NPL/61-TW/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/NTU02/TWN/human/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/NTU01/TWN/human/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/ITA/INMI1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/IND/166/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/CHN/Yunnan-01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/IND/29/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-WA1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate WIV07, complete genome Severe acute respiratory syndrome coronavirus 2 isolate WIV06, complete genome Severe acute respiratory syndrome coronavirus 2 isolate WIV05, complete genome Severe acute respiratory syndrome coronavirus 2 isolate WIV04, complete genome Severe acute respiratory syndrome coronavirus 2 isolate WIV02, complete genome Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome Severe acute respiratory syndrome coronavirus 2 isolate USA/MN1-MDH1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate USA/MN2-MDH2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate USA/MN3-MDH3/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-26/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-25/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-24/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-23/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-22/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-21/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-19/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-6/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-5/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-4/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-3/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-18/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-17/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-16/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-15/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-14/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-13/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-9/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-12/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-11/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-10/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-8/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CruiseA-7/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA9/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-TX1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA8/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA7/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA6/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-IL2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-MA1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-WI1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate SNU01, complete genome Severe acute respiratory syndrome coronavirus 2 isolate HZ-1, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA5/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA4/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA3/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-WA1-F6/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-WA1-A12/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate BetaCoV/Wuhan/IPBCAMS-WH-05/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate BetaCoV/Wuhan/IPBCAMS-WH-04/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate BetaCoV/Wuhan/IPBCAMS-WH-03/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate BetaCoV/Wuhan/IPBCAMS-WH-02/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate BetaCoV/Wuhan/IPBCAMS-WH-01/2019, complete genome Severe acute respiratory syndrome coronavirus 2 isolate Australia/VIC01/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-AZ1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA2/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-CA1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV/USA-IL1/2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV WHU02, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV WHU01, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV_HKU-SZ-005b_2020, complete genome Severe acute respiratory syndrome coronavirus 2 isolate 2019-nCoV_HKU-SZ-002a_2020, complete genome