CSC334 Lab1

From dftwiki3
Jump to: navigation, search

<meta name="keywords" content="computer science, bioinformatics, DNA, CSC334, Lab" /> <meta name="description" content="DNA Sequence Logo Lab" /> <meta name="title" content="Bioinformatics Lab" /> <meta name="abstract" content="DNA Sequence Logo" /> <meta name="author" content="thiebaut at cs.smith.edu" />

Back to CSC334 Lab Page


Retrieving a Nucleotide (DNA) sequence from the NCBI database

In this lab we retrieve a DNA sequence in FASTA format that we can use for various experiments.

Procedure

  • Point your browser to www.ncbi.nlm.nih.gov
  • Select Nucleotide from the drop-down menu, and enter escherichia coli in the search box.
Picture 1






























  • Click on the first link, in our case on AB426820, and select FASTA in the display box. You should get something like this (note that the first line may wraps around, and should not include a carriage return):
>gi|194306025|dbj|AB426820.1| Escherichia coli ompT mRNA for outer membrane protease T, partial cds, strain: JCM 5491
TGGGAATAGTCCTGACAACCCCTATTGCGATCAGCTCTTTTGCTTCTACCGAGACTTTATCGTTTACTCC
TGACAACATAAATGCGGACATTAGTCTTGGAACTCTGAGCGGAAAAACAAAAGAGCGTGTTTATCTAGCC 
GAAGAAGGAGGCCGAAAGGTCAGTCAACTTGACTGGAAATTCAATAACGCTGCAATTATTAAAGGTGCAA
TTAATTGGGATTTGATGCCCCAGATATCTATCGGGGCTGCTGGCTGGACAACTCTCGGTAGCCGAGGTGG  
CAATATGGTCGATCGGGACTGGATGGATTCCAGTAACCCCGGAACCTGGACGGATGAAAGTAGACACCCT 
GATACACAACTCAATTATGCCAACGAATTTGATCTGAATATCAGAGGCTGGCTCCCCAACGAACCCAATT
ACCGCCTGGGACTCATGGCCGGATATCAGGAAAGCCGTTATAGCTTTACAGCCAGAGGGGGTTCCTATAT
CTACAGTTCTGAGGAGGGATTCAGAGATGATATCGGCTCCTTCCCGAATGGAGAAAGAGCAATCGGCTAC
AAACAACGTTTTAAAATGCCCTACATTGGCTTGACTGGAAGTTATCGTTATGAAGATTTTGAGCTAGGTG
GTACATTTAAATACAGCGGCTGGGTGGAAGCATTTGATAACGATGAACACTATGACCCAGGAAAAAGAAT
CACTTATCGCAGTAAAGTCAAAGACCAAAATTACTATTCTGTTGCAGTCAATGCAGGTTATTACGTAACG
CCTAATGCAAAAGTTTATATTGAAGGCGCATGGAATCGGGTTACGAATAAAAAAGGTGATACTTCACTTT
ATGATCACAATGATAACACTTCTGACTACAGCAAAAATGGTGCAGGCATAGAAAACTATAACTTCATCAC
TACTGCTGGTC
  • That's it, you have your first FASTA sequence for a nucleotide of the E. Coli bacteria. You can store it in a text file or copy and paste it in a program or a Web form, depending on what you want to do with it.

Getting the Protein translation

Take the same steps as above, but this time select GenBank as the display option, and look at the same AB426820 link. Note in the center of the display the Translation section, which shows the translation of the DNA sequence as a protein:

   /translation="GIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVY
                    LAEEGGRKVSQLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDRDWMD
                    SSNPGTWTDESRHPDTQLNYANEFDLNIRGWLPNEPNYRLGLMAGYQESRYSFTARGG
                    SYIYSSEEGFRDDIGSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWV
                    EAFDNDEHYDPGKRITYRSKVKDQNYYSVAVNAGYYVTPNAKVYIEGAWNRVTNKKGD
                    TSLYDHNDNTSDYSKNGAGIENYNFITTAG"

References/Misc. Links

  • Check out the PubMed Tutorials on the NCBI main page to train yourself in using the databases












Back to CSC334 Lab Page
© D. Thiebaut 2008