Finding Hidden Messages in DNA (Bioinformatics I)

Start Date: 07/05/2020

Course Type: Common Course

Course Link: https://www.coursera.org/learn/dna-analysis

About Course

Named a top 50 MOOC of all time by Class Central! This course begins a series of classes illustrating the power of computing in modern biology. Please join us on the frontier of bioinformatics to look for hidden messages in DNA without ever needing to put on a lab coat. In the first half of the course, we investigate DNA replication, and ask the question, where in the genome does DNA replication begin? We will see that we can answer this question for many bacteria using only some straightforward algorithms to look for hidden messages in the genome. In the second half of the course, we examine a different biological question, when we ask which DNA patterns play the role of molecular clocks. The cells in your body manage to maintain a circadian rhythm, but how is this achieved on the level of DNA? Once again, we will see that by knowing which hidden messages to look for, we can start to understand the amazingly complex language of DNA. Perhaps surprisingly, we will apply randomized algorithms, which roll dice and flip coins in order to solve problems. Finally, you will get your hands dirty and apply existing software tools to find recurring biological motifs within genes that are responsible for helping Mycobacterium tuberculosis go "dormant" within a host for many years before causing an active infection.

Course Syllabus

Welcome to class!

This course will focus on two questions at the forefront of modern computational biology, along with the algorithmic approaches we will use to solve them in parentheses:

  1. Weeks 1-2: Where in the Genome Does DNA Replication Begin? (Algorithmic Warmup)
  2. Weeks 3-4: Which DNA Patterns Play the Role of Molecular Clocks? (Randomized Algorithms)

Week 5 will consist of a Bioinformatics Application Challenge in which you will get to apply software for finding DNA motifs to a real biological dataset.

Each of the two chapters in the course is accompanied by a Bioinformatics Cartoon created by Randall Christopher and serving as a chapter header in the Specialization's bestselling print companion. You can find the first chapter's cartoon at the bottom of this message. What does a cryptic message leading to buried treasure have to do with biology? We hope you will join us to find out!

Phillip and Pavel

Coursera Plus banner featuring three learners and university partner logos

Course Introduction

Finding Hidden Messages in DNA (Bioinformatics I) DNA, the building blocks of life, is everywhere. We are 99.9% certain that you will find DNA in all objects and things that you use everyday, regardless of size or type. But how do we find DNA in samples that might have been contaminated by someone? What happens if we contaminate a sample and get a "DNA signature" from the sample? How does DNA work and where does it all come from? Find out in this course. This course is the first in a series on "Bioinformatics I". The second course is "Bioinformatics II: Foundations, Methods and Applications" which is a general overview of the fields of bioinformatics. The third course is "Bioinformatics III: Applications and Future directions" which is the focus of the Specialization. Please note that the free version of this class gives you access to all of the instructional videos and handouts. The peer feedback and quizzes are only available in the paid version.E.g. a sample from a genome sequencing is missing for some people’s (some)own (sample) which might affect the content of the genome. How do we find missing samples? Nucleoside sequencing is the process of breaking down nucleotides to find the genetic material that will be passed down to the next generation. Nucleotides are the building blocks of DNA.

Course Tag

Bioinformatics Bioinformatics Algorithms Algorithms Python Programming

Related Wiki Topic

Article Example
Bioinformatics Bioinformatics is a science field that is similar to but distinct from biological computation, while it is often considered synonymous to computational biology. Biological computation uses bioengineering and biology to build biological computers, whereas bioinformatics uses computation to better understand biology. Bioinformatics and computational biology involve the analysis of biological data, particularly DNA, RNA, and protein sequences. The field of bioinformatics experienced explosive growth starting in the mid-1990s, driven largely by the Human Genome Project and by rapid advances in DNA sequencing technology.
Bioinformatics Common activities in bioinformatics include mapping and analyzing DNA and protein sequences, aligning DNA and protein sequences to compare them, and creating and viewing 3-D models of protein structures.
Hidden message The information in hidden messages is not immediately noticeable; it must be discovered or uncovered, and interpreted before it can be known. Hidden messages include backwards audio messages, hidden visual messages, and symbolic or cryptic codes such as a crossword or cipher. There are many legitimate examples of hidden messages, though many are imaginings.
Hidden message Hidden messages can be created in visual mediums with techniques such as hidden text and steganography.
Hidden message A hidden message is information that is not immediately noticeable, and that must be discovered or uncovered and interpreted before it can be known. Hidden messages include backwards audio messages, hidden visual messages and symbolic or cryptic codes such as a crossword or cipher. Although there are many legitimate examples of hidden messages created with techniques such as backmasking and steganography, many so-called hidden messages are merely fanciful imaginings or apophany.
Closest string In bioinformatics, the closest string problem is an intensively studied facet of the problem of finding signals in DNA.
Bioinformatics Bioinformatics has become an important part of many areas of biology. In experimental molecular biology, bioinformatics techniques such as image and signal processing allow extraction of useful results from large amounts of raw data. In the field of genetics and genomics, it aids in sequencing and annotating genomes and their observed mutations. It plays a role in the text mining of biological literature and the development of biological and gene ontologies to organize and query biological data. It also plays a role in the analysis of gene and protein expression and regulation. Bioinformatics tools aid in the comparison of genetic and genomic data and more generally in the understanding of evolutionary aspects of molecular biology. At a more integrative level, it helps analyze and catalogue the biological pathways and networks that are an important part of systems biology. In structural biology, it aids in the simulation and modeling of DNA, RNA, proteins as well as biomolecular interactions.
Bioinformatics Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to analyze and interpret biological data. Bioinformatics has been used for "in silico" analyses of biological queries using mathematical and statistical techniques.
Bioinformatics A Bioinformatics workflow management system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, in a Bioinformatics application. Such systems are designed to
Bioinformatics MOOC platforms also provide online certifications in bioinformatics and related disciplines, including Coursera's Bioinformatics Specialization (UC San Diego) and Genomic Data Science Specialization (Johns Hopkins) as well as EdX's Data Analysis for Life Sciences XSeries (Harvard).
Secret Messages "Secret Messages", as its title suggests, was littered with hidden messages in the form of backmasking, some obvious and others less so. This was Jeff Lynne's second tongue-in-cheek response to allegations of hidden Satanic messages in earlier Electric Light Orchestra LPs by Christian fundamentalists which led up to early 1980s American congressional hearings (a similar response had been made by Lynne on the "Face the Music" album, during the intro to the "Fire on High" track). In Britain, the back cover of "Secret Messages" has a mock warning about the hidden messages. Word of the album's impending release in the United States caused enough of a furore to cause CBS Records to delete the cover blurb there.
Bioinformatics Software tools for bioinformatics range from simple command-line tools, to more complex graphical programs and standalone web-services available from various bioinformatics companies or public institutions.
Complementary DNA The term "cDNA" is also used, typically in a bioinformatics context, to refer to an mRNA transcript's sequence, expressed as DNA bases (GCAT) rather than RNA bases (GCAU).
Bioinformatics Basic bioinformatics services are classified by the EBI into three categories: SSS (Sequence Search Services), MSA (Multiple Sequence Alignment), and BSA (Biological Sequence Analysis). The availability of these service-oriented bioinformatics resources demonstrate the applicability of web-based bioinformatics solutions, and range from a collection of standalone tools with a common data format under a single, standalone or web-based interface, to integrative, distributed and extensible bioinformatics workflow management systems.
Bioinformatics The range of open-source software packages includes titles such as Bioconductor, BioPerl, Biopython, BioJava, BioJS, BioRuby, Bioclipse, EMBOSS, .NET Bio, Orange with its bioinformatics add-on, Apache Taverna, UGENE and GenoCAD. To maintain this tradition and create further opportunities, the non-profit Open Bioinformatics Foundation have supported the annual Bioinformatics Open Source Conference (BOSC) since 2000.
Bioinformatics Historically, the term "bioinformatics" did not mean what it means today. Paulien Hogeweg and Ben Hesper coined it in 1970 to refer to the study of information processes in biotic systems. This definition placed bioinformatics as a field parallel to biophysics (the study of physical processes in biological systems) or biochemistry (the study of chemical processes in biological systems).
UCPH Bioinformatics Centre The center is headed by Anders Krogh, who pioneered the use of hidden Markov models in bioinformatics, together with David Haussler. The center further consists of six postdocs and about 17 PhD students.
Bioinformatics Databases are essential for bioinformatics research and applications. Many databases exist, covering various information types: for example, DNA and protein sequences, molecular structures, phenotypes and biodiversity. Databases may contain empirical data (obtained directly from experiments), predicted data (obtained from analysis), or, most commonly, both. They may be specific to a particular organism, pathway or molecule of interest. Alternatively, they can incorporate data compiled from multiple other databases. These databases vary in their format, way of accession and whether they are public or not.
DNA In a paper published in "Nature" in January 2013, scientists from the European Bioinformatics Institute and Agilent Technologies proposed a mechanism to use DNA's ability to code information as a means of digital data storage. The group was able to encode 739 kilobytes of data into DNA code, synthesize the actual DNA, then sequence the DNA and decode the information back to its original form, with a reported 100% accuracy. The encoded information consisted of text files and audio files. A prior experiment was published in August 2012. It was conducted by researchers at Harvard University, where the text of a 54,000-word book was encoded in DNA.
Bioinformatics a bioinformatics tool BPGA can be used to characterize the Pan Genome of bacterial species.