Whole genome sequencing of bacterial genomes - tools and applications

Start Date: 07/05/2020

Course Type: Common Course

Course Link: https://www.coursera.org/learn/wgs-bacteria

About Course

This course will cover the topic of Whole genome sequencing (WGS) of bacterial genomes which is becoming more and more relevant for the medical sector. WGS technology and applications are high on international political agenda, as the classical methods are being replaced by WGS technology and therefore bioinformatic tools are extremely important for allowing the people working in this sector to be able to analyze the data and obtain results that can be interpreted and used for different purposes. The course will give the learners a basis to understand and be acquainted with WGS applications in surveillance of bacteria including species identification, typing and characterization of antimicrobial resistance and virulence traits as well as plasmid characterization. It will also give the opportunity to learners to learn about online tools and what they can be used for through demonstrations on how to use some of these tools and exercises to be solved by learners with use of freely available WGS analysis tools . By the end of this course you should be able to: 1. Describe the general Principles in typing of Bacteria 2. Give examples of the applications of Whole Genome Sequencing to Surveillance of bacterial pathogens and antimicrobial resistance 3. Apply genomic tools for sub-typing and surveillance 4. Define the concept of Next-Generation Sequencing and describe the sequencing data from NGS 5. Describe how to do de novo assembly from raw reads to contigs 6. Enumerate the methods behind the tools for species identification, MLST typing and resistance gene detection 7. Apply the tools for species identification, MLST typing and resistance gene detection in real cases of other bacterial and pathogen genomes. 8. Describe the methods behind the tools for Salmonella and E.coli typing, plasmid replicon detection and plasmid typing 9. Utilize the tools for Salmonella and E.coli typing, plasmid replicon detection and plasmid typing in real cases of other bacterial and pathogen genomes. 10. Explain the concept and be able to use the integrated bacterial analysis pipeline for batch analysis and typing of genomic data 11. Demonstrate how to construct phylogenetic tree based on SNPs 12. Apply the phylogenetic tool to construct phylogenetic trees and explain the relatedness of bacterial or pathogen strains 13. Describe how to create your own sequence database 14. Utilize the MyDbFinder tool to detect genetic markers of interest from whole genome sequencing

Coursera Plus banner featuring three learners and university partner logos

Course Introduction

Whole genome sequencing of bacterial genomes - tools and applications Biologists have long been aware of the enormous opportunities provided by modern genetics by the discovery of genes and microRNAs that are expressed in different parts of the genome. However, the sequencing of bacterial genomes has become increasingly important in two respects: First, the analysis of the genomes of bacteria is an essential tool for the characterizing and characterizing of bacterial strains and, second, the analysis of the genomes of bacteria that are resistant to be introduced. In this course, we will take a closer look at the whole genome sequencing techniques that are used to analyze the genomes of bacteria. We will also discuss the tools and applications of whole genome sequencing for diagnostics, sequence analysis, and mapping data. We will also explore the major challenges that face the pathologists as they try to read the entire genome of a pathogen.Whole genome sequencing Whole genome sequencing and assembly Analysis of bacterial genomes Bacterial resistance What is machine learning and how is it used? The decision to use a machine learning solution is a mind-set shift. In this course you will look at the specifics of the technology, the platforms, and the skill of the engineers choosing the solution. You will also learn the approach and techniques for implementation that meet the needs of your specific products and business model. This is the first class in this specialization. We will be covering the key areas of machine learning, including neural networks,

Course Tag

Nucleotide Antimicrobial Genome Microbiology

Related Wiki Topic

Article Example
Whole genome sequencing The first bacterial and archaeal genomes, including that of "H. influenzae", were sequenced by Shotgun sequencing. In 1996 the first eukaryotic genome ("Saccharomyces cerevisiae") was sequenced. "S. cerevisiae", a model organism in biology has a genome of only around 12 million nucleotide pairs, and was the first "unicellular" eukaryote to have its whole genome sequenced. The first "multicellular" eukaryote, and animal, to have its whole genome sequenced was the nematode worm: "Caenorhabditis elegans" in 1998. Eukaryotic genomes are sequenced by several methods including Shotgun sequencing of short DNA fragments and sequencing of larger DNA clones from DNA libraries such as bacterial artificial chromosomes (BACs) and yeast artificial chromosomes (YACs).
Whole genome sequencing Whole genome sequencing should not be confused with DNA profiling, which only determines the likelihood that genetic material came from a particular individual or group, and does not contain additional information on genetic relationships, origin or susceptibility to specific diseases. In addition, whole genome sequencing should not be confused with methods that sequence specific subsets of the genome - such methods include whole exome sequencing (1% of the genome) or SNP genotyping (<0.1% of the genome). Almost all truly complete genomes are of microbes; the term "full genome" is thus sometimes used loosely to mean "greater than 95%". The remainder of this article focuses on nearly complete human genomes.
Bacterial genome Right now, we have genome sequences from 50 different bacterial phyla and 11 different archaeal phyla. Second-generation sequencing has yielded many draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing might eventually yield a complete genome in a few hours. The genome sequences reveal much diversity in bacteria. Analysis of over 2000 Escherichia coli genomes reveals an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Genome sequences show that parasitic bacteria have 500-1200 genes, free-living bacteria have 1500-7500 genes, and archaea have 1500-2700 genes.
Bacterial genome Bacterial genomes are generally smaller and less variant in size between species when compared with genomes of animals and single cell eukaryotes. Bacterial genomes can range in size anywhere from about 130 kbp to over 14 Mbp. Recent advances in sequencing technology led to the discovery of a high correlation between the number of genes and the genome size of bacteria.
Whole genome sequencing Whole genome sequencing has established the mutation frequency for whole human genomes. The mutation frequency in the whole genome between generations for humans (parent to child) is about 70 new mutations per generation. An even lower level of variation was found comparing whole genome sequencing in blood cells for a pair of monozygotic (identical twins) 100-year-old centenarians. Only 8 somatic differences were found, though somatic variation occurring in less than 20% of blood cells would be undetected.
Bacterial Phylodynamics Sequencing of the genome or genomic regions and what sequencing technique to use is an important experimental setting to phylodynamic analysis. Whole genome sequencing is often performed on bacterial genomes, although depending on the design of the study, many different methods can be utilized for phylodynamic analysis. Bacterial genomes are much larger and have a slower evolutionary rate then RNA viruses, limiting studies on the bacterial phylodynamics. The advancement of sequencing technology has made bacterial phylodynamics possible but proper preparation of the whole bacterial genomes is mandatory.
Whole genome sequencing Other technologies are emerging, including nanopore technology. Though nanopore sequencing technology is still being refined, its portability and potential capability of generating long reads are of relevance to whole-genome sequencing applications.
Whole genome sequencing Whole genome sequencing (also known as WGS, full genome sequencing, complete genome sequencing, or entire genome sequencing) is the process of determining the complete DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.
Whole genome sequencing The DNA sequencing methods used in the 1970s and 1980s were manual, for example Maxam-Gilbert sequencing and Sanger sequencing. The shift to more rapid, automated sequencing methods in the 1990s finally allowed the sequence of whole genomes.
Whole genome sequencing Sequencing of nearly an entire human genome was first accomplished in 2000 partly through the use of shotgun sequencing technology. While full genome shotgun sequencing for small (4000–7000 base pair) genomes was already in use in 1979, broader application benefited from pairwise end sequencing, known colloquially as "double-barrel shotgun sequencing". As sequencing projects began to take on longer and more complicated genomes, multiple groups began to realize that useful information could be obtained by sequencing both ends of a fragment of DNA. Although sequencing both ends of the same fragment and keeping track of the paired data was more cumbersome than sequencing a single end of two distinct fragments, the knowledge that the two sequences were oriented in opposite directions and were about the length of a fragment apart from each other was valuable in reconstructing the sequence of the original target fragment.
Bacterial genome Obligate bacterial symbionts or pathogens have the smallest genomes and the fewest pseudogenes of the three groups. The relationship between life-styles of bacteria and genome size raises questions as to the mechanisms of bacterial genome evolution. Researchers have developed several theories to explain the patterns of genome size evolution amongst bacteria.
Whole genome sequencing Single cell genome sequencing is being tested as a method of preimplantation genetic diagnosis, wherein a cell from the embryo created by in vitro fertilization is taken and analyzed before embryo transfer into the uterus. After implantation, cell-free fetal DNA can be taken by simple venipuncture from the mother and used for whole genome sequencing of the fetus.
Bacterial genome A striking discovery by Cole et al. described massive amounts of gene decay when comparing Leprosy bacillus to ancestral bacteria. Studies have since shown that several bacteria have smaller genome sizes than their ancestors did. Over the years, researchers have proposed several theories to explain the general trend of bacterial genome decay and the relatively small size of bacterial genomes. Compelling evidence indicates that the apparent degradation of bacterial genomes is owed to a deletional bias.
Whole genome sequencing Whole genome sequencing has largely been used as a research tool, but is currently being introduced to clinics. In the future of personalized medicine, whole genome sequence data will be an important tool to guide therapeutic intervention. The tool of gene sequencing at SNP level is also used to pinpoint functional variants from association studies and improve the knowledge available to researchers interested in evolutionary biology, and hence may lay the foundation for predicting disease susceptibility and drug response.
Bacterial genome One theory predicts that bacteria have smaller genomes due to a selective pressure on genome size to ensure faster replication. The theory is based upon the logical premise that smaller bacterial genomes will take less time to replicate. Subsequently, smaller genomes will be selected preferentially due to enhanced fitness. A study done by Mira et al. indicated little to no correlation between genome size and doubling time. The data indicates that selection is not a suitable explanation for the small sizes of bacterial genomes. Still, many researchers believe there is some selective pressure on bacteria to maintain small genome size.
Whole genome sequencing The first nearly complete human genomes sequenced were two Americans of predominantly Northwestern European ancestry in 2007 (J. Craig Venter at 7.5-fold coverage, and James Watson at 7.4-fold). This was followed in 2008 by sequencing of an anonymous Han Chinese man (at 36-fold), a Yoruban man from Nigeria (at 30-fold), and a female caucasian Leukemia patient (at 33 and 14-fold coverage for tumor and normal tissues). Steve Jobs was among the first 20 people to have their whole genome sequenced, reportedly for the cost of $100,000. , there are 69 nearly complete human genomes publicly available.
Whole genome bisulfite sequencing Whole genome bisulfite sequencing (WGBS), is a next-generation sequencing technology used to determine the DNA methylation status of single cytosines by treating the DNA with sodium bisulfite before sequencing. Sodium bisulfite is a chemical compound that converts unmethylated cytosines into uracil. The cytosines that haven't converted in uracil are methylated. After sequencing, the unmethylated cytosines appear as thymines.
Shotgun sequencing The first genome sequenced by shotgun sequencing was that of cauliflower mosaic virus, published in 1981. However, whole genome shotgun sequencing for small (4000- to 7000-base-pair) genomes had been suggested already in 1979.
Whole genome sequencing In research, whole-genome sequencing can be used in a Genome-Wide Association Study (GWAS) - a project aiming to determine the genetic variant or variants associated with a disease or some other phenotype.
Cancer genome sequencing Cancer genome sequencing is the whole genome sequencing of a single, homogeneous or heterogeneous group of cancer cells. It is a biochemical laboratory method for the characterization and identification of the DNA or RNA sequences of cancer cell(s).