Start Date: 09/15/2019
Course Type: Common Course |
Course Link: https://www.coursera.org/learn/building-deep-learning-models-with-tensorflow
Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.Article | Example |
---|---|
TensorFlow | TensorFlow is Google Brain's second generation machine learning system, released as open source software on November 9, 2015. While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA extensions for general-purpose computing on graphics processing units). TensorFlow is available on 64-bit Linux, macOS, and mobile computing platforms including Android and iOS. |
Deep learning | Compound hierarchical-deep models compose deep networks with non-parametric Bayesian models. Features can be learned using deep architectures such as DBNs, DBMs, deep auto encoders, convolutional variants, ssRBMs, deep coding networks, DBNs with sparse feature learning, recursive neural networks, conditional DBNs, de-noising auto encoders. This provides a better representation, allowing faster learning and more accurate classification with high-dimensional data. However, these architectures are poor at learning novel classes with few examples, because all network units are involved in representing the input (a "distributed representation") and must be adjusted together (high degree of freedom). Limiting the degree of freedom reduces the number of parameters to learn, facilitating learning of new classes from few examples. "Hierarchical Bayesian (HB)" models allow learning from few examples, for example for computer vision, statistics, and cognitive science. |
Deep learning | A deep Q-network (DQN) is a type of deep learning model developed at Google DeepMind which combines a deep convolutional neural network with Q-learning, a form of reinforcement learning. Unlike earlier reinforcement learning agents, DQNs can learn directly from high-dimensional sensory inputs. Preliminary results were presented in 2014, with a paper published in February 2015 in Nature The application discussed in this paper is limited to Atari 2600 gaming, although it has implications for other applications. However, much before this work, there had been a number of reinforcement learning models that apply deep learning approaches (e.g.,). |
Deep learning | Deep learning (also known as deep structured learning, hierarchical learning or deep machine learning) is a class |
TensorFlow | TensorFlow is an open source software library for machine learning across a range of tasks, and developed by Google to meet their needs for systems capable of building and training neural networks to detect and decipher patterns and correlations, analogous to the learning and reasoning which humans use. It is currently used for both research and production at Google products, often replacing the role of its closed-source predecessor, DistBelief. TensorFlow was originally developed by the Google Brain team for internal Google use before being released under the Apache 2.0 open source license on November 9, 2015. |
Deep learning | These definitions have in common (1) multiple layers of nonlinear processing units and (2) the supervised or unsupervised learning of feature representations in each layer, with the layers forming a hierarchy from low-level to high-level features. The composition of a layer of nonlinear processing units used in a deep learning algorithm depends on the problem to be solved. Layers that have been used in deep learning include hidden layers of an artificial neural network and sets of complicated propositional formulas. They may also include latent variables organized layer-wise in deep generative models such as the nodes in Deep Belief Networks and Deep Boltzmann Machines. |
Keras | , Keras is the second-fastest growing deep learning framework after Google's TensorFlow, and the third largest after TensorFlow and Caffe. |
Deep learning | The real impact of deep learning in industry apparently began in the early 2000s, when CNNs already processed an estimated 10% to 20% of all the checks written in the US in the early 2000s, according to Yann LeCun. Industrial applications of deep learning to large-scale speech recognition started around 2010. In late 2009, Li Deng invited Geoffrey Hinton to work with him and colleagues at Microsoft Research in Redmond, Washington to apply deep learning to speech recognition. They co-organized the 2009 NIPS Workshop on Deep Learning for Speech Recognition. The workshop was motivated by the limitations of deep generative models of speech, and the possibility that the big-compute, big-data era warranted a serious try of deep neural nets (DNN). It was believed that pre-training DNNs using generative models of deep belief nets (DBN) would overcome the main difficulties of neural nets encountered in the 1990s. However, early into this research at Microsoft, it was discovered that without pre-training, but using large amounts of training data, and especially DNNs designed with corresponding large, context-dependent output layers, produced error rates dramatically lower than then-state-of-the-art GMM-HMM and also than more advanced generative model-based speech recognition systems. This finding was verified by several other major speech recognition research groups. Further, the nature of recognition errors produced by the two types of systems was found to be characteristically different, |
TensorFlow | In May 2016 Google announced its tensor processing unit (TPU), a custom ASIC built specifically for machine learning and tailored for TensorFlow. The TPU is a programmable AI accelerator designed to provide high throughput of low-precision arithmetic (e.g., 8-bit), and oriented toward using or running models rather than training them. Google announced they had been running TPUs inside their data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning. |
Deep learning | Deep learning exploits this idea of hierarchical explanatory factors where higher level, more abstract concepts are learned from the lower level ones. These architectures are often constructed with a greedy layer-by-layer method. Deep learning helps to disentangle these abstractions and pick out which features are useful for learning. |
Deep learning | A main criticism of deep learning concerns the lack of theory surrounding many of the methods. Learning in the most common deep architectures is implemented using gradient descent; while gradient descent has been understood for a while now, the theory surrounding other algorithms, such as contrastive divergence is less clear. (i.e., Does it converge? If so, how fast? What is it approximating?) Deep learning methods are often looked at as a black box, with most confirmations done empirically, rather than theoretically. |
Deep learning | According to a historic survey, the first functional Deep Learning networks with many layers were published by Alexey Grigorevich Ivakhnenko and V. G. Lapa in 1965. The learning algorithm was called the Group Method of Data Handling or GMDH. GMDH features fully automatic structural and parametric optimization of models. The activation functions of the network nodes are Kolmogorov-Gabor polynomials that permit additions and multiplications. |
TensorFlow | TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations which such neural networks perform on multidimensional data arrays. These multidimensional arrays are referred to as "tensors". In June 2016, Google's Jeff Dean stated that 1,500 repositories on GitHub mentioned TensorFlow, of which only 5 were from Google. |
Deep learning | Computational deep learning is closely related to a class of theories of brain development (specifically, neocortical development) proposed by cognitive neuroscientists in the early 1990s. An approachable summary of this work is Elman, et al.'s 1996 book "Rethinking Innateness" (see also: Shrager and Johnson; Quartz and Sejnowski). As these developmental theories were also instantiated in computational models, they are technical predecessors of purely computationally motivated deep learning models. These developmental models share the interesting property that various proposed learning dynamics in the brain (e.g., a wave of nerve growth factor) conspire to support the self-organization of just the sort of inter-related neural networks utilized in the later, purely computational deep learning models; and such computational neural networks seem analogous to a view of the brain's neocortex as a hierarchy of filters in which each layer captures some of the information in the operating environment, and then passes the remainder, as well as modified base signal, to other layers further up the hierarchy. This process yields a self-organizing stack of transducers, well-tuned to their operating environment. As described in The New York Times in 1995: "...the infant's brain seems to organize itself under the influence of waves of so-called trophic-factors ... different regions of the brain become connected sequentially, with one layer of tissue maturing before another and so on until the whole brain is mature." |
Deep learning | Recommendation systems have used deep learning to extract meaningful deep features for latent factor model for content-based recommendation for music. Recently, a more general approach for learning user preferences from multiple domains using multiview deep learning has been introduced. The model uses a hybrid collaborative and content-based approach and enhances recommendations in multiple tasks. |
Deep learning | If there is a lot of learnable predictability in the incoming data sequence, then the highest level RNN can use supervised learning to easily classify even deep sequences with very long time intervals between important events. In 1993, such a system already solved a "Very Deep Learning" task that requires more than 1000 subsequent layers in an RNN unfolded in time. |
Deep learning | The first general, working learning algorithm for supervised deep feedforward multilayer perceptrons was published by Ivakhnenko and Lapa in 1965. A 1971 paper described a deep network with 8 layers trained by the Group method of data handling algorithm which is still popular in the current millennium. These ideas were implemented in a computer identification system "Alpha", which demonstrated the learning process. Other Deep Learning working architectures, specifically those built from artificial neural networks (ANN), date back to the Neocognitron introduced by Kunihiko Fukushima in 1980. The ANNs themselves date back even further. The challenge was how to train networks with multiple layers. |
Deep learning | Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task (e.g., face recognition or facial expression recognition). One of the promises of deep learning is replacing handcrafted features with efficient algorithms for unsupervised or semi-supervised feature learning and hierarchical feature extraction. |
TensorFlow | TensorFlow provides a Python API, as well as somewhat less documented C++, Java and Go APIs. |
Deep learning | The initial success of deep learning in speech recognition, however, was based on small-scale TIMIT tasks. The results shown in the table below are for automatic speech recognition on the popular TIMIT data set. This is a common data set used for initial evaluations of deep learning architectures. The entire set contains 630 speakers from eight major dialects of American English, where each speaker reads 10 sentences. Its small size allows many configurations to be tried effectively. More importantly, the TIMIT task concerns phone-sequence recognition, which, unlike word-sequence recognition, allows very weak "language models" and thus the weaknesses in acoustic modeling aspects of speech recognition can be more easily analyzed. Such analysis on TIMIT by Li Deng and collaborators around 2009-2010, contrasting the GMM (and other generative models of speech) vs. DNN models, stimulated early industrial investment in deep learning for speech recognition from small to large scales, eventually leading to pervasive and dominant use in that industry. That analysis was done with comparable performance (less than 1.5% in error rate) between discriminative DNNs and generative models. The error rates listed below, including these early results and measured as percent phone error rates (PER), have been summarized over a time span of the past 20 years: |