Building Resilient Streaming Analytics Systems on GCP

Start Date: 02/23/2020

Course Type: Common Course

Course Link: https://www.coursera.org/learn/streaming-analytics-systems-gcp

Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.

About Course

*Note: this is a new course with updated content from what you may have seen in the previous version of this Specialization. Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations. This course covers how to build streaming data pipelines on Google Cloud Platform. Cloud Pub/Sub is described for handling incoming streaming data. The course also covers how to apply aggregations and transformations to streaming data using Cloud Dataflow, and how to store processed records to BigQuery or Cloud Bigtable for analysis. Learners will get hands-on experience building streaming data pipeline components on Google Cloud Platform using QwikLabs.

Course Syllabus

Cloud Dataflow Streaming Features
Summary

Deep Learning Specialization on Coursera

Course Introduction

Building Resilient Streaming Analytics Systems on GCP GCP is a trademark or registered trademark of Google Inc. This course builds on the knowledge and skills that you’ve acquired in previous courses in this specialization. It shows you how to use the GCP cloud computing platform to analyze large streams of data. It covers GCP features, standards, and how to use subscription models to get the most out of your data plan. It covers installation steps for streaming analytics using the Hadoop and Spark frameworks. It also covers setting up nNLP systems in nNLP settings, to reduce latency in your data network. By the end of this course you should be able to: - Use a simple command-line interface to deploy nNLP systems to a Hadoop cluster - Use Hadoop and Spark cores to analyze large streams of data - Set up nNLP systems in your nNLP cluster - Analyze large streams of data with Spark Note: This is an advanced course, and therefore, we assume basic computer science, mathematics, and statistics will be high on your degree requirements.You will need: - A computer with a strong Intel Core i5 or equivalent processor and 8 GB RAM. For benchmarking, we use the Hadoop framework. - At least 1 GB of disk space free. More RAM will speed up your computer. - A web browser with a stable connection. We’ll use a fast internet connection for benchmark

Course Tag

Related Wiki Topic

Article Example
Comparison of streaming media systems This is a comparison of streaming media systems. A more complete list of streaming media systems is also available.
Resilient control systems Such performance characteristics exist with both time and data integrity. Time, both in terms of delay of mission and communications latency, and data, in terms of corruption or modification, are normalizing factors. In general, the idea is to base the metric on “what is expected” and not necessarily the actual initiator to the degradation. Considering time as a metrics basis, resilient and un-resilient systems can be observed in Fig. 2.
List of streaming media systems This is a list of streaming media systems with articles. A more detailed comparison of streaming media systems is also available.
Software analytics Software Analytics focuses on trinity of software systems, software users, and software development process:
Software analytics Software Analytics refers to analytics specific to software systems and related software development processes. It aims at describing, predicting, and improving development, maintenance, and management of complex software systems. Methods and techniques of software analytics typically rely on gathering, analyzing, and visualizing information found in the manifold data sources in the scope of software systems and their software development processes---software analytics "turns it into actionable insight to inform better decisions related to software".
Analytics Web analytics allows marketers to collect session-level information about interactions on a website using an operation called sessionization. Google Analytics is an example of a popular free analytics tool that marketers use for this purpose. Those interactions provide web analytics information systems with the information necessary to track the referrer, search keywords, identify IP address, and track activities of the visitor. With this information, a marketer can improve marketing campaigns, website creative content, and information architecture.
Resilient control systems In our modern society, computerized or digital control systems have been used to reliably automate many of the industrial operations that we take for granted, from the power plant to the automobiles we drive. However, the complexity of these systems and how the designers integrate them, the roles and responsibilities of the humans that interact with the systems, and the cyber security of these highly networked systems has led to a new paradigm in research philosophy for next generation control systems. Resilient Control Systems consider all of these elements and those disciplines that contribute to a more effective design, such as cognitive psychology, computer science, and control engineering to develop interdisciplinary solutions. These solutions consider such things such as how to tailor the control system operating displays to best enable the user to make an accurate and reproducible response, how to design in cyber security protections such that the system defends itself from attack by changing its behaviors, and how to better integrate widely distributed computer control systems to prevent cascading failures that result in disruptions to critical industrial operations. In the context of cyber-physical systems, resilient control systems are an aspect that focuses on the unique interdependencies of a control system, as compared to information technology computer systems and networks, due to its importance in operating our critical industrial operations.
GCP Applied Technologies GCP Applied Technologies was established as a subsidiary of W.R. Grace & Co. in Columbia, Maryland in 2015. Its parent company spun off GCP Applied Technologies on January 28, 2016.
Google Analytics On September 29, 2011, Google Analytics launched Real Time analytics.
Adaptive bitrate streaming MPEG-DASH is a technology related to Adobe Systems HTTP Dynamic Streaming, Apple Inc. HTTP Live Streaming (HLS) and Microsoft Smooth Streaming. DASH is based on Adaptive HTTP streaming (AHS) in 3GPP Release 9 and on HTTP Adaptive Streaming (HAS) in Open IPTV Forum Release 2.
Comparison of streaming media systems The following tables compare general and technical information for a number of streaming media systems both audio and video. Please see the individual systems' linked articles for further information.
Continuous analytics Continuous analytics then is the extension of the continuous delivery software development model to the big data analytics development team. The goal of the continuous analytics practitioner then is to find ways to incorporate writing analytics code and installing big data software into the agile development model of automatically running unit and functional tests and building the environment system with automated tools.
Analytics Organizations may apply analytics to business data to describe, predict, and improve business performance. Specifically, areas within analytics include predictive analytics, prescriptive analytics, enterprise decision management, retail analytics, store assortment and stock-keeping unit optimization, marketing optimization and marketing mix modeling, web analytics, sales force sizing and optimization, price and promotion modeling, predictive science, credit risk analysis, and fraud analytics. Since analytics can require extensive computation (see big data), the algorithms and software used for analytics harness the most current methods in computer science, statistics, and mathematics.
Analytics Analytics is the discovery, interpretation, and communication of meaningful patterns in data. Especially valuable in areas rich with recorded information, analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance.
Resilient control systems While much of the current critical infrastructure is controlled by a web of interconnected control systems, either architecture termed as distributed control systems (DCS) or supervisory control and data acquisition (SCADA), the application of control is moving toward a more decentralized state. In moving to a smart grid, the complex interconnected nature of individual homes, commercial facilities and diverse power generation and storage creates an opportunity and a challenge to ensuring that the resulting system is more resilient to threats. The ability to operate these systems to achieve a global optimum for multiple considerations, such as overall efficiency, stability and security, will require mechanisms to holistically design complex networked control systems. Multi-agent methods suggest a mechanism to tie a global objective to distributed assets, allowing for management and coordination of assets for optimal benefit and semi-autonomous, but constrained controllers that can react rapidly to maintain resilience for rapidly changing conditions.
Learning analytics In some prominent cases like the inBloom disaster even full functional systems have been shut down due to lack of trust in the data collection by governments, stakeholders and civil rights groups. Since then, the Learning Analytics community has extensively studied legal conditions in a series of experts workshops on 'Ethics & Privacy 4 Learning Analytics' that constitute the use of trusted Learning Analytics. Drachsler & Greller released a 8-point checklist named DELICATE that is based on the intensive studies in this area to demystify the ethics and privacy discussions around Learning Analytics.
Web analytics There are at least two categories of web analytics; "off-site" and "on-site" web analytics.
Software analytics Software analytics represents a base component of software diagnosis that generally aims at generating findings, conclusions, and evaluations about software systems and their implementation, composition, behavior, and evolution. Software analytics frequently uses and combines approaches and techniques from statistics, prediction analysis, data mining, and scientific visualization. For example, software analytics can map data by means of software maps that allow for interactive exploration.
Resilient (album) Resilient is the fifteenth studio album by Running Wild, released on 2 October 2013 via Steamhammer Records.
Dynamic Adaptive Streaming over HTTP DASH is a technology related to Adobe Systems HTTP Dynamic Streaming, Apple Inc. HTTP Live Streaming (HLS) and Microsoft Smooth Streaming. DASH is based on Adaptive HTTP streaming (AHS) in 3GPP Release 9 and on HTTP Adaptive Streaming (HAS) in Open IPTV Forum Release 2.