Serverless Data Analysis with Google BigQuery and Cloud Dataflow en Español

Start Date: 02/16/2020

Course Type: Common Course

Course Link:

Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.

About Course

Este curso acelerado a pedido de una semana está basado en Google Cloud Platform Big Data and Machine Learning Fundamentals. Mediante una serie de presentaciones, demostraciones y labs prácticos dictados por un instructor, los participantes aprenderán a realizar procesamiento de canalizaciones, análisis y almacenamiento de datos no-ops. Requisitos previos: • Google Cloud Platform Big Data and Machine Learning Fundamentals • Experiencia en el lenguaje de consulta de tipo SQL para analizar datos • Conocimientos de Python o Java Notas sobre la Cuenta de Google: • Los servicios de Google no están disponibles en China.

Course Syllabus

Module 2: Ajuste de escala automático de canalizaciones de procesamiento de datos con Dataflow

Deep Learning Specialization on Coursera

Course Introduction

Este curso acelerado a pedido de una semana está basado en Google Cloud Platform Big Data and Machine Learning Fundamentals. Mediante una serie de pre

Course Tag

Related Wiki Topic

Article Example
BigQuery BigQuery provides an external access to the Dremel technology, a scalable, interactive "ad hoc" query system for analysis of read-only nested data. To use the data in BigQuery, it first must be uploaded to Google Storage and in a second step imported using the BigQuery HTTP API. BigQuery requires all requests to be authenticated, supporting a number of Google-proprietary mechanisms as well as OAuth.
Serverless computing Google has released an alpha version of its serverless platform, which is called Google Cloud Functions, and supports Node.js.
BigQuery BigQuery is a RESTful web service that enables interactive analysis of massively large datasets working in conjunction with Google Storage. It is an "Infrastructure as a Service" (IaaS) that may be used complementarily with MapReduce.
Apache Hadoop Google also offers connectors for using other Google Cloud Platform products with Hadoop, such as a Google Cloud Storage connector for using Google Cloud Storage and a Google BigQuery connector for using Google BigQuery.
Information capital Google - Google is working on development of BigQuery - first cloud-based big data processing platform.
Global Database of Events, Language, and Tone The dataset is also available on Google Cloud Platform and can be accessed using Google BigQuery.
BigQuery After a limited testing period in 2010, BigQuery was generally available in November 2011 at the Google Atmosphere conference.
Google Cloud Dataproc Google Cloud Dataproc (Cloud Dataproc) is a cloud-based managed Spark and Hadoop service offered on Google Cloud Platform. Cloud Dataproc utilizes many Google Cloud Platform technologies such as Google Compute Engine and Google Cloud Storage to offer fully managed clusters running popular data processing frameworks such as Apache Hadoop and Apache Spark.
Google Cloud Platform Google Cloud Platform is a part of a suite of enterprise services from Google Cloud and provides a set of modular cloud-based services with a host of development tools. For example, hosting and computing, cloud storage, data storage, translations APIs and prediction APIs.
Google Cloud Dataproc Cloud Dataproc includes many open source packages used for data processing, including items from the Spark and Hadoop ecosystem, and open source tools to connect these frameworks with other Google Cloud Platform products.
Google Cloud Print Google Cloud Print integrates with the mobile versions of Gmail and Google Docs, allowing users to print from their mobile devices. Google Cloud Print is listed as a printer option in the Print Preview page of Google's Web browser, Google Chrome, in Chrome 16 and higher. "Legacy", also called "classic", printers (those without cloud printing capabilities) are supported through a "Cloud Print Connector" integrated with Google Chrome versions 9 and higher.
Google Cloud Messaging Google Cloud Messaging functions using server APIs and SDKs, both maintained by Google. The GCM has the ability to send push notifications, deep-linking commands, and application data. Larger messages can be sent with up to 4 KB of payload data.
Talend In January 2016, Talend joined Cloudera, data Artisans, Google, Cask and PayPal to propose Google’s Cloud Dataflow to the Apache Software Foundation.
Data analysis Data integration is a precursor to data analysis, and data analysis is closely linked to data visualization and data dissemination. The term "data analysis" is sometimes used as a synonym for data modeling.
Google Cloud Datastore Google Cloud Datastore (Cloud Datastore) is a highly scalable, fully managed NoSQL database service offered by Google on the Google Cloud Platform. Cloud Datastore is built upon Google's Bigtable and Megastore technology.
Dataflow programming In computer programming, dataflow programming is a programming paradigm that models a program as a directed graph of the data flowing between operations, thus implementing dataflow principles and architecture. Dataflow programming languages share some features of functional languages, and were generally developed in order to bring some functional concepts to a language more suitable for numeric processing. Some authors use the term Datastream instead of Dataflow to avoid confusion with Dataflow Computing or Dataflow architecture, based on an indeterministic machine paradigm. Dataflow programming was pioneered by Jack Dennis and his graduate students at MIT in the 1960s.
Global Database of Events, Language, and Tone In May 2014, the Google Cloud Platform blog announced that the entire GDELT dataset would be available as a public dataset in Google BigQuery.
Google Cloud Dataproc Cloud Dataproc is a Platform as a service (PaaS) product designed to combine the Spark and Hadoop frameworks with many common cloud computing patterns. Cloud Dataproc separate compute and storage, which is a relatively common design for many cloud Hadoop offerings. Cloud Dataproc utilizes Google Compute Engine virtual machines for compute and Google Cloud Storage for file storage. Cloud Dataproc has a set of control and integration mechanisms that coordinate the lifecycle, management, and coordination of clusters. Cloud Dataproc is integrated with the YARN application manager to make managing and using clusters easier.
Data analysis Analysis of data, also known as data analytics, is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.
Dataflow programming Dataflow programs are represented in different ways. A traditional program is usually represented as a series of text instructions, which is reasonable for describing a serial system which pipes data between small, single-purpose tools that receive, process, and return. Dataflow programs start with an input, perhaps the command line parameters, and illustrate how that data is used and modified. The flow of data is explicit, often visually illustrated as a line or pipe.