Pattern Discovery in Data Mining

Start Date: 03/24/2019

Course Type: Common Course

Course Link:

Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.

About Course

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for data-driven phrase mining and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.

Course Syllabus

Module 1 consists of two lessons. Lesson 1 covers the general concepts of pattern discovery. This includes the basic concepts of frequent patterns, closed patterns, max-patterns, and association rules. Lesson 2 covers three major approaches for mining frequent patterns. We will learn the downward closure (or Apriori) property of frequent patterns and three major categories of methods for mining frequent patterns: the Apriori algorithm, the method that explores vertical data format, and the pattern-growth approach. We will also discuss how to directly mine the set of closed patterns.

Deep Learning Specialization on Coursera

Course Introduction

Learn the general concepts of data mining along with basic methodologies and applications. Then dive

Course Tag

Streams Sequential Pattern Mining Data Mining Algorithms Data Mining

Related Wiki Topic

Article Example
K-optimal pattern discovery K-optimal pattern discovery is a data mining technique that provides an alternative to the frequent pattern discovery approach that underlies most association rule learning techniques.
Domain driven data mining Data-driven pattern mining and knowledge discovery in databases face such challenges that the discovered outputs are often not actionable. In the era of big data, how to effectively discover actionable insights from complex data and environment is critical. A significant paradigm shift is the evolution from data-driven pattern mining to domain-driven actionable knowledge discovery. Domain driven data mining is to enable the discovery and delivery of actionable knowledge and actionable insights.
Examples of data mining In the context of pattern mining as a tool to identify terrorist activity, the National Research Council provides the following definition: "Pattern-based data mining looks for patterns (including anomalous data patterns) that might be associated with terrorist activity — these patterns might be regarded as small signals in a large ocean of noise." Pattern Mining includes new areas such a Music Information Retrieval (MIR) where patterns seen both in the temporal and non temporal domains are imported to classical knowledge discovery search methods.
Sequential pattern mining Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time series mining is closely related, but usually considered a different activity. Sequential pattern mining is a special case of structured data mining.
Examples of data mining In the context of combating terrorism, two particularly plausible methods of data mining are "pattern mining" and "subject-based data mining".
Data mining In the Academic community, the major forums for research started in 1995 when the First International Conference on Data Mining and Knowledge Discovery (KDD-95) was started in Montreal under AAAI sponsorship. It was co-chaired by Usama Fayyad and Ramasamy Uthurusamy. A year later, in 1996, Usama Fayyad launched the journal by Kluwer called Data Mining and Knowledge Discovery as its founding Editor-in-Chief. Later he started the SIGKDDD Newsletter SIGKDD Explorations. The KDD International conference became the primary highest quality conference in Data Mining with an acceptance rate of research paper submissions below 18%. The Journal Data Mining and Knowledge Discovery is the primary research journal of the field.
Data mining In the 1960s, statisticians used terms like "Data Fishing" or "Data Dredging" to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. The term "Data Mining" appeared around 1990 in the database community. For a short time in 1980s, a phrase "database mining"™, was used, but since it was trademarked by HNC, a San Diego-based company, to pitch their Database Mining Workstation; researchers consequently turned to "data mining". Other terms used include Data Archaeology, Information Harvesting, Information Discovery, Knowledge Extraction, etc. Gregory Piatetsky-Shapiro coined the term "Knowledge Discovery in Databases" for the first workshop on the same topic (KDD-1989) and this term became more popular in AI and Machine Learning Community. However, the term data mining became more popular in the business and press communities. Currently, Data Mining and Knowledge Discovery are used interchangeably.
Data stream mining Data stream mining can be considered a subfield of data mining, machine learning, and knowledge discovery.
Data discovery Data discovery is a user-driven process of searching for patterns or specific items in a data set. Data discovery applications use visual tools such as geographical maps, pivot-tables, and heat-maps to make the process of finding patterns or specific items rapid and intuitive. Data discovery may leverage statistical and data mining techniques to accomplish these goals.
Group method of data handling GMDH is used in such fields as data mining, knowledge discovery, prediction, complex systems modeling, optimization and pattern recognition.
Wrapper (data mining) unsupervised pattern mining. Automated extraction is possible because most Web data objects follow fixed
Data mining Data mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.
Data mining The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting is part of the data mining step, but do belong to the overall KDD process as additional steps.
Data Mining and Knowledge Discovery Data Mining and Knowledge Discovery is a triannual peer-reviewed scientific journal focusing on data mining. It is published by Springer Science+Business Media. , the editor-in-chief is Geoffrey I. Webb. It was started in 1996 and launched in 1997 by Usama Fayyad as founding Editor-in-Chief by Kluwer Academic Publishers (later becoming Springer). The first Editorial provides a summary of why it was started.
K-optimal pattern discovery Frequent pattern discovery techniques find all patterns for which there are sufficiently frequent examples in the sample data. In contrast, k-optimal pattern discovery techniques find the "k" patterns that optimize a user-specified measure of interest. The parameter "k" is also specified by the user.
K-optimal pattern discovery In contrast to k-optimal rule discovery and frequent pattern mining techniques, subgroup discovery focuses on mining interesting patterns with respect to a specified target property of interest. This includes, for example, binary, nominal, or numeric attributes, but also more complex target concepts such as correlations between several variables. Background knowledge like constraints and ontological relations can often be successfully applied for focusing and improving the discovery results.
Examples of data mining There are several critical research challenges in geographic knowledge discovery and data mining. Miller and Han offer the following list of emerging research topics in the field:
Structure mining Structure mining or structured data mining is the process of finding and extracting useful information from semi-structured data sets. Graph mining, sequential pattern mining and molecule mining are special cases of structured data mining.
Educational data mining Of the general categories of methods mentioned, prediction, clustering and relationship mining are considered universal methods across all types of data mining; however, Discovery with Models and Distillation of Data for Human Judgment are considered more prominent approaches within educational data mining.
Domain driven data mining Domain driven data mining is a data mining methodology for discovering actionable knowledge and deliver actionable insights from complex data and behaviors in a complex environment. It studies the corresponding foundations, frameworks, algorithms, models, architectures, and evaluation systems for actionable knowledge discovery.