-
Notifications
You must be signed in to change notification settings - Fork 0
/
cep.tex
88 lines (82 loc) · 4.2 KB
/
cep.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
\subsection{Complex Event Processing}\label{sec:cep} % Angela
\begin{figure}[!h]
\begin{lstlisting}[morekeywords={stream,param,output}]
stream<Alert> Alerts = MatchRegex(Calls) {
param
partitionBy: caller;
predicates : {
tooFarTooFast =
geoDist(First(loc), Last(loc)) >= 10.0
&& timeDist(First(ts), Last(ts)) <= 60.0; };
pattern : ".+ tooFarTooFast";
output
Alerts: who=caller, where=Last(loc), when=Last(ts);
}
\end{lstlisting}
\vspace*{-4mm}
\caption{\label{fig:cep}CEP example.}
\end{figure}
Complex event processing (CEP) uses patterns over simple events to
detect higher-level, \emph{complex}, events that may comprise multiple
simple events. CEP can be considered either as an alternative to
stream processing or as a special case of stream processing. The
latter consideration has led to the definition of CEP operators in
streaming languages. For example, the \textsf{MatchRegex}~\cite{hirzel_2012}
operator implements CEP in the library of the SPL
language~\cite{hirzel_schneider_gedik_2017} (Section~\ref{sec:big}). MatchRegex was introduced by Hirzel in~2012,
influenced by the \mbox{\textsf{M{\small{}ATCH}-R{\small{}ECOGNIZE}}} proposal for extending ANSI
SQL~\cite{zemke_et_al_2007}. Compared to its SQL counterpart,
MatchRegex is simplified, syntactically concise, and easy to deploy as
a library operator. MatchRegex is implemented via code generation and
translates to an automaton for space- and time-efficient incremental
computation of aggregates. However, it omits other
functionalities beyond pattern matching, such as joins and reporting
tasks. Figure~\ref{fig:cep} shows an example for detecting a complex
event when simple phone-call events occur over 10~miles apart within
60~seconds. Line~8 defines the regular expression, where the period
(\lstinline{.}) matches any simple event; the plus (\lstinline{+})
indicates at-least-once repetition; and \lstinline{tooFarTooFast} is a
simple event defined via a predicate in \mbox{Lines 5--7}. The
\lstinline{First} and \lstinline{Last} functions reference
corresponding simple events in the overall match: in this case, the
start of the sequence matched by \lstinline{.+} and the simple event matched by
\lstinline{tooFarTooFast}.
%% 2006
One of the earliest languages for complex event queries on real-time
streams was \textsf{SASE}~\cite{WuDR06}. The language was designed to translate
to algebraic operators, but did not yet support aggregation or regular
expressions with Kleene closure, as used in Figure~\ref{fig:cep}.
%% 2007
The \textsf{Cayuga} Event Language offered aggregation and Kleene closure, but
did so in a hand-crafted syntax instead of familiar regular expression
syntax~\cite{demers_et_al_2007}.
%% 2007
The closest predecessor of MatchRegex was the \textsc{Match-Recognize}
proposal for extending SQL with pattern recognition in relational
rows~\cite{zemke_et_al_2007}. It used regular-expression syntax as
well as aggregations. Like MatchRegex, it is embedded in a host
language that supports orthogonal features via operators such as
joins.
%% 2008
Another take on CEP using regular expressions was
\textsf{EventScript}~\cite{cohen_kalleberg_2008}, which allowed the patterns to
be interspersed with action blocks.
%% 2010
While most CEP pattern matching is inherently sequential, Chandramouli
et al.\ generalized it for out-of-order data
streams~\cite{chandramouli_goldstein_maier_2010}, a topic further
discussed in Section~\ref{sec:veracity}.
%% Most CEP pattern matching is inherently sequential, but Chandramouli et
%% al.\ explored how to generalize it for out-of-order data
%% streams~\cite{chandramouli_goldstein_maier_2010}. Their work
%% anticipates the veracity challenge, further discussed in
%% Section~\ref{sec:veracity} of this survey.
%new part on support of CEP in Microsoft Trill, Apache Flink and Esper
% we can remove the previous statement in case of lack of space
Recently, CEP is also supported by several big-data streaming engines,
such as Trill~\cite{chandramouli_et_al_2014}, Esper~\cite{Esper18},
and Flink~\cite{carbone_et_al_2015}, the latter
exhibiting a CEP library since its early 1.0 version.
Indeed, the high throughput
and low latency nature of these engines make them suitable for CEP's real
time analytics.