-
Notifications
You must be signed in to change notification settings - Fork 0
/
sql.tex
90 lines (84 loc) · 4.18 KB
/
sql.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
\subsection{Relational Streaming}\label{sec:sql} % Sherif
\begin{figure}[!h]
\begin{lstlisting}[morekeywords={Select,IStream,As,From,Range,Slide}]
Select IStream( Max(len) As mxl,
MaxCount(len) As num,
ArgMax(len, caller) As who )
From Calls[Range 24 Hours Slide 1 Minute]
\end{lstlisting}
\vspace*{-4mm}
\caption{\label{fig:cql}CQL code example.}
\end{figure}
In 2004, Arasu et al.\ at Stanford introduced \textsf{CQL} (for Continuous
Query Language)~\cite{arasu_widom_2004}. CQL has been designed as an
SQL-based declarative language for implementing continuous queries
against streams of data, such as the LinearRoad
benchmark~\cite{arasu_et_al_2004}. The design was influenced by the
\textsf{TelegraphCQ} system, which proposed an SQL-based language with a
focus on expressive windowing
constructs~\cite{chandrasekaran_et_al_2003}.
Figure~\ref{fig:cql} illustrates a CQL code example that
uses a time-based sliding window (per minute within the last 24 hours) over
phone calls to return the maximum phone call length along with its
count and caller information.
\begin{figure}[!h]
\centerline{\includegraphics[scale=0.45]{cqlops.pdf}}
\vspace*{-4mm}
\caption{\label{fig:cqlops}CQL algebra operators.}
\end{figure}
The semantics of CQL are
based on two phases of data, \emph{streams} and \emph{relations}.
As Figure~\ref{fig:cqlops} illustrates, CQL
supports three classes of operators over these types. First,
\emph{stream-to-relation} operators freeze a stream into a relation.
These operators are based on
\emph{windows} that, at any point of time, contain a
historical snapshot of a recent portion of the stream. CQL includes
time-based and tuple-based windows, both with optional
partitioning. Second, \emph{relation-to-relation} operators
turn relations into another relation. These operators are expressed
using standard SQL syntax and come from traditional relational
algebra, such as select~($\sigma$), project~($\pi$),
group-by-aggregate~($\gamma$), and join~($\bowtie$).
Third, \emph{rela\-tion-to-stream} operators thaw a relation
back into a stream. CQL supports three operators of this class:
\texttt{IStream}, \texttt{DStream}, and \texttt{RStream} (to capture inserts, deletes, or the entire
relation).
Streaming SQL dialects were preceded by temporal relational models
such as the one by Jensen and Snodgrass in the early
'90s~\cite{jensen1994temporal}. In their model, each temporal relation
has two main dimensions: a valid time record and transaction time.
Besides TelegraphCQ, another CQL predecessor was
\textsf{GSQL}~\cite{cranor_et_al_2003}. In addition to the standard SQL
operators (e.g., $\sigma$, $\pi$, $\gamma$,~$\bowtie$), GSQL supported
a \emph{merge} operator that combines streams from multiple sources in
order as specified by ordered attributes. GSQL supported joins as long
as it could determine a window from ordered attributes and join predicates.
CQL has influenced the design of many systems, for example,
\textsf{StreamInsight}~\cite{ali_et_al_2009} and
\textsf{StreamBase}~\cite{seyfer_tibbetts_mishkin_2011}.
%
Jain et al.\ described an approach to unify two different proposed SQL
extensions for streams~\cite{jain_et_al_2008}. The first, by
Oracle, was CQL-based and used a time-based execution model that
could model simultaneity. The second, by
StreamBase, used a tuple-based execution model that provided
a way to react to primitive events as soon as they are seen by the
system.
%
\textsf{SECRET} goes beyond Jain et al.'s work to comprehensively
understand the results of various window-based queries (e.g., time-
and tuple-based windows)~\cite{botan_et_al_2010}.
%% Jain et al.\ captured ordering and simultaneity through partial orders
%% on batches of tuples using a new \emph{Spread} operator.
Zou et al.\ showed how to turn a stream of queries into a stream query
by streaming their parameters~\cite{zou_et_al_2010}.
%
Chandramouli et al.\ presented \textsf{TiMR}, which implemented
temporal queries over the MapReduce
framework~\cite{chandramouli2012temporal}.
%
And finally, Soul\'{e} et al.~\cite{soule_et_al_2016} presented a
type system and small-step operational semantics for CQL via
translation to the \textsf{Brooklet} stream-processing
calculus~\cite{soule_et_al_2010}.