By Nesime Tatbul, Intel Labs and MIT
Managing high-speed data streams generated in real time is an integral part of today’s big data applications. In a wide range of domains from social media to financial trading, there is a growing need to seamlessly support incremental processing as new data is generated. At the same time, the system must ingest some or all of this data into a persistent store for on-demand transaction or analytical processing. In particular, for applications with high-velocity data updating a large persistent state, there is need to manage streaming and non-streaming state side by side, in a way to ensure transactional integrity and performance. Examples include leader-board maintenance, multi-player online games, real-time financial trading, and online advertising.
A decade ago, we (and others) built stream processing engines. Ours was called Aurora/Borealis, which was commercialized as StreamBase (now part of TIBCO). Customer applications were never pure stream processing. They always required storage as a way to record the past for comparison with the present. Thus, the obvious solution was to bolt on a DBMS, but the solutions were very unsatisfying because they ignored the need for consistency between streaming results and stored state.
For applications with high-velocity data updating large persistent state, there is a need to manage streaming and non-streaming state side by side, in a way to ensure transactional integrity and performance. Our goal is to fill this gap by building a single system that can scale to support stream and transaction processing under the same hood.
Today’s stream processing systems still lack the required transactional robustness, while OLTP databases do not provide native support for data-driven processing. In this work, our goal is to fill this gap by building a single system that can scale to support stream and transaction processing under the same hood. We believe that modern distributed main-memory OLTP platforms such as H-Store provide a suitable foundation for building such a system, since: first, they are more light-weight than their traditional disk-based counterparts; second, like streaming engines, they offer lower latency via in-memory processing; and last but not least, they provide strong support for state and transaction management. Thus, we introduce S-Store, a streaming OLTP system that realizes our goal by extending H-Store with streams.
H-Store is a high-performance, in-memory, distributed OLTP system designed for shared-nothing clusters. It targets OLTP workloads with short-lived transactions, which are pre-defined as parametric stored procedures (i.e., SQL queries embedded in Java programs) that get instantiated by client requests at run time. Data is carefully partitioned and transactions that can be executed on a single partition are run serially on that partition, eliminating the need for locks and latches. S-Store builds on H-Store’s architecture, reusing its core data management primitives such as tables and stored procedures. In addition, it introduces a number of new primitives for supporting stream processing natively. S-Store’s key extensions include:
New constructs for streams, windows, triggers, and workflows: Streams differentiate continuously streaming state from regular stored state; windows define finite chunks of state over (possibly unbounded) streams; triggers are used to indicate computations to be invoked for newly generated data; and workflows refer to pipelines of dependent transactions triggered by common streaming input(s).
Stream-oriented transaction model: We define a streaming transaction in S-Store as a stored procedure with a streaming input that is invoked by newly arriving stream tuple(s) on that input. Thus, streaming transactions are data-driven, whereas regular OLTP transactions are operation-driven.
Uniform state management: H-Store’s in-memory tables are used for representing all states, including streams and windows, making state access both efficient and transactionally safe. Furthermore, S-Store provides automatic garbage collection mechanisms for tuples that expire from stream or window state.
Data-driven processing via triggers: Special insert triggers are defined on stream or window state in order to enable push-based, data-driven processing in S-Store. There are two types of triggers: engine triggers at the query level (SQL) and front-end triggers at the stored procedure level (Java). The former enable continuous processing for a given transaction, while the latter enable composing workflows of transactions.
S-Store follows H-Store’s three-layer transaction execution stack that consists of the client application, the partition engine, and the execution engine. The native stream processing extensions described above help S-Store use this stack in the most efficient way by embedding the required functionality in the proper engine layers, thereby avoiding redundant computations, communication across the layers, and the need to poll for new data.
Our first experimental study with a streaming OLTP benchmark has shown that S-Store significantly outperforms H-Store in terms of transaction throughput. This benchmark involved a streaming input in addition to regular tables, and a stored procedure with a sliding window aggregation along with regular table operations. S-Store’s native stream processing constructs described above led to a simpler, data-driven implementation with fewer SQL commands and round trips between the engine layers, boosting the performance.
We are extending this study towards more complex benchmark scenarios and comparison against other state-of-the-art alternatives. Building an integrated system for transaction and stream processing such as S-Store raises a number of interesting research challenges from workload distribution and management to transaction modeling and execution. We will report on how we have addressed these challenges in our future blog posts.
S-Store is a joint project of Intel Labs, MIT, Brown, and Carnegie Mellon University. We will be making an official open-source release in the future. In the meantime, please feel free to contact me (Nesime Tatbul) with any questions or comments. We are particularly interested in hearing about novel use cases.