loader

Pravega is a storage system for data streams that has an innovative design and an attractive set of features to cope with today’s Stream processing requirements (e.g., event ordering, scalability, performance, etc.). The project has plenty of documentation and great blog posts that explain in detail every technical aspect of Pravega. But, if you are […]

Traditional cache solutions treat each entry as an immutable blob of data, which poses problems for the append-heavy ingestion workloads that are common in Pravega. Each Event appended to a Stream would either require its own cache entry or need an expensive read-modify-write operation to be included in the Cache. To enable high-performance ingestion of […]

The ability to pipeline Events to the Segment Store is a key technique that the Pravega Client uses to achieve high throughput, even when dealing with small writes. A Writer appends an Event to its corresponding Segment as soon as it is received, without waiting for previous ones to be acknowledged. To guarantee ordering and […]

Pravega Watermarking Support Tom Kaitchuck and Flavio Junqueira Motivation Stream processing broadly refers to the ability to ingest data from unbounded sources and processing such data as it is ingested. The data can be user-generated, like in social networks or other online application, or machine-generated, like in server telemetry or sensor samples from IoT and […]

This blog post provides an overview of how Apache Flink and Pravega Connector works under the hood to provide end-to-end exactly-once semantics for streaming data pipelines. Overview Pravega [4] is a storage system that exposes Stream as storage primitive for continuous and unbounded data. A Pravega stream is a durable, elastic, append-only, unbounded sequence of […]

The Pravega Segment Store Service is a subsystem that lies at the heart of the entire Pravega deployment. It is the main access point for managing Stream Segments, providing the ability to modify and read their contents. The Pravega Client communicates with the Pravega Stream Controller to identify which Segments need to be used (for […]

Pravega allows the state to be shared in a consistent fashion across multiple cooperating processes distributed in a cluster using a State Synchronizer. This blog details how to use State Synchronizer [1] to build and maintain consistency in a distributed application. State Synchronizer In distributed systems, frequently state needs to be shared across multiple instances […]

Several of the difficulties with tailing a data stream boil down to the dynamics of the source and of the stream processor. For example, if the source increases its production rate in an unplanned manner, then the ingestion system must be able to accommodate such a change. The same happens in the case a processor […]

Introduction Reading and writing is the most basic functionality that Pravega offers. Applications ingest data by writing to one or more Pravega streams and consume data by reading data from one or more streams. To implement applications correctly with Pravega, however, it is crucial that the developer is aware of some additional functionality that complements […]