Wallaroo Core Concepts
A Wallaroo application consists of a pipeline with one or more sources. A pipeline takes in data from external systems, performs a series of computations based on that data, and optionally produces outputs which are sent to an external system.
Here is a diagram illustrating a single, linear pipeline:
External | Wallaroo Wallaroo| External System ──>| Source ─> C1 ─> C2 ─> C3 ─> Sink |──> System A | | B
An external data source sends data (say, over TCP) to an internal Wallaroo source. The Wallaroo source decodes that stream of bytes, transforming it into a stream of internal data types that are sent to a series of computations (C1, C2, and C3). Each computation takes an input and produces an output. Finally, C3 sends its output to a Wallaroo sink, which encodes that output as a series of bytes and sends it over TCP to an external system.
A Wallaroo application can have multiple sources. Each source might send data down a chain of computations. These chains can then be merged, say at a downstream state computation. For example:
External | Wallaroo System ──>| Source ─> C1 ─> C2 -\ A | \ \ Wallaroo| External -> C5 -> Sink |--> System / | C External | Wallaroo / System ──>| Source ─> C3 ─> C4 ─/ B |
The outputs of both C2 and C4 are sent along to C5, where they will arrive in a non-deterministic interleaving. The outputs of C5 are then sent to a sink and ultimately sent out over TCP to an external system.
- State -- Accumulated result of data stored over the course of time
- Stateless Computation -- Code that transforms an input of some type
Into an output of some type
Noneif the input should be filtered out).
- State Computation -- Code that takes an input type
Inand a state object of some type
State, operates on that input and state (possibly making state updates), and optionally producing an output of some type
- Source -- Ingress (input) point for data from external systems into a Wallaroo application.
- Sink -- Egress (output) point from a Wallaroo application to external systems.
- Decoder -- Code that transforms a stream of bytes from an external system into a series of application input types.
- Encoder -- Code that transforms an application output type into bytes for sending to an external system.
- Pipeline -- A sequence of computations and/or state computations originating from one or more sources and terminating in a sink.
- Stage -- A source, stateless computation, state computation, and sink each represents a single logical stage in a Wallaroo pipeline. The linear pipeline above--starting from a source, going through computations C1, C2, and C3, and terminating at a sink--has 5 logical stages (1 source, 3 computations, and 1 sink).
- Topology -- A graph of how all sources, sinks, and computations are connected within an application.
- API -- Wallaroo provides APIs for implementing all of the above concepts.