Lambda Architecture Definition
Lambda Architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch stream-processing methods to design a robust, scalable and fault-tolerance (human and machine) big data systems.
Lamba Architecture tries tries also balancing between the latency & Accuracy.
Lambda Architecture Layers
Lambda Architecture Properties:
- A paradigm for Big Data
- In data processing for balance on throughput , latency, fault-tolerance and scalable.
- For modern data warehouse
Applying the Lambda Architecture with Spark, Kafka, and Cassandra
The toolings are the following:
- Spark Data Frame & Spark SQL in addition to Spark’s Data Source API to load, store and manipulate data.
- Spark Streaming & Spark-Kafka Integration techniques -> for reliability and speed
- Develop a Kafka Data Producer -> to simulate the real-time data stream feed into streaming application.
- Stateful Spark Streaming Application -> to preserve global state and use memory efficiently with approximate algorithms.
- Errors & Code updates -> when we build a stateful Spark streaming application and a production application isn’t complete without the ability to handle errors and code updates.
- Persist Data to Cassandra & HDFS -> for working with the scalable NoSQL database and persist the data to Cassandra and HDFS.
Your Text Here
Lambda Architecture on Azure, Google and AWS