Data stream characteristics (infinite, unbounded, out-of-order), Lambda vs. Kappa architectures, Time concepts (event time, processing time, ingestion time, watermarks), Windowing strategies (tumbling, hopping, slidin...
Data stream characteristics (infinite, unbounded, out-of-order), Lambda vs. Kappa architectures, Time concepts (event time, processing time, ingestion time, watermarks), Windowing strategies (tumbling, hopping, sliding, session), Late data handling and allowed lateness, Exactly-once vs. at-least-once semantics, Fault tolerance in streaming systems.
Kafka architecture (topics, partitions, leaders/followers), Producer/consumer APIs and configurations, Kafka Streams DSL vs. Processor API, KSQL for stream processing SQL, Kafka Connect for data integration, MirrorMaker and cluster federation, Schema Registry and Avro/Protobuf serialization, Exactly-once guarantees with transactions.
Flink execution model (DataStream/DataSet APIs), Stateful stream processing and keyed state, Checkpointing and savepoints, Event-time processing with watermarks, Complex Event Processing (CEP), Table/SQL API for unified batch/streaming, FlinkML and Gelly for streaming ML, Side outputs and dynamic scaling.
Pattern detection (match_recognize, CEP patterns), Sessionization and funnel analysis, Real-time aggregations and materialized views, Join strategies (stream-stream, stream-table), Time-windowed joins and interval joins, Anomaly detection (isolation forests, statistical methods), Real-time dashboards (Grafana + Prometheus).
Stream-table duality and changelog semantics, Upserts and primary key handling, Change Data Capture (CDC) with Debezium, Streaming ETL/ELT pipelines, Backpressure handling and resource management, Multi-tenancy and resource isolation, Monitoring and observability (metrics, tracing), Deployment strategies (Kubernetes operators).