GeeCON 2024: Viktor Gamov - How do you query a stream?

youtube.com 13 godzin temu


Suppose you have embraced Apache Kafka as the core of your data infrastructure. In that case, you have most likely integrated event-driven services to communicate with each another through topics, combined with legacy systems through an ecosystem of connectors, and responded more or little in real-time to things happening in the planet outside your software. Immutable logs of events form a more robust backbone than the one-database-to-rule-them-all of your profound monolith past. Your stack is more evolvable, responsive, and easier to work with. However, you might face a challenge now that everything is simply a stream - how do you query things? Although you may name at least 1 or 2 ways off the top of your head, it's time you think through how to make the choice. In this talk, we'll research the solutions presently in usage for asking questions about the contents of a topic, including Kafka Streams, the various streaming SQL implementations, your favourite relational database, your favourite data lake, and real-time analytics databases like Apache Pinot. There is no single correct answer to the question, so as liable builders of systems, we must realize our options and the trade-offs they present to us. You'll leave this talk even more satisfied that you've embraced Kafka as the heart of your strategy and are ready to deploy the right choice for querying the logs that hold your data