One of the main use-cases for Apache Kafka is the building of reliable and flexible data pipelines. Part of Apache Kafka, Kafka Connect enables the integration of data from multiple sources, including Oracle, Hadoop, S3 and Elasticsearch. Building on Kafka's Streams API, KSQL from Confluent enables stream processing and data Transformations using a SQL-like language. This presentation will briefly recap the purpose of Kafka, and then dive into Kafka Connect with practical examples of data pipelines that can be built with it. We'll explore two options for data transformation and processing: Pluggable Single-Message Transformations and the newly-announced KSQL for powerful query-based stream processing.
Gwen is a principal data architect at Confluent helping customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an author of “Kafka - the Definitive Guide”, "Hadoop Application Architectures", and a frequent presenter at industry conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects. When Gwen isn't coding or building data pipelines, you can find her pedaling on her bike exploring the roads and trails of California, and beyond.