Introducing KSQL

有什么用?

以前我们拿kafka当作一个data hub,用来传输数据,扔进数据库。KSQL使得我们可以直接拿kafka当作数据库,读,写,转变数据

From a generic point of view KSQL is what you should use when transformations, integrations and analytics need to happen on the fly during the data stream. KSQL provides a way of keeping Kafka as unique datahub: no need of taking out data, transforming and re-inserting in Kafka. Every transformation can be done Kafka using SQL! Kafka + KSQL turn the database inside out

特性

  • solve the main problem of providing a SQL interface over Kafka, without the need of using external languages like Python or Java
  • continuous queries: with KSQL transformations are done continuously as new data arrives in the Kafka topic

Cases

like real time analytics, security and anomaly detection, online data integration or general application development

怎么用?

关键词

streams and tables

  • A Stream is a sequence of structured data, once an event was introduced into a stream it is immutable.
  • A Table on the other hand represents the current situation based on the events coming from a stream.

A topic in Apache Kafka can be represented as either a STREAM or a TABLE in KSQL, depending on the intended semantics of the processing on the topic.

例子

For instance, if you want to read the data in a topic as a series of independent values, you would use CREATE STREAM. An example of such a stream is a topic that captures page view events where each page view event is unrelated and independent of another. If, on the other hand, you want to read the data in a topic as an evolving collection of updatable values, you’d use CREATE TABLE. An example of a topic that should be read as a TABLE in KSQL is one that captures user metadata where each event represents latest metadata for a particular user id, be it user’s name, address or preferences.

机制

KSQL enables the definition of streams and tables via a simple SQL dialect. Various streams and tables coming from different sources can be joined directly in KSQL enabling data combination and transformation on the fly.

Each stream or table created in KSQL will be stored in a separate topic, allowing the usage of the usual connectors or scripts to extract the informations from it.

实战

standalone and client-server mode

KSQL can work both in standalone and client-server mode with the first one aimed at development and testing scenarios while the second supporting production environments.

Syntax Reference

What’s Next for KSQL?

  • Now: releasing KSQL as a developer preview to start building the community around it and gathering feedback
  • Plan: add several more capabilities as we work with the open source community to turn it into a production-ready system
    注:quality, stability, and operability of KSQL to supporting a richer SQL grammar including further aggregation functions and point-in-time SELECT on continuous tables–i.e., to enable quick lookups against what’s computed so far in addition to the current functionality of continuously computing results off of a stream.
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容