Kafka Summit 2017-sf (pipeline)

Billions of Messages a Day – Yelp’s Real-time Data Pipeline

by Justin Cunningham, Technical Lead, Software Engineering, Yelp
video, slide
Yelp moved quickly into building out a comprehensive service oriented architecture, and before long had over 100 data-owning production services. Distributing data across an organization creates a number of issues, particularly around the cost of joining disparate data sources, dramatically increasing the complexity of bulk data applications. Straightforward solutions like bulk data APIs and sharing data snapshots have significant drawbacks. Yelp’s Data Pipeline makes it easier for these services to communicate with each other, provides a framework for real-time data processing, and facilitates high-performance bulk data applications – making large SOAs easier to work with. The Data Pipeline provides a series of guarantees that makes it easy to create universal data producers and consumers that can be mashed up into interesting real-time data flows. We’ll show how a few simple services at Yelp lay the foundation that powers everything from search to our experimentation framework.

以下内容来自谷歌翻译:
Yelp迅速建立了面向全面的面向服务架构,并且长期以来一直拥有超过100个数据拥有的生产服务。跨组织分发数据会产生一些问题,特别是在加入不同数据源的成本之间,大大增加了批量数据应用程序的复杂性。直观的解决方案,如批量数据API和共享数据快照具有重大缺陷。 Yelp的数据管道使这些服务更容易相互通信,为实时数据处理提供框架,并促进高性能批量数据应用程序 - 使大型SOA更易于使用。数据管道提供了一系列保证,可以轻松创建通用数据生产者和消费者,从而将其融入有趣的实时数据流中。我们将展示Yelp的几个简单服务如何为搜索到实验框架提供一切依据。

Body Armor for Distributed System

by Michael Egorov, Co-founder and CTO, NuCypher
video, slide
We show a way to make Kafka end-to-end encrypted. It means that data is ever decrypted only at the side of producers and consumers of the data. The data is never decrypted broker-side. Importantly, all Kafka clients have their own encryption keys. There is no pre-shared encryption key. Our approach can be compared to TLS implemented for more than two parties connected together.

以下内容来自谷歌翻译:
我们展示了使端到端加密的Kafka的方法。这意味着数据只能在数据的生产者和消费者的一边被解密。数据从不解密代理方。重要的是,所有Kafka客户端都有自己的加密密钥。没有预共享加密密钥。我们的方法可以与连接在一起的两个以上方实施的TLS进行比较。

DNS for Data: The Need for a Stream Registry

by Praveen Hirsave, Director Cloud Engineering, HomeAway
video, slide
As organizations increasingly adopt streaming platforms such as kafka, the need for visibility and discovery has become paramount. Increasingly, with the advent of self-service streaming and analytics, a need to increase on overall speed, not only on time-to-signal, but also on reducing times to production is becoming the difference between winners and losers. Beyond Kafka being at the core of successful streaming platforms, there is a need for a stream registry. Come to this session to find out how HomeAway is solving this with a “just right” approach to governance.

以下内容来自谷歌翻译:
随着组织越来越多地采用流媒体平台,例如kafka,对可见性和发现的需求变得至关重要。越来越多的随着自助流媒体和分析技术的出现,不仅需要提高总体速度,而且在时间到信号的同时,还要减少生产时间成为赢家和输家之间的差异。超越Kafka是成功的流媒体平台的核心,需要一个流注册表。来参加这个会议,了解HomeAway如何用“正确”的治理方法来解决这个问题。

Efficient Schemas in Motion with Kafka and Schema Registry

by Pat Patterson, Community Champion, StreamSets Inc.
video, slide
Apache Avro allows data to be self-describing, but carries an overhead when used with message queues such as Apache Kafka. Confluent’s open source Schema Registry integrates with Kafka to allow Avro schemas to be passed ‘by reference’, minimizing overhead, and can be used with any application that uses Avro. Learn about Schema Registry, using it with Kafka, and leveraging it in your application.

以下内容来自谷歌翻译:
Apache Avro允许数据进行自我描述,但与消息队列(如Apache Kafka)一起使用时,会发生开销。 Confluent的开源架构注册表集成了Kafka,以允许Avro模式通过引用传递,最大限度地减少开销,并可与任何使用Avro的应用程序一起使用。了解架构注册表,使用Kafka,并将其用于您的应用程序。

From Scaling Nightmare to Stream Dream : Real-time Stream Processing at Scale

by Amy Boyle, Software Engineer, New Relic
video, slide
On the events pipeline team at New Relic, Kafka is the thread that stitches our micro-service architecture together. We receive billions of monitoring events an hour, which customers rely on us to alert on in real-time. Facing a ten fold+ growth in the system, learn how we avoided a costly scaling nightmare by switching to a streaming system, based on Kafka. We follow a DevOps philosophy at New Relic. Thus, I have a personal stake in how well our systems perform. If evaluation deadlines are missed, I loose sleep and customers loose trust. Without necessarily setting out to from the start, we’ve gone all in, using Kafka as the backbone of an event-driven pipeline, as a datastore, and for streaming updates to the system. Hear about what worked for us, what challenges we faced, and how we continue to scale our applications.

以下内容来自谷歌翻译:
在New Relic的事件管道团队中,Kafka是将我们的微服务体系结合在一起的线程。我们每小时收到数十亿次监控事件,客户依靠我们即时提醒。面对系统的十倍+增长,通过切换到基于Kafka的流式传输系统,了解我们如何避免昂贵的扩展噩梦。我们按照新遗物的DevOps理念。因此,我对我们的系统执行情况有个人利益。如果错过评估期限,我放松睡眠,客户信任松散。没有必要从一开始就开始,我们已经全部进入,使用Kafka作为事件驱动的流水线的主干,作为数据存储区,并将流式更新系统。听取有关我们的工作,我们面临的挑战以及我们如何继续扩大我们的应用程序。

How Blizzard Used Kafka to Save Our Pipeline (and Azeroth)

by Jeff Field, Systems Engineer, Blizzard
video, slide
When Blizzard started sending gameplay data to Hadoop in 2013, we went through several iterations before settling on Flumes in many data centers around the world reading from RabbitMQ and writing to central flumes in our Los Angeles datacenter. While this worked at first, by 2015 we were hitting problems scaling to the number of events required. This is how we used Kafka to save our pipeline.

以下内容来自谷歌翻译:
当暴雪在2013年开始向Hadoop发​​送游戏数据时,我们经历了几次迭代,然后在世界各地的许多数据中心处理Flumes,从RabbitMQ读取并写入我们Los的中央水槽安吉拉数据中心。虽然这一工作起初,到2015年,我们正在将问题扩大到所需的事件数量。这是我们如何使用Kafka来保存我们的管道。

Kafka Connect Best Practices – Advice from the Field

by Randall Hauch, Engineer, Confluent
video, slide
This talk will review the Kafka Connect Framework and discuss building data pipelines using the library of available Connectors. We’ll deploy several data integration pipelines and demonstrate :

best practices for configuring, managing, and tuning the connectors
tools to monitor data flow through the pipeline
using Kafka Streams applications to transform or enhance the data in flight.

以下内容来自谷歌翻译:
这个讨论将回顾Kafka连接框架,并讨论使用可用连接器库构建数据管道。我们将部署多个数据集成管道并展示:

配置,管理和调整连接器的最佳做法
通过管道监视数据流的工具
使用Kafka流应用程序来转换或增强飞行中的数据。

One Data Center is Not Enough: Scaling Apache Kafka Across Multiple Data Centers

by Gwen Shapira, Product Manager, Confluent
video, slide
You have made the transition from single machines and one-off solutions to distributed infrastructure in your data center powered by Apache Kafka. But what if one data center is not enough? In this session, we review resilient data pipelines with Apache Kafka that span multiple data centers. We provide an overview of best practices and common patterns including key areas such as architecture and data replication as well as disaster scenarios and failure handling.

以下内容来自谷歌翻译:
您已经通过Apache Kafka,将数据中心从单机和一次性解决方案过渡到数据中心的分布式基础设施。但是如果一个数据中心还不够?在本次会议中,我们将审查跨越多个数据中心的Apache Kafka的弹性数据流水线。我们提供最佳实践和常见模式的概述,包括架构和数据复制以及灾难情景和故障处理等关键领域。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,530评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 86,403评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,120评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,770评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,758评论 5 367
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,649评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,021评论 3 398
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,675评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,931评论 1 299
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,659评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,751评论 1 330
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,410评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,004评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,969评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,203评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,042评论 2 350
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,493评论 2 343

推荐阅读更多精彩内容