Druid单机版安装及离线导入数据

Druid单机版安装及离线导入数据


1.概述

本文快速安装基于单机服务器,很多配置可以默认不需要修改,数据存储在操作系统级别的磁盘。推出快速安装的目的,便于了解并指导基于Druid进行大数据分析的开发流程。

2.安装要求

  • Java 8 or higher

  • Linux, Mac OS X, or other Unix-like OS (Windows is not supported)

  • 8G of RAM

  • 2 vCPUs

3.zookeeper安装

本次采单机版安装,如果采用分布式安装,则需要修改Druid相应配置,反之不需要。Zookeeper默认启用2181端口监听。

curl http://www.gtlib.gatech.edu/pub/apache/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz -o zookeeper-3.4.10.tar.gz

tar -xzf zookeeper-3.4.10.tar.gz
cd zookeeper-3.4.10
cp conf/zoo_sample.cfg conf/zoo.cfg
./bin/zkServer.sh start

➜  zookeeper-3.4.10 jps
10565 QuorumPeerMain
17832 Jps

4.Druid安装

curl -O http://static.druid.io/artifacts/releases/druid-0.12.3-bin.tar.gz
tar -xzf druid-0.12.3-bin.tar.gz
cd druid-0.12.3

解压后 Druid 相关目录说明

LICENSE - 许可证文件。
bin/ - 快速启动脚本。
conf/* - 集群安装配置(包括Hadoop)。
conf-quickstart/* - 快速启动相关配置。
extensions/* - Druid扩展。
hadoop-dependencies/* - Druid hadoop依赖。
lib/* - Druid核心软件包。
quickstart/* - 快速启动示例文件及数据。

5.启动 Druid 准备

启动Druid相关服务之前,我们需要做两件事:

  1. 启动Zookeeper
  2. 切换到Druid根目录,执行 bin/init

6.启动 Druid 相关服务

启动5个Druid进程在不同远程终端窗口,因为是单机模式,所有进程在同一服务器上;在大的分布式集群中,很多Druid进程可以在同一服务器,我们需要启动的5个Druid进程:Historical、Broker、coordinator、overlord、middleManager。

启动historical

java `cat conf-quickstart/druid/historical/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/historical:lib/*" io.druid.cli.Main server historical

注意跟官网的区别,druid安装目录下没有examples目录

java `cat examples/conf/druid/coordinator/jvm.config | xargs` -cp "examples/conf/druid/_common:examples/conf/druid/_common/hadoop-xml:examples/conf/druid/coordinator:lib/*" io.druid.cli.Main server coordinator

启动broker

java `cat conf-quickstart/druid/broker/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/broker:lib/*" io.druid.cli.Main server broker

启动coordinator

java `cat conf-quickstart/druid/coordinator/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/coordinator:lib/*" io.druid.cli.Main server coordinator

启动overload

java `cat conf-quickstart/druid/overlord/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/overlord:lib/*" io.druid.cli.Main server overlord

启动middleManager

java `cat conf-quickstart/druid/middleManager/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/middleManager:lib/*" io.druid.cli.Main server middleManager

7.Druid 控制台

如果上述服务启动成功,则可以访问如下控制台

    1. 访问http://localhost:8090/console.html 可以查看数据批量导入Druid的任务执情况,间隔一段时间刷新一下控制台,如果看到SUCCESS任务状态,说明任务执行成功,如下图所示:
druid-console.png
druid-006.png

8.导入离线数据到Druid

{ "type" : "index", 
  "spec" : {
    "ioConfig" : {
      "type" : "index",
      "firehose" : {
        "type" : "local",
        "baseDir" : "/Users/zzy/Documents/zzy/software/druid-0.12.3/quickstart",
        "filter" : "wikiticker-2015-09-12-sampled.json.gz"
      }
    },
    "dataSchema" : {
      "dataSource" : "wikiticker",
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "day",
        "queryGranularity" : "none",
        "intervals" : ["2015-09-12/2015-09-13"]
      },
      "parser" : {
        "type" : "string",
        "parseSpec" : {
          "format" : "json",
          "dimensionsSpec" : {
            "dimensions" : [
              "channel",
              "cityName",
              "comment",
              "countryIsoCode",
              "countryName",
              "isAnonymous",
              "isMinor",
              "isNew",
              "isRobot",
              "isUnpatrolled",
              "metroCode",
              "namespace",
              "page",
              "regionIsoCode",
              "regionName",
              "user"
            ]
          },
          "timestampSpec" : {
            "format" : "auto",
            "column" : "time"
          }
        }
      },
      "metricsSpec" : [
        {
          "name" : "count",
          "type" : "count"
        },
        {
          "name" : "added",
          "type" : "longSum",
          "fieldName" : "added"
        },
        {
          "name" : "deleted",
          "type" : "longSum",
          "fieldName" : "deleted"
        },
        {
          "name" : "delta",
          "type" : "longSum",
          "fieldName" : "delta"
        },
        {
          "name" : "user_unique",
          "type" : "hyperUnique",
          "fieldName" : "user"
        }
      ]
    },
    "tuningConfig" : {
      "type" : "index",
      "partitionsSpec" : {
        "type" : "hashed",
        "targetPartitionSize" : 5000000
      },
      "jobProperties" : {}
    }
  }
}

注意baseDir最好是绝对路径

执行curl命令
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikiticker-index_local.json localhost:8090/druid/indexer/v1/task

控制台打印如下

➜  druid-0.12.3 curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikiticker-index_local.json localhost:8090/druid/indexer/v1/task
{"task":"index_wikiticker_2018-11-27T03:33:42.307Z"}%

去overlord console查看下task的状态http://localhost:8090/console.html

druid-007.png

任务状态是failed的

druid-008.png
druid-009.png

查看日志发现报错如下:

2018-11-27T03:10:43,416 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[AbstractTask{id='index_wikiticker_2018-11-27T03:10:39.850Z', groupId='index_wikiticker_2018-11-27T03:10:39.850Z', taskResource=TaskResource{availabilityGroup='index_wikiticker_2018-11-27T03:10:39.850Z', requiredCapacity=1}, dataSource='wikiticker', context={}}]
java.lang.IllegalStateException: Failed to create directory within 10000 attempts (tried 1543288243332-0 to 1543288243332-9999)
  at com.google.common.io.Files.createTempDir(Files.java:600) ~[guava-16.0.1.jar:?]
  at io.druid.segment.indexing.RealtimeTuningConfig.createNewBasePersistDirectory(RealtimeTuningConfig.java:58) ~[druid-server-0.12.3.jar:0.12.3]
  at io.druid.segment.indexing.RealtimeTuningConfig.makeDefaultTuningConfig(RealtimeTuningConfig.java:68) ~[druid-server-0.12.3.jar:0.12.3]
  at io.druid.segment.realtime.FireDepartment.<init>(FireDepartment.java:62) ~[druid-server-0.12.3.jar:0.12.3]
  at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:572) ~[druid-indexing-service-0.12.3.jar:0.12.3]
  at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:264) ~[druid-indexing-service-0.12.3.jar:0.12.3]
  at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.3.jar:0.12.3]
  at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.3.jar:0.12.3]
  at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
2018-11-27T03:10:43,420 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_wikiticker_2018-11-27T03:10:39.850Z] status changed to [FAILED].
2018-11-27T03:10:43,423 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_wikiticker_2018-11-27T03:10:39.850Z",
  "status" : "FAILED",
  "duration" : 109
}

解决方法:手动创建临时目录,比如上面的临时目录var/tmp

mkdir -p tmp

➜  druid-0.12.3 ll var/tmp
total 0
drwxr-xr-x  2 zzy  staff  64 Nov 27 11:33 1543289625953-0
➜  druid-0.12.3 pwd
/Users/zzy/Documents/zzy/software/druid-0.12.3

注意在druid目录下创建,不是根目录!!!

load本地数据成功后,可以在coordinator页面看到多了一个wikiticker的datasources

druid-010.png

查看数据

curl -L -H'Content-Type: application/json' -XPOST --data-binary @quickstart/wikiticker-top-pages.json http://localhost:8082/druid/v2/?pretty

返回如下

➜  druid-0.12.3 curl -L -H'Content-Type: application/json' -XPOST --data-binary @quickstart/wikiticker-top-pages.json http://localhost:8082/druid/v2/\?pretty
[ {
  "timestamp" : "2015-09-12T00:46:58.771Z",
  "result" : [ {
    "edits" : 33,
    "page" : "Wikipedia:Vandalismusmeldung"
  }, {
    "edits" : 28,
    "page" : "User:Cyde/List of candidates for speedy deletion/Subpage"
  }, {
    "edits" : 27,
    "page" : "Jeremy Corbyn"
  }, {
    "edits" : 21,
    "page" : "Wikipedia:Administrators' noticeboard/Incidents"
  }, {
    "edits" : 20,
    "page" : "Flavia Pennetta"
  }, {
    "edits" : 18,
    "page" : "Total Drama Presents: The Ridonculous Race"
  }, {
    "edits" : 18,
    "page" : "User talk:Dudeperson176123"
  }, {
    "edits" : 18,
    "page" : "Wikipédia:Le Bistro/12 septembre 2015"
  }, {
    "edits" : 17,
    "page" : "Wikipedia:In the news/Candidates"
  }, {
    "edits" : 17,
    "page" : "Wikipedia:Requests for page protection"
  }, {
    "edits" : 16,
    "page" : "Utente:Giulio Mainardi/Sandbox"
  }, {
    "edits" : 16,
    "page" : "Wikipedia:Administrator intervention against vandalism"
  }, {
    "edits" : 15,
    "page" : "Anthony Martial"
  }, {
    "edits" : 13,
    "page" : "Template talk:Connected contributor"
  }, {
    "edits" : 12,
    "page" : "Chronologie de la Lorraine"
  }, {
    "edits" : 12,
    "page" : "Wikipedia:Files for deletion/2015 September 12"
  }, {
    "edits" : 12,
    "page" : "Гомосексуальный образ жизни"
  }, {
    "edits" : 11,
    "page" : "Constructive vote of no confidence"
  }, {
    "edits" : 11,
    "page" : "Homo naledi"
  }, {
    "edits" : 11,
    "page" : "Kim Davis (county clerk)"
  }, {
    "edits" : 11,
    "page" : "Vorlage:Revert-Statistik"
  }, {
    "edits" : 11,
    "page" : "Конституция Японской империи"
  }, {
    "edits" : 10,
    "page" : "The Naked Brothers Band (TV series)"
  }, {
    "edits" : 10,
    "page" : "User talk:Buster40004"
  }, {
    "edits" : 10,
    "page" : "User:Valmir144/sandbox"
  } ]
} ]%

执行Druid SQL查询

SELECT page, COUNT(*) AS Edits FROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00' GROUP BY page ORDER BY Edits DESC LIMIT 10;
cat quickstart/wikipedia-top-pages-sql.json
{
  "query":"SELECT page, COUNT(*) AS Edits FROM wikipedia WHERE \"__time\" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00' GROUP BY page ORDER BY Edits DESC LIMIT 10"
}

执行命令

curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikipedia-top-pages-sql.json http://localhost:8082/druid/v2/sql

返回结果

[{"page":"Wikipedia:Vandalismusmeldung","Edits":33},
{"page":"User:Cyde/List of candidates for speedy deletion/Subpage","Edits":28},
{"page":"Jeremy Corbyn","Edits":27},
{"page":"Wikipedia:Administrators' noticeboard/Incidents","Edits":21},
{"page":"Flavia Pennetta","Edits":20},
{"page":"Total Drama Presents: The Ridonculous Race","Edits":18},
{"page":"User talk:Dudeperson176123","Edits":18},
{"page":"Wikipédia:Le Bistro/12 septembre 2015","Edits":18},
{"page":"Wikipedia:In the news/Candidates","Edits":17},
{"page":"Wikipedia:Requests for page protection","Edits":17}]

更多查询查看官网Tutorial: Querying data

至此Druid单机版及导入离线数据完成,后面会继续更新Druid其他的文章,欢迎关注交流学习。

参考:

http://yangyangmyself.iteye.com/blog/2321487

http://druid.io/docs/latest/tutorials/index.html

https://blog.csdn.net/paicmis/article/details/72625404

imply

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 220,809评论 6 513
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 94,189评论 3 395
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 167,290评论 0 359
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,399评论 1 294
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,425评论 6 397
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 52,116评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,710评论 3 420
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,629评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 46,155评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,261评论 3 339
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,399评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 36,068评论 5 347
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,758评论 3 332
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,252评论 0 23
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,381评论 1 271
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,747评论 3 375
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,402评论 2 358

推荐阅读更多精彩内容

  • Spring Cloud为开发人员提供了快速构建分布式系统中一些常见模式的工具(例如配置管理,服务发现,断路器,智...
    卡卡罗2017阅读 134,685评论 18 139
  • Quickstart单机测试 http://druid.io/docs/0.10.1/tutorials/quic...
    大诗兄_zl阅读 1,256评论 0 0
  • Druid.io(以下简称Druid)是面向海量数据的、用于实时查询与分析的OLAP存储系统。Druid的四大关键...
    大诗兄_zl阅读 6,463评论 0 9
  • 今天是我记录成长的第34天,多谢有朋友们的捧场,我才会坚持到现在。 有人问,你是怎么做到每天写一篇成长记录? 刚开...
    牙医零柒阅读 501评论 0 1
  • 她是CCTV的“双高”,高学历北大中文系博士,大高个。 她是《对话》、《我们》的主持人和制片人,并制定了为人所称道...
    玫瑰花园2017阅读 1,440评论 0 0