DataHub 0.8.14.1安装排错及数据导入

在新建yml的时候务必用到yml格式验证,在看8080端口的时候好像有错误:

{"exceptionClass":"com.linkedin.restli.server.RestLiServiceException","stackTrace":"com.linkedin.restli.server.RestLiServiceException [HTTP Status:404]\n\tat com.linkedin.restli.server.RestLiServiceException.fromThrowable(RestLiServiceException.java:315)\n\tat com.linkedin.restli.server.BaseRestLiServer.buildPreRoutingError(BaseRestLiServer.java:158)\n\tat com.linkedin.restli.server.RestRestLiServer.buildPreRoutingRestException(RestRestLiServer.java:203)\n\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:177)\n\tat com.linkedin.restli.server.RestRestLiServer.doHandleRequest(RestRestLiServer.java:164)\n\tat com.linkedin.restli.server.RestRestLiServer.handleRequest(RestRestLiServer.java:120)\n\tat com.linkedin.restli.server.RestLiServer.handleRequest(RestLiServer.java:132)\n\tat com.linkedin.restli.server.DelegatingTransportDispatcher.handleRestRequest(DelegatingTransportDispatcher.java:70)\n\tat com.linkedin.r2.filter.transport.DispatcherRequestFilter.onRestRequest(DispatcherRequestFilter.java:70)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\n\tat com.linkedin.r2.filter.transport.ServerQueryTunnelFilter.onRestRequest(ServerQueryTunnelFilter.java:58)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\n\tat com.linkedin.r2.filter.message.rest.RestFilter.onRestRequest(RestFilter.java:50)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.FilterChainImpl.onRestRequest(FilterChainImpl.java:96)\n\tat com.linkedin.r2.filter.transport.FilterChainDispatcher.handleRestRequest(FilterChainDispatcher.java:75)\n\tat com.linkedin.r2.util.finalizer.RequestFinalizerDispatcher.handleRestRequest(RequestFinalizerDispatcher.java:61)\n\tat com.linkedin.r2.transport.http.server.HttpDispatcher.handleRequest(HttpDispatcher.java:101)\n\tat com.linkedin.r2.transport.http.server.AbstractR2Servlet.service(AbstractR2Servlet.java:105)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat com.linkedin.restli.server.spring.ParallelRestliHttpRequestHandler.handleRequest(ParallelRestliHttpRequestHandler.java:63)\n\tat org.springframework.web.context.support.HttpRequestHandlerServlet.service(HttpRequestHandlerServlet.java:73)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:536)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:494)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: com.linkedin.restli.server.RoutingException\n\tat com.linkedin.restli.internal.server.RestLiRouter.process(RestLiRouter.java:111)\n\tat com.linkedin.restli.server.BaseRestLiServer.getRoutingResult(BaseRestLiServer.java:139)\n\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:173)\n\t... 62 more\n","status":404}

不折腾了,导入文件类型的没有问题,导入mysql之类的好像是上面那个docker有问题,不知道怎么调试。收摊
装载测试数据

python3 -m datahub docker ingest-sample-data

装载mysql数据

python3 -m datahub ingest -c /app/datahub_yml/test_mysql.yml

报错,原因好像与上面的无关!!!!yml格式的问题!!!见附图,一定要注意yml格式的解析

 python3 -m datahub ingest -c /app/datahub_yml/test_mysql.yml
10 validation errors for PipelineConfig
source
  none is not an allowed value (type=type_error.none.not_allowed)
allow
  extra fields not permitted (type=value_error.extra)
config
  extra fields not permitted (type=value_error.extra)
database
  extra fields not permitted (type=value_error.extra)
database_alias
  extra fields not permitted (type=value_error.extra)
host_port
  extra fields not permitted (type=value_error.extra)
password
  extra fields not permitted (type=value_error.extra)
schema_pattern
  extra fields not permitted (type=value_error.extra)
type
  extra fields not permitted (type=value_error.extra)
username
  extra fields not permitted (type=value_error.extra)

官方模板,这个是默认的内置的mysql

---
# see https://datahubproject.io/docs/metadata-ingestion/source_docs/mysql for complete documentation
source:
  type: "mysql"
  config:
    username: datahub
    password: datahub

# see https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for complete documentation
sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"
微信截图_20210923162800.png

元数据的删除

查看最近的导入

python3 -m datahub ingest list-runs
No ~/.datahubenv file found, generating one for you...
+--------------------------------------+--------+---------------------+
| runId                                |   rows | created at          |
+======================================+========+=====================+
| befcb37e-1c44-11ec-997c-000c297f660f |   2948 | 2021-09-23 08:06:16 |
+--------------------------------------+--------+---------------------+
| 80d3f1c6-1c43-11ec-9382-000c297f660f |    250 | 2021-09-23 07:54:54 |
+--------------------------------------+--------+---------------------+
| no-run-id-provided                   |     19 | 2021-09-23 04:00:48 |
+--------------------------------------+--------+---------------------+

python3 -m datahub ingest rollback --run-id no-run-id-provided

python3 -m  datahub ingest rollback --run-id no-run-id-provided 
This will permanently delete data from DataHub. Do you want to continue? [y/N]: y
rolling back deletes the entities created by a run and reverts the updated aspects
this rollback deleted 0 entities and rolled back 19 aspects
showing first 19 of 19 aspects reverted by this run
+-------------------------------+------------------+---------------------+
| urn                           | aspect name      | created at          |
+===============================+==================+=====================+
| urn:li:dataPlatform:postgres  | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:presto    | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:teradata  | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:voldemort | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:snowflake | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:redshift  | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:mssql     | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:bigquery  | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:druid     | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:looker    | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:feast     | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:sagemaker | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:glue      | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:redash    | dataPlatformInfo | 2021-09-23 04:00:48 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:athena    | dataPlatformInfo | 2021-09-23 04:00:48 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:mongodb   | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:mysql     | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:oracle    | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+
| urn:li:dataPlatform:pinot     | dataPlatformInfo | 2021-09-23 04:00:47 |
+-------------------------------+------------------+---------------------+

目前遇到的是greenplum无法导入,不知道原因,mysql正常了。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,558评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,002评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,036评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,024评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,144评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,255评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,295评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,068评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,478评论 1 305
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,789评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,965评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,649评论 4 336
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,267评论 3 318
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,982评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,223评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,800评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,847评论 2 351

推荐阅读更多精彩内容

  • Spring Cloud为开发人员提供了快速构建分布式系统中一些常见模式的工具(例如配置管理,服务发现,断路器,智...
    卡卡罗2017阅读 134,644评论 18 139
  • 第一章 ELK简介 E: elasticsearch 存储数据 javaL: ...
    斗魂_2e5d阅读 457评论 0 0
  • 一. 准备 Python3 和 Python 虚拟环境 cd /opt yum -y install wget s...
    Jack0111阅读 3,307评论 0 0
  • 一、架构 二、框架部署 2.1 准备 准备三台虚拟机,操作系统为CentOS 7.x,每台内存至少8G以上。 步骤...
    CJ21阅读 1,092评论 0 3
  • https://github.com/amundsen-io/amundsen[https://github.co...
    李春田阅读 1,886评论 0 0