利用sqoop从mysql导入数据到HIVE中

项目有个需求是要求把mysql的数据同步到hive中,之前用过sqoop,这里记录下,以后还用得着
命令如下

sqoop import --connect jdbc:mysql://100.98.97.156:3306/volte_eop_prod --username root --password 123456 --table dw_wy_drop_customized_drilldown_table_daily --direct --fields-terminated-by "\t" --lines-terminated-by "\n" --delete-target-dir --hive-import --create-hive-table --hive-database test --hive-table test1 --num-mappers 1

参数分析
  • delete-target-dir :当你重复导入数据的时候由于HDFS文件路径已经存在会导致导入失败,加入这个参数,导入完后删除HDFS对应文件,重复导入不会报错

  • num-mappers : 这是mapper的数量,这个根据你自己的情况而定

  • create-hive-table : 根据mysql的表结构创建hive表

  • direct : mysql的特别参数,加快导出速度

执行结果
[ericsson@dlbdn3 runtu]$ sqoop import --connect jdbc:mysql://100.98.97.156:3306/volte_eop_prod --username root --password 123456 --table dw_wy_drop_customized_drilldown_table_daily --direct  --fields-terminated-by "\t" --lines-terminated-by "\n" --delete-target-dir --hive-import --create-hive-table --hive-database test --hive-table test1 --num-mappers 1
Warning: /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/12/13 17:55:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.11.0
18/12/13 17:55:49 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/12/13 17:55:49 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/12/13 17:55:49 INFO tool.CodeGenTool: Beginning code generation
18/12/13 17:55:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dw_wy_drop_customized_drilldown_table_daily` AS t LIMIT 1
18/12/13 17:55:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dw_wy_drop_customized_drilldown_table_daily` AS t LIMIT 1
18/12/13 17:55:49 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-ericsson/compile/0f7e6d0f0c9ff6fc9fffb7d3d6412651/dw_wy_drop_customized_drilldown_table_daily.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/12/13 17:55:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-ericsson/compile/0f7e6d0f0c9ff6fc9fffb7d3d6412651/dw_wy_drop_customized_drilldown_table_daily.jar
18/12/13 17:55:54 INFO tool.ImportTool: Destination directory dw_wy_drop_customized_drilldown_table_daily deleted.
18/12/13 17:55:54 INFO manager.DirectMySQLManager: Beginning mysqldump fast path import
18/12/13 17:55:54 INFO mapreduce.ImportJobBase: Beginning import of dw_wy_drop_customized_drilldown_table_daily
18/12/13 17:55:54 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/12/13 17:55:54 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/12/13 17:55:54 INFO client.RMProxy: Connecting to ResourceManager at dlbdn3/192.168.123.4:8032
18/12/13 17:55:56 INFO db.DBInputFormat: Using read commited transaction isolation
18/12/13 17:55:56 INFO mapreduce.JobSubmitter: number of splits:1
18/12/13 17:55:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1543800485319_1069
18/12/13 17:55:57 INFO impl.YarnClientImpl: Submitted application application_1543800485319_1069
18/12/13 17:55:57 INFO mapreduce.Job: The url to track the job: http://dlbdn3:8088/proxy/application_1543800485319_1069/
18/12/13 17:55:57 INFO mapreduce.Job: Running job: job_1543800485319_1069
18/12/13 17:56:05 INFO mapreduce.Job: Job job_1543800485319_1069 running in uber mode : false
18/12/13 17:56:05 INFO mapreduce.Job:  map 0% reduce 0%
18/12/13 17:56:15 INFO mapreduce.Job:  map 100% reduce 0%
18/12/13 17:56:15 INFO mapreduce.Job: Job job_1543800485319_1069 completed successfully
18/12/13 17:56:15 INFO mapreduce.Job: Counters: 32
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=153436
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=87
        HDFS: Number of bytes written=562
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Other local map tasks=1
        Total time spent by all maps in occupied slots (ms)=6137
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=6137
        Total vcore-milliseconds taken by all map tasks=6137
        Total megabyte-milliseconds taken by all map tasks=6284288
    Map-Reduce Framework
        Map input records=1
        Map output records=6
        Input split bytes=87
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=48
        CPU time spent (ms)=1510
        Physical memory (bytes) snapshot=328478720
        Virtual memory (bytes) snapshot=1694789632
        Total committed heap usage (bytes)=824180736
        Peak Map Physical memory (bytes)=328478720
        Peak Map Virtual memory (bytes)=1694789632
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=562
18/12/13 17:56:15 INFO mapreduce.ImportJobBase: Transferred 562 bytes in 21.2223 seconds (26.4815 bytes/sec)
18/12/13 17:56:15 INFO mapreduce.ImportJobBase: Retrieved 6 records.
18/12/13 17:56:15 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dw_wy_drop_customized_drilldown_table_daily` AS t LIMIT 1
18/12/13 17:56:15 WARN hive.TableDefWriter: Column DATE_TIME had to be cast to a less precise type in Hive
18/12/13 17:56:15 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/hive-common-1.1.0-cdh5.11.0.jar!/hive-log4j.properties
OK
Time taken: 3.832 seconds
Loading data to table test.test1
Table test.test1 stats: [numFiles=1, totalSize=562]
OK
Time taken: 0.691 seconds
[ericsson@dlbdn3 runtu]$ 
用direct参数

18/12/13 17:55:54 INFO manager.DirectMySQLManager: Beginning mysqldump fast path import

生成表名的java文件
[ericsson@dlbdn3 runtu]$ ll
total 32
-rw-rw-r-- 1 ericsson ericsson 32198 Dec 13 17:37 dw_wy_drop_customized_drilldown_table_daily.java
[ericsson@dlbdn3 runtu]$
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,463评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,868评论 3 391
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 161,213评论 0 351
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,666评论 1 290
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,759评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,725评论 1 294
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,716评论 3 415
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,484评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,928评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,233评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,393评论 1 345
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,073评论 5 340
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,718评论 3 324
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,308评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,538评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,338评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,260评论 2 352

推荐阅读更多精彩内容