【Sqoop】Sqoop 1.4.7 安装

一、Sqoop 介绍

Sqoop 是一款用于 hadoop 和关系型数据库之间数据导入导出的工具。可以通过 Sqoop 把数据从数据库(比如 mysql,oracle)导入到 hdfs 中;也可以把数据从 hdfs 中导出到关系型数据库中。

通过将 Sqoop 的操作命令转化为 Hadoop 的 MapReduce 作业(通常只涉及到 Map 任务)进行导入导出,即 Sqoop 生成的 Job 主要是并发运行 MapTask 实现数据并行传输以提升数据传送速度和效率,如果使用 Shell 脚本来实现多线程数据传送则存在很大的难度。

Sqoop2(Sqoop1.99.7)需要在 Hadoop 安装目录下的配置文件中设置代理,属于重量级嵌入安装,文中使用 Sqoop1(Sqoop1.4.7)。

二、Sqoop 安装

安装前提:

已经具备了Java和hadoop的环境。

下载安装包:

http://mirror.bit.edu.cn/apache/sqoop/1.4.7/

解压:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n13" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">sudo tar -zxv -f sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
sudo mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop</pre>

配置环境变量:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="sh" cid="n15" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">export SQOOP_HOME=/opt/sqoop
export PATH=PATH:SQOOP_HOME/bin</pre>

使配置生效:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n17" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">source /etc/profile</pre>

上传 mysql 驱动至 Sqoop lib 下:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n19" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">cp mysql-connector-java-5.1.46.jar /opt/sqoop/lib/</pre>

修改 Sqoop 配置:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n21" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">cd /opt/sqoop/conf/
cp sqoop-env-template.sh sqoop-env.sh
vim sqoop-env.sh</pre>

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="sh" cid="n22" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/opt/hadoop

Set path to where hadoop-*-core.jar is available

export HADOOP_MAPRED_HOME=/opt/hadoop

set the path to where bin/hbase is available

export HBASE_HOME=/opt/hbase

Set the path to where bin/hive is available

export HIVE_HOME=/opt/hive

Set the path for where zookeper config dir is

export ZOOCFGDIR=/opt/zookeeper/conf</pre>

注意:这里的 HADOOP_COMMON_HOME 和 HADOOP_MAPRED_HOME 配成一个就行了,因为现在安装的 hadoop 的开源的版本,但是在 hadoop 的商业版本中这两个配置是分别安装在不同的目录下的。

三、验证安装是否成功

执行 sqoop version,看到版本信息,即安装成功。

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n26" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017</pre>

四、测试

4.1、mysql 整个表导入 hdfs

运行下面的指令前确保 mysql 中存在 test db 和 mysql_input table。

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n30" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">sqoop import
--connect jdbc:mysql://master:3306/test
--username root
--password xxx
--table mysql_input
--target-dir /user/sqoop/data_from_mysql</pre>

运行后有报错:

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n32" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">ERROR tool.ImportTool: Import failed: No primary key could be found for table mysql_input. Please specify one with --split-by or perform a sequential import with '-m 1'.</pre>

因为没有主键,又没有指定切分的列,Sqoop 不知道用多少个 map 进行数据同步,可以通过设置 split-by 和 -m 1来规避。

当-m 设置的值大于1时,split-by 必须设置字段。

4.2、mysql 表按查询条件导入 hdfs

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="shell" cid="n36" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); --select-text-font-color: #fff; font-size: 0.9rem; line-height: 1.71429em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(218, 218, 218); position: relative !important; margin-bottom: 3em; margin-left: 2em; padding-left: 1ch; padding-right: 1ch; width: inherit; color: rgb(31, 9, 9); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">sqoop import
--connect jdbc:mysql://master:3306/test
--username root
--password xxx
--query 'select * from mysql_input where size>1 and $CONDITIONS'
--target-dir /user/sqoop/data_from_mysql
-m 1</pre>

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 219,928评论 6 509
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 93,748评论 3 396
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 166,282评论 0 357
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,065评论 1 295
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,101评论 6 395
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,855评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,521评论 3 420
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,414评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,931评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,053评论 3 340
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,191评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,873评论 5 347
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,529评论 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,074评论 0 23
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,188评论 1 272
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,491评论 3 375
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,173评论 2 357

推荐阅读更多精彩内容