本文介绍了使用Sqoop导入MYSQL数据到Hive&Hbase之中。
主要内容:
- 1.下载
1.下载
2.安装
2.1.解压
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /opt/soft
2.2.设置环境变量
cd conf/
mv sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
加入如下配置
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/opt/soft/hadoop-2.7.3
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/opt/soft/hadoop-2.7.3
#set the path to where bin/hbase is available
export HBASE_HOME=/opt/soft/hbase-1.2.6
#Set the path to where bin/hive is available
export HIVE_HOME=/opt/soft/apache-hive-1.2.2-bin
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/opt/soft/zookeeper-3.4.10
2.3.复制MySQL驱动
cp /opt/soft-install/mysql-connector-java-5.1.32.jar /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
2.4.复制Hive的jar
cp /opt/soft/apache-hive-1.2.2-bin/lib/hive-shims-0.23-1.2.2.jar /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
cp /opt/soft/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
cp /opt/soft/apache-hive-1.2.2-bin/lib/hive-shims-common-1.2.2.jar /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
3.常用命令
3.1.查看命令
[hadoop@hadoop1 sqoop-1.4.7.bin__hadoop-2.6.0]$ ./bin/sqoop help
Warning: /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0//../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0//../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/02/14 17:46:41 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information
See 'sqoop help COMMAND' for information on a specific command.
3.2.查看版本
[hadoop@hadoop1 sqoop-1.4.7.bin__hadoop-2.6.0]$ ./bin/sqoop version
Warning: /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0//../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/soft/sqoop-1.4.7.bin__hadoop-2.6.0//../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/02/14 17:46:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
3.3.导入MySQL数据到HDFS
sqoop import \
--connect jdbc:mysql://hadoop1:3306/mysql \
--username root \
--password root \
--target-dir /data/input/SysCodeType.txt \
--query 'select id,code_type_num,code_type_name from sys_code_type where $CONDITIONS and is_deleted=0' \
--split-by id \
--fields-terminated-by '\t' \
-m 1
3.4.导入MySQL数据到Hive
指定行分隔符和列分隔符,指定hive-import,指定覆盖导入,指定自动创建hive表,指定表名,指定删除中间结果数据目录
sqoop import \
--connect jdbc:mysql://hadoop1:3306/mysql \
--username root \
--password root \
--table sys_code_type \
--fields-terminated-by "\t" \
--lines-terminated-by "\n" \
--hive-import \
--hive-overwrite \
--create-hive-table \
--delete-target-dir \
--hive-database test \
--hive-table sys_code_type
进入Hive查看结果:
hive> use test;
OK
Time taken: 1.599 seconds
hive> show tables;
OK
person
student
sys_code_type
Time taken: 1.212 seconds, Fetched: 3 row(s)
hive> select * from sys_code_type;
OK
1029197104365404162 XXBXLXB schoolRunType 学校办学类型表 学校办学类型表 0 null null 1027740701250162689 2019-01-21 11:29:06.0 4 null null
可以看到数据已经导入Hive表了。
3.5导入MySQL数据到Hbase
先在Hbase里创建表
create 'sys_code_type ','f1'
将数据导入到Hbase
sqoop import \
--connect jdbc:mysql://hadoop1:3306/auto_study \
--username root \
--password root \
--table sys_code_type \
--hbase-table sys_code_type \
--column-family f1\
--hbase-row-key id