hive
oozie调用hive action,注意,不是使用hive2 action
oozie需要上传 workflow.xml文件到hdfs目录中,script.q上传到workflow.xml同级目录;
注意,job.properties留在服务器本地,作为config使用;
oozie创建job命令
oozie job -oozie http://localhost:11000/oozie -config job.properties -run
oozie删除job命令
oozie job -oozie http://localhost:11000/oozie -kill [jobId]
在oozie调用hive action过程中,仅参考附件文件,注意不需要上传任何lib库到hdfs上;
job.properties
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# 下面两个地址可以从hdfs配置中查询到
nameNode=hdfs://kxjgCffI-Master1.jcloud.local:8020
jobTracker=kxjgCffI-Master1.jcloud.local:8050
queueName=default
examplesRoot=zy
oozie.use.system.libpath=true
# workflow.xml文件所在的hdfs目录
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/hive
workflow.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<script>script.q</script>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
script.q 执行的hive命令demo,创建数据库,创建内部表,创建外部表,将外部表中的数据导入到内部表
--
-- Licensed to the Apache Software Foundation (ASF) under one
-- or more contributor license agreements. See the NOTICE file
-- distributed with this work for additional information
-- regarding copyright ownership. The ASF licenses this file
-- to you under the Apache License, Version 2.0 (the
-- "License"); you may not use this file except in compliance
-- with the License. You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
--
DROP DATABASE IF EXISTS people1 cascade;
CREATE DATABASE people1;
DROP TABLE student1inner;
CREATE TABLE people1.student1inner(id int,name string,sex string);
DROP TABLE student1out;
CREATE EXTERNAL TABLE people1.student1out(id int,name string,sex string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' location '/user/hadoop/examples/output-data/sqoop-mysql';
INSERT INTO people1.student1inner(id,name,sex) select id,name,sex from people1.student1out;
hive操作笔记
创建数据库:
CREATE DATABASE IF NOT EXISTS [DATABASE_NAME];
删除数据库:
DROP DATABASE IF EXISTS [DATABASE_NAME]; 注意,如果该数据库中有表存在的话,这个数据库删除不掉
强制删除数据库:
DROP DATABASE IF EXISTS [DATABASE_NAME] CASCADE;
创建外部表:
CREATE EXTERNAL TABLE [DATABASE_NAME].[TABLENAME](id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' location '/user/hadoop/examples/output-data/sqoop-mysql';
注意,[DATABASE_NAME]也可以没有;
/user/hadoop/examples/output-data/sqoop-mysql 这个是外部表级联文件的位置;TERMINATED BY ','是文件中分隔符
文件demo
1,zhangsan
2,lisi
创建内部表:
CREATE TABLE people1.student1inner(id int,name string,sex string);
从一个表导入数据到另一张表
INSERT INTO people1.student1inner(id,name,sex) select id,name,sex from people1.student1out;
上面语句使用前提是两张表都已经创建好
删除表:
DROP TABLE student1out;
ssh
oozie在调用shell任务时,是需要将sh文件上传到hdfs上,并随机选择集群中的一台主机执行该sh脚本
如果不是集群全部主机都能成功执行该shell脚本(比如调用master节点的mysql创建表),则不要使用shell action方式,
这种情况下使用ssh action可能更好点,因为能指定哪一台host执行,并选择执行的用户
job.properties参考hive
workflow-ssh.xml
<workflow-app xmlns="uri:oozie:workflow:0.2" name="ssh-wf">
<start to="ssh-createmysql"/>
<action name="ssh-createmysql">
<ssh xmlns="uri:oozie:ssh-action:0.2">
<host>hadoop@kxjgCffI-Master1.jcloud.local</host>
<command>sh /home/hadoop/oozie/createmysql.sh</command>
</ssh>
<ok to="ssh-hiveoption"/>
<error to="fail"/>
</action>
<action name="ssh-hiveoption">
<ssh xmlns="uri:oozie:ssh-action:0.2">
<host>hadoop@kxjgCffI-Master1.jcloud.local</host>
<command>sh /home/hadoop/oozie/hiveoption.sh</command>
</ssh>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>ssh failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
createmysql.sh 创建mysql数据库、创建表、并插入模拟数据
#!/bin/bash
DATABASE="people"
TABLE="students"
#delete database;
mysql -u root << EOF
DROP DATABASE IF EXISTS $DATABASE;
EOF
#create database
mysql -u root << EOF
CREATE DATABASE IF NOT EXISTS $DATABASE CHARACTER SET UTF8;
EOF
echo 'create database $DATABASE'
#create table
mysql -u root $DATABASE << EOF
CREATE TABLE IF NOT EXISTS $TABLE(id bigint(8) unsigned primary key Auto_Increment,name text,sex text) Engine InnoDB;
EOF
echo 'create table $TABLE'
#insert data
mysql -u root $DATABASE << EOF
INSERT INTO $TABLE (name,sex) VALUES ("zhangsan","man");
INSERT INTO $TABLE (name,sex) VALUES ("lisi","man");
INSERT INTO $TABLE (name,sex) VALUES ("wangwu","man");
INSERT INTO $TABLE (name,sex) VALUES ("zhaoliu","woman");
EOF