hive 相关

hive 相关

搭建hadoop和hive,mysql的环境,过程截图

1.hadoop install

sunyonggang@gg01:~/hadoop-2.6.0$ ./sbin/start-dfs.sh
Starting namenodes on [gg01]
gg01: starting namenode, logging to /home/sunyonggang/hadoop-2.6.0/logs/hadoop-sunyonggang-namenode-gg01.out
ggg03: starting datanode, logging to /home/sunyonggang/hadoop-2.6.0/logs/hadoop-sunyonggang-datanode-ggg03.out
ggg02: starting datanode, logging to /home/sunyonggang/hadoop-2.6.0/logs/hadoop-sunyonggang-datanode-ggg02.out
Starting secondary namenodes [gg01]
gg01: starting secondarynamenode, logging to /home/sunyonggang/hadoop-2.6.0/logs/hadoop-sunyonggang-secondarynamenode-gg01.out
sunyonggang@gg01:~/hadoop-2.6.0$ ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/sunyonggang/hadoop-2.6.0/logs/yarn-sunyonggang-resourcemanager-gg01.out
ggg02: starting nodemanager, logging to /home/sunyonggang/hadoop-2.6.0/logs/yarn-sunyonggang-nodemanager-ggg02.out
ggg03: starting nodemanager, logging to /home/sunyonggang/hadoop-2.6.0/logs/yarn-sunyonggang-nodemanager-ggg03.out
sunyonggang@gg01:~/hadoop-2.6.0$ jps
1915 NameNode
2118 SecondaryNameNode
2260 ResourceManager
2514 Jps

2.mysql install

sunyonggang@gg01:~$ sudo service mysql status
mysql start/running, process 7483
sunyonggang@gg01:~$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 43
Server version: 5.5.49-0ubuntu0.14.04.1 (Ubuntu)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+
3 rows in set (0.00 sec)

3.hive install(ps: 这边安装的时候需要讲my.cof: #bind-address = 127.0.0.1)

sunyonggang@gg01:/etc/mysql$ hive
16/04/26 19:14:49 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead

Logging initialized using configuration in jar:file:/home/sunyonggang/apache-hive-0.13.1-bin/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive> show databases;
OK
default
Time taken: 0.487 seconds, Fetched: 1 row(s)
hive>

利用上节课的数据,建表并导入数据

hive> create external table sg(time char(14),
    >         cookie char(32),
    >         keyword varchar(256),
    >         rank int,
    >         click int,
    >         ref varchar(512)
    >     ) row format delimited fields terminated by '\t' location '/hdfs/user/data';
OK
Time taken: 0.05 seconds
hive> select * from sg limit 5;
OK
20111230000005  57375476989eea12893c0c3811607bcf    奇艺高清    1   1   http://www.qiyi.com/
20111230000005  66c5bb7774e31d0a22278249b26bc83a    凡人修仙传   3   1   http://www.booksky.org/BookDetail.aspx?BookID=1050804&Level=1
20111230000007  b97920521c78de70ac38e3713f524b50    本本联盟    1   1   http://www.bblianmeng.com/
20111230000008  6961d0c97fe93701fc9c0d861d096cd9    华南师范大学图书馆   1   1   http://lib.scnu.edu.cn/
20111230000008  f2f5a21c764aebde1e8afcc2871e086f    在线代理    2   1   http://proxyie.cn/
Time taken: 0.239 seconds, Fetched: 5 row(s)

最热门的查询词排行 top10

  1. 按词分组,按出现的次数倒序排列
  2. sql语句与结果
hive> select count(*) as times, keyword from sg group by keyword order by times desc limit 10;
Total MapReduce CPU Time Spent: 1 minutes 25 seconds 60 msec
OK
77627   百度
36564   baidu
29598   人体艺术
23306   4399小游戏
20847   优酷
20677   qq空间
19205   新亮剑
17842   馆陶县县长闫宁的父亲
16612   公安卖萌
15212   百度一下 你就知道

用户查询排行 top10

  1. 按照cookie分组,按出现次数排序,选取前10
  2. sql语句与结果
hive> select count(*) as times, cookie from sg group by cookie order by times desc limit 10;  
Total MapReduce CPU Time Spent: 1 minutes 36 seconds 110 msec
OK
20385   ac65768b987c20b3b25cd35612f61892
11653   9faa09e57c277063e6eb70d178df8529
11528   02a8557754445a9b1b22a37b40d6db38
2571    cc7063efc64510c20bcdd604e12a3b26
2355    b64b0ec03efd0ca9cef7642c4921658b
1292    7a28a70fe4aaff6c35f8517613fb5c67
1277    b1e371de5729cdda9270b7ad09484c4f
1241    f656e28e7c3e10c2b733e6b68385d5a2
1181    7eab8caf9708d68e6964220e2f89e80d
1120    c72ce1164bcd263ba1f69292abdfdf7c
Time taken: 187.017 seconds, Fetched: 10 row(s)

搜索结果排名第1,但是点击次序排在第2的数据

  1. rank = 1 && click = 2
  2. sql语句与结果
hive> select count(*) from sg where rank=1 and click=2; 
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1461738251729_0004, Tracking URL = http://gg01:8088/proxy/application_1461738251729_0004/
Kill Command = /home/sunyonggang/hadoop-2.6.0/bin/hadoop job  -kill job_1461738251729_0004
Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1
2016-04-27 14:46:23,653 Stage-1 map = 0%,  reduce = 0%
2016-04-27 14:47:07,452 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 16.51 sec
2016-04-27 14:47:11,596 Stage-1 map = 47%,  reduce = 0%, Cumulative CPU 19.3 sec
2016-04-27 14:47:12,622 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 20.03 sec
2016-04-27 14:47:13,677 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 20.66 sec
2016-04-27 14:47:18,928 Stage-1 map = 87%,  reduce = 0%, Cumulative CPU 23.04 sec
2016-04-27 14:47:19,964 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 23.69 sec
2016-04-27 14:47:23,044 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 24.86 sec
MapReduce Total cumulative CPU time: 24 seconds 860 msec
Ended Job = job_1461738251729_0004
MapReduce Jobs Launched: 
Job 0: Map: 5  Reduce: 1   Cumulative CPU: 24.86 sec   HDFS Read: 1147077971 HDFS Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 24 seconds 860 msec
OK
200885
Time taken: 248.801 seconds, Fetched: 1 row(s)
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 221,198评论 6 514
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 94,334评论 3 398
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 167,643评论 0 360
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,495评论 1 296
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,502评论 6 397
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 52,156评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,743评论 3 421
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,659评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 46,200评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,282评论 3 340
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,424评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 36,107评论 5 349
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,789评论 3 333
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,264评论 0 23
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,390评论 1 271
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,798评论 3 376
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,435评论 2 359

推荐阅读更多精彩内容