Sie sind auf Seite 1von 2

1. Creating table in Hive.

> create table w1 (wban int, date string, stationtype string)


> row format delimited
> fields terminated by ',';
2. load data inpath '/usr/hdfs/weather/201301hourly.txt' into table w1;
3. creating partitioning table.
>create table part1(date string, stationtype string)
> partitioned by (wban int)
> row format delimited fields terminated by ',';
4. Dynamic partition Properties.
>SET hive.exec.dynamic.partition = true;

>SET hive.exec.dynamic.partition.mode = nonstrict;


5. partitioning with table w1
> insert overwrite table part1 partition(wban='03011')
> select date, stationtype from w1 where wban='03011';
6. For auto join conversion
> set hive.auto.convert.join=true;
7. Map Join in hive.
---> MAPJOINs are processed by loading the smaller table into an in-memory hash map and
matching keys with the larger table as they are streamed through.
8. > create table mpj (wban int)
> row format delimited
> fields terminated by ',';
9. Join query
> insert overwrite table mpj
> select count(*) from
> w1 JOIN mont2 on (w1.wban = mont2.wban);
10. OUTER JOINS
→ Left Outer Join:
> create table monthly (WBAN int, YearMonth string, AvgMaxTemp string)
row format delimited
fields terminated by ',';
>load data local inpath '/home/vis/Documents/weather/201301monthly.txt' into table
monthly;
>create table hourly (WBAN int, Date string, Stationtype string)
row format delimited
fields terminated by ',';
LOJ join query:
> select monthly.wban, hourly.wban from monthly LEFT OUTER JOIN hourly
ON (monthly.wban=hourly.wban)
where hourly.date='20130101';
→ create table lftjoin(wban int, stationtype string) row format delimited fields
terminated by ',';

>insert overwrite table lftjoin


> select hourly.wban, hourly.stationtype from hourly
> LEFT OUTER JOIN monthly ON (monthly.wban=hourly.wban)
> where hourly.date='20130104';
11. Right Outer JOIN

Das könnte Ihnen auch gefallen