Create a External Hive Partitioned Table.

 Create a External Hive Partitioned Table.(hands on explanation)

------------------------
Step1:Create a External table .The partition files would be the sub directories of  location in hdfs under /user/hadoop/country/ . Under directory country the sub directories are partitions in the view of table definition as shown below

hive>create EXTERNAL table  pd_det
(dept string,  cost int, id int,asmb_city string,asmb_ct string, retail int)
PARTITIONED BY (country string)
 row format delimited
 fields terminated by '\001'
 stored as textfile
location '/user/hadoop/country/';

Step 2:Whenever a new directory  is added to hdfs in path /user/hadoop/country/ , the alter table with add partition command must be executed .
a)For example if a new directory with name format  'country=usa' is added to hdfs://localhost:8020/user/hadoop/country path ,
 then it means you are adding a  new partition to table pd_det with partition country value as usa.
The data file present in the directory of  hdfs://localhost:8020/user/hadoop/country/country=usa will be read into table.
b) The table pd_det to recognize the partition so must execute  a alter table command as shown below.

hive>Alter table pd_det add partition(country ='usa');

Step 3: To create a data file inside a directory from table present in hive execute

hive>insert overwrite  directory
 '/user/hadoop/country/country=india'
select
 dept,cost, id ,asmb_city,asmb_ct,retail
 from pd_temp
where make = 'india';

step 4: If the  folder name is different than above specified then specify the location with the alter statement

 hive>alter table pd_det
add if not exists partition(country='mexico')
location '/user/hadoop/country/mexico' ;

No comments:

Post a Comment