Loading a Flat File & Converting a Date Variable In a Flat File Into Hive Date Format Field In a Table.(Hands on explanation)
--------------------------Step 1:Create a Stage table
hive>create table emp_stage
(fn string , ln string,cur_wk string, prv_wk string, stdt Array<string>,enddt Array<string>,eid bigint)
row format delimited
fields terminated by ','
collection items terminated by '/'
stored as textfile;
Note: Sample of flat file data
First_ Name , last_name, Currently_ working, Previously_working,Start_date,End_date,Employee_id
John,Wade,Washigton , ,1/8/2014,,2938
Franklin,Josephs,Detriot,washington ,11/12/2013,2/12/2015,2913
Elizabeth,Reynolds,Cleveland, ,4/30/2011,,2891
Arthur,Gibbons,fortwayne, ,6/28/2015, ,400
Franklin,Josephs,washington, ,2/12/2015, ,2913
But the date format of hive is YYYY-MM-DD
Step 2:
hive>Load data inpath
'/user/hadoop/Emp_details.csv'
into table emp_stage;
Step 3: Insert overwrite directory
'/user/hadoop/date_temp'
select fn , ln ,cur_wk , prv_wk,
concat(stdt[2],'-',stdt[0],'-',stdt[1]),
concat(enddt[2],'-',enddt[0],'-',enddt[1]),
eid from emp_stage;
step4:create table emp_det
(fn string , ln string,cur_wk string, prv_wk string, stdt date,enddt date,eid bigint)
row format delimited
fields terminated by '\001'
collection items terminated by '-'
stored as textfile;
step5:load data inpath
'/user/hadoop/date_temp'
into table emp_det;
step 6:To display in hive date format in hive cli
select cast(t3.st_dt as date), cast(t3.end_dt as date) from (select concat(stdt[2],'-',stdt[0],'-',stdt[1]) st_dt , concat(enddt[2],'-',enddt[0],'-',enddt[1]) end_dt from emp_stage)t3;