Loading a Flat File & Converting a Date Variable In a Flat File Into Hive Date Format Field In a Table.

 Loading a Flat File & Converting  a Date Variable In a Flat File Into Hive Date Format Field In a Table.(Hands on explanation)

--------------------------
Step 1:Create a Stage table

hive>create table emp_stage
(fn string , ln string,cur_wk string, prv_wk string, stdt Array<string>,enddt Array<string>,eid bigint)
 row format delimited
 fields terminated by ','
collection items terminated by '/'
 stored as textfile;

Note: Sample of flat file data
First_ Name , last_name, Currently_ working, Previously_working,Start_date,End_date,Employee_id
John,Wade,Washigton , ,1/8/2014,,2938

Franklin,Josephs,Detriot,washington ,11/12/2013,2/12/2015,2913

Elizabeth,Reynolds,Cleveland, ,4/30/2011,,2891

Arthur,Gibbons,fortwayne, ,6/28/2015, ,400

Franklin,Josephs,washington, ,2/12/2015, ,2913

But the date format of hive is YYYY-MM-DD

Step 2:
hive>Load data inpath
 '/user/hadoop/Emp_details.csv'
 into table emp_stage;

Step 3: Insert overwrite directory
'/user/hadoop/date_temp'
select fn , ln ,cur_wk ,  prv_wk,
 concat(stdt[2],'-',stdt[0],'-',stdt[1]),
 concat(enddt[2],'-',enddt[0],'-',enddt[1]),
 eid from emp_stage;

step4:create table emp_det
(fn string , ln string,cur_wk string, prv_wk string, stdt date,enddt date,eid bigint)
row format delimited
 fields terminated by '\001'
collection items terminated by '-'
stored as textfile;

step5:load data inpath
 '/user/hadoop/date_temp'
 into table emp_det;

step 6:To display in hive date format in hive cli 


select cast(t3.st_dt as date), cast(t3.end_dt as date)  from (select concat(stdt[2],'-',stdt[0],'-',stdt[1]) st_dt , concat(enddt[2],'-',enddt[0],'-',enddt[1]) end_dt from emp_stage)t3;

No comments:

Post a Comment