Loading a tar file GZIP or BZIP2 into Hive table/Using CTAS/Like(Hands on explanation)
------------------------------Step 1: Set a few Hive properties
Set Hive.exec.compress.output = true
Set io.seqfile.compression.type = block
Step 2:- To zip a file to GZIP
tar -cvzf /home/hadoop/Desktop/customer.tar.gz /home/hadoop/Desktop/Customer.csv
Step3:
create table customer_gz
(fn string, ln string, cat string, ph string, cid int,add array<string>)
row format delimited
fields terminated by ','
collection items terminated by ','
stored as textfile;
step4: load data local inpath
'/home/hadoop/Desktop/customer.tar.gz'
into table customer_gz;
step5: create table customer_gz_seq
stored as sequencefile as
select * from customer_gz ;
step6: select * from customer_gz_seq ;
Hint: textfile of compression format Gzip or Bzip2 are not spittable on hadoop environment , so its not utilizing the parallel processing power of hadoop cluster. So it's better to load it into sequence file table
-----
Just to copy a table definition without any data, create a table as shown below.
hive>create table customer_gz_seq_bckup LIKE customer_gz_seq;
hint: you cannot specify any more clauses in between LIKE and new table name mentioned.
No comments:
Post a Comment