A tab separated file data load into HBase from Hdfs/And Access the HBase table from Hive table(cross reference) (A hands on explanation).
--------------------------------------------Load the tab separated file into Hdfs :
cli>hdfs dfs -put /home/hadoop/Desktop/Customer.csv
______
Go to $HBASE_HOME/bin
The table pig_cust1 should be in hbase before executing the below command
HBASE_HOME/bin>hbase shell
hbase>create 'hb_cust1' ,'nm','pd'
hbase>list
hbase>describe 'hb_cust1 '
hbase>exit
Then execute the following command
HBASE_HOME/bin>
./hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=nm:fname,nm:lname,pd:cat,pd:ph,
HBASE_ROW_KEY,pd:addr 'hb_cust1'
hdfs://localhost:8020/user/hadoop/pigout/part-m-00000
-----------------------------------------------------------------------------------
the data file :
John Wade A 416-776-9281 2938 Bigcity|12345
Franklin Josephs I 905-560-9887 2913 Mediumcity|67890
Elizabeth Reynolds A 647-908-8865 2891 Smallcity|98765
Arthur Gibbons A 416-238-7765 400 Bigcity|12345
Lisa Scott I 416-824-8866 402 Bigcity|12345
Cliff Bowden A 905-994-9982 3772 Mediumcity|67890
Jean Dexter A 416-774-2339 210 Bigcity|12345
Roger Vickers A 647-232-9987 234 Smallcity|98765
Hubert Lexter A 416-972-2348 109 Bigcity|12345
Jack Merdec A 905-216-0989 54 Mediumcity|67890
Jovan Melcic A 905-324-4456 92 Mediumcity|67890
Melanie Ilyenko I 416-988-7723 404 Bigcity|12345
Jillian Panelo A 905-498-8872 12 Mediumcity|67890
Michael Junielle A 416-209-9987 43 Bigcity|12345
Rodson Mello A 416-996-7654 220 Bigcity|12345
Erick Jansen A 905-498-8211 93 Mediumcity|67890
Grant Getnet A 416-298-3445 332 Bigcity|12345
Roscoe Banhent A 647-829-0223 338 Smallcity|98765
Allen Serghert A 416-344-0992 324 Bigcity|12345
John Merdec I 416-922-2331 55 Bigcity|12345
Franklin Melcic A 905-561-2330 647 Mediumcity|67890
Elizabeth Ilyenko A 416-354-2778 102 Bigcity|12345
Arthur Panelo A 905-276-9987 227 Mediumcity|67890
Lisa Junielle A 416-352-9992 323 Bigcity|12345
Cliff Mello A 416-210-9997 47 Bigcity|12345
Jean Jansen A 905-244-4588 431 Mediumcity|67890
Roger Getnet I 416-309-9982 24 Bigcity|12345
Hubert Banhent A 416-526-8888 19 Bigcity|12345
Gerhart Serghert A 905-561-9234 773 Mediumcity|67890
Lambert Givens A 416-209-8223 587 Bigcity|12345
Panelo Wade I 905-298-9992 922 Mediumcity|67890
Junielle Josephs A 905-737-9088 433 Mediumcity|67890
Mello Reynolds A 905-245-4431 64 Mediumcity|67890
Jansen Gibbons A 416-298-3881 895 Bigcity|12345
Getnet Scott A 905-309-8221 1993 Mediumcity|67890
Banhent Bowden A 416-989-0223 720 Bigcity|12345
Serghert Dexter A 416-823-0991 830 Bigcity|12345
Merdec Vickers I 416-298-0908 176 Bigcity|12345
Melcic Lexter A 416-823-4443 128 Bigcity|12345
Ilyenko Merdec A 416-293-8771 97 Bigcity|12345
Roscoe Morris A 905-455-3221 322 Mediumcity|67890
Allen Jaskobec A 416-299-0202 7 Bigcity|12345
John Jarkin A 416-622-0991 11 Bigcity|12345
Franklin Drill A 647-309-2331 37 Smallcity|98765
Elizabeth Metzer A 416-322-9001 85 Bigcity|12345
Arthur Balgne A 905-311-1211 331 Mediumcity|67890
Lisa Hetzer A 416-980-3229 482 Bigcity|12345
Cliff Brandson I 416-559-0223 3221 Bigcity|12345
Jean Kulinski A 905-409-8823 452 Mediumcity|67890
Roger Marjory A 647-290-6776 557 Smallcity|98765
Hubert Mentz A 416-577-8233 312 Bigcity|12345
---------------------------------------------------------------------------
once executed go to HBase shell
hbase>list
hbase>scan hb_cust1
(to see the rows)
--------------------------------------------------------------------------
Because the data is already present in HBase table you can access it in Hive only through hive external table declaration
Follow the following steps :
1)Go to $HIVE_HOME/bin
2)$HIVE_HOME/bin>hive (shell)
3)At hive shell
hive>create external table hbtohive_customers(first_name string, last_name string , category string,phone_number string , customer_id int ,address Array<String>)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping"="nm:fname,nm:lname,pd:cat,pd:ph,:key,pd:addr")
> TBLPROPERTIES("hbase.table.name"="hb_cust1");
________________________________________________
In hive> select * from hbtohive_customers; will display the results.
No comments:
Post a Comment