Hive Left Semi Join / IN clause with subquery

 Hive Left Semi Join / IN clause with sub query(hands on explanation)

______________
Step1:
hive> describe sales;
OK
cust_id                 int                                        
prod_no                 int                                        
qty                     int                                        
datestring              string                                     
sales_id                int                                        
Time taken: 0.104 seconds, Fetched: 5 row(s)
hive> describe customer_temp;
FAILED: SemanticException [Error 10001]: Table not found customer_temp
hive> describe customers_temp;
OK
first_name              string                                     
last_name               string                                     
category                string                                     
phone_number            string                                     
customer_id             int                                        
address                 array<string>                              
Time taken: 0.097 seconds, Fetched: 6 row(s)

Step 2: Using a subquery using like clause
select c.first_name , c.phone_number from customers_temp c where c.customer_id in (select cust_id from sales);

Step 3: Using left semi join
select c.first_name , c.phone_number from customers_temp c left semi join sales s on c.customer_id  =s.cust_id ;

Hint: Both results will be same.

No comments:

Post a Comment