· 7 years ago · Jan 06, 2019, 02:08 PM
1Prerequisites :
2>>Enable hive interactive server in hive
3
4>>Get following details from hive for spark
5
6spark.hadoop.hive.llap.daemon.service.hosts @llap0
7spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
8spark.datasource.hive.warehouse.load.staging.dir /tmp
9spark.datasource.hive.warehouse.metastoreUri thrift://c420-node3.squadron-labs.com:9083
10spark.security.credentials.hiveserver2.enabled false
11
12
13
14Basic testing :
15
161) Create a table employee in hive and load some data
17 eg:
18 Create table
19 ----------------
20
21```CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, destination String)
22COMMENT 'Employee details'
23ROW FORMAT DELIMITED
24FIELDS TERMINATED BY ','
25LINES TERMINATED BY '\n'
26STORED AS TEXTFILE;
27```
28 Load data data.txt file into hdfs
29 ---------------
30```
311201,Gopal,45000,Technical manager
321202,Manisha,45000,Proof reader
331203,Masthanvali,40000,Technical writer
341204,Kiran,40000,Hr Admin
351205,Kranthi,30000,Op Admin
36```
37
38```LOAD DATA INPATH '/tmp/data.txt' OVERWRITE INTO TABLE employee;```
39
402) kinit to the spark user and run
41
42spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --conf "spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/_HOST@HWX.COM" --conf "spark.datasource.hive.warehouse.metastoreUri=thrift://c420-node3.squadron-labs.com:9083" --conf "spark.datasource.hive.warehouse.load.staging.dir=/tmp/" --conf "spark.hadoop.hive.llap.daemon.service.hosts=@llap0" --conf "spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
43
44Note: spark.security.credentials.hiveserver2.enabled should be set to false for YARN client deploy mode, and true for YARN cluster deploy mode (by default). This configuration is required for a Kerberized cluster
45
463) run following code in scala shell to view the table data
47import com.hortonworks.hwc.HiveWarehouseSession
48val hive = HiveWarehouseSession.session(spark).build()
49hive.execute("show tables").show
50hive.executeQuery("select * from employee").show
51
52
53
544) To apply common properties by default, add following setting into ambari spark2 custom conf
55
56
57spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/_HOST@HWX.COM
58spark.datasource.hive.warehouse.metastoreUri=thrift://c420-node3.squadron-labs.com:9083
59spark.datasource.hive.warehouse.load.staging.dir=/tmp/
60spark.hadoop.hive.llap.daemon.service.hosts=@llap0
61spark.hadoop.hive.zookeeper.quorum=c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181
62
63
645) spark-shell --master yarn --conf "spark.security.credentials.hiveserver2.enabled=false" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
65Note: Common properties are read from spark default properties
66
676) run following code in scala shell to view the hive table data
68```
69import com.hortonworks.hwc.HiveWarehouseSession
70val hive = HiveWarehouseSession.session(spark).build()
71hive.execute("show tables").show
72hive.executeQuery("select * from employee").show
73```
74
757) To integrate HWC in Livy2
76
77 a) add following property in Custom livy2-conf
78livy.file.local-dir-whitelist=/usr/hdp/current/hive_warehouse_connector/
79 b) Add hive-site.xml to /usr/hdp/current/spark2-client/conf on all cluster nodes.
80
81 c) In livy2 interpreter settings add following
82
83livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0
84livy.spark.security.credentials.hiveserver2.enabled true
85livy.spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://c420-node2.squadron-labs.com:2181,c420-node3.squadron-labs.com:2181,c420-node4.squadron-labs.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
86livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@HWX.COM
87livy.spark.yarn.security.credentials.hiveserver2.enabled true
88livy.spark.jars file:///usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.1.0-187.jar
89
90 d) Restart livy2 interpreter
91
92 e) in first paragraph add
93 %livy2
94import com.hortonworks.hwc.HiveWarehouseSession
95val hive = HiveWarehouseSession.session(spark).build()
96
97 f) in second paragraph add
98 %livy2
99hive.executeQuery("select * from employee").show