티스토리 뷰

Bigdata/hive

Rhive 구성 및 실행

hellotheresy 2017. 4. 12. 16:03

개요 

 : Rhive ( R ↔ hive ) 설치 및 테스트를 위한 Page 


환경

: HDP 2.5 Stack ( hive 1.2 , hadoop 2.7 ) 

: R 3.2.3


Source download 


> git clone git://github.com/nexr/Rhive.git
> yum install ant

Build

> ant build 



REAME file을 보면 다음과 같다. 

## Install RHive
1. Requirements
- ant (in order to build java files)
2. Installing RHive
1. Download source code: <code>git clone https://github.com/nexr/RHive.git</code>
2. Change your working directory: <code>cd RHive</code>
3. Set the environment variables HIVE_HOME and HADOOP_HOME:
<code>export HIVE_HOME=/path/to/your/hive/directory</code>
<code>export HADOOP_HOME=/path/to/your/hadoop/directory</code>
5. Build java files using ant: <code>ant build</code>
4. Build RHive: <code>R CMD build RHive</code>
5. Install RHive: <code>R CMD INSTALL RHive_<VERSION>.tar.gz</code>

## Loading RHive and connecting to Hive
1. Set the environment variables HIVE_HOME and HADOOP_HOME:
- Set the environment variables:
<code>export HIVE_HOME=/path/to/your/hive/directory</code>
<code>export HADOOP_HOME=/path/to/your/hadoop/directory</code>
<code>export HADOOP_CONF_DIR=/path/to/your/hadoop/conf/directory</code>
- Or, add environment variables into Renviron
<code>HIVE_HOME=/path/to/your/hive/directory</code>
<code>HADOOP_HOME=/path/to/your/hadoop/directory</code>
<code>HADOOP_CONF_DIR=/path/to/your/hadoop/conf/directory</code>
2. launch R
<pre><code>library(RHive)
rhive.connect(host, port, hiveServer2)</code></pre>

## Tutorials
- [RHive user guide](https://github.com/nexr/RHive/wiki/User-Guide)

## Requirements
- Java 1.6
- R 2.13.0
- Rserve 0.6-0
- rJava 0.9-0
- Hadoop 0.20.x (x >= 1)

→ 환경 변수 셋팅이 필요함 


 환경 변수

> export HIVE_HOME=/usr/hdp/current/hive-client

> export HADOOP_HOME=/usr/hdp/current/hadoop-client



 Re-Build

> ant build 




  > R CMD build ./RHive


Rhive 패키지 설치 

 > R CMD INSTALL ./RHive_2.0-0.10.tar.gz


 : rJava / Rserve 설치 필요 



 > R 

 > install.packages("rJava") 

 > install.packages("Rserve") 

 > R CMD INSTALL ./RHive_2.0-0.10.tar.gz

 > R 

 >  install.packages("./RHive_2.0-0.10.tar.gz", repos=NULL) 



Rhive 예제 실행


export HIVE_HOME=/usr/hdp/current/hive-client

export HADOOP_HOME=/usr/hdp/current/hadoop-client


> su - hdfs

> R 

>Sys.setenv(HIVE_HOME="/usr/hdp/current/hive-client")

>Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client")


library(rJava)

library(Rserve)

library(RHive)


rhive.connect()

rhive.query("select count(*) from customer")

rhive.query("select count(*) from tpch.supplier")

> rhive.query(set hive.execution.engine=mr")
Error: unexpected symbol in "rhive.query(set hive.execution.engine"
> rhive.query("set hive.execution.engine=mr")
Error: java.sql.SQLException: The query did not generate a result set!


실행결과




Connection User 변경

rhive.connect("localhost", user="hdfs")

rhive.query("select count(*) from tpch.supplier")








user 설정이후 Tez여부 확인 → 동작확인 


Hive Data R로 가져오기 

resultDF <-rhive.query("select * from tpch.supplier limit 10")

summary(resultDF)



댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/04   »
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
글 보관함