1、mysql準備
yum install -y mariadb-server
systemctl start mariadb
systemctl enable mariadb
通過以下命令進行配置(設置密碼,比如説root@123):
mysql_secure_installation
登錄進去設置root可以任意機器登錄
mysql -uroot -proot@123
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root@123' WITH GRANT OPTION;
FLUSH PRIVILEGES;
2、安裝Hive
2.1、上傳apache-hive-3.1.3-bin.tar.gz
在node3節點上操作
將 apache-hive-3.1.3-bin.tar.gz 上傳到 /opt/bigdata目錄下,鏡像tar -zxvf apache-hive-3.1.3-bin.tar.gz 解壓到本地目錄
配置HIVE_HOME
export JAVA_HOME=/opt/bigdata/jdk1.8.0_461
export HADOOP_HOME=/opt/bigdata/hadoop-3.2.2
export HIVE_HOME=/opt/bigdata/apache-hive-3.1.3-bin
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
2.2、配置
cd /opt/bigdata/apache-hive-3.1.3-bin/conf
cp hive-env.sh.template hive-env.sh
vim hive-env.sh
最後添加如下 :
HADOOP_HOME=/opt/bigdata/hadoop-3.2.2
同目錄下新建操作
vim hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 存儲元數據mysql相關配置 /etc/hosts -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value> jdbc:mysql://node3:3306/hive?createDatabaseIfNotExist=true&useSSL=false&useUnicode=true&characterEncoding=UTF-8</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root@123</value>
</property>
<!-- Metastore 服務地址 -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://node3:9083</value>
</property>
<!-- 連接超時設置 -->
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>300</value>
</property>
</configuration>
2.3、加mysql驅動到hive lib下
將 mysql-connector-java-5.1.47.jar 放到/opt/bigdata/apache-hive-3.1.3-bin/lib
2.3、分發
scp -r apache-hive-3.1.3-bin root@node1:/opt/bigdata/
scp -r apache-hive-3.1.3-bin root@node2:/opt/bigdata/
2.4、啓動(node3上啓動)
初始化元數據
cd /opt/bigdata/apache-hive-3.1.3-bin/bin
./schematool -dbType mysql -initSchema
注意 : 啓動這個的時候會報錯
Exception in thread “main“ java.lang.NoSuchMethodError:com.google.common.base.Preconditions.checkAr
原因是 :
Hive 3.x 需要 Guava 19+,但 Hadoop 3.1.3 中默認可能是 guava-11.0.2.jar 或類似舊版本
解決方式 :
1、將 Hive 目錄下的舊 guava 包刪除或備份
2、用 Hadoop 中的 guava-27.0-jre.jar 或更高版本替換 Hive 中的
啓動hive metastore
cd /opt/bigdata/apache-hive-3.1.3-bin/bin
nohup hive --service metastore &
3、Spark3安裝
3.1、上傳spark-3.3.2-bin-hadoop3.tgz
在node1節點上操作
將 spark-3.3.2-bin-hadoop3.tgz 上傳到 /opt/bigdata目錄下,鏡像tar -zxvf spark-3.3.2-bin-hadoop3.tgz 解壓到本地目錄
配置SPARK_HOME
export JAVA_HOME=/opt/bigdata/jdk1.8.0_461
export HADOOP_HOME=/opt/bigdata/hadoop-3.2.2
export SPARK_HOME=/opt/bigdata/spark-3.3.2-bin-hadoop3
export HIVE_HOME=/opt/bigdata/apache-hive-3.1.3-bin
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$HIVE_HOME/bin
3.2、配置
將hive-site.xml放到spark conf下
cp /opt/bigdata/apache-hive-3.1.3-bin/conf/hive-site.xml /opt/bigdata/spark-3.3.2-bin-hadoop3/conf
spark-env.sh 配置
mv spark-env.sh.template spark-env.sh
vim spark-env.sh
export JAVA_HOME=/opt/bigdata/jdk1.8.0_461
export HADOOP_CONF_DIR=/opt/bigdata/hadoop-3.2.2/etc/hadoop
export YARN_CONF_DIR=/opt/bigdata/hadoop-3.2.2/etc/hadoop
export SPARK_HISTORY_OPTS="
-Dspark.history.ui.port=18080
-Dspark.history.fs.logDirectory=hdfs://node1:8020/directory
-Dspark.history.retainedApplications=30"
spark-defaults.conf配置
mv spark-defaults.conf.template spark-defaults.conf
vim spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node1:8020/directory
spark.yarn.historyServer.address=node1:18080
spark.history.ui.port=18080
3.3、分發
scp -r spark-3.3.2-bin-hadoop3 root@node2:/opt/bigdata/
scp -r spark-3.3.2-bin-hadoop3 root@node3:/opt/bigdata/
3.4、啓動歷史服務器
node1節點上啓動
start-history-server.sh
3.5、測試
spark wordcount測試
spark-submit --class org.apache.spark.examples.JavaWordCount --master yarn --deploy-mode cluster /opt/bigdata/spark-3.3.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.3.2.jar /input/README.txt
如果要打印參數加 -v
spark-submit --class org.apache.spark.examples.JavaWordCount --master yarn --deploy-mode cluster -v /opt/bigdata/spark-3.3.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.3.2.jar /input/README.txt
spark sql on hive測試
[root@node1 bin]# ./spark-sql --master yarn
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/07/20 22:40:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
25/07/20 22:40:01 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
Spark master: yarn, Application Id: application_1752971759164_0005
spark-sql> show databases;
default
journey
Time taken: 1.147 seconds, Fetched 2 row(s)
spark-sql> use journey;
Time taken: 0.048 seconds
spark-sql> show tables;
t_user
Time taken: 0.081 seconds, Fetched 1 row(s)
spark-sql> select * from t_user;
25/07/20 22:40:25 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
1 xx1
2 xx2
3 xx3
Time taken: 1.618 seconds, Fetched 3 row(s)
3.6、Spark Standalone部署
vim workers
locahost
vim spark-env.sh
export JAVA_HOME=/Users/xxx/Documents/InstallProgram/jdk1.8.0_462.jdk/Contents/Home
export HADOOP_CONF_DIR=/Users/xxx/bigdata/hadoop-3.3.4/etc/hadoop
SPARK_MASTER_HOST=localhost
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8080
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=4g
SPARK_WORKER_WEBUI_PORT=8081
export SPARK_HISTORY_OPTS="
-Dspark.history.ui.port=18080
-Dspark.history.fs.logDirectory=hdfs://localhost:8020/directory
-Dspark.history.retainedApplications=30"
vim spark-defaults.conf
spark.master spark://localhost:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://localhost:8020/directory
spark.history.ui.port=18080
進入SPARK_HOME/sbin下
./start-master.sh
./start-worker.sh spark://localhost:7077
./start-history-server.sh