DataSphere
目录
常用连接
https://linkis.staged.apache.org/zh-CN/faq/main/
https://github.com/WeBankFinTech/DataSphereStudio/blob/master/README-ZH.md
https://linkis.apache.org/zh-CN/docs/latest/introduction/ https://github.com/apache/incubator-linkis/discussions/3628
常用命令
echo $(hostname -I)
ifconfig ifconfig virbr0 down
curl https://bootstrap.pypa.io/pip/2.7/get-pip.py -o get-pip.py && python get-pip.py && python -m pip install "matplotlib<3.0"
tail -f /home/hadoop/linkis/logs/linkis-cg-entrance.log
tail -f /home/hadoop/linkis/logs/linkis-mg-gateway.log
tail -f /home/hadoop/linkis/logs/linkis-cg-engineconnmanager.log
tail -f /home/hadoop/linkis/logs/linkis-ps-metadatamanager.log
tail -f /home/hadoop/linkis/logs/linkis-cg-linkismanager.log
tail -f /home/hadoop/linkis/logs/linkis-ps-cs.log
tail -f /home/hadoop/linkis/logs/linkis-cg-engineplugin.log
tail -f /home/hadoop/linkis/logs/linkis-ps-publicservice.log
tail -f /home/hadoop/linkis/logs/linkis-mg-eureka.log
tail -f /home/hadoop/linkis/logs/linkis-ps-data-source-manager.log
tail -f /home/hadoop/dss/logs/dss-apiservice-server.log
tail -f /home/hadoop/dss/logs/dss-flow-execution-server.log
tail -f /home/hadoop/dss/logs/dss-framework-orchestrator-server.log
tail -f /home/hadoop/dss/logs/dss-framework-project-server.log
tail -f /home/hadoop/dss/logs/dss-guide-server.log
tail -f /home/hadoop/dss/logs/dss-scriptis-server.log
tail -f /home/hadoop/dss/logs/dss-workflow-server.log
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start mg-eureka
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start ps-publicservice
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start cg-entrance
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start mg-gateway
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start cg-engineconnmanager
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start ps-metadatamanager
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start cg-linkismanager
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start ps-cs
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start cg-engineplugin
sh /home/hadoop/linkis/sbin/linkis-daemon.sh start ps-data-source-manager
Linkis1.3.0+DSS1.1.1 Ansible一键安装脚本
https://github.com/apache/incubator-linkis/discussions/3944
3.1 安装前设置
### 安装ansible
yum -y install epel-release ansible telnet tar sed zip unzip less vim curl sudo krb5-workstation sssd crontabs net-tools wget openssh-server openssh-clients initscripts dos2unix epel-release python-pip glibc-common
### 配置免密
ssh-keygen -t rsa
ssh-copy-id root@192.168.74.134
3.2 部署linkis+dss
### 解压安装包
tar zxvf dss-linkis-ansible.tar.gz
cd dss-linkis-ansible
# 目录说明
dss-linkis-ansible
├── ansible.cfg # ansible 配置文件
├── hosts # hosts主机及变量配置
├── playbooks # playbooks剧本
├── README.md # 说明文档
└── roles # 角色配置
### 配置部署主机(注:ansible_ssh_host的值不能设置127.0.0.1)
vi hosts
[deploy]
dss-service ansible_ssh_host=192.168.74.134 ansible_ssh_port=22
# 一键安装Linkis+DSS
ansible-playbook playbooks/all.yml
执行结束后,即可访问:http://192.168.74.134 查看信息页面,上面记录了所有服务的访问地址及账号密码。
3.3 部署其它服务
# 安装dolphinscheduler
ansible-playbook playbooks/dolphinscheduler.yml
### 注: 安装以下服务必须优先安装dolphinscheduler调度系统
# 安装visualis
ansible-playbook playbooks/visualis.yml
# 安装qualitis
ansible-playbook playbooks/qualitis.yml
# 安装streamis
ansible-playbook playbooks/streamis.yml
# 安装exchangis
ansible-playbook playbooks/exchangis.yml
软件名称 | 软件版本 | 应用路径 | 测试/连接命令 |
---|---|---|---|
MySQL | mysql-5.6 | /usr/local/mysql | mysql -h 127.0.0.1 -uroot -p123456 |
JDK | jdk1.8.0_171 | /usr/local/java | java -version |
Python | python 2.7.5 | /usr/lib64/python2.7 | python -V |
Nginx | nginx/1.20.1 | /etc/nginx | nginx -t |
Hadoop | hadoop-2.7.2 | /opt/hadoop | hdfs dfs -ls / |
Hive | hive-2.3.3 | /opt/hive | hive -e "show databases" |
Spark | spark-2.4.3 | /opt/spark | spark-sql -e "show databases" |
dss | dss-1.1.1 | /home/hadoop/dss | http://192.168.74.134:8085 |
links | linkis-1.3.0 | /home/hadoop/linkis | http://192.168.74.134:8188 |
zookeeper | 3.4.6 | /usr/local/zookeeper | 无 |
DolphinScheduler | 1.3.9 | /opt/dolphinscheduler | http://192.168.74.134:12345/dolphinscheduler |
Visualis | 1.0.0 | /opt/visualis-server | http://192.168.74.134:9088 |
Qualitis | 0.9.2 | /opt/qualitis | http://192.168.74.134:8090 |
Streamis | 0.2.0 | /opt/streamis | http://192.168.74.134:9188 |
Soop | 1.4.6 | /opt/sqoop | sqoop |
Exchangis | 1.0.0 | /opt/exchangis | http://192.168.74.134:8028 |
rm -rf /usr/local/mysql
rm -rf /opt/hadoop
rm -rf /opt/hive
rm -rf /opt/spark
rm -rf /home/hadoop/dss
rm -rf /home/hadoop/linkis
rm -rf /usr/local/zookeeper
rm -rf /opt/dolphinscheduler
rm -rf /opt/visualis-server
rm -rf /opt/qualitis
rm -rf /opt/streamis
rm -rf /opt/sqoop
rm -rf /opt/exchangis
vm
JDK_VERSION=1.8.0-openjdk
JDK_BUILD_REVISION=1.8.0.332.b09-1.el7_9
MYSQL_JDBC_VERSION=8.0.28
HADOOP_VERSION=2.7.2
HIVE_VERSION=2.3.3
SPARK_VERSION=2.4.3
SPARK_HADOOP_VERSION=2.7
FLINK_VERSION=1.12.2
ZOOKEEPER_VERSION=3.5.9
LINKIS_VERSION=0.0.0
HADOOP_HOME=/opt/ldh/current/hadoop
HIVE_AUX_JARS_PATH=/opt/ldh/current/hive/lib
HIVE_HOME=/opt/ldh/current/hive
SPARK_HOME=/opt/ldh/current/spark
FLINK_HOME=/opt/ldh/current/flink
ZOOKEEPER_HOME=/opt/ldh/current/zookeeper
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
#JAVA_HOME /etc/alternatives/jre
JAVA_HOME=/usr/lib/jvm/java-${JDK_VERSION}-${JDK_BUILD_REVISION}.x86_64
PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin:$FLINK_HOME/bin:$ZOOKEEPER_HOME/bin
HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
HIVE_CONF_DIR=${HIVE_HOME}/conf
SPARK_CONF_DIR=${SPARK_HOME}/conf
FLINK_CONF_DIR=${FLINK_HOME}/conf
ZOOCFGDIR=${ZOOKEEPER_HOME}/conf
ZOO_LOG_DIR=/var/log/zookeeper
yum install -y telnet tar sed zip unzip less vim curl sudo krb5-workstation sssd crontabs net-tools wget openssh-server openssh-clients initscripts dos2unix \
epel-release python-pip glibc-common \
java-${JDK_VERSION}-${JDK_BUILD_REVISION} \
java-${JDK_VERSION}-devel-${JDK_BUILD_REVISION} \
mysql nginx \
&& yum clean all
curl https://bootstrap.pypa.io/pip/2.7/get-pip.py -o get-pip.py && python get-pip.py && python -m pip install "matplotlib<3.0"
ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub root@127.0.0.1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@0.0.0.0
unaslias cp
mkdir /data/ldh
sudo ln -s /data/ldh /opt/ldh
mkdir -p /opt/ldh/${LINKIS_VERSION} \
&& mkdir -p /opt/ldh/current \
&& mkdir -p /data \
&& chmod 777 -R /data
cd /opt/ldh/${LINKIS_VERSION}
tar xvf /data/barkup/ldh-tars/hadoop-${HADOOP_VERSION}.tar.gz
tar xvf /data/barkup/ldh-tars/apache-hive-${HIVE_VERSION}-bin.tar.gz
tar xvf /data/barkup/ldh-tars/spark-${SPARK_VERSION}-bin-hadoop${SPARK_HADOOP_VERSION}.tgz
tar xvf /data/barkup/ldh-tars/flink-${FLINK_VERSION}-bin-scala_2.11.tgz
tar xvf /data/barkup/ldh-tars/apache-zookeeper-${ZOOKEEPER_VERSION}-bin.tar.gz
mkdir -p /opt/common/extendlib/
cp /data/barkup/ldh-tars/mysql-connector-java-8.0.28.jar /opt/common/extendlib/
mkdir -p /etc/ldh \
&& mkdir -p /var/log/hadoop && chmod 777 -R /var/log/hadoop \
&& mkdir -p /var/log/hive && chmod 777 -R /var/log/hive \
&& mkdir -p /var/log/spark && chmod 777 -R /var/log/spark \
&& mkdir -p /var/log/flink && chmod 777 -R /var/log/flink \
&& mkdir -p /var/log/zookeeper && chmod 777 -R /var/log/zookeeper \
&& ln -s /opt/ldh/${LINKIS_VERSION}/hadoop-${HADOOP_VERSION} /opt/ldh/current/hadoop \
&& ln -s /opt/ldh/${LINKIS_VERSION}/apache-hive-${HIVE_VERSION}-bin /opt/ldh/current/hive \
&& ln -s /opt/ldh/${LINKIS_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${SPARK_HADOOP_VERSION} /opt/ldh/current/spark \
&& ln -s /opt/ldh/${LINKIS_VERSION}/flink-${FLINK_VERSION} /opt/ldh/current/flink \
&& ln -s /opt/ldh/${LINKIS_VERSION}/apache-zookeeper-${ZOOKEEPER_VERSION}-bin /opt/ldh/current/zookeeper
mkdir -p ${HADOOP_CONF_DIR} && chmod 777 -R ${HADOOP_CONF_DIR} \
&& mkdir -p ${HIVE_CONF_DIR} && chmod 777 -R ${HIVE_CONF_DIR} \
&& mkdir -p ${SPARK_CONF_DIR} && chmod 777 -R ${SPARK_CONF_DIR} \
&& mkdir -p ${FLINK_CONF_DIR} && chmod 777 -R ${FLINK_CONF_DIR} \
&& mkdir -p ${ZOOCFGDIR} && chmod 777 -R ${ZOOCFGDIR} \
&& mkdir -p ${ZOO_LOG_DIR} && chmod 777 -R ${ZOO_LOG_DIR}
##hadoop
\cp -fr /data/source/ling-boot/dockerfile/files/core-site.xml ${HADOOP_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/hdfs-site.xml ${HADOOP_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/yarn-site.xml ${HADOOP_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/mapred-site.xml ${HADOOP_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/hive-site.xml ${HADOOP_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/hadoop-env.sh ${HADOOP_CONF_DIR}
#spark
#files/spark-env.sh ${SPARK_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/spark-defaults.conf ${SPARK_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/workers ${SPARK_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/hive-site.xml ${SPARK_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/zoo.cfg ${ZOOCFGDIR}
# after create soft link
\cp -fr /data/barkup/ldh-tars/mysql-connector-java-${MYSQL_JDBC_VERSION}.jar /opt/ldh/current/hive/lib/
\cp -fr /data/barkup/ldh-tars/mysql-connector-java-${MYSQL_JDBC_VERSION}.jar /opt/ldh/current/spark/jars/
mkdir -p ${HADOOP_HOME}/hadoopinfra/hdfs/namenode && mkdir -p ${HADOOP_HOME}/hadoopinfra/hdfs/datanode
#files/entry-point-ldh.sh /usr/bin/start-all.sh
echo "export CURRENT_HOME=/opt/ldh/current \
export HADOOP_HOME=/opt/ldh/current/hadoop \
export HADOOP_CONF_DIR=/opt/ldh/current/hadoop/etc/hadoop \
export HIVE_HOME=/opt/ldh/current/hive \
export HIVE_CONF_DIR=/opt/ldh/current/hive/conf \
export HIVE_AUX_JARS_PATH=/opt/ldh/current/hive/lib \
export SPARK_HOME=/opt/ldh/current/spark \
export SPARK_CONF_DIR=/opt/ldh/current/spark/conf \
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64 \
export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin \
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar \
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native \
export HADOOP_OPTS=\"-Djava.library.path=$HADOOP_HOME/lib\"">>/etc/profile
source /etc/profile
unzip -d /opt/ldh/${LINKIS_VERSION}/dss-${DSS_VERSION} /data/barkup/ldh-tars/dss_linkis_one-click_install_20221201.zip \
&& ln -s /opt/ldh/${LINKIS_VERSION}/dss-${DSS_VERSION}/dss_linkis_one-click_install_20221201 /opt/ldh/current/dss
DSS_HOME=/opt/ldh/current/dss
DSS_CONF_DIR=${DSS_HOME}/conf
\cp -fr /data/source/ling-boot/dockerfile/files/config.sh ${DSS_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/db.sh ${DSS_CONF_DIR}
\cp -fr /data/source/ling-boot/dockerfile/files/dss_init.sh ${DSS_HOME}/bin
\cp -fr /data/source/ling-boot/dockerfile/files/dss_start_all.sh ${DSS_HOME}/bin/dss_start_all.sh_template
\cp -fr /data/source/ling-boot/dockerfile/files/dss_start_sleep.sh ${DSS_HOME}/bin/dss_start_all.sh
chmod +x -R ${DSS_HOME}/bin && chmod -R 600 /etc/ssh
\cp -fr /data/source/ling-boot/dockerfile/files/entry-point-ldh.sh /usr/bin/start-all.sh
\cp -fr /data/source/ling-boot/dockerfile/files/dss_init.sh /opt/ldh/current/dss/bin/dss_init.sh
chmod +x /usr/bin/start-all.sh
chmod +x /opt/ldh/current/dss/bin/dss_init.sh
sh /opt/ldh/current/dss/bin/dss_init.sh
#修改 vi /etc/nginx/nginx.conf
sed -i 's/user nginx;/user root;/g' /etc/nginx/nginx.conf
sed -i 's/sudo systemctl start nginx/nginx/g' /opt/ldh/current/dss/bin/start-all.sh
sed -i 's/sudo systemctl stop nginx/nginx -s stop/g' /opt/ldh/current/dss/bin/stop-all.sh
mv /opt/ldh/current/dss/bin/dss_start_all.sh /opt/ldh/current/dss/bin/dss_start_all.sh_sleep
cp /opt/ldh/current/dss/bin/dss_start_all.sh_template /opt/ldh/current/dss/bin/dss_start_all.sh
rm -rf apache-linkis-1.1.1-incubating-bin.tar.gz
rm -rf wedatasphere-dss-1.1.1-dist.tar.gz
rm -rf wedatasphere-dss-web-1.1.1-dist.zip
yum clean all
sh /opt/ldh/current/dss/bin/dss_start_all.sh
sed -i 's/dss_ipaddr=${dss_ipaddr\/\/ \/}/dss_ipaddr=192.168.74.134/g' /opt/ldh/current/dss/bin/start-all.sh
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /etc/nginx/conf.d/dss.conf
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/linkis/conf/linkis-env.sh
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/dss/conf/dss.properties
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/dss/conf/application-dss.yml
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/dss/conf/config.sh
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/linkis/conf/application-eureka.yml
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/linkis/conf/application-linkis.yml
sed -i 's/192.168.74.134192.168.122.1172.17.0.1/192.168.74.134/g' /opt/ldh/current/dss/linkis/conf/linkis.properties
ln -s /opt/ldh/current/dss/bin/dss_start_all.sh /dss_start_all.sh
vi /opt/ldh/current/dss/linkis/conf/linkis-ps-publicservice.properties
hive.meta.url=jdbc:mysql://rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com:3306/hive?useSSL=false
hive.meta.user=lingcloud
hive.meta.password=Wb19831010!
sh /opt/ldh/current/dss/linkis/sbin/linkis-daemon.sh start ps-publicservice
镜像
cd /data/source/ling-boot/dockerfile/
docker build -f Dockerfile_centos7_base -t registry.cn-shanghai.aliyuncs.com/ling/centos7-base:0.6 .
docker build -f Dockerfile_centos7_ldh_base -t registry.cn-shanghai.aliyuncs.com/ling/centos7-ldh-base:0.1 .
docker build -f Dockerfile_centos7_ldh -t registry.cn-shanghai.aliyuncs.com/ling/centos7-ldh:0.1 .
docker build -f Dockerfile_centos7_dss -t registry.cn-shanghai.aliyuncs.com/ling/centos7-dss:0.1 .
docker run -d --name dsstemplate registry.cn-shanghai.aliyuncs.com/ling/centos7-dss:0.1
docker exec -it dsstemplate bash
/usr/sbin/sshd -D &
ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub root@127.0.0.1
ssh localhost
ssh 0.0.0.0
sh /opt/ldh/current/dss/bin/dss_init.sh
docker commit -m="customer dss" -a="bo.wang" xxxxx registry.cn-shanghai.aliyuncs.com/ling/centos7-dss:c.0.2
docker run -d -p 9001:9001 -p 50070:50070 -p 9600:9600 -p 8080:8080 -p 8085:8085 \
--name dsstest registry.cn-shanghai.aliyuncs.com/ling/centos7-dss:c.0.2
docker logs -f dsstest
docker exec -it dsstest bash
cd /opt/ldh/current/dss/linkis/logs
tail -f linkis-ps-publicservice.log
docker login --username=102010cncger@sina.com registry.cn-shanghai.aliyuncs.com
docker push registry.cn-shanghai.aliyuncs.com/ling/centos7-dss:c.1.0
编译
后端
cd ../incubator-linkis
mvn -N install
mvn -Dmaven.test.skip=true clean install
linkis-dist/target/
git branch -a
linkis
git checkout -b release-1.1.1 remotes/origin/release-1.1.1
dss
git checkout -b branch-1.1.1 remotes/origin/branch-1.1.1
前端
cd ../incubator-linkis/linkis-web
npm install -g cnpm --registry=https://registry.npm.taobao.org
接着,通过执行以下指令代替npm install指令
cnpm install
安装windows-build-tools (管理员权限)
$ cnpm install --global --production windows-build-tools
安装node-gyp
$ cnpm install --global node-gyp
2.如果编译失败 请按如下步骤清理后重新执行
#进入项目工作目录,删除 node_modules
$ rm -rf node_modules
#删除 package-lock.json
$ rm -rf package-lock.json
#清除 npm 缓存
$ cnpm cache clear --force
#重新下载依赖
$ cnpm install
安装
https://zhuanlan.zhihu.com/p/555062985
https://zhuanlan.zhihu.com/p/556259593
docker cp /ling-cloud/apache-hive-2.3.3-bin.tar.gz 53969e6b563c:/home/hadoop docker cp /ling-cloud/hadoop-2.7.2.tar.gz 53969e6b563c:/home/hadoop docker cp /ling-cloud/dss_linkis_one-click_install_20220704.zip 53969e6b563c:/home/hadoop docker cp /ling-cloud/spark-2.4.3-bin-hadoop2.7.tgz 53969e6b563c:/home/hadoop docker cp /ling-cloud/spark-2.4.3-bin-without-hadoop.tgz 53969e6b563c:/home/hadoop
hadoop
docker run -i -t registry.cn-shanghai.aliyuncs.com/ling/centos7-base:0.3 /bin/bash
netstat -tunlp
如果没有22端口可以执行 /usr/sbin/sshd -D &启动
root账号执行
sudo useradd hadoop
sudo echo "hadoop ALL=(ALL) NOPASSWD: NOPASSWD: ALL">>/etc/sudoers
ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub root@127.0.0.1
配置成功后,测试下是否成功,如果不需要输入密码,证明配置成功。
ssh localhost
sudo rpm -ivh http://repo.mysql.com/yum/mysql-5.5-community/el/6/x86_64/mysql-community-release-el6-5.noarch.rpm
#sudo yum install mysql-community-client mysql-community-devel mysql-community-server php-mysql
sudo rpm -ivh http://nginx.org/packages/centos/7/noarch/RPMS/nginx-release-centos-7-0.el7.ngx.noarch.rpm
sudo yum install -y java-1.8.0-openjdk-1.8.0.342.b07-1.el7_9 java-1.8.0-openjdk-devel-1.8.0.342.b07-1.el7_9 nginx dos2unix mysql initscripts
sudo systemctl enable nginx
sudo systemctl start nginx
hadoop账号执行
su hadoop
cd ~
wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
tar xvf hadoop-2.7.2.tar.gz
sudo mkdir -p /opt/hadoop
sudo mv hadoop-2.7.2 /opt/hadoop/
sudo vim /etc/profile
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.3-bin/lib
export HIVE_HOME=/opt/hive/apache-hive-2.3.3-bin
export SPARK_HOME=/opt/spark/spark-2.4.3-bin-without-hadoop
export HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.342.b07-1.el7_9.x86_64
export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
source /etc/profile
sudo vi /etc/hosts
添加 127.0.0.1 namenode
配置Hadoop
mkdir -p /opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/namenode
mkdir -p /opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/datanode
vi /opt/hadoop/hadoop-2.7.2/etc/hadoop/core-site.xml
core-site.xml修改如下
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/hadoop-2.7.2/data/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
修改Hadoop的hdfs目录配置
vi /opt/hadoop/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
hdfs-site.xml修改如下
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/datanode</value>
</property>
</configuration>
修改Hadoop的yarn配置
vi /opt/hadoop/hadoop-2.7.2/etc/hadoop/yarn-site.xml
yarn-site.xml修改如下
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
</configuration>
修改mapred
cp /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml.template /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml
vi /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml
mapred-site.xml修改如下
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改Hadoop环境配置文件
vi /opt/hadoop/hadoop-2.7.2/etc/hadoop/hadoop-env.sh
修改JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.342.b07-1.el7_9.x86_64/
初始化hadoop
hdfs namenode -format
root账号执行
/opt/hadoop/hadoop-2.7.2/sbin/start-dfs.sh
/opt/hadoop/hadoop-2.7.2/sbin/start-yarn.sh
临时关闭防火墙
sudo systemctl stop firewalld
浏览器访问Hadoop
访问hadoop的默认端口号为50070
hive
wget https://archive.apache.org/dist/hive/hive-2.3.3/apache-hive-2.3.3-bin.tar.gz
tar xvf apache-hive-2.3.3-bin.tar.gz
sudo mkdir -p /opt/hive
sudo mv apache-hive-2.3.3-bin /opt/hive/
修改配置文件
cd /opt/hive/apache-hive-2.3.3-bin/conf/
sudo cp hive-env.sh.template hive-env.sh
sudo cp hive-default.xml.template hive-site.xml
sudo cp hive-log4j2.properties.template hive-log4j2.properties
sudo cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
在Hadoop中创建文件夹并设置权限
hadoop fs -mkdir -p /data/hive/warehouse
hadoop fs -mkdir /data/hive/tmp
hadoop fs -mkdir /data/hive/log
hadoop fs -chmod -R 777 /data/hive/warehouse
hadoop fs -chmod -R 777 /data/hive/tmp
hadoop fs -chmod -R 777 /data/hive/log
hadoop fs -mkdir -p /spark-eventlog
修改hive配置文件
sudo cp hive-site.xml hive-site.xml_barkup
sudo vi hive-site.xml
配置文件如下
sudo tee /opt/hive/apache-hive-2.3.3-bin/conf/hive-site.xml <<-'EOF'
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
--><configuration>
<property>
<name>hive.exec.scratchdir</name>
<value>hdfs://127.0.0.1:9000/data/hive/tmp</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://127.0.0.1:9000/data/hive/warehouse</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>hdfs://127.0.0.1:9000/data/hive/log</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com:3306/hive?useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>lingcloud</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Wb19831010!</value>
</property>
<property>
<name>system:java.io.tmpdir</name>
<value>/tmp/hive/java</value>
</property>
<property>
<name>system:user.name</name>
<value>hadoop</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hive/apache-hive-2.3.3-bin/tmp/${system:user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hive/apache-hive-2.3.3-bin/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/opt/hive/apache-hive-2.3.3-bin/tmp/root/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
</configuration>
EOF
配置hive中jdbc的MySQL驱动
cd /opt/hive/apache-hive-2.3.3-bin/lib/
wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.49.tar.gz
tar xvf mysql-connector-java-5.1.49.tar.gz
cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar .
配置环境变量
sudo vi /opt/hive/apache-hive-2.3.3-bin/conf/hive-env.sh
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.3-bin/lib
创建对应数据库
初始化schema
/opt/hive/apache-hive-2.3.3-bin/bin/schematool -dbType mysql -initSchema
初始化完成后修改MySQL链接信息,之后配置MySQL IP 端口以及放元数据的库名称
nohup hive --service metastore >> metastore.log 2>&1 &
nohup hive --service hiveserver2 >> hiveserver2.log 2>&1 &
验证安装
hive -e "show databases"
Spark
su hadoop
wget https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-without-hadoop.tgz
tar xvf spark-2.4.3-bin-without-hadoop.tgz
sudo mkdir -p /opt/spark
sudo mv spark-2.4.3-bin-without-hadoop /opt/spark/
配置spark环境变量以及备份配置文件
cd /opt/spark/spark-2.4.3-bin-without-hadoop/conf/
cp spark-env.sh.template spark-env.sh
cp spark-defaults.conf.template spark-defaults.conf
cp metrics.properties.template metrics.properties
cp workers.template workers
配置程序的环境变量
vi spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.342.b07-1.el7_9.x86_64
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/opt/hadoop/hadoop-2.7.2/bin/hadoop classpath)
export SPARK_MASTER_HOST=127.0.0.1
export SPARK_MASTER_PORT=7077
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -
Dspark.history.retainedApplications=50 -
Dspark.history.fs.logDirectory=hdfs://127.0.0.1:9000/spark-eventlog"
修改默认的配置文件
vi spark-defaults.conf
spark.master spark://127.0.0.1:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://127.0.0.1:9000/spark-eventlog
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 3g
spark.eventLog.enabled true
spark.eventLog.dir hdfs://127.0.0.1:9000/spark-eventlog
spark.eventLog.compress true
配置工作节点
vi workers
127.0.0.1
配置hive
cp /opt/hive/apache-hive-2.3.3-bin/conf/hive-site.xml /opt/spark/spark-2.4.3-bin-without-hadoop/conf
验证应用程序
su root
/opt/spark/spark-2.4.3-bin-without-hadoop/sbin/start-all.sh
访问集群中的所有应用程序的默认端口号为8080
验证安装
spark-sql -e "show databases"
提示
Error: Failed to load class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
You need to build Spark with -Phive and -Phive-thriftserver.
查找原因是因为没有集成hadoop的spark没有hive驱动,按网上的讲法,要么自己编译带驱动版本,要么把驱动文件直接放到jars目录。第一种太麻烦,第二种没成功,我用的第三种方法。下载对应版本集成了hadoop的spark安装包,直接覆盖原来的jars目录
wget https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
tar xvf spark-2.4.3-bin-hadoop2.7.tgz
cp -rf spark-2.4.3-bin-hadoop2.7/jars/ /opt/spark/spark-2.4.3-bin-without-hadoop/
如果提示缺少MySQL驱动,可以将mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar放入到spark的jars目录
如果本地没有相关驱动,执行下面脚本
cd /opt/spark/spark-2.4.3-bin-without-hadoop/jars
wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.49.tar.gz
tar xvf mysql-connector-java-5.1.49.tar.gz
cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar .
DataSphere Studio
sudo rpm -ivh http://repo.mysql.com/yum/mysql-5.5-community/el/6/x86_64/mysql-community-release-el6-5.noarch.rpm
sudo yum install mysql-community-client mysql-community-devel mysql-community-server php-mysql
准备安装包
DataSphereStudio1.1.0
%E2%80%8Bgithub.com/WeBankFinTech/DataSphereStudio/releases/tag/1.1.0
unzip -d dss dss_linkis_one-click_install_20220704.zip
sudo yum -y install epel-release
sudo yum install -y python-pip
#python -m pip install matplotlib
python -m pip install "matplotlib<3.0"
修改配置
用户需要对 xx/dss_linkis/conf 目录下的 config.sh 和 db.sh 进行修改
### deploy user
deployUser=hadoop
### Linkis_VERSION
LINKIS_VERSION=1.1.1
### DSS Web
DSS_NGINX_IP=127.0.0.1
DSS_WEB_PORT=8085
### DSS VERSION
DSS_VERSION=1.1.0
############## ############## linkis的其他默认配置信息 start ############## ##############
### Specifies the user workspace, which is used to store the user's script files and log files.
### Generally local directory
##file:// required
WORKSPACE_USER_ROOT_PATH=file:///tmp/linkis/
### User's root hdfs path
##hdfs:// required
HDFS_USER_ROOT_PATH=hdfs:///tmp/linkis
### Path to store job ResultSet:file or hdfs path
##hdfs:// required
RESULT_SET_ROOT_PATH=hdfs:///tmp/linkis
### Path to store started engines and engine logs, must be local
ENGINECONN_ROOT_PATH=/appcom/tmp
#ENTRANCE_CONFIG_LOG_PATH=hdfs:///tmp/linkis/ ##hdfs:// required
###HADOOP CONF DIR #/appcom/config/hadoop-config
HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
###HIVE CONF DIR #/appcom/config/hive-config
HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
###SPARK CONF DIR #/appcom/config/spark-config
SPARK_CONF_DIR=/opt/spark/spark-2.4.3-bin-without-hadoop/conf
# for install
LINKIS_PUBLIC_MODULE=lib/linkis-commons/public-module
##YARN REST URL spark engine required
YARN_RESTFUL_URL=http://127.0.0.1:8088
## Engine version conf
#SPARK_VERSION
SPARK_VERSION=2.4.3
##HIVE_VERSION
HIVE_VERSION=2.3.3
PYTHON_VERSION=python2
## LDAP is for enterprise authorization, if you just want to have a try, ignore it.
#LDAP_URL=ldap://localhost:1389/
#LDAP_BASEDN=dc=webank,dc=com
#LDAP_USER_NAME_FORMAT=cn=%s@xxx.com,OU=xxx,DC=xxx,DC=com
################### The install Configuration of all Linkis's Micro-Services #####################
#
# NOTICE:
# 1. If you just wanna try, the following micro-service configuration can be set without any settings.
# These services will be installed by default on this machine.
# 2. In order to get the most complete enterprise-level features, we strongly recommend that you install
# the following microservice parameters
#
### EUREKA install information
### You can access it in your browser at the address below:http://${EUREKA_INSTALL_IP}:${EUREKA_PORT}
### Microservices Service Registration Discovery Center
LINKIS_EUREKA_INSTALL_IP=127.0.0.1
LINKIS_EUREKA_PORT=9600
#LINKIS_EUREKA_PREFER_IP=true
### Gateway install information
#LINKIS_GATEWAY_INSTALL_IP=127.0.0.1
LINKIS_GATEWAY_PORT=9001
### ApplicationManager
#LINKIS_MANAGER_INSTALL_IP=127.0.0.1
LINKIS_MANAGER_PORT=9101
### EngineManager
#LINKIS_ENGINECONNMANAGER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONNMANAGER_PORT=9102
### EnginePluginServer
#LINKIS_ENGINECONN_PLUGIN_SERVER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONN_PLUGIN_SERVER_PORT=9103
### LinkisEntrance
#LINKIS_ENTRANCE_INSTALL_IP=127.0.0.1
LINKIS_ENTRANCE_PORT=9104
### publicservice
#LINKIS_PUBLICSERVICE_INSTALL_IP=127.0.0.1
LINKIS_PUBLICSERVICE_PORT=9105
### cs
#LINKIS_CS_INSTALL_IP=127.0.0.1
LINKIS_CS_PORT=9108
########## Linkis微服务配置完毕#####
################### The install Configuration of all DataSphereStudio's Micro-Services #####################
#
# NOTICE:
# 1. If you just wanna try, the following micro-service configuration can be set without any settings.
# These services will be installed by default on this machine.
# 2. In order to get the most complete enterprise-level features, we strongly recommend that you install
# the following microservice parameters
#
### DSS_SERVER
### This service is used to provide dss-server capability.
### project-server
#DSS_FRAMEWORK_PROJECT_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_PROJECT_SERVER_PORT=9002
### orchestrator-server
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_PORT=9003
### apiservice-server
#DSS_APISERVICE_SERVER_INSTALL_IP=127.0.0.1
#DSS_APISERVICE_SERVER_PORT=9004
### dss-workflow-server
#DSS_WORKFLOW_SERVER_INSTALL_IP=127.0.0.1
#DSS_WORKFLOW_SERVER_PORT=9005
### dss-flow-execution-server
#DSS_FLOW_EXECUTION_SERVER_INSTALL_IP=127.0.0.1
#DSS_FLOW_EXECUTION_SERVER_PORT=9006
###dss-scriptis-server
#DSS_SCRIPTIS_SERVER_INSTALL_IP=127.0.0.1
#DSS_SCRIPTIS_SERVER_PORT=9008
###dss-data-api-server
#DSS_DATA_API_SERVER_INSTALL_IP=127.0.0.1
#DSS_DATA_API_SERVER_PORT=9208
###dss-data-governance-server
#DSS_DATA_GOVERNANCE_SERVER_INSTALL_IP=127.0.0.1
#DSS_DATA_GOVERNANCE_SERVER_PORT=9209
###dss-guide-server
#DSS_GUIDE_SERVER_INSTALL_IP=127.0.0.1
#DSS_GUIDE_SERVER_PORT=9210
########## DSS微服务配置完毕#####
############## ############## other default configuration 其他默认配置信息 ############## ##############
## java application default jvm memory
export SERVER_HEAP_SIZE="512M"
##sendemail配置,只影响DSS工作流中发邮件功能
EMAIL_HOST=smtp.163.com
EMAIL_PORT=25
EMAIL_USERNAME=xxx@163.com
EMAIL_PASSWORD=xxxxx
EMAIL_PROTOCOL=smtp
### Save the file path exported by the orchestrator service
ORCHESTRATOR_FILE_PATH=/appcom/tmp/dss
### Save DSS flow execution service log path
EXECUTION_LOG_PATH=/appcom/tmp/dss
用脚本安装
tee /home/hadoop/dss/conf/db.sh <<-'EOF'
### for DSS-Server and Eventchecker APPCONN
MYSQL_HOST=rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com
MYSQL_PORT=3306
MYSQL_DB=dss
MYSQL_USER=lingcloud
MYSQL_PASSWORD=Wb19831010!
#主MHscriptis起使\M置认Z$HIVE_CONF_DIR 中DM置G件V
HIVE_META_URL=jdbc:mysql://rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com:3306/hive?useSSL=false # HiveMetaCDURL
HIVE_META_USER=lingcloud # HiveMetaCD�[m
HIVE_META_PASSWORD=Wb19831010! # HiveMetaCDA
EOF
cd xx/dss_linkis/bin
sh install.sh
等待安装脚本执行完毕,再进到linkis目录里修改对应的配置文件
修改linkis-ps-publicservice.properties配置,否则hive数据库刷新不出来表
linkis.metadata.hive.permission.with-login-user-enabled=false
拷贝缺少的jar
cp /opt/hive/apache-hive-2.3.3-bin/lib/datanucleus-* ~/dss/linkis/lib/linkis-engineconn-plugins/hive/dist/v2.3.3/lib
cp /opt/hive/apache-hive-2.3.3-bin/lib/*jdo* ~/dss/linkis/lib/linkis-engineconn-plugins/hive/dist/v2.3.3/lib
安装完成后启动
sh start-all.sh
启动完成后eureka注册页面
常用链接
Linkis1.0.2 安装及使用指南 https://www.jianshu.com/p/d0e8b605c4ce
WeDataSphere 常见问题(含DSS,Linkis等)QA文档 https://docs.qq.com/doc/DSGZhdnpMV3lTUUxq
systemctl stop firewalld systemctl stop firewalld.service #停止firewall systemctl disable firewalld.service #禁止firewall开机启动
安装准备
yum -y install yum-utils yum-config-manager --disable mysql80-community yum-config-manager --enable mysql57-community yum repolist enabled | grep mysql yum install -y mysql-community-server
yum install -y telnet,tar,sed,dos2unix,unzip,expect
http://nginx.org/en/linux_packages.html#RHEL-CentOS
touch /etc/yum.repos.d/nginx.repo vi /etc/yum.repos.d/nginx.repo
[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/centos/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
[nginx-mainline]
name=nginx mainline repo
baseurl=http://nginx.org/packages/mainline/centos/$releasever/$basearch/
gpgcheck=1
enabled=0
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
yum install yum-utils yum install -y nginx
whereis nginx
perl
https://www.perl.org/get.html 项目部署#安装nginx依赖:查看是否已经安装
wget https://www.cpan.org/src/5.0/perl-5.34.1.tar.gz tar -xzf perl-5.34.1.tar.gz cd perl-5.34.1 mv /usr/bin/perl /usr/bin/perl.bak ./Configure -des -Dprefix=/usr/local/perl make&&make install
perl -v ln -s /usr/local/perl/bin/perl /usr/bin/perl
mysql
https://www.cnblogs.com/milton/p/15418572.html
wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm yum -y install mysql57-community-release-el7-10.noarch.rpm yum -y install mysql-community-server
b、然后手动下载
wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm
c、然后安装该数据库的rpm包
rpm -ivh mysql-community-release-el7-5.noarch.rpm
d、开始安装mysql-server
yum install mysql-server
(2)卸载mysql-community-release-el7-5.noarch
rpm -e --nodeps mysql-community-release-el7-5.noarch
centos8的镜像包不在维护,重装centos后一切都好了
wget https://cdn.mysql.com//Downloads/MySQL-8.0/mysql-8.0.28-1.el8.x86_64.rpm-bundle.tar tar -xvf mysql-8.0.28-1.el8.x86_64.rpm-bundle.tar
rpm -ivh mysql-community-common-8.0.28-1.el8.x86_64.rpm rpm -ivh mysql-community-client-plugins-8.0.28-1.el8.x86_64.rpm rpm -ivh mysql-community-libs-8.0.28-1.el8.x86_64.rpm rpm -ivh mysql-community-client-8.0.28-1.el8.x86_64.rpm rpm -ivh mysql-community-icu-data-files-8.0.28-1.el8.x86_64.rpm rpm -ivh mysql-community-server-8.0.28-1.el8.x86_64.rpm
mysqld --console查看日志后发现是data文件的问题,将data文件手动删除之后使用mysqld --initalize-insecure 系统自动生成data文件夹及内部文件,再使用mysqld -install 重新安装
rm -rf /var/lib/mysql mysqld --initalize-insecure mysqld -install
tail -f /var/log/mysqld.log
安装hadoop2.7.2
https://blog.csdn.net/qq_44665283/article/details/121329554
mkdir /datasphere cd /datasphere wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz tar -zxvf hadoop-2.7.2.tar.gz -C /datasphere
vi /datasphere/hadoop-2.7.2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/datasphere/jdk1.8.0_91
vi /datasphere/hadoop-2.7.2/etc/hadoop/core-site.xml
<configuration>
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.74.135:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储路径 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/datasphere/hadoop-2.7.2/tmp</value>
</property>
</configuration>
vi /datasphere/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
<configuration>
<!-- 设置hdfs副本数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
SSH免密登录
- 到 root 目录下:
cd /root
- 执行生成密钥命令:
ssh-keygen -t rsa
- 然后三个回车
- 然后复制公钥追加到第一台节点的公钥文件中:
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.74.135
- 选择 yes
- 输入登录第一台节点的密码(操作完成该节点公钥复制到第一台节点中)
配置环境变量
vim /etc/profile
export HADOOP_HOME=/datasphere/hadoop-2.7.2/ PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$MAVEN_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
hdfs 启动与停止
第一次启动得先格式化(最好不要复制):
hdfs namenode -format
启动hdfs
start-dfs.sh
(9)开放50070端口
添加永久开放的端口
firewall-cmd --add-port=50070/tcp --permanent firewall-cmd --reload
(10) 配置yarn启动
1、配置mapred-site.xml
cd /datasphere/hadoop-2.7.2/etc/hadoop/ mv mapred-site.xml.template mapred-site.xml vim mapred-site.xml
<configuration>
<!-- 通知框架MR使用YARN -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
2、配置yarn-site.xml
<configuration>
<!-- reducer取数据的方式是mapreduce_shuffle -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3、启动yarn
start-yarn.sh
浏览器访问(防火墙开放8088端口):
firewall-cmd --add-port=8088/tcp --permanent firewall-cmd --reload
至此,我们Hadoop的单机模式搭建成功。
Hive2.3.3的安装
https://blog.csdn.net/qq_44665283/article/details/121147347 下载地址:
http://archive.apache.org/dist/hive/hive-2.3.3/
wget http://archive.apache.org/dist/hive/hive-2.3.3/apache-hive-2.3.3-bin.tar.gz tar -zxvf apache-hive-2.3.3-bin.tar.gz -C /datasphere mv apache-hive-2.3.3-bin hive-2.3.3
1 解压配置环境变量
- 配置环境变量
sudo vi /etc/profile
末尾追加
export HIVE_HOME=/datasphere/hive-2.3.3 export PATH=$PATH:$HIVE_HOME/bin
重新编译环境变量生效
source /etc/profile
2 配置Hive文件
2.1 修改hive-env.sh
cp hive-env.sh.template hive-env.sh
# HADOOP_HOME=${bin}/../../hadoop 打开注释修改 HADOOP_HOME=/datasphere/hadoop-2.7.2 # export HIVE_CONF_DIR= 打开注释修改 HIVE_CONF_DIR=/datasphere/hive-2.3.3/conf
2.2 修改hive-log4j.properties
修改hive的log存放日志到/datasphere/hive-2.3.3/logs
cp hive-log4j2.properties.template hive-log4j2.properties
vi hive-log4j2.properties
找到 property.hive.log.dir = ${sys:java.io.tmpdir}/${sys:user.name}
修改 property.hive.log.dir = /datasphere/hive-2.3.3/logs
3 配置MySQL作为Metastore
默认情况下, Hive的元数据保存在了内嵌的 derby 数据库里, 但一般情况下生产环境使用 MySQL 来存放 Hive 元数据。
安装mysql,拷贝 mysql-connector-java-5.1.47.jar 放入 $HIVE_HOME/lib 下。
3.2 修改配置文件
参数配置文档:https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin
复制hive-default.xml.template为hive-site.xml 文件,删除掉configuration里的配置信息,重新配置 MySQL 数据库连接信息。
cp hive-default.xml.template hive-site.xmltouch hive-site.xml vi hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--Hive作业的HDFS根目录位置 -->
<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/tmp</value>
</property>
<!--Hive作业的HDFS根目录创建写权限 -->
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
</property>
<!--hdfs上hive元数据存放位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!--连接数据库地址,名称 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<!--连接数据库驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<!--连接数据库用户名称 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>lingcloud</value>
</property>
<!--连接数据库用户密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Wb19831010!</value>
</property>
<!--客户端显示当前查询表的头信息 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<!--客户端显示当前数据库名称信息 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
3.3 mysql创建hive用户密码
CREATE DATABASE hive; USE hive; CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive'; GRANT ALL ON hive.* TO 'hive'@'localhost' IDENTIFIED BY 'hive'; GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive'; FLUSH PRIVILEGES;
4.1 初始化数据库 从Hive 2.1开始,我们需要运行下面的schematool命令作为初始化步骤。例如,这里使用“mysql”作为db类型。
schematool -dbType mysql -initSchema
执行成功后,可以使用Navicat Premium 查看元数据库 hive 是否已经创建成功。
4.2 启动 Hive 客户端
启动Hadoop服务,使用 Hive CLI(Hive command line interface), **hive --service cli和hive效果一样,**可以在终端输入以下命令
hive
安装spark
https://spark.apache.org/downloads.html
a、下载安装
wget https://dlcdn.apache.org/spark/spark-3.0.3/spark-3.0.3-bin-hadoop2.7.tgz
b、解压安装包
tar -zxvf spark-3.0.3-bin-hadoop2.7.tgz
c、修改spark-env.sh文件
cp spark-env.sh.template spark-env.sh
末尾添加以下内容:
export JAVA_HOME=/datasphere/jdk1.8.0_91
export SPARK_MASTER_IP=192.168.74.135
export SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_CORES=2
export SPARK_WORKER_INSTANCES=1
d、配置环境变量
vim /etc/profile
export SPARK_HOME=/datasphere/spark-3.0.3-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
source /etc/profile
e、启动
./sbin/start-master.sh
安装dss
hadoop用户
vim /etc/profile
export JAVA_HOME=/datasphere/jdk1.8.0_91
export JRE_HOME=$JAVA_HOME/jre
export JAVA_BIN=$JAVA_HOME/bin
export JAVA_LIB=$JAVA_HOME/lib
export CLASSPATH=.$CLASSPATH:$JAVA_LIB/tools.jar:$JAVA_LIB/dt.jar
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:
export HADOOP_HOME=/datasphere/hadoop-2.7.2/
export PATH=$PATH:$MAVEN_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HIVE_HOME=/datasphere/hive-2.3.3
export PATH=$PATH:$HIVE_HOME/bin
export SPARK_HOME=/datasphere/spark-3.0.3-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HIVE_CONF_DIR=$HIVE_HOME/conf
export FLINK_CONF_DIR=$FLINK_HOME/conf
export FLINK_LIB_DIR=%FLINK_HOME/lib
export SPARK_CONF_DIR=$SPARK_HOME/conf
source /etc/profile
unzip -o DSS-Linkis全家桶20220223.zip -d dss
如果有问题userdel -r hadoop
adduser hadoop passwd hadoop usermod -a -G hadoop hadoop cat /etc/passwd | grep hadoop
useradd hadoop -g hadoop
vi /etc/sudoers hadoop ALL=(ALL) NOPASSWD: NOPASSWD: ALL
vim /home/hadoop/.bash_rc
同profile
检查环境
ENGINECONN_ROOT_PATH为本地目录,需要用户提前创建,并且完成授权,授权命令chmod -R 777 /目录,若为 Linkis1.0.2 版本,不必提前创建与授权,会在脚本、程序中自动创建与授权。
HDFS_USER_ROOT_PATH为 HDFS 上的路径,需要提前创建,且完成授权,授权命令hadoop fs -chmod -R 777 /目录。
yum install gcc,zlib -y sh bin/checkEnv.sh dnf install python3 alternatives --set python /usr/bin/python3
dnf install python2 alternatives --set python /usr/bin/python2
pip install --upgrade pip python -m pip install matplotlib
如果要删除默认的python命令,请输入:lternatives --auto python
配置
vi conf/db.sh
MYSQL_HOST=rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com
MYSQL_PORT=3306
MYSQL_DB=dss
MYSQL_USER=lingcloud
MYSQL_PASSWORD=Wb19831010!
##hive的配置
HIVE_HOST=rm-8vbe87b5295dz08zhxo.mysql.zhangbei.rds.aliyuncs.com
HIVE_PORT=3306
HIVE_DB=hive
HIVE_USER=lingcloud
HIVE_PASSWORD=Wb19831010!
vi conf/config.sh
###HADOOP CONF DIR #/appcom/config/hadoop-config
HADOOP_CONF_DIR=/datasphere/hadoop-2.7.2/etc/hadoop
###HIVE CONF DIR #/appcom/config/hive-config
HIVE_CONF_DIR=/datasphere/hive-2.3.3/conf
###SPARK CONF DIR #/appcom/config/spark-config
SPARK_CONF_DIR=/datasphere/spark-3.0.3-bin-hadoop2.7/conf
启动
启动hadoop
start-dfs.sh
第一次运行,否则不要运行
/datasphere/dss/bin/install.sh
启动
/datasphere/dss/bin/start-all.sh
停止
/datasphere/dss/bin/stop-all.sh
单个启动
cd /datasphere/dss/dss/sbin sh dss-daemon.sh start dss-framework-project-server sh dss-daemon.sh start dss-framework-orchestrator-server
cd /datasphere/dss/linkis/logs cd /datasphere/dss/dss/logs
tail -f /datasphere/dss/dss/logs/dss-framework-project-server.out tail -f /datasphere/dss/linkis/logs/linkis-ps-publicservice.log
tail -f /datasphere/dss/dss/logs/dss-framework-orchestrator-server.out
问题处理
javafx.util.Pair
rpm -qa | grep java rpm -e --nodeps java
java -version
安装oracle jdk