官方文档:https://superset.apache.org/docs/installation/installing-superset-from-scratch
由于在centos上安装有各种问题,个人没有解决好。因此采用在ubuntu server上安装,顺利的多。
步骤如下,主要是按官方步骤就行,其中有一点就是必须升级pip至最新。
In Ubuntu 20.04 the following command will ensure that the required dependencies are installed:
sudo apt-get install build-essential libssl-dev libffi-dev python3-dev python3-pip libsasl2-dev libldap2-dev
pip install --upgrade pip【重要,一定要更新,不然后面报错】
pip install virtualenv
# virtualenv is shipped in Python 3.6+ as venv instead of pyvenv.
# See https://docs.python.org/3.6/library/venv.html
python3 -m venv venv
. venv/bin/activate
You can exit the environment by running deactivate on the command line
Installing and Initializing Superset
First, start by installing apache-superset:
pip install apache-superset
Then, you need to initialize the database:
superset db upgrade
Finish installing by running through the following commands:
# Create an admin user in your metadata database (use `admin` as username to be able to load the examples)
$ export FLASK_APP=superset
superset fab create-admin
# Load some data to play with
superset load-examples【加载失败见后面文档,需要下载数据,自建http服务器】
# Create default roles and permissions
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset run -p 8088 --with-threads --reload --debugger
superset run -p 8088 -h 192.168.1.149 --with-threads
连接数据库:
mysql:用pip install mysqlclient会报错,可以改用pip install mysql-connector-python。
mssql:pip install pymssql
mssql+pymssql://<Username>:<Password>@<Host>:<Port-default:1433>/<Database Name>/?Encrypt=yes
mysql+mysqlconnector://{username}:{password}@{host}/{database}
mssql+pymssql://sa:123123123@192.168.1.254/Test
其它:
UserWarning: Could not import the lzma module. Your installed Python is incomplete
For centos: sudo yum install -y xz-devel
Recompile python from source code
cd Python-3.8*/
./configure --enable-optimizations
sudo make altinstall
superset load-examples失败解决方法
由于原因是example数据是存放在github的,但是无法直接下载或者下载超时,可提前到官方下载:网址如下https://github.com/apache-superset/examples-data/archive/refs/heads/master.zip
接着需要自己起一个http 服务,用的是python自带的python -m http.server 【端口号】快速搭建一个,(端口号一般是自己设4位数字例如8088之类的)。搭建步骤如下
第一步:cmd进入自己要上传到http的文件目录,例如我进入到下载的案例压缩包解压到的一个文件夹目录中
第二步:命令行里敲入python -m http.server 【端口号】,会出现以下结果,这时候已经搞定了
第三步,在网址栏输入http://(自己电脑的IP):【端口号】
就可以看到你当时的目录(webserver要允许目录浏览?)
http服务搭建完成
接着改examples/helpers.py里的BASE_URL项
把原本的BASE_URL ="https://github.com/apache-superset/examples-data/blob/master/"
改成BASE_URL ="http://(自己电脑的IP):【端口号】/examples-data-master/"
【注意BASE_URL后面是直接各个数据的gz文件,不能再有子目录间隔】
保存好
再回到最开始尝试:superset load-examples【仍有一些db的错误,请网上搜查】
安装为服务(未验证)
Actually the superset runserver is used for development mode and it is highly recommended other tools like gunicorn for production. Anyway, the main problem is that superset path on the virutalenv is $VENV_PATH/bin/superset (in general the applications that treat like binary applications like superset or airflow, etc servers on this path: $VENV_PATH/bin and the easy way to find the path of any application on Linux systems is to use which command that in this case, you can use which superset to find the superset path ).
This is the superset service file that I use it on the production, hope to useful:
[Unit]
Description = Apache Superset Webserver Daemon
After = network.target
[Service]
PIDFile = /home/superset/superset-webserver.PIDFile
User = superset
Group = superset
Environment=SUPERSET_HOME=/home/superset
Environment=PYTHONPATH=/home/superset
WorkingDirectory = /home/superset
ExecStart =/home/superset/venv/bin/python3.7 /home/superset/venv/bin/gunicorn --workers 8 --worker-class gevent --bind 0.0.0.0:8888 --pid /home/superset/superset-webserver.PIDFile superset:app
ExecStop = /bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target