LightGBM(Light Gradient Boosting Machine)是一个基于决策树算法的快速的、分布式的、高性能 gradient boosting(GBDT、GBRT、GBM 或 MART)框架,可被用于排行、分类以及其他许多机器学习任务中。
开源项目地址: https://github.com/Microsoft/LightGBM
开源|LightGBM:三天内收获GitHub 1000+ Star,表现超越已有boosting工具。
LightGBM
LightGBM 是一个梯度 boosting 框架,使用基于学习算法的决策树。它可以说是分布式的,高效的,它有以下优势:
更快的训练效率
低内存使用
更好的准确率
支持并行学习
可处理大规模数据
环境描述
操作系统:mac os Sierra 10.12
iMac (21.5-inch, Late 2015)
编译过程
lightgbm 依赖 OpenMP 进行编译,苹果的 Clang 不支持。请使用 GCC / G++ 代替。参考官方给出的安装指南:
brew install cmake brew install gcc --without-multilib git clone --recursive https://github.com/Microsoft/LightGBM ; cd LightGBM mkdir build ; cd build cmake -DCMAKE_CXX_COMPILER=g++-6 -DCMAKE_C_COMPILER=gcc-6 .. make -j以下为我逐条执行的输出:1、使用 brew 安装 cmake,CentOS下使用 yum install cmake,Ubuntu下使用apt-get install cmake
$ brew install cmake Updating Homebrew... ==> Downloading https://homebrew.bintray.com/bottles/cmake-3.7.2.sierra.bottle.tar.gz ######################################################################## 100.0% ==> Pouring cmake-3.7.2.sierra.bottle.tar.gz ==> Caveats Emacs Lisp files have been installed to: /usr/local/share/emacs/site-lisp/cmake ==> Summary /usr/local/Cellar/cmake/3.7.2: 2,143 files, 29MB avenMac:conf aven$ cmake Usage cmake [options] <path-to-source> cmake [options] <path-to-existing-build> Specify a source directory to (re-)generate a build system for it in the current working directory. Specify an existing build directory to re-generate its build system. Run 'cmake --help' for more information.
2、安装gcc,CentOS下使用 yum install gcc,Ubuntu下使用apt-get install gcc,可能还需要安装g++
$ brew install gcc --without-multilib Updating Homebrew... ==> Auto-updated Homebrew! Updated 1 tap (homebrew/core). ==> Updated Formulae aws-sdk-cpp docker go pari qjson tarsnap-gui awscli dspdfviewer godep pdf2htmlex qscintilla2 tomcat@7 certbot flow knot-resolver percona-server qtads transcrypt chapel gammaray kobalt pngquant quazip urh cockatrice gearboy liblastfm pushpin qwt yarn codequery gearsystem logtalk qbs qwtpolar zurl color-code git-cola mono qca qxmpp dbus gitlab-ci-multi-runner nano qcachegrind rabbitmq djview4 gnupg paket qjackctl riemann ==> Renamed Formulae pyqt5 -> pyqt qt5 -> qt ==> Installing dependencies for gcc: gmp, mpfr, libmpc, isl ==> Installing gcc dependency: gmp ==> Downloading https://homebrew.bintray.com/bottles/gmp-6.1.2.sierra.bottle.1.tar.gz ######################################################################## 100.0% ==> Pouring gmp-6.1.2.sierra.bottle.1.tar.gz /usr/local/Cellar/gmp/6.1.2: 18 files, 3.1MB ==> Installing gcc dependency: mpfr ==> Downloading https://homebrew.bintray.com/bottles/mpfr-3.1.5.sierra.bottle.tar.gz ######################################################################## 100.0% ==> Pouring mpfr-3.1.5.sierra.bottle.tar.gz /usr/local/Cellar/mpfr/3.1.5: 25 files, 3.6MB ==> Installing gcc dependency: libmpc ==> Downloading https://homebrew.bintray.com/bottles/libmpc-1.0.3.sierra.bottle.tar.gz ######################################################################## 100.0% ==> Pouring libmpc-1.0.3.sierra.bottle.tar.gz /usr/local/Cellar/libmpc/1.0.3: 11 files, 345.6KB ==> Installing gcc dependency: isl ==> Downloading https://homebrew.bintray.com/bottles/isl-0.18.sierra.bottle.tar.gz ######################################################################## 100.0% ==> Pouring isl-0.18.sierra.bottle.tar.gz /usr/local/Cellar/isl/0.18: 80 files, 3.8MB ==> Installing gcc --without-multilib ==> Using the sandbox ==> Downloading https://ftpmirror.gnu.org/gcc/gcc-6.3.0/gcc-6.3.0.tar.bz2 ==> Downloading from http://mirrors.ustc.edu.cn/gnu/gcc/gcc-6.3.0/gcc-6.3.0.tar.bz2 ######################################################################## 100.0% ==> Downloading https://raw.githubusercontent.com/Homebrew/formula-patches/e9e0ee09389a54cc4c8fe1c24ebca3cd765ed0ba/gcc/6.1.0-jit.patch ######################################################################## 100.0% ==> Patching ==> Applying 6.1.0-jit.patch patching file gcc/jit/Make-lang.in ==> ../configure --build=x86_64-apple-darwin16.4.0 --prefix=/usr/local/Cellar/gcc/6.3.0_1 --libdir=/usr/local/Cellar/gcc/6.3.0_1/lib/gcc/6 --enable-la ==> make bootstrap ==> make install /usr/local/Cellar/gcc/6.3.0_1: 1,358 files, 237.8MB, built in 63 minutes 14 seconds3、克隆项目代码
$ git clone --recursive https://github.com/Microsoft/LightGBM ; cd LightGBM Cloning into 'LightGBM'... remote: Counting objects: 5315, done. remote: Compressing objects: 100% (31/31), done. remote: Total 5315 (delta 5), reused 2 (delta 2), pack-reused 5282 Receiving objects: 100% (5315/5315), 4.06 MiB | 21.00 KiB/s, done. Resolving deltas: 100% (3644/3644), done. Submodule 'include/boost/compute' (https://github.com/boostorg/compute) registered for path 'compute' Cloning into '/Users/aven/software/LightGBM/compute'... Submodule path 'compute': checked out '1380a04582080bbe2364352b336270bc4bfa3025' avenMac:LightGBM aven$ pwd /Users/aven/software/LightGBM4、创建构建目录
mkdir build ; cd build5、执行 cmake
$ cmake -DCMAKE_CXX_COMPILER=g++-6 -DCMAKE_C_COMPILER=gcc-6 .. -- The C compiler identification is GNU 6.3.0 -- The CXX compiler identification is GNU 6.3.0 -- Checking whether C compiler has -isysroot -- Checking whether C compiler has -isysroot - yes -- Checking whether C compiler supports OSX deployment target flag -- Checking whether C compiler supports OSX deployment target flag - yes -- Check for working C compiler: /usr/local/bin/gcc-6 -- Check for working C compiler: /usr/local/bin/gcc-6 -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Checking whether CXX compiler has -isysroot -- Checking whether CXX compiler has -isysroot - yes -- Checking whether CXX compiler supports OSX deployment target flag -- Checking whether CXX compiler supports OSX deployment target flag - yes -- Check for working CXX compiler: /usr/local/bin/g++-6 -- Check for working CXX compiler: /usr/local/bin/g++-6 -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Try OpenMP C flag = [-fopenmp] -- Performing Test OpenMP_FLAG_DETECTED -- Performing Test OpenMP_FLAG_DETECTED - Success -- Try OpenMP CXX flag = [-fopenmp] -- Performing Test OpenMP_FLAG_DETECTED -- Performing Test OpenMP_FLAG_DETECTED - Success -- Found OpenMP: -fopenmp -- Configuring done CMake Warning (dev): Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake --help-policy CMP0042" for policy details. Use the cmake_policy command to set the policy and suppress this warning. MACOSX_RPATH is not specified for the following targets: _lightgbm This warning is for project developers. Use -Wno-dev to suppress it. -- Generating done -- Build files have been written to: /Users/aven/software/LightGBM/build6、构建
$ make -j Scanning dependencies of target _lightgbm Scanning dependencies of target lightgbm [ 2%] Building CXX object CMakeFiles/_lightgbm.dir/src/c_api.cpp.o [ 6%] Building CXX object CMakeFiles/_lightgbm.dir/src/application/application.cpp.o [ 8%] Building CXX object CMakeFiles/lightgbm.dir/src/main.cpp.o [ 8%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/boosting.cpp.o [ 10%] Building CXX object CMakeFiles/lightgbm.dir/src/boosting/boosting.cpp.o [ 14%] Building CXX object CMakeFiles/lightgbm.dir/src/application/application.cpp.o [ 16%] Building CXX object CMakeFiles/_lightgbm.dir/src/boosting/gbdt.cpp.o [ 18%] Building CXX object CMakeFiles/lightgbm.dir/src/boosting/gbdt.cpp.o [ 20%] Building CXX object CMakeFiles/lightgbm.dir/src/io/dataset.cpp.o [ 20%] Building CXX object CMakeFiles/lightgbm.dir/src/io/bin.cpp.o [ 22%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/bin.cpp.o [ 24%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/config.cpp.o [ 30%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/dataset.cpp.o [ 30%] Building CXX object CMakeFiles/lightgbm.dir/src/io/config.cpp.o [ 30%] Building CXX object CMakeFiles/lightgbm.dir/src/io/dataset_loader.cpp.o [ 32%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/dataset_loader.cpp.o [ 38%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/metadata.cpp.o [ 40%] Building CXX object CMakeFiles/_lightgbm.dir/src/metric/dcg_calculator.cpp.o [ 40%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/parser.cpp.o [ 40%] Building CXX object CMakeFiles/_lightgbm.dir/src/io/tree.cpp.o [ 44%] Building CXX object CMakeFiles/lightgbm.dir/src/io/tree.cpp.o [ 46%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linkers_mpi.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/metric/metric.cpp.o [ 58%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linkers_socket.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/network/network.cpp.o [ 62%] Building CXX object CMakeFiles/lightgbm.dir/src/io/parser.cpp.o [ 72%] Building CXX object CMakeFiles/_lightgbm.dir/src/metric/metric.cpp.o [ 72%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/network.cpp.o [ 72%] Building CXX object CMakeFiles/_lightgbm.dir/src/objective/objective_function.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/network/linkers_mpi.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/objective/objective_function.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/metric/dcg_calculator.cpp.o [ 72%] Building CXX object CMakeFiles/_lightgbm.dir/src/network/linker_topo.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/network/linker_topo.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/io/metadata.cpp.o [ 72%] Building CXX object CMakeFiles/lightgbm.dir/src/network/linkers_socket.cpp.o [ 74%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/data_parallel_tree_learner.cpp.o [ 76%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/serial_tree_learner.cpp.o [ 78%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/feature_parallel_tree_learner.cpp.o [ 80%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/gpu_tree_learner.cpp.o [ 82%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/voting_parallel_tree_learner.cpp.o [ 84%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/tree_learner.cpp.o [ 86%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/serial_tree_learner.cpp.o [ 88%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/tree_learner.cpp.o [ 94%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/gpu_tree_learner.cpp.o [ 94%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/data_parallel_tree_learner.cpp.o [ 94%] Building CXX object CMakeFiles/lightgbm.dir/src/treelearner/feature_parallel_tree_learner.cpp.o [ 96%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/voting_parallel_tree_learner.cpp.o [100%] Linking CXX shared library ../lib_lightgbm.so [100%] Linking CXX executable ../lightgbm [100%] Built target lightgbm [100%] Built target _lightgbm执行到 96% 的时候会卡一会儿,如果不希望执行“make -j”命令时对电脑影响太大的话,执行“make -j1”,使用CPU的一核,不会占满CPU。
构建成功后会在上层目录生成两个文件,lib_lightgbm.so lightgbm,一个库文件,一个可执行文件。
相关下载:
lib_lightgbm.so 库文件和可执行文件 lightgbm
相关阅读:
微软开源分布式高性能GB框架LightGBM安装使用——Python
====================文档信息=======================
版权声明:非商用*转载-保持署名-注明出处
署名(BY) :testcs_dn(微wx笑)
文章出处:[无知人生,记录点滴](http://blog.csdn.NET/testcs_dn)
==============欢迎关注我的个人微信订阅号(微wx笑)==========