Xgbfi特征重要性分析(xgboost扩展)

star2017 1年前 ⋅ 322 阅读

Xgbfi

用于训练好的xgboost模型分析对应特征的重要性,当然你也可以使用fmap来观察

What is Xgbfi?

Xgbfi is a XGBoost model dump parser, which ranks features as well as feature interactions by different metrics.

Siblings

Xgbfir – Python porting

衡量准则

  • Gain: Total gain of each feature or feature interaction
  • FScore: Amount of possible splits taken on a feature or feature interaction
  • wFScore: Amount of possible splits taken on a feature or feature interaction weighted by the probability of the splits to take place
  • Average wFScorewFScore divided by FScore
  • Average GainGain divided by FScore
  • Expected Gain: Total gain of each feature or feature interaction weighted by the probability to gather the gain
  • Average Tree Index
  • Average Tree Depth

其他功能

  • Leaf Statistics
  • Split Value Histograms

评判准则的相关说明:

7f8a56c2a5cbc02e51ff8a728a2e67e9.png

python包安装

 

Using pip

You can install using the pip package manager by running

pip install xgbfir

From source

Clone the repo and install:

git clone https://github.com/limexp/xgbfir.git
cd xgbfir
sudo python setup.py install

Or download the source code by pressing ‘Download ZIP’ on this page. Install by navigating to the proper directory and running

sudo python setup.py install

快速上手

from sklearn.datasets import load_iris, load_boston
import xgboost as xgb
import xgbfir
# loading database
boston = load_boston()
# doing all the XGBoost magic
xgb_rmodel = xgb.XGBRegressor().fit(boston[‘data’], boston[‘target’])
# saving to file with proper feature names
xgbfir.saveXgbFI(xgb_rmodel, feature_names=boston.feature_names, OutputXlsxFile=‘bostonFI.xlsx’)
# loading database
iris = load_iris()
# doing all the XGBoost magic
xgb_cmodel = xgb.XGBClassifier().fit(iris[‘data’], iris[‘target’])
# saving to file with proper feature names
xgbfir.saveXgbFI(xgb_cmodel, feature_names=iris.feature_names, OutputXlsxFile=‘irisFI.xlsx’)

现在你看下生成的excel文件

503bc31f7481aff5eee11dfab37668ff.png

参考

https://github.com/limexp/xgbfir

https://github.com/Far0n/xgbfi

本文来自算法之道,观点不代表一起大数据-技术文章心得立场,如若转载,请注明出处:https://www.deeplearn.me/2375.html

更多内容请访问:IT源点

相关文章推荐

全部评论: 0

    我有话说: