Skip to content

tsudalab/DT-sampler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DT-Sampler

Details

You can find more details about DT-Sampler at https://arxiv.org/abs/2307.13333.

Abstract

DT-sampler is an ensemble model based on decision tree sampling. Different from random forest, DT-sampler uniformly samples decision trees from a given space, which can generate more stable results and provide higher interpretability compared to random forest. DT-sampler only has two key parameters: #node and threshold. #node constrains the size of decision trees generated by DT-sampler and threshold ensures a minimum training accuracy for each decision tree.

① Encode the construction of decision trees as a SAT problem.
② Utilize SAT sampler to uniformly sample multiple satisfiable solutions from the high accuracy space.
③ Decode the satisfiable solutions back into decision trees.
④ Estimate the training accuracy distribution of the decision trees in the high accuracy space.
⑤ Measure feature importance by calculating the emergence probability of each feature.

Requirements

matplotlib == 3.6.3
numpy == 1.21.0
pandas == 1.5.3
pyunigen == 2.5.2
scikit_learn == 1.2.1
scipy == 1.11.1
z3_solver == 4.12.1.0

Quick Start

...
dt_sampler = DT_sampler(X_train, y_train, #node, threshod, "./cnf/cnf_name.cnf")
dt_sampler.run(#tree, method = "unigen", seed)
...

Contact

Chao Huang ([email protected])
Department of Computational Biology and Medical Science
The University of Tokyo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages