bsmart.scans.ContourGP
Active Learning Scan to find a decision boundary using Gaussian Processes
Entropy gain taken from https://github.com/diana-hep/excursion by Lukas Heinrich,Gilles Louppe, Kyle Cranmer (unfortunately that code does not work any more)
Adapted by Mark Goodsell to remove integration over fixed grids and replace by integration using [p(c | z) ] Plus smaller choice of points to sample (both save a lot of time) Plus some improvement of robustness
Settings
Setup: {
"Type": "ContourGP",
"Points": default 60, int -> number of points to check
"Initial Points": default 5, int -> number of initial random points to use for initialisation
"Noise" : default None, float -> signal noise to include in the Gaussian Process
"Grid Size": default 100, int -> Number of points to include in the integration grids
"Sample Size": default 50, int -> Number of Samples to check for the maximum entropy gain
"Sample Multiplier": default 5, int -> We select the sample from this number * the sample size of random points, where the
ones to be checked are those closest in value to the decision boundary
"Function": if present, this specifies a function of variables and/or observables to solve for f(x) = threshold (see below)
if not present, we construct a likelihood function from the observables in the usual way.
One very useful option is to set "SCALING": "USER" in the observable if it is e.g. signal strength or signal strength -1.
"Threshold": default 0.0, float -> value to subtract from the above function so that the contour we are looking for solves
f(x) = 0
"Fail Value": default 10.0, float -> value to assign to failed points
"Run Name": string -> name of the run
}
Information
BSMArt Name: ContourGP
- Requires:
scikit-learn
numpy
pandas
Settings:
Points: Int
Initial Points: Int
Noise: Float
Grid Size: Int
Sample Size: Int
Sample Multiplier: Int
Input_CSV_File: Path
Fail Value: Float
Threshold: Float
Function: String
- class bsmart.scans.ContourGP.NewScan(inputs, log)[source]
Bases:
ScanScanner class for Grid Scans
- bsmart.scans.ContourGP.getgrids(GP, ndim=2, npoints=50, thresholds=[-inf, 0.0, inf])[source]
get grids via shooting algorithm according to p(c | z)