bsmart.scans.ContourGP

Active Learning Scan to find a decision boundary using Gaussian Processes

Entropy gain taken from https://github.com/diana-hep/excursion by Lukas Heinrich,Gilles Louppe, Kyle Cranmer (unfortunately that code does not work any more)

Adapted by Mark Goodsell to remove integration over fixed grids and replace by integration using [p(c | z) ] Plus smaller choice of points to sample (both save a lot of time) Plus some improvement of robustness

Settings

Setup: {
    "Type": "ContourGP",
        "Points": default 60, int -> number of points to check
        "Initial Points": default 5, int -> number of initial random points to use for initialisation
        "Noise" : default None, float -> signal noise to include in the Gaussian Process
        "Grid Size":  default 100, int -> Number of points to include in the integration grids
        "Sample Size":  default 50, int -> Number of Samples to check for the maximum entropy gain
        "Sample Multiplier":    default 5, int -> We select the sample from this number * the sample size of random points, where the
                                ones to be checked are those closest in value to the decision boundary
        "Function": if present, this specifies a function of variables and/or observables to solve for f(x) = threshold (see below)
                    if not present, we construct a likelihood function from the observables in the usual way.
                    One very useful option is to set "SCALING": "USER" in the observable if it is e.g. signal strength or signal strength -1.
        "Threshold": default 0.0, float ->  value to subtract from the above function so that the contour we are looking for solves
                                            f(x) = 0
        "Fail Value": default 10.0, float -> value to assign to failed points
        "Run Name": string -> name of the run

}

Information

BSMArt Name: ContourGP

Requires:
  • scikit-learn

  • numpy

  • pandas

Settings:

  • Points: Int

  • Initial Points: Int

  • Noise: Float

  • Grid Size: Int

  • Sample Size: Int

  • Sample Multiplier: Int

  • Input_CSV_File: Path

  • Fail Value: Float

  • Threshold: Float

  • Function: String

class bsmart.scans.ContourGP.NewScan(inputs, log)[source]

Bases: Scan

Scanner class for Grid Scans

get_random_point()[source]
postprocess(Point, observables, data_point, temp_dir, log, lock=None)[source]

Get the function as a product of the likelihoods

run()[source]
bsmart.scans.ContourGP.approx_mi_vec(mu, cov, threshold_start, threshold_end)[source]
bsmart.scans.ContourGP.getgrids(GP, ndim=2, npoints=50, thresholds=[-inf, 0.0, inf])[source]

get grids via shooting algorithm according to p(c | z)

bsmart.scans.ContourGP.getnewcandidates(GP, ndim=2, npoints=100, multiplier=6)[source]

get a bunch of test candidates selected from those near to the boundary

bsmart.scans.ContourGP.info_gain(x_candidate, gp, thresholds, meanXs, noise=0.0)[source]
bsmart.scans.ContourGP.warn(*args, **kwargs)[source]