What are parameter importances? How can I access them?

What are parameter importances?

A parameter’s importance value is a measure of that parameter’s capacity to predict optimized metric values. This is estimated by fitting an ExtraTreesRegressor and using the features_importances of the fit model to compute the impurity of each parameter. If you have two optimized metrics, the importances are computed independently for each metric. Parameters assigned high importance are more likely to have good values in specific regions that enable high predictive capacity. If good and bad metric values occur over the full parameter domain, importances will likely show less signal.

Below is a scenario where there are parameter data and associated importance quantiles displayed.

We can see the reverse situation below, where the location of the optimum is the same, the shape from each dimension is the same, but x1 has a much more prominent impact on the value of f than x2 does.

How can I access parameter importances in python?

conn = sigopt.Connection(client_token=API_TOKEN)
metric_importances = conn.experiments(EXPERIMENT_ID).metric_importances().fetch()

What should I know about this visualization’s reliability?

The importance results are biased as the data-generating distribution that is modeled is strongly dependent on SigOpt’s search strategy. There is also variance in their computation. Below is an example with only 30 points from an actual SigOpt optimization.

While the importances can still be close, the fact that there are only 30 points gives great variance to the importance quantities. When the experiment above was rerun 20 times the importance bars varied as depicted below (the red region shows the range of the max/min importance for each parameter).

Visualizing effect of sparsity on importance variance

1 Like