Skip to content

Conversation

@matthewdeng
Copy link
Contributor

@matthewdeng matthewdeng commented Dec 3, 2025

Description

Replace GPy with scikit-learn's GaussianProcessRegressor in the PB2 (Population Based Bandits) scheduler.

Additional information

  • Rewrote TV_SquaredExp kernel to implement sklearn's Kernel interface instead of GPy's Kern
  • Replaced GPy.models.GPRegression with sklearn.gaussian_process.GaussianProcessRegressor
  • Removed gpy from tune-test-requirements.txt
  • Updated documentation to only require scikit-learn (which is already a Ray dependency)

Testing

Re-ran existing PB2 tests in est_trial_scheduler_pbt.py to validate change.

Signed-off-by: Matthew Deng <[email protected]>
Signed-off-by: Matthew Deng <[email protected]>
Signed-off-by: Matthew Deng <[email protected]>
@matthewdeng matthewdeng marked this pull request as ready for review December 9, 2025 19:49
@matthewdeng matthewdeng requested review from a team as code owners December 9, 2025 19:49
@matthewdeng matthewdeng added the go add ONLY when ready to merge, run all tests label Dec 9, 2025
Signed-off-by: Matthew Deng <[email protected]>
@matthewdeng matthewdeng removed the go add ONLY when ready to merge, run all tests label Dec 9, 2025
Signed-off-by: Matthew Deng <[email protected]>
Copy link
Contributor

@TimothySeah TimothySeah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with gaussian processes or their implementations in scikitlearn/GPy. Also the scikitlearn -> GPy migration does not appear to be a 1 for 1 replacement. It might be worth getting a review from someone more familiar and/or having a good testing plan to ensure no regressions.


try:
m.optimize()
m = GaussianProcessRegressor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is m = GPy.models.GPRegression(X, y, kernel) equivalent to

m = GaussianProcessRegressor(
    kernel=kernel, optimizer="fmin_l_bfgs_b", alpha=1e-10
)
m.fit(X, y)

? Might be worth doublechecking if this matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are the default values of GaussianProcessRegressor


def __init__(
self, input_dim, variance=1.0, lengthscale=1.0, epsilon=0.0, active_dims=None
self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do input_dim and active_dims no longer matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sklearn Kernel doesn't need to pass it in during construction, it calculates at runtime.

if Y is None:
Y = X

epsilon = np.clip(self.epsilon, 1e-5, 0.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did the 1e-5 lower bound come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's consistent with epsilon_bounds, I think it should already be handled by sklearn but kept here for consistency with prev logic.

def __call__(self, X, Y=None, eval_gradient=False):
X = np.atleast_2d(X)
if Y is None:
Y = X
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do Y = np.copy(X) since it was X2 = np.copy(X) before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we replaced in-place modification of

        X = X[:, 1:]
        X2 = X2[:, 1:]

with

        X_spatial = X[:, 1:]
        Y_spatial = Y[:, 1:]

return self.variance * np.ones(X.shape[0])
if eval_gradient:
K_gradient_variance = K
dist2 = np.square(euclidean_distances(X_spatial, Y_spatial))
Copy link
Contributor

@TimothySeah TimothySeah Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is no longer divided by lengthscale - see the previous code: dist2 = np.square(euclidean_distances(X, X2)) / self.lengthscale

Iiuc call in the scikitlearn implementation is similar to K + update_gradients_full in the GPy implementation, but the update_gradients_full portion seems different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is because sklearn uses logspace

m.fit(X, y)

scores.append(m.log_likelihood())
scores.append(m.log_marginal_likelihood_value_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log likelihood isn't quite the same as log marginal likelihood but maybe that doesn't affect this algorithm in a meaningful way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpy's implementation returns _log_marginal_likelihood, so it's the same thing.

@ray-gardener ray-gardener bot added tune Tune-related issues docs An issue or change related to documentation labels Dec 10, 2025
@matthewdeng matthewdeng added the go add ONLY when ready to merge, run all tests label Dec 10, 2025
Copy link
Contributor

@TimothySeah TimothySeah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but might be worth adding testing notes to the PR description.

@matthewdeng matthewdeng merged commit 9d75c1e into ray-project:master Dec 11, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs An issue or change related to documentation go add ONLY when ready to merge, run all tests tune Tune-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants