Gaussian Process Optimization using GPy

Overview

End of maintenance for GPyOpt

Dear GPyOpt community!

We would like to acknowledge the obvious. The core team of GPyOpt has moved on, and over the past months we weren't giving the package nearly as much attention as it deserves. Instead of dragging our feet and giving people only occasional replies and no new features, we feel the time has come to officially declare the end of GPyOpt maintenance.

We would like to thank the community that has formed around GPyOpt. Without your interest, discussions, bug fixes and pull requests the package would never be as successful as it is. We hope we were able to provide you with a useful tool to aid your research and work.

From now on we won't be participating in the issues, merging PRs or developing any new functions. All existing PRs will be closed (but not the issues). The repo itself is not closing though, so feel free to start new discussion threads and forks. We are also still around, so may drop an occasional comment here or there. But no promises.

Finally, if you feel really enthusiastic and would like to take over the package, feel free to drop both of us an email, and who knows, maybe you'll be the one(s) carrying the GPyOpt to new heights!

Sincerely yours, Andrei Paleyes and Javier Gonzalez

GPyOpt

Gaussian process optimization using GPy. Performs global optimization with different acquisition functions. Among other functionalities, it is possible to use GPyOpt to optimize physical experiments (sequentially or in batches) and tune the parameters of Machine Learning algorithms. It is able to handle large data sets via sparse Gaussian process models.

licence develstat covdevel Research software impact

Citation

@Misc{gpyopt2016,
author = {The GPyOpt authors},
title = {{GPyOpt}: A Bayesian Optimization framework in python},
howpublished = {\url{http://github.com/SheffieldML/GPyOpt}},
year = {2016}
}

Getting started

Installing with pip

The simplest way to install GPyOpt is using pip. ubuntu users can do:

sudo apt-get install python-pip
pip install gpyopt

If you'd like to install from source, or want to contribute to the project (e.g. by sending pull requests via github), read on. Clone the repository in GitHub and add it to your $PYTHONPATH.

git clone https://github.com/SheffieldML/GPyOpt.git
cd GPyOpt
python setup.py develop

Dependencies:

  • GPy
  • paramz
  • numpy
  • scipy
  • matplotlib
  • DIRECT (optional)
  • cma (optional)
  • pyDOE (optional)
  • sobol_seq (optional)

You can install dependencies by running:

pip install -r requirements.txt

Funding Acknowledgements

  • BBSRC Project No BB/K011197/1 "Linking recombinant gene sequence to protein product manufacturability using CHO cell genomic resources"

  • See GPy funding Acknowledgements

Comments
  • pre-computed search space

    pre-computed search space

    I forked this project and did some modifications to pre-computed search space for the case of discrete variables with constraints. What is the process to get green light for this changes to be merged to master (i.e. pull request approval process)?

    opened by pavel-rev 16
  • different results vs. runs

    different results vs. runs

    I run optimization and get different results. Sometimes it hits the optimum (I have independent "slow" exhaustive search to know the optimum). Sometimes it does not hit it. Are there known rules of thumb wrt parameters to get more consistent results?

    opened by pavel-rev 13
  • No module named task.cost

    No module named task.cost

    Hi everyone! I still have the problem to use the new version of GPyOpt with the following error:

    import GPyOpt Traceback (most recent call last): File "", line 1, in File "build/bdist.macosx-10.5-x86_64/egg/GPyOpt/init.py", line 4, in

    File "build/bdist.macosx-10.5-x86_64/egg/GPyOpt/core/init.py", line 4, in

    File "build/bdist.macosx-10.5-x86_64/egg/GPyOpt/core/bo.py", line 7, in ImportError: No module named task.cost

    Any idea to resolve this problem? THANKS

    opened by calm85 13
  • ImportError: No module named 'core'

    ImportError: No module named 'core'

    I tried to run the first example from the manual: http://nbviewer.ipython.org/github/SheffieldML/GPyOpt/blob/master/manual/GPyOpt_reference_manual.ipynb

    The line:

    import GPyOpt
    

    fails with the following error:

    ImportError: No module named 'core'
    

    Happens on Python 3.4 / Windows 8 x64.

    I think the problem is Python 3.x, but I cannot use another version because of other packages... any way to fix this?

    opened by stmax82 12
  • pip install issue

    pip install issue

    ± pip install gpy gpyopt --user --upgrade
    Collecting gpy
      Downloading GPy-1.8.5.tar.gz (856kB)
        100% |████████████████████████████████| 860kB 1.3MB/s 
    Collecting gpyopt
      Using cached GPyOpt-1.2.1.tar.gz
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/setup.py", line 6, in <module>
            from GPyOpt.__version__ import __version__
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/__init__.py", line 7, in <module>
            from GPyOpt.core.task.space import Design_space
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/core/__init__.py", line 4, in <module>
            from .bo import BO
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/core/bo.py", line 9, in <module>
            from ..util.duplicate_manager import DuplicateManager
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/util/duplicate_manager.py", line 5, in <module>
            from ..core.task.space import Design_space
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/core/task/__init__.py", line 4, in <module>
            from .objective import SingleObjective
          File "/tmp/ahundt/pip-build-fcsazi50/gpyopt/GPyOpt/core/task/objective.py", line 8, in <module>
            import GPy
        ImportError: No module named 'GPy'
        
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/ahundt/pip-build-fcsazi50/gpyopt/
    
    
    opened by ahundt 10
  • Optimization chooses and gets stuck at an infeasible point

    Optimization chooses and gets stuck at an infeasible point

    Hi GPyOpt developers. I have an issue where GPyOpt chooses an infeasible next point when the number of variables in my problem exceeds 8 and then immediately converges to a suboptimal infeasible point (with respect to the inequality constraints, the chosen point is still within the domain). What I am doing is optimizing multiple l2-norm regularization parameters, an n dimensional vector denoted x, where the unknown function f(x) is the solution to a regularized logistic regression with respect to some weights, w, under the inequality constraints x_1 <= x_2 <= x_3 <= ... <= x_n. The purpose of these constraints is to reduce the computation time since any permutation of the elements of x will result in the same value of f(x). Here is an example of how I have the bounds and constraints set up for an n = 5 problem before being passed into BayesianOptimization():

    domain = [{'domain': (0.0, 1.0), 'type': 'continuous', 'name': x_1_b'},
                      {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_2_b'},
                      {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_3_b'},
                      {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_4_b'},
                      {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_5_b'}]
    
    constraints = [{'constrain': 'x[:,0] - x[:,1]', 'name': 'x_1_2_c'},
                           {'constrain': 'x[:,1] - x[:,2]', 'name': 'x_2_3_c'},
                           {'constrain': 'x[:,2] - x[:,3]', 'name': 'x_3_4_c'},
                           {'constrain': 'x[:,3] - x[:,4]', 'name': 'x_4_5_c'}]
    

    Other information: I am using the matern32 kernel (this phenomenon occurs with matern52 as well) with the maximum likelihood update to the Gaussian process hyperparameters and the lower confidence bound acquisition function (default values except jitter is 1E-4 to protect against singularity of the kernel). The optimization has worked fine with these constraints when n < 9 and also worked for n > 8 when the constraints were removed.

    opened by jkaardal 10
  • plot_acquisition() plots the wrong y-values

    plot_acquisition() plots the wrong y-values

    Hi,

    I have a fairly complex function that should always return a number >=0. Print statements in this function indicate this is satisfied. Running opt.fx_opt gives a value close to 0, which should be the true minimum. However, opt.plot_acquisition shows negative numbers at the minimum. This shouldn't be an artifact of the underlying gaussian process model, as even the points in red where GPyOpt evaluates the function is negative. Why is this the case?

    acquisition

    opened by tawe141 8
  • Using physical data not a

    Using physical data not a "known" function

    Hello, I'm new to python and gaussian processes, so I need a little help/insight. I am trying to use GPyOpt with physical x, y data and not a "known" function. The examples all use a known function. I am reading in a .csv file and I want to use bayesian optimization with the test data to let me know what experiment to run next (obtain next x,y data set). Please any advice and help is appreciated. Thank you.

    opened by will-colea 8
  • Fix bad comparison against None using ==

    Fix bad comparison against None using ==

    While running GPyOpt on python3 I started to see the following failure:

    Traceback (most recent call last):
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/testing/test_parallelization.py", line 102, in test_run
        unittest_result = run_eval(problem_config= self.problem_config, f_inits= self.f_inits, method_config=m_c, name=name, outpath=self.outpath, time_limit=None, unittest = self.is_unittest)
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/testing/driver.py", line 47, in run_eval
        verbosity       = m_c['verbosity'])
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/methods/bayesian_optimization.py", line 458, in run_optimization
        super(BayesianOptimization, self).run_optimization(max_iter = max_iter, max_time = max_time,  eps = eps, verbosity=verbosity, save_models_parameters = save_models_parameters, report_file = report_file, evaluations_file= evaluations_file, models_file=models_file)
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/core/bo.py", line 103, in run_optimization
        self._update_model()
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/core/bo.py", line 198, in _update_model
        self._save_model_parameter_values()
      File "/Users/mattlavin/Projects/GPyOpt/GPyOpt/core/bo.py", line 209, in _save_model_parameter_values
        if self.model_parameters_iterations == None:
    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
    

    It turns out that an array was being compared against None with == instead of is switching to is avoided the ambiguous type coercion.

    opened by mdlavin 8
  • Matern kernel

    Matern kernel

    I understand that "kernel" will become deprecated in the new version. How can one specify the Gpy Matern ARD 5/2 kernel with the newest API?

    GPyOpt.methods.bayesian_optimization.BayesianOptimization(f, domain=None, constrains=None, cost_withGradients=None, model_type='GP', X=None, Y=None, initial_design_numdata=None, initial_design_type='random', acquisition_type='EI', normalize_Y=True, exact_feval=False, acquisition_optimizer_type='lbfgs', model_update_interval=1, evaluator_type='sequential', batch_size=1, num_cores=1, verbosity=True, verbosity_model=False, bounds=None, **kwargs)

    opened by ghost 8
  • Bayesian updating of hyperparameters

    Bayesian updating of hyperparameters

    Dear Developers, We were able to customise our kernel and BO.model, and used it with run_optimisation. The BO procedure works well, but we noticed that if we update our model with each acquisition, this also updates the hyperparameters of the GP process (maximum likelihood) and often results in overfitting. Increasing model update interval was not a good idea either. We cannot find how to a) impose upper/lower numerical bounds on the hyperparameter search and b) refit the model with each acquisitions but update the model hyperparameters more slowly. Ideally, we would use Bayesian updating of hyperparameters (with priors on the hyperparameters), is this available? This is currently the biggest issue for us. Many thanks for your assistance, Milica.

    question 
    opened by milicasan 8
  • New Acquisition function

    New Acquisition function

    I am trying to write a new acquisition function using the GpyOpt package as provided in https://nbviewer.org/github/SheffieldML/GPyOpt/blob/master/manual/GPyOpt_creating_new_aquisitions.ipynb. After completing the writing of my acquisition function, I am unable to run Bayesian optimization loop using my new acquisition function. Please give me some suggestions regarding how to run my new acquisition function.

    opened by craju06 0
  • How to check lengthscales when ARD=True.

    How to check lengthscales when ARD=True.

    I do Bayesian optimisation in two dimensions. When 'ARD' is set to 'True', printing kernel outputs '(2, )'. I would like to check the respective lengthscale in 2D, is there a way to do this?

    opened by ktakihara2000 2
  • Optimization terminates earlier than max_iter

    Optimization terminates earlier than max_iter

    I often see that optimization terminates earlier than max_iter. I set the option of 'de_duplication' to "False" and eps to zero (or a negative value), so I do not think the consecutive occurrence of the same values of x is the reason.

    I would appreciate it if someone could tell me what is causing this problem and how to solve it.

    opened by takagi-ya 1
  • constrained bayesian optimization

    constrained bayesian optimization

    Hi,

    Is it possible to model constrained via Gaussian process? (especially implicit constrained, like max tracking error, which is not the constrained of the hyperparameter) Then output the best parameter based on cost and constraint?

    Thanks, and have a nice day!

    opened by merdan-9 0
  • Cannot use the batch_size>1 with local penalization evaluator

    Cannot use the batch_size>1 with local penalization evaluator

    Hey all,

    According to the issues I read in this repository, the batch_size>1 only works with the local_penalization evaluator, and not with sequential or any other evaluator model.

    I have set up a Bayesian Optimization model where I use a GP surrogate model m trained on my dataset as my objective function, as follows:

    def obj_func(X):
        out,_ = m.predict(X)
        return(out)
    
    bo_step = GPyOpt.methods.BayesianOptimization(f = obj_func, domain = bounds,
                                                        model_type='GP',normalize_Y = False,
                                                        evaluator_type = 'local_penalization',
                                                        acquisition_type='EI',batch_size=5, maximize=True, eps=1e-8)
    

    when I run the optimization, I only get a single prediction instead of batch_size=5:

    bo_step.run_optimization(max_iter=5)

    If I use the external object evaluation example as a starting point and use bo_step.suggest_next_locations() functions, I get 5 suggestions, but it seems that it does not really maximize my objective function (below). However I am not sure if I can/should use this object since I already have a surrogate model function fit into my dataset.

    x_next = bo_step.suggest_next_locations()

    Any help or suggestion on this is highly appreciated.

    Best,

    Bulut

    opened by blttkgl 1
  • Can the optimizer stop when best_y is smaller than a specific value?

    Can the optimizer stop when best_y is smaller than a specific value?

    Hi, thanks for your contributions.

    I notice that the optimizer will finish when the iteration number is over than 'max_iter', or the time is over 'max_time'. But I wonder that can I set a specific number, if the best_y in the optimization is smaller than this value, it means that a optimal result has been found, then BO can stop searching.

    Hoping for your reply, thanks!

    opened by Seal-o-O 0
Releases(v1.2.6)
  • v1.2.6(Mar 19, 2020)

    • Small fix in description.
    • coloring acquisition plot by the step of the objective evaluation.
    • Issue #94: Fix constraint violation in anchor generation.
    • Added the ability to set the mean function.
    • Added x and y labels for plotted graphs.
    • Correct typo in RFModel docstring.
    • Pull request: fix branch for Issue-244.
    • Implementing Probability of Feasibility (PoF) constraint handling.
    • Fixed obvious errors in CMA and DIRECT optimizers, added unit tests.
    • Caught mismatch between code and spec: constraints.
    • Fix a broken link in the web page
    • Fix a broken link in the footer.
    • Allow users to choose lhs sampling criteria
    • correct square root in lower confidence bound acquisition
    • Remove unused and add required imports
    • Typo in jupyter command
    • Avoid cost function ignored warning with LCB acquisition
    Source code(tar.gz)
    Source code(zip)
Owner
Sheffield Machine Learning Software
Software from the Sheffield machine learning group and collaborators.
Sheffield Machine Learning Software
LiuAlgoTrader is a scalable, multi-process ML-ready framework for effective algorithmic trading

LiuAlgoTrader is a scalable, multi-process ML-ready framework for effective algorithmic trading. The framework simplify development, testing, deployment, analysis and training algo trading strategies

Amichay Oren 458 Dec 24, 2022
Automated Time Series Forecasting

AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. There are dozens of forecasting mod

Colin Catlin 652 Jan 03, 2023
Python factor analysis library (PCA, CA, MCA, MFA, FAMD)

Prince is a library for doing factor analysis. This includes a variety of methods including principal component analysis (PCA) and correspondence anal

Max Halford 915 Dec 31, 2022
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

2 Aug 23, 2022
List of Data Science Cheatsheets to rule the world

Data Science Cheatsheets List of Data Science Cheatsheets to rule the world. Table of Contents Business Science Business Science Problem Framework Dat

Favio André Vázquez 11.7k Dec 30, 2022
Implementation of deep learning models for time series in PyTorch.

List of Implementations: Currently, the reimplementation of the DeepAR paper(DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks

Yunkai Zhang 275 Dec 28, 2022
Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.

Tangram Website | Discord Tangram makes it easy for programmers to train, deploy, and monitor machine learning models. Run tangram train to train a mo

Tangram 1.4k Jan 05, 2023
WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can b

Shigang Li 6 Jun 18, 2022
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

AutoViz and Auto_ViML 519 Jan 03, 2023
XManager: A framework for managing machine learning experiments 🧑‍🔬

XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP). Interaction

DeepMind 620 Dec 27, 2022
Machine Learning for Time-Series with Python.Published by Packt

Machine-Learning-for-Time-Series-with-Python Become proficient in deriving insights from time-series data and analyzing a model’s performance Links Am

Packt 124 Dec 28, 2022
A handy tool for common machine learning models' hyper-parameter tuning.

Common machine learning models' hyperparameter tuning This repo is for a collection of hyper-parameter tuning for "common" machine learning models, in

Kevin Hu 2 Jan 27, 2022
AutoX是一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色、简单易用、通用、自动化、灵活。

English | 简体中文 AutoX是什么? AutoX一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色: AutoX在多个kaggle数据集上,效果显著优于其他解决方案(见效果对比)。 简单易用: AutoX的接口和sklearn类似,方便上手使用。

4Paradigm 431 Dec 28, 2022
虚拟货币(BTC、ETH)炒币量化系统项目。在一版本的基础上加入了趋势判断

🎉 第二版本 🎉 (现货趋势网格) 介绍 在第一版本的基础上 趋势判断,不在固定点位开单,选择更优的开仓点位 优势: 🎉 简单易上手 安全(不用将api_secret告诉他人) 如何启动 修改app目录下的authorization文件

幸福村的码农 250 Jan 07, 2023
Machine learning that just works, for effortless production applications

Machine learning that just works, for effortless production applications

Elisha Yadgaran 16 Sep 02, 2022
Learn Machine Learning Algorithms by doing projects in Python and R Programming Language

Learn Machine Learning Algorithms by doing projects in Python and R Programming Language. This repo covers all aspect of Machine Learning Algorithms.

Ravi Chaubey 6 Oct 20, 2022
Dual Adaptive Sampling for Machine Learning Interatomic potential.

DAS Dual Adaptive Sampling for Machine Learning Interatomic potential. How to cite If you use this code in your research, please cite this using: Hong

6 Jul 06, 2022
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile

matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo

Target 696 Dec 26, 2022
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

3 Nov 24, 2021
A single Python file with some tools for visualizing machine learning in the terminal.

Machine Learning Visualization Tools A single Python file with some tools for visualizing machine learning in the terminal. This demo is composed of t

Bram Wasti 35 Dec 29, 2022