Multi-Output Gaussian Process Toolkit

Related tags

Deep Learningmogptk
Overview

Multi-Output Gaussian Process Toolkit

Paper - API Documentation - Tutorials & Examples

The Multi-Output Gaussian Process Toolkit is a Python toolkit for training and interpreting Gaussian process models with multiple data channels. It builds upon PyTorch to provide an easy way to train multi-output models effectively on CPUs and GPUs. The main authors are Taco de Wolff, Alejandro Cuevas, and Felipe Tobar as part of the Center for Mathematical Modelling at the University of Chile.

Installation

With Anaconda installed on your system, open a command prompt and create a virtual environment:

conda create -n myenv python=3.7
conda activate myenv

where myenv is the name of your environment, and where the version of Python could be 3.6 or above. Next we will install this toolkit and automatically install the necessary dependencies such as PyTorch.

pip install mogptk

In order to upgrade to a new version of MOGPTK or any of its dependencies, use --upgrade as follows:

pip install --upgrade mogptk

For developers of the library or for users who need the latest changes, we recommend cloning the git master or develop branch and to use the following command inside the repository folder:

pip install --upgrade -e .

See Tutorials & Examples to get started.

Introduction

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in the following publication: G. Parra, F. Tobar, Spectral Mixture Kernels for Multi-Output Gaussian Processes, Advances in Neural Information Processing Systems, 2017. Proceedings link: https://papers.nips.cc/paper/7245-spectral-mixture-kernels-for-multi-output-gaussian-processes

The kernel learns the cross-channel correlations of the data, so it is particularly well-suited for the task of signal reconstruction in the event of sporadic data loss. All other included kernels can be derived from the Multi Output Spectral Mixture kernel by restricting some parameters or applying some transformations.

One of the main advantages of the present toolkit is the GPU support, which enables the user to train models through PyTorch, speeding computations significantly. It also includes sparse-variational GP regression functionality to decrease computation time even further.

See MOGPTK: The Multi-Output Gaussian Process Toolkit for our publication in Neurocomputing.

Tutorials

00 - Quick Start: Short notebook showing the basic use of the toolkit.

01 - Data Loading: Functionality to load CSVs and DataFrames while using formatters for dates.

02 - Data Preparation: Handle data, removing observations to simulate sensor failure and apply tranformations to the data.

03 - Parameter Initialization: Parameter initialization using different methods, for single output regression using spectral mixture kernel and multioutput case using MOSM kernel.

04 - Model Training: Training of models while keeping certain parameters fixed.

05 - Error Metrics Obtain different metrics in order to compare models.

06 - Custom Kernels and Mean Functions Use or create custom kernels as well as training custom mean functions.

Examples

Airline passangers: Regression using a single output spectral mixture on the yearly number of passengers of an airline.

Seasonal CO2 of Mauna-Loa: Regression using a single output spectral mixture on the CO2 concentration at Mauna-Loa throughout many years.

Currency Exchange: Model training, interpretation and comparison on a dataset of 11 currency exchange rates (against the dollar) from 2017 and 2018. These 11 channels are fitted with the MOSM, SM-LMC, CSM, and CONV kernels and their results are compared and interpreted.

Gold, Oil, NASDAQ, USD-index: The commodity indices for gold and oil, together with the indices for the NASDAQ and the USD against a basket of other currencies, we train multiple models to find correlations between the macro economic indicators.

Human Activity Recognition: Using the Inertial Measurement Unit (IMU) of an Apple iPhone 4, the accelerometer, gyroscope and magnetometer 3D data were recorded for different activities resulting in nine channels.

Bramblemet tidal waves: Tidal wave data set of four locations in the south of England. We model the tidal wave periods of approximately 12.5 hours using different multi-output Gaussian processes.

Documentation

See the API documentation for documentation of our toolkit, including usage and examples of functions and classes.

Authors

  • Taco de Wolff
  • Alejandro Cuevas
  • Felipe Tobar

Users

This is a list of users of this toolbox, feel free to add your project!

Contributing

We accept and encourage contributions to the toolkit in the form of pull requests (PRs), bug reports and discussions (GitHub issues). It is adviced to start an open discussion before proposing large PRs. For small PRs we suggest that they address only one issue or add one new feature. All PRs should keep documentation and notebooks up to date.

Citing

Please use our publication at arXiv to cite our toolkit: MOGPTK: The Multi-Output Gaussian Process Toolkit. We recommend the following BibTeX entry:

@article{mogptk,
    author = {T. {de Wolff} and A. {Cuevas} and F. {Tobar}},
    title = {{MOGPTK: The Multi-Output Gaussian Process Toolkit}},
    journal = "Neurocomputing",
    year = "2020",
    issn = "0925-2312",
    doi = "https://doi.org/10.1016/j.neucom.2020.09.085",
    url = "https://github.com/GAMES-UChile/mogptk"
}

License

Released under the MIT license.

Comments
  • Question about Normalizing X dependent on #channels, and Related Issue with Mean Functions

    Question about Normalizing X dependent on #channels, and Related Issue with Mean Functions

    Hi,

    Sorry for the massive amounts of requests, but your package is exactly what I need for my current projects, so I'm just trying to make sure everything is compatible. I really appreciate the work you've put into this and the prompt responses.

    I tried to modify the code for a mean function in your example here to my problem, but the mean functions are not performing as expected.

    I tried to come up with a reproducible example which seems to show the issue, where normalizing is occurring and I'm not sure when, where, or why.

    Here is the example:

    # generate data
    n_points = 100
    x = np.linspace(0.0, 6.0, n_points)
    
    f = np.sin(x*4.0*np.pi) + 2*x - 0.2*x**2 + 0.1*np.random.normal(size=len(x))
    g = np.sin(1/(x+0.1))*x + 0.2*x - 0.01*(x-3)**3 + 0.2*f + 0.025*np.random.normal(size=len(x))
    data = mogptk.DataSet(
        mogptk.Data(x, f, name="f"),
        mogptk.Data(x, g, name="g")
    )
    
    # declare model
    model = mogptk.MOSM(data, Q=2)
    

    Now, when I look at the X tensor associated with this model, the data is normalized:

    model.model.X
    tensor([[   0.0000,    0.0000],
            [   0.0000,   10.1010],
            [   0.0000,   20.2020],
            [   0.0000,   30.3030],
            [   0.0000,   40.4040],
            [   0.0000,   50.5051],
    ...
            [   0.0000,  959.5960],
            [   0.0000,  969.6970],
            [   0.0000,  979.7980],
            [   0.0000,  989.8990],
            [   0.0000, 1000.0000],
            [   1.0000,    0.0000],
            [   1.0000,   10.1010],
            [   1.0000,   20.2020],
            [   1.0000,   30.3030],
            [   1.0000,   40.4040],
            [   1.0000,   50.5051],
            [   1.0000,   60.6061],
            [   1.0000,   70.7071],
            [   1.0000,   80.8081],
            [   1.0000,   90.9091],
            [   1.0000,  101.0101],
            [   1.0000,  111.1111],
            [   1.0000,  121.2121],
            [   1.0000,  131.3131],
            [   1.0000,  141.4141],
    ...
            [   1.0000,  959.5960],
            [   1.0000,  969.6970],
            [   1.0000,  979.7980],
            [   1.0000,  989.8990],
            [   1.0000, 1000.0000]], device='cuda:0', dtype=torch.float64)
    

    where I believe the first column is the channel, and the second is the x variable, which was initialized with x = np.linspace(0.0, 6.0, n_points).

    In contrast, if the data only has one channel, the behavior is much different:

    n_points = 100
    x = np.linspace(0.0, 6.0, n_points)
    f = np.sin(x*4.0*np.pi) + 2*x - 0.2*x**2 + 0.1*np.random.normal(size=len(x))
    
    data = mogptk.Data(x, f, name="f")
    
    kernel = mogptk.gpr.MaternKernel(input_dims=1)
    mo_kernel = mogptk.gpr.IndependentMultiOutputKernel(kernel)
    model = mogptk.Model(data, mo_kernel)
    

    Now,

    model.model.X
    tensor([[0.0000, 0.0000],
            [0.0000, 0.0606],
            [0.0000, 0.1212],
            [0.0000, 0.1818],
            [0.0000, 0.2424],
     ...
            [0.0000, 5.6970],
            [0.0000, 5.7576],
            [0.0000, 5.8182],
            [0.0000, 5.8788],
            [0.0000, 5.9394],
            [0.0000, 6.0000]], device='cuda:0', dtype=torch.float64)
    

    The data isn't normalized.

    This seems to pose a problem when declaring a mean function, since following the example, we are told to define something like

    
    class Mean(mogptk.gpr.Mean):
        def __init__(self):
            super(Mean, self).__init__()
            self.coefficients = mogptk.gpr.Parameter([0.0, 0.0, 0.0])
    
        def __call__(self, X):
            coefs = self.coefficients()
            return coefs[0] + coefs[1] * X[:, 1] + coefs[2] * X[:, 1] ** 2
    

    As a result, I get absurd predictions when there are multiple channels, due to the mean function not seeming to agree on what is normalized and what isn't. It works perfectly fine when there is one channel, which leads me to believe that the normalizing that occurs with multiple channels is at play.

    I suppose I don't actually have a concrete question, but I hope you can see the issue and fix it. Basically, normalizing seems to occur under the hood when there are multiple channels, but not when there are single channels, and it causes issues in prediction with multiple channels.

    For reference, I'm attaching the picture for the multiple-channel prediction.

    Also, this is occurring in my own project where I tried to implement a mean function.

    image

    Lastly, if there are multiple inputs, could you verify for a user defined mean function, we want to use one of X[:, 1], X[:, 2], ..., X[:, p] to refer to the respective column of our input data? This seems to work (noting that X[:, 0] refers to the channel), but I'd like to make sure.

    Thanks so much!

    opened by jimmyrisk 9
  • Support for when dim(X)>1?

    Support for when dim(X)>1?

    Hi,

    I just found this package and it's amazingly clean which I really like.

    However, I'm not finding much on how to implement anything for when dim(X)>1.

    As an example, my data set is mortality related, with two inputs (age, calendar year). It's useful to be able to predict annually (one year ahead) so for example, given a data set for ages [50,85] and calendar years [1990,2018], I may want to leave out 2018 for prediction assessment.

    On a similar note, is there any 1d plotting functionality? For example, could I fix the year to be 2018, and put age (in [50, 85]) on the x axis? Or, for a fixed age, plot mortality rates with year on the x axis?

    If there are any examples or documentation how to do this, please send me in that direction (I cannot find any right now).

    Best, Jimmy

    opened by jimmyrisk 9
  • multi-input and multi-output problem

    multi-input and multi-output problem

    It seems that this package is apt to solve one dimensional input, how can I code to solve multi-input and multi-output problem? for example, the shape of input is (32, 64), and output is also (32, 64), after training, how to predict on the test data with shape (1, 64), would you like to share the demo to solve it?

    opened by Fangwq 6
  • "RuntimeError: CUDA out of memory" Occurring in MOGPTK, not in GPyTorch

    When I fit a LMC to my data set using GPyTorch, I have no issues, but when I try to fit the same model using MOGPTK, I have a memory error. I understand this is a different package than GPyTorch, but given the common use of Torch, I would expect it to work without any memory issues.

    Here is the error I am obtaining, after running model.train(...):

    Starting optimization using Adam
    ‣ Model: SM-LMC
    ‣ Channels: 10
    ‣ Mixtures: 6
    ‣ Training points: 20
    ‣ Parameters: 94
    Traceback (most recent call last):
      File "<input>", line 4, in <module>
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\model.py", line 224, in train
        print('‣ Initial loss: {:.3g}'.format(self.loss()))
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\model.py", line 144, in loss
        return self.model.loss().detach().cpu().item()
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\model.py", line 182, in loss
        loss = -self.log_marginal_likelihood() - self.log_prior()
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\model.py", line 219, in log_marginal_likelihood
        K = self.kernel(self.X) + self.noise()*torch.eye(self.X.shape[0], device=config.device, dtype=config.dtype)  # NxN
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\kernel.py", line 17, in __call__
        return self.K(X1,X2)
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\kernel.py", line 169, in K
        k = self.Ksub(i, j, x1[i], x1[j])
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\multioutput.py", line 127, in Ksub
        kernels = torch.stack([kernel(X1,X2) for kernel in self.kernels], dim=2)  # NxMxQ
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\multioutput.py", line 127, in <listcomp>
        kernels = torch.stack([kernel(X1,X2) for kernel in self.kernels], dim=2)  # NxMxQ
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\kernel.py", line 17, in __call__
        return self.K(X1,X2)
      File "C:\Users\jrisk\anaconda3\envs\gp-mort\lib\site-packages\mogptk\gpr\singleoutput.py", line 134, in K
        return self.weight() * torch.prod(exp * cos, dim=2)
    RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.00 GiB total capacity; 8.21 GiB already allocated; 18.70 MiB free; 8.70 GiB reserved in total by PyTorch)
    

    It's worth noting that this works without issue when there are 4 channels (instead of 10). I tried 6 channels but had the same memory issue. It also occurred when trying MOSM.

    Here is my full code. I can't quite provide a reproducible example yet. For reference, mort is a data frame with 1015 rows and 12 columns (two predictors, and ten responses (corresponding to the channels)).

    import numpy as np
    from matplotlib import pyplot as plt
    from HMD_data import getMortData_new
    import mogptk
    
    countries = ["'CAN'", "'NOR'" , "'AUS'", "'BLR'", "'USA'"]
    mort = getMortData_new(countries)
    
    x_names = list(mort.columns[0:2])
    y_names = list(mort.columns[2:])
    mortData = mogptk.LoadDataFrame(mort, x_col = x_names, y_col = y_names)
    mortData.transform(mogptk.TransformStandard())
    
    rem_start = [0, 2016]
    rem_end = [0, 2018]
    
    for i, channel in enumerate(mortData):
        channel.remove_range(rem_start, rem_end)
        #channel.transform(mogptk.TransformDetrend(degree=1))
        channel.transform(mogptk.TransformNormalize())
    
    modelMOSM = mogptk.MOSM(mortData, Q=6)
    modelMOSM.init_parameters()
    modelMOSM.train(iters=2000, verbose=True, plot=True, error='MAE')
    

    modelMOSM.init_parameters() seems to work fine.

    Side note: is training points=20 accurate?? MortData has 1015 rows...

    opened by jimmyrisk 6
  • BNSE initialization is fragile

    BNSE initialization is fragile

    Here's a test code that I used to find one problem:

    #!/usr/bin/env python3
    
    import numpy as np
    import mogptk
    
    Q = 2
    num_channels = 3
    seed = 3428943
    num_inp_comps = 3
    
    min_channel_size = 10
    max_channel_size = 70
    
    print(f"""Q = {Q}
    num_channels = {num_channels}
    seed = {seed}
    num_inp_comps = {num_inp_comps}
    
    min_channel_size = {min_channel_size}
    max_channel_size = {max_channel_size}
    """)
    
    # Using old-style NumPy RNG usage, since mogptk uses it under the
    # hood.
    np.random.seed(seed)
    
    dataset = mogptk.DataSet()
    
    for channel_num in range(num_channels):
        channel_size = np.random.randint(min_channel_size,
                                         max_channel_size + 1)
    
        x = np.pi*np.random.random_sample((channel_size, num_inp_comps))
        
        y = (np.sin(x).prod(axis = 1) +
             np.random.standard_normal(channel_size))
        
        curr_data = mogptk.Data(x, y, name = str(channel_num))
    
        dataset.append(curr_data)
    
    model = mogptk.MOSM(dataset, Q = Q)
    model.init_parameters(method = "BNSE")
    

    The results of running this are as follows:

    Traceback (most recent call last):
      File "./test-bnse-w-multi-inp-dataset.py", line 43, in <module>
        model.init_parameters(method = "BNSE")
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/mosm.py", line 107, in init_parameters
        self.set_parameter(q, 'magnitude', magnitude[:, q])
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/model.py", line 308, in set_parameter
        kern[key].assign(val)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/gpflow/base.py", line 228, in assign
        unconstrained_value = self.validate_unconstrained_value(value, self.dtype)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/gpflow/base.py", line 199, in validate_unconstrained_value
        return tf.debugging.assert_all_finite(unconstrained_value, message=message)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/tensorflow_core/python/ops/numerics.py", line 67, in verify_tensor_all_finite_v2
        verify_input = array_ops.check_numerics(x, message=message)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 902, in check_numerics
        _ops.raise_from_not_ok_status(e, name)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 6606, in raise_from_not_ok_status
        six.raise_from(core._status_to_exception(e.code, message), None)
      File "<string>", line 3, in raise_from
    tensorflow.python.framework.errors_impl.InvalidArgumentError: gpflow.Parameter: the value to be assigned is incompatible with this parameter's transform (the corresponding unconstrained value has NaN or Inf) and hence cannot be assigned. : Tensor had NaN values [Op:CheckNumerics]
    

    Unfortunately, reducing num_inp_comps to 1 doesn't fix this problem.

    I've also had different errors with another dataset, which look like this:

    Traceback (most recent call last):
      File "./mk_gp_mogptk.py", line 98, in <module>
        model.init_parameters(method = args.init)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/mosm.py", line 88, in init_parameters
        amplitudes, means, variances = self.dataset.get_bnse_estimation(self.Q)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/dataset.py", line 419, in get_bnse_estimation
        channel_amplitudes, channel_means, channel_variances = channel.get_bnse_estimation(Q, n)
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/data.py", line 1052, in get_bnse_estimation
        amplitudes, positions, variances = bnse.get_freq_peaks()
      File "/Users/jjramsey/venvs/gpflow2/lib/python3.7/site-packages/mogptk/bnse.py", line 111, in get_freq_peaks
        dx = x[1]-x[0]
    IndexError: index 1 is out of bounds for axis 0 with size 1
    

    Unfortunately, I'm not at liberty to release that other dataset, and I haven't been able to find a test dataset that reproduces the above error.

    opened by jjramsey 6
  • Prediction with partial observations

    Prediction with partial observations

    It's a great feature to allow missing values (e.g., remove_range) in the training. I'm wondering if it is supported in prediction/inference? For example, in prediction I may have certain channel values available but other channels to be predicted. Thanks.

    opened by zhh210 5
  • Error not displayed

    Error not displayed

    I have followed the exact steps as in the example for 'Error metrics'. When I type in the command for error, I get the following error:

    'list' object has no attribute 'shape'

    opened by pabhidnya15 4
  • get_parameters() not returning anything

    get_parameters() not returning anything

    I'm trying to extract my parameters from a model, but the get_parameters() command doesn't seem to do anything. Here's an example using one of the given examples:

    n_points = 500
    frequencies = [0.2, 1.0, 2.0]
    amplitudes = [1.0, 0.5, 0.5]
    
    t = np.linspace(0.0, 20.0, n_points)
    y = np.zeros(n_points)
    for i in range(3):
        y += amplitudes[i] * np.sin(2.0*np.pi * frequencies[i] * t)
    y += np.random.normal(scale=0.4, size=n_points)
    
    # data class
    data = mogptk.Data(t, y)
    data.remove_range(start=10.0)
    data.plot();
    
    model = mogptk.SM(data, Q=3)
    model.predict()
    
    # initialize params
    method = 'BNSE'
    # method = 'LS'
    # method = 'IPS'
    
    model.init_parameters(method=method)
    model.predict()
    model.train(iters=100, lr=0.5, verbose=True)
    model.predict()
    
    params = model.get_parameters()
    

    I can print the parameters fine, but I haven't found a way to extract the information. The get_parameters() function generates a NoneType variable with no values.

    opened by ff1201 4
  • Reproduce example from NIPS 2017 paper

    Reproduce example from NIPS 2017 paper

    Hi, I was wondering if there is an example or tutorial about using mogptk to reproduce the synthetic example shown in Fig 2 of the Spectral Mixture Kernels for Multi-Output Gaussian Processes paper. This is the figure

    image

    I'm almost sure all the relevant methods are already implemented in the MOSM model, just wondering if there is a script that could be adapted more easily to reproduce this analysis. In particular, I'm still looking for an easy way to plot the auto and cross covariances of the model.

    Thanks!

    opened by j-faria 4
  • Train mean function

    Train mean function

    Can any of the models currently implemented have a user-defined mean function? If yes, can the mean function have trainable parameters (ideally per channel)?

    Thank you. This is a fantastic package, congratulations!

    opened by j-faria 4
  • Error with get_lombscargle_estimation

    Error with get_lombscargle_estimation

    Hi, when running the LS initialisation method, I get the following error:

    UnboundLocalError                         Traceback (most recent call last)
    ~\AppData\Local\Temp\ipykernel_15852\3290155114.py in <module>
          6 
          7 # initialize parameters of kernel using BNSE
    ----> 8 model.init_parameters(method='LS')
    
    c:\users\ff120\documents\mini_project_2\mogptk\mogptk\models\mosm.py in init_parameters(self, method, sm_init, sm_method, sm_iters, sm_params, sm_plot)
         85             amplitudes, means, variances = self.dataset.get_bnse_estimation(self.Q)
         86         elif method.lower() == 'ls':
    ---> 87             amplitudes, means, variances = self.dataset.get_lombscargle_estimation(self.Q)
         88         else:
         89             amplitudes, means, variances = self.dataset.get_sm_estimation(self.Q, method=sm_init, optimizer=sm_method, iters=sm_iters, params=sm_params, plot=sm_plot)
    
    c:\users\ff120\documents\mini_project_2\mogptk\mogptk\dataset.py in get_lombscargle_estimation(self, Q, n)
        587         variances = []
        588         for channel in self.channels:
    --> 589             channel_amplitudes, channel_means, channel_variances = channel.get_lombscargle_estimation(Q, n)
        590             amplitudes.append(channel_amplitudes)
        591             means.append(channel_means)
    
    c:\users\ff120\documents\mini_project_2\mogptk\mogptk\data.py in get_lombscargle_estimation(self, Q, n)
        918                 amplitudes = amplitudes[:Q]
        919                 positions = positions[:Q]
    --> 920                 stddevs = stddevs[:Q]
        921 
        922             n = len(amplitudes)
    
    UnboundLocalError: local variable 'stddevs' referenced before assignment
    

    Code used to obtain error:

    n_points = 100
    t = np.linspace(0.0, 6.0, n_points)
    
    y1 = np.sin(6.0*t) + 0.2*np.random.normal(size=len(t))
    y2 = np.sin(6.0*t + 2.0) + 0.2*np.random.normal(size=len(t))
    y3 = np.sin(6.0*t) - np.sin(4.0*t) + 0.2*np.random.normal(size=len(t))
    y4 = 3.0*np.sin(6.0 * (t-2.0)) + 0.3*np.random.normal(size=len(t))
    
    dataset = mogptk.DataSet(
        mogptk.Data(t, y1, name='First channel'),
        mogptk.Data(t, y2, name='Second channel'),
        mogptk.Data(t, y3, name='Third channel'),
        mogptk.Data(t, y4, name='Fourth channel')
    )
    
    for data in dataset:
        data.remove_randomly(pct=0.4)
    
    dataset[0].remove_range(start=2.0)
    
    model = mogptk.MOSM(dataset, Q=2)
    
    model.init_parameters(method='LS')
    

    I haven't had this error before when using LS, but in the code it seems the variances and stddevs are mixed up.

    opened by ff1201 2
  • Plot PSD uncertainties of SM/MOSM models

    Plot PSD uncertainties of SM/MOSM models

    We're drawing the PSD using the trained model parameters, but is it possible to draw the "posterior" including the data with uncertainty bars much like BNSE computes for the SM and MOSM models?

    opened by tdewolff 0
  • Memory issues

    Memory issues

    Check memory usage since it seems to use much more than expected. With N data points we'd expect a usage of about NxN for the kernel matrix (with large N).

    opened by tdewolff 0
  • MOHSM has Cholesky problems

    MOHSM has Cholesky problems

    El siguiente ejemplo a veces cae, @maltamiranomontero puedes verlo tú? Seguramente hay un problema en el modelo o en los límites de los parámetros (aunque no veo nada raro con los valores de los parámetros cuando se cayó). Esto pasa con dos canales y yo creo que la matriz de la covarianza está mal (pero no estoy seguro). Algunos valores son más grandes que el diagonal, que hace que la matriz no es positiva semi-definitiva.

    import numpy as np
    import pandas as pd
    import mogptk
    
    t = np.linspace(1.0, 200.0, 200)
    
    data = pd.read_csv("data_BM.csv")
    dataset = mogptk.DataSet(t, [np.array(data['a']), np.array(data['b']), np.array(data['c']), np.array(data['d'])], names=['Channel 1','Channel 2', 'Channel 3','Channel 4'])
    dataset.transform(mogptk.TransformNormalize())
    dataset.rescale_x()
    
    for data in dataset:
        data.remove_randomly(pct=0.80)
    dataset = dataset[0:2]
     
    model = mogptk.MOHSM(dataset, Q=1, P=1, rescale_x=True)
    model.init_parameters('BNSE')
    model.print_parameters()
    [data_BM.csv](https://github.com/GAMES-UChile/mogptk/files/8525944/data_BM.csv)
    model.train(iters=1000, lr=0.1, verbose=False)
    model.predict()
    

    ERROR: torch.linalg_cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 41 is not positive-definite).

    opened by tdewolff 0
  • Confidence interval with non-Gaussian likelihoods

    Confidence interval with non-Gaussian likelihoods

    The variance of non-Gaussian likelihoods is not useful when predicting y values since the likelihood may be non-symmetric. Instead we should return the confidence interval or certain quantiles. Even better, return an approximation by sampling with MCMC.

    See: https://arxiv.org/pdf/1206.6391.pdf ?

    See https://github.com/GPflow/GPflow/issues/1345

    idea 
    opened by tdewolff 0
Releases(v0.3.2)
  • v0.3.1(Jul 17, 2022)

    • Fix conversions to/from GPU
    • Fix error on plot_losses()
    • Rename gpr.PhiKernel as gpr.FunctionKernel
    • Add kernel shortcuts such as mogptk.Kernels.SpectralMixture
    • Include end point when calling Data.remove_range()
    • Fix input dimensions for AddKernel and MulKernel
    • Add sigma and figsize arguments to Model.plot_prediction()
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Jun 1, 2022)

    Features

    • Support for variational and sparse models
    • Support for multi output (heterogeneous) likelihoods, i.e. different likelihoods for each channel
    • New models: Snelson, OpperArchambeau, Titsias, Hensman
    • New kernels: Constant, White, Exponential, LocallyPeriodic, Cosine, Sinc
    • New likelihoods: StudentT, Exponential, Laplace, Bernoulli, Beta, Gamma, Poisson, Weibull, LogLogistic, LogGaussian, ChiSquared
    • New mean functions: Constant and Linear
    • Allow kernels to be added and multiplied (i.e. K1 + K2 or K1 * K2)
    • Data and DataSet now accept more data types as input, such as pandas series
    • Data, DataSet, and Model plot functionalities return the figure and axes to allow customization
    • Support sampling (prior or posterior) from the model
    • Add the MOHSM kernel: multi-output harmonic spectral mixture kernel (Altamirano 2021)
    • Parameters can be pegged to other parameters, essentially removing them from training
    • Exact model supports training with known data point variances and draw their error bars in plots

    Improvements

    • Jitter added to the diagonal before calculating the Cholesky is now relative to the average value of the diagonal, this improves numeric stability for all kernels irrespective of the actual numerical magnitude of the values
    • Kernels now implement K_diag that returns the kernel diagonal for better performance
    • BNSE initialization method has been reimplemented with improved performance and stability
    • Parameter initialization for all models from different initialization methods has been much improved
    • Induction point initialization now support random or grid or density
    • SpectralMixture (in addition to Spectral), MultiOutputSpectralMixture (in addition to MultiOutputSpectral) with higher performance
    • Allow mixing of single-output and multi-output kernels using active
    • All plotting functions have been restyled
    • Model training allows custom error function for calculation at each iteration
    • Support single and cross lengthscales for the SquaredExponential, RationalQuadratic, Periodic, LocallyPeriodic kernels
    • Add AIC and BIC methods to model
    • Add model.plot_correlation()

    Changes

    • Remove rescale_x
    • Parameter.trainable => Parameter.train
    • Kernels are by default initialized deterministically and not random, however the models (MOSM, MOHSM, CONV, CSM, SM-LMC, and SM) are still initialized randomly by default
    • Plotting predictions happens from the model no the data: model.plot_prediction() instead of model.predict(); data.plot()
    Source code(tar.gz)
    Source code(zip)
  • v0.2.5(Sep 8, 2021)

  • v0.2.4(Jul 26, 2021)

    • Set maximum frequency to Nyquist in MOSM, CSM, SM-LMC, and SM; fixes #21
    • Improve CholeskyException messaging
    • Update the GONU example
    • Fix Sigmoid.backward, fixes #25
    • Add support for multiple input dimensions for remove_range, fixes #24
    • Fix SM model initialization for IPS
    • Data now permits different dtypes per input dimension for X, LoadFunction now works for multi input dimensions, upgrading time delta for datetime64 now fixed
    • Change X from (n,input_dims) to [(n,)] * input_dims
    • Add dim to functions to specify input dimension
    • Fix example 06
    • Fix old import path, fixes #27
    • Reuse torch.eye in log_marginal_likelihood
    • Make rescale_x optional for models, see #28; return losses and errors from train()
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Dec 18, 2020)

    • Adding the MSE and sMAPE error measures
    • Fix returning tensors from the GPU back to the CPU
    • Fix repeated use of a dataset by properly deepcopying it
    • Add console output for training
    • Fix the LBFGS optimizer
    • Add the plot_losses function to the Model class to plot losses/errors after train separately
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Dec 10, 2020)

  • v0.2.0(Nov 23, 2020)

Owner
GAMES
Grupo de Aprendizaje de Máquinas, infErencia y Señales, Universidad de Chile
GAMES
Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Resilient projection-based consensus actor-critic (RPBCAC) algorithm We implement the RPBCAC algorithm with nonlinear approximation from [1] and focus

Martin Figura 5 Jul 12, 2022
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022
A simple implementation of Kalman filter in single object tracking

kalman-filter-in-single-object-tracking A simple implementation of Kalman filter in single object tracking https://www.bilibili.com/video/BV1Qf4y1J7D4

130 Dec 26, 2022
The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

Balloon Learning Environment Docs The Balloon Learning Environment (BLE) is a simulator for stratospheric balloons. It is designed as a benchmark envi

Google 87 Dec 25, 2022
Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation: Work In Progress, Results can't be replicated yet with the m

Yad Konrad 196 Aug 30, 2022
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs This repository contains code to accompany the paper "Hierarchical Clustering: O

3 Sep 25, 2022
Practical tutorials and labs for TensorFlow used by Nvidia, FFN, CNN, RNN, Kaggle, AE

TensorFlow Tutorial - used by Nvidia Learn TensorFlow from scratch by examples and visualizations with interactive jupyter notebooks. Learn to compete

Alexander R Johansen 1.9k Dec 19, 2022
End-to-end beat and downbeat tracking in the time domain.

WaveBeat End-to-end beat and downbeat tracking in the time domain. | Paper | Code | Video | Slides | Setup First clone the repo. git clone https://git

Christian J. Steinmetz 60 Dec 24, 2022
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

🆕 Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

3.6k Dec 26, 2022
MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

Fong 22 Dec 24, 2022
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be sh

Jianquan Ye 82 Nov 17, 2022
This is a simple plugin for Vim that allows you to use OpenAI Codex.

🤖 Vim Codex An AI plugin that does the work for you. This is a simple plugin for Vim that will allow you to use OpenAI Codex. To use this plugin you

Tom Dörr 195 Dec 28, 2022
AlphaNet Improved Training of Supernet with Alpha-Divergence

AlphaNet: Improved Training of Supernet with Alpha-Divergence This repository contains our PyTorch training code, evaluation code and pretrained model

Facebook Research 87 Oct 10, 2022
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

Jiayi Chen 3 Mar 03, 2022
Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Expert-Linking Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking." This is

BoChen 12 Jan 01, 2023
LoL Runes Recommender With Python

LoL-Runes-Recommender Para ejecutar la aplicación se debe llamar a execute_app.p

Sebastián Salinas 1 Jan 10, 2022
This repository lets you interact with Lean through a REPL.

lean-gym This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-g

OpenAI 87 Dec 28, 2022
A simple version for graphfpn

GraphFPN: Graph Feature Pyramid Network for Object Detection Download graph-FPN-main.zip For training , run: python train.py For test with Graph_fpn

WorldGame 67 Dec 25, 2022
An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Text2Event An implementation for Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction Please contact Yaojie Lu (@

Roger 153 Jan 07, 2023