Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Overview

numbalsoda

numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems. LSODA was originally written in Fortran. numbalsoda is a wrapper to a C++ re-write of the original code: https://github.com/dilawar/libsoda

numbalsoda also wraps the dop853 explicit Runge-Kutta method from this repository: https://github.com/jacobwilliams/dop853

This package is very similar to scipy.integrate.solve_ivp (see here), when you set method = 'LSODA' or method = DOP853. But, scipy.integrate.solve_ivp invokes the python interpreter every time step which can be slow. Also, scipy.integrate.solve_ivp can not be used within numba jit-compiled python functions. In contrast, numbalsoda never invokes the python interpreter during integration and can be used within a numba compiled function which makes numbalsoda a lot faster than scipy for most problems, and achieves similar performance to Julia's DifferentialEquations.jl in some cases (see benchmark folder).

Installation

Conda:

conda install -c conda-forge numbalsoda

Pip:

python -m pip install numbalsoda

Basic usage

from numbalsoda import lsoda_sig, lsoda, dop853
from numba import njit, cfunc
import numpy as np

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    du[0] = u[0]-u[0]*u[1]
    du[1] = u[0]*u[1]-u[1]*p[0]

funcptr = rhs.address # address to ODE function
u0 = np.array([5.,0.8]) # Initial conditions
data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
t_eval = np.linspace(0.0,50.0,1000) # times to evaluate solution

# integrate with lsoda method
usol, success = lsoda(funcptr, u0, t_eval, data = data)

# integrate with dop853 method
usol1, success1 = dop853(funcptr, u0, t_eval, data = data)

# usol = solution
# success = True/False

The variables u, du and p in the rhs function are pointers to an array of floats. Therefore, operations like np.sum(u) or len(u) will not work. However, you can use the function nb.carray() to make a numpy array out of the pointers. For example:

import numba as nb

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    u_ = nb.carray(u, (2,))
    p_ = nb.carray(p, (1,))
    # ... rest of rhs goes here using u_ and p_

Above, u_ and p_ are numpy arrays build out of u and p, and so functions like np.sum(u_) will work.

Also, note lsoda can be called within a jit-compiled numba function (see below). This makes it much faster than scipy if a program involves many integrations in a row.

@njit
def test():
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    return usol
usol = test() # this works!

@njit
def test_sp():
    sol = solve_ivp(f_scipy, t_span, u0, t_eval = t_eval, method='LSODA')
    return sol
sol = test_sp() # this does not work :(

Passing data to the right-hand-side function

In the examples shown above, we passed a an single array of floats to the right-hand-side function:

# ...
data = np.array([1.0])
usol, success = lsoda(funcptr, u0, t_eval, data = data)

However, sometimes you might want to pass more data types than just floats. For example, you might want to pass several integers, an array of floats, and an array of integers. One way to achieve this is with generating the cfunc using a function like this:

def make_lsoda_func(param1, param2, param3):
    @cfunc(lsoda_sig)
    def rhs(t, x, du, p):
        # Here param1, param2, and param3
        # can be accessed.
        du[0] = param1*t
        # etc...
    return rhs
    
rhs = make_lsoda_func(10.0, 5, 10000)
funcptr = rhs.address
# etc...

The only drawback of this approach is if you want to do many successive integrations where the parameters change because it would required re-compiling the cfunc between each integration. This could be slow.

But! It is possible to pass arbitrary parameters without re-compiling the cfunc, but it is a little tricky. The notebook passing_data_to_rhs_function.ipynb gives an example that explains how.

Comments
  • Installation with pip fails

    Installation with pip fails

    Good evening,

    I am working on a project involving the long-time integration of an ODE, and the results you show definitely look very promising, but I cannot install the package ...

    I am working on mac os X v12.4 with a M1 chip. Python v3.10.8, pip v22.3.1, setuptools v65.5.1.

    To really test only the install of numbalsoda and remove all possible side effects from pre-installed packages, I put myself in a new virtualenv. πŸ”΄ I am getting the following error when doing pip install numbalsoda:

    pip install numbalsoda
    Collecting numbalsoda
      Downloading numbalsoda-0.3.4.tar.gz (241 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 241.3/241.3 kB 2.8 MB/s eta 0:00:00
      Installing build dependencies ... done
      Getting requirements to build wheel ... done
      Preparing metadata (pyproject.toml) ... done
    Collecting numpy
      Downloading numpy-1.23.5-cp310-cp310-macosx_11_0_arm64.whl (13.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.4/13.4 MB 9.5 MB/s eta 0:00:00
    Collecting numba
      Downloading numba-0.56.4-cp310-cp310-macosx_11_0_arm64.whl (2.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 9.4 MB/s eta 0:00:00
    Requirement already satisfied: setuptools in ./py3env/lib/python3.10/site-packages (from numba->numbalsoda) (65.5.1)
    Collecting llvmlite<0.40,>=0.39.0dev0
      Downloading llvmlite-0.39.1-cp310-cp310-macosx_11_0_arm64.whl (23.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.1/23.1 MB 10.3 MB/s eta 0:00:00
    Building wheels for collected packages: numbalsoda
      Building wheel for numbalsoda (pyproject.toml) ... error
      error: subprocess-exited-with-error
      
      Γ— Building wheel for numbalsoda (pyproject.toml) did not run successfully.
      β”‚ exit code: 1
      ╰─> [73 lines of output]
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
            warnings.warn(
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
            warnings.warn(msg, warning_class)
          
          
          --------------------------------------------------------------------------------
          -- Trying "Ninja" generator
          --------------------------------
          ---------------------------
          ----------------------
          -----------------
          ------------
          -------
          --
          Not searching for unused variables given on the command line.
          -- The C compiler identification is AppleClang 13.1.6.13160021
          -- Detecting C compiler ABI info
          -- Detecting C compiler ABI info - done
          -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang - skipped
          -- Detecting C compile features
          -- Detecting C compile features - done
          -- The CXX compiler identification is AppleClang 13.1.6.13160021
          -- Detecting CXX compiler ABI info
          -- Detecting CXX compiler ABI info - done
          -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ - skipped
          -- Detecting CXX compile features
          -- Detecting CXX compile features - done
          -- Configuring done
          -- Generating done
          -- Build files have been written to: /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_cmake_test_compile/build
          --
          -------
          ------------
          -----------------
          ----------------------
          ---------------------------
          --------------------------------
          -- Trying "Ninja" generator - success
          --------------------------------------------------------------------------------
          
          Configuring Project
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
          
          CMake Error at /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineFortranCompiler.cmake:33 (message):
            Could not find compiler set in environment variable FC:
          
            /usr/local/bin/gfortran.
          Call Stack (most recent call first):
            CMakeLists.txt:3 (project)
          
          
          CMake Error: CMAKE_Fortran_COMPILER not set, after EnableLanguage
          CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
          -- Configuring incomplete, errors occurred!
          See also "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build/CMakeFiles/CMakeOutput.log".
          Traceback (most recent call last):
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 632, in setup
              env = cmkr.configure(
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 334, in configure
              raise SKBuildError(
          
          An error occurred while configuring with CMake.
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
            Source directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
          Please see CMake's output for more information.
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      ERROR: Failed building wheel for numbalsoda
    Failed to build numbalsoda
    ERROR: Could not build wheels for numbalsoda, which is required to install pyproject.toml-based projects
    

    πŸ”΄ This error also occurs with numbalsoda v0.3.3.

    🟒 The pip install runs fine for numbalsoda v0.2.1 however.

    Could you please advise @Nicholaswogan ?

    Thank you very much in advance ! Best, T. Bridel-Bertomeu

    opened by tbridel 5
  • [lsoda] 500 steps taken before reaching tout

    [lsoda] 500 steps taken before reaching tout

    I am running a simulation and if I understand correctly the maximum number of steps (500) is performed before the integration finishes. Is there a way to change this parameter?

    Thanks

    opened by nmourdo 5
  • Can I use 2d-array of 'u0'?

    Can I use 2d-array of 'u0'?

    I want to use the 2d-shape initial values of the multiple-particles system. But, I think I cannot use the slicing or indexing.

    def swing_lsoda(network):
        """
        rhs = swing_cover(network)
        funcptr = rhs.address
        ...
        data = np.array([1.0])
        usol, success = lsoda(funcptr, u0, t_eval, data = data)
        """
        @nb.cfunc(lsoda_sig) # network
        def rhs(t, u, du, p): # p0=m, p1=gamma, p2=P, p3=K
            # u is 2d-array 
            Interaction = p[3] * SinIntE(network, u[0])
            du[0] = u[1] #Thetas of nodes
            du[1] = 1/p[0]*(p[2] - p[1]*u[1] - Interaction) #Omega 
    

    How can I try it?

    opened by kimyoungjin06 4
  • Support for other integrators

    Support for other integrators

    Hi, thanks for this great package!

    I know this package is only supposed to deal with the lsoda integrator, but I was wondering if you could at least add support for an explicit integrator as well, maybe rk23. The reason I ask is that I want to solve a very large system of ODEs (>10^6), and because of that, implicit integrators are not really suitable (since they require a matrix solve Ax=b at each step), so I'm stuck with scipy's rk23 or rk45, which are fine, except that because the integration involves too many time steps, I think solve_ivp's explicit while loop is becoming the bottle-neck.

    opened by ma-sadeghi 4
  • Numba cache and iterative solving

    Numba cache and iterative solving

    Hi,

    I am trying to use NumbaLSODA to solve many systems iteratively. However, it seems that I can't out the function in the numba cache and so I have a big overhead at each iteration. I was wondering if you have an advice about it.

    This is my code. First I put all numba related function inside a file

    numba_func.py

    NOPYTHON = True
    CACHE = True
    PARALLEL = True
    
    @nb.cfunc(lsoda_sig, nopython=NOPYTHON, cache=CACHE)
    def rhs(x, i, di, p):
        di[0] = ...
        di[1] = ...
        di[2] = ...
    
    funcptr = rhs.address
    
    @nb.njit('(float64)(float64, float64, float64, int16)', nopython=NOPYTHON, parallel=PARALLEL, cache=CACHE)
    def solve(a, b, c, funcptr):
    
        w0 = np.array([a, b, ...], dtype=np.float64)
        p  = np.array([c], dtype=np.float64)
    
        x = np.linspace(0, 100, 500)
    
        usol, success = lsoda(funcptr, w0, x, data=p)
        
        return usol[-1][1]
    
    

    Then I use another file to solve the systems one after the others

    
    from numba_func import solve, funcptr
    
    gs = []
    for a, b, c in zip(as, bs, cs):
        gs = np.append(gs, solve(a, b, c, funcptr))
    

    I got the following warning:

    NumbaWarning: Cannot cache compiled function "solve" as it uses dynamic globals (such as ctypes pointers and large global arrays)
    

    I guess the idea is to pass the variable funcptr correctly so that numba is happy but so far I failed...

    opened by edumur 4
  • Strange results when t_eval has repeat values

    Strange results when t_eval has repeat values

    Thanks for the great package!

    Summary: I'm trying to switch over from using scipy's odeint to NumbaLSODA as it seems much better! Sometimes the t_eval array I supply has repeated times at the end and this produces some strange output. This was okay with odeint though, would it be possible to allow it for NumbaLSODA too? I think it is trying to integrate an infinitely small timestep rather than just evaluating the same timestep multiple times.

    Details: I'm trying to add this to LEGWORK so that we can speed up how we evolve the orbits of binary stars due to gravitational wave emission. However, if the evolution gets too close to the merger then LSODA produces a bunch of warnings about the timesteps being too small (since the derivatives get so large). To avoid this, I set any timesteps that are too close to the merger equal to the last allowable timestep - this leads to repeated values and the weird results that I see. (I don't just trim the array since I do this in a 2D array with a collection of binaries and so each individual binary might have a different length array which would wouldn't work in a 2D array I think.)

    MWE: Evolve binary evolution due to gravitational wave emission using Peters (1964)

    from numba import cfunc
    import numba as nb
    import numpy as np
    from NumbaLSODA import lsoda_sig, lsoda
    
    @cfunc(lsoda_sig)
    def de_dt(t, e, de, data):
        e_ = nb.carray(e, (1,))
        beta, c_0 = nb.carray(data, (2,))
        de[0] = -19 / 12 * beta / c_0**4 * (e_[0]**(-29.0 / 19.0) * (1 - e_[0]**2)**(3.0/2.0)) \
            / (1 + (121./304.) * e_[0]**2)**(1181./2299.)
    
    ecc = 0.5
    beta = 2.47e22
    c_0 = 1.677e10
    t_merge = 1.81e17
    
    funcptr = de_dt.address
    u0 = np.array([ecc])
    data = np.array([beta, c_0])
    t_eval = np.array([0.0, 0.1, 0.2, 0.5, 0.8, 0.9, 0.9, 0.9]) * t_merge
    
    ecc_evol, _ = lsoda(funcptr, u0, t_eval=t_eval, data=data)
    print(ecc_evol)
    

    This outputs

    [[ 5.00000000e-01]
     [ 4.83244343e-01]
     [ 4.64840518e-01]
     [ 3.95592532e-01]
     [ 2.82404708e-01]
     [ 2.15245963e-01]
     [-1.36311572e+57]
     [-1.36311572e+57]]
    

    So you can see that as soon as the timesteps repeat, you get this hugely small number (and eccentricity should be between 0 and 1)

    opened by TomWagg 3
  • Cannot import dop853 from numbalsoda

    Cannot import dop853 from numbalsoda

    Hi! I've just installed your library and got this error when running the code shown in the readme. ImportError: cannot import name 'dop853' from 'numbalsoda' Could I be missing a library? Thanks in advance

    opened by samirop 2
  • do you have a tip how to workaround the global variable?

    do you have a tip how to workaround the global variable?

    Hi,

    thanks for the great code, it's very fast.

    I'm struggling though with a need for a global variable, or maybe an argument that is updated by the ODE itself

    @nb.cfunc(lsoda_sig)
    def particle_ode(t, y_, dydt, p_):
        y = nb.carray(y_,(3,))
        p = nb.carray(p_,(11,))
        rho1, rho, nu , Cd, g, rhop, d, zu, zl, Vc0, trec = p
       
        global tzl # <---- here 
    
                               
        zp   = y[0]                # particle position      [m]
        V    = y[1]                # particle velocity      [m/s]
        tzl =  y[2] 
        Cam = 0.5                  # added mass coefficient [-]
    
        # determine Vc
        if (y[0]<= zl):
            Vc = Vc0
            tzl = t     # < ----- here I need to adjust this as we move through with y[0]
        else:
            Vc = Vc0 * np.exp(-1*((t-tzl)/trec))
        FS = (rho(zp) - rho1)*Vc*g
    
        # force balance on the particle 
        dzpdt = V
        dVdt  = ( (rhop-rho(zp)) * Vp(d) * g \
                - 0.5 *Cd(V*d/nu(zp))* rho(zp) * Ap(d) * np.abs(V) * V \
                - FS \
                ) / (rhop*Vp(d) + Cam*rho(zp)*Vp(d))
    
        dydt = [dzpdt, dVdt, 0.0 ]
    funcptr = particle_ode.address
    
    opened by alexlib 2
  • Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Hi There, Thanks for sharing this package. I am seeing up to 20x speedup in my code. As I am simulating the evolution of multiple initial conditions, I want to speed up the code further by using parallel=True flag in Numba. I adapted the example you provided in Stackoverflow for my code to use NumbaLSODA with parallel=True flag in Numba. However, I observed that the speedup was marginal even with 8 cores . In some cases, my parallel code was actually slower (!).

    After debugging on the 2 state example using the code below, I narrowed the reason down to the way the parameters in the rhs function are defined. if the parameters are either global variables or passed using the data argument in lsoda, the parallel = True speeds up the code almost linearly versus the number of cores. However, if the parameters are defined locally as in f_local below, the speed up is marginal. In fact, the parallel version using f_local is slower than the series version using f_global or f_param.

    I thought that local declarations of variables is a good coding practice and it speeds up code execution, but this does not seem to be the case with Numba and NumbaLSODA. I do not know if this behavior is caused by cfunc in Numba or NumbaLSODA. It was definitely unexpected and I thought I will bring it to your notice. Do you know the reason for this behavior? Are local constants not compiled just as global constants? I am unable to dig deeper into the reason and was wondering if you could help. Best Amar

    from NumbaLSODA import lsoda_sig, lsoda
    from matplotlib import pyplot as plt
    import numpy as np
    import numba as nb
    import time
    
    a_glob=np.array([1.5,1.5])
    
    @nb.cfunc(lsoda_sig)
    def f_global(t, u_, du, p): # variable a is global
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*a_glob[0]
        du[1] = u[0]*u[1]-u[1]*a_glob[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_local(t, u_, du, p): # variable a is local
        u = nb.carray(u_, (2,))
        a = np.array([1.5,1.5]) 
        du[0] = u[0]-u[0]*u[1]*a[0]
        du[1] = u[0]*u[1]-u[1]*a[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_param(t, u_, du, p): # pass in a as a paameter
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*p[0]
        du[1] = u[0]*u[1]-u[1]*p[1]
    
    funcptr_glob = f_global.address
    funcptr_local = f_local.address
    funcptr_param = f_param.address
    t_eval = np.linspace(0.0,20.0,201)
    np.random.seed(0)
    a = np.array([1.5,1.5])
    
    @nb.njit(parallel=True)
    def main(n, flag):
    #     a = np.array([1.5,1.5])
        u1 = np.empty((n,len(t_eval)), np.float64)
        u2 = np.empty((n,len(t_eval)), np.float64)
        for i in nb.prange(n):
            u0 = np.empty((2,), np.float64)
            u0[0] = np.random.uniform(4.5,5.5)
            u0[1] = np.random.uniform(0.7,0.9)
            if flag ==1: # global
                usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==2: # local
                usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==3: # param
                usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
            u1[i] = usol[:,0]
            u2[i] = usol[:,1]
        return u1, u2
    
    @nb.njit(parallel=False)
    def main_series(n, flag): # same function as above but with parallel flag = False
    #     a = np.array([1.5,1.5])
    u1 = np.empty((n,len(t_eval)), np.float64)
    u2 = np.empty((n,len(t_eval)), np.float64)
    for i in nb.prange(n):
        u0 = np.empty((2,), np.float64)
        u0[0] = np.random.uniform(4.5,5.5)
        u0[1] = np.random.uniform(0.7,0.9)
        if flag ==1: # global
            usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==2: # local
            usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==3: # param
            usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
        u1[i] = usol[:,0]
        u2[i] = usol[:,1]
    return u1, u2
    
    n = 100
    # calling first time for JIT compiling
    u1, u2 = main(n,1)
    u1, u2 = main(n,2)
    u1, u2 = main(n,3)
    
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    
    # Running code for large number 
    n = 10000
    a1 = time.time()
    u1, u2 = main(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is fast
    
    
    a1 = time.time()
    u1, u2 = main(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    a1 = time.time()
    u1, u2 = main_series(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is faster than local + parallel
    
    a1 = time.time()
    u1, u2 = main_series(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main_series(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    opened by amar-iastate 2
  • Easier passing data to rhs

    Easier passing data to rhs

    Thanks for the package, some 400x faster than plain Python for the system I'm looking at. I just wanted to point out that Numba functions close over their local scope so passing arbitrary (constant?) data can be much easier than in your example. Mine looks like this:

    def make_lsoda_func(c,R,dstar,E,F,N):
        @cfunc(lsoda_sig)
        def rhs(t, x, du, p):
            codim3(t,x,c,R,dstar,E,F,N,du)
        return rhs
    
    opened by maedoc 2
  • Potential interpolant issue

    Potential interpolant issue

    Hi @Nicholaswogan,

    Great work on this repository, it looks like you managed a consistent and quite high speed-up compared to SciPy. I have been working with LSODA recently using solve_ivp, and I have encountered an issue relative to the interpolant (PR at SciPy). I am wondering if this implementation of LSODA also has a similar issue, as it looks to me that the root of the problem lies within the FORTRAN implementation. In particular, LSODA::intdy seems very similar to me to DINTDY in ODEPACK, which could well mean that the issue is present. Let me know if you can have a look.

    opened by Patol75 2
  • Addition of a Runge-Kutta version

    Addition of a Runge-Kutta version

    Good afternoon @Nicholaswogan.

    As discussed, here is a PR for the addition of the RK45 explicit solver. I still have a few things to add like the comparison to Scipy and the performance tests, but you can already have a look.

    Note that I only implemented Fehlberg RK45 here, but I wrote it in a sufficient general manner to implement any explicit RK described by its Butcher tableau.

    To-Do list

    • [x] It seems there is a bug with machine-epsilon requested absolute tolerance (atol = 1e-30): the timesteps get infinitesimally smaller and the solver never finishes.
    opened by tbridel 1
  • Adding other Runge-Kutta time integrators

    Adding other Runge-Kutta time integrators

    Hye again !

    Could you please share the steps one would have to take to add a new Runge-Kutta integrator to the project ?

    I'm happy to implement it push a PR at some point, but I wanted to know if you had maybe a process that would make it easier to add a new Butcher-tableau based RK integrator ?

    Thanks ! T. Bridel-Bertomeu

    opened by tbridel 2
  • Adaptive timestepping and hijacking the

    Adaptive timestepping and hijacking the "step" method

    Hi again @Nicholaswogan !

    I was wondering if you ever thought about the adaptive timestepping capability that exists in Scipy LSODA ? Is anything like this possible with your implementation ?

    Thanks ! Best, T. Bridel-Bertomeu

    opened by tbridel 12
  • Installation via pip produces error when importing numbalsoda

    Installation via pip produces error when importing numbalsoda

    Hi Nick, thank you very much for building this Python package, it definitely appears very promising!

    I installed the latest version of numbalsoda (0.3.4) via pip and the installation was successful. I am using Python 3.9.7 and conda version 22.9.0.

    However, when trying to import it, I get the following error:

    'The procedure entry point ZSt28​__throw​_bad_array_new_lengthv could not be located in the dynamic link library C:\Users...\numbalsoda\liblsoda.dll'

    Installing numbalsoda via conda (conda install -c conda-forge numbalsoda) solves the issue and the package is successfully imported into the Python script. Conda installs version 0.3.2, and thus I was wondering if the difference in the versions may play a role in this issue.

    Thank you very much!

    opened by kf120 0
  • Backward in time using numbalsoda

    Backward in time using numbalsoda

    Hi

    I am using numblsoda to solve couples of first order differential equations. I need to solve those equations backward in time for some particular situations. I could not figure out how I can do that. I tried to use decreasing time step but it does not work as it works for solve_ivp and odeint.

    for instance in this code how can I go backward in time?

    from numbalsoda import lsoda_sig, lsoda, dop853
    from numba import njit, cfunc
    import numpy as np
    
    @cfunc(lsoda_sig)
    def rhs(t, u, du, p):
        du[0] = u[0]-u[0]*u[1]
        du[1] = u[0]*u[1]-u[1]*p[0]
    
    funcptr = rhs.address # address to ODE function
    u0 = np.array([5.,0.8]) # Initial conditions
    data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
    t_eval = np.arrange(0.0,50.0,0.1) # times to evaluate solution
    
    
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    

    I tried t_eval = np.arrange(50, 0, -0.1) but it does not work. It would be great if someone could help me to fix this issue.

    opened by meysam-motaharfar 1
  • Make NumPy types for u0 and usol configurable

    Make NumPy types for u0 and usol configurable

    First of all, awesome code! I got a speedup of an order of magnitude basically for free!

    But, as far as I can tell, u0 and usol are required to be of type np.float64. However, often times np.float32 is more than enough.

    Maybe I just don't see the option to change it, but if it doesn't exist, it'd be nice if we could change it.

    Thank you!

    opened by juhannc 1
Releases(v0.3.5)
Owner
Nick Wogan
Planetary Scientist, PhD Student, developing models of atmospheric evolution.
Nick Wogan
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

CReST in Tensorflow 2 Code for the paper: "CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning" by Chen Wei, Ki

Google Research 75 Nov 01, 2022
A simple editor for captions in .SRT file extension

WaySRT A simple editor for captions in .SRT file extension The program doesn't use any external dependecies, just run: python way_srt.py {file_name.sr

Gustavo Lopes 3 Nov 16, 2022
RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

RLMeta rlmeta - a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib Installation To build fro

Meta Research 281 Dec 22, 2022
Fast and robust clustering of point clouds generated with a Velodyne sensor.

Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo

Photogrammetry & Robotics Bonn 957 Dec 21, 2022
ICLR 2021 i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Introduction PyTorch code for the ICLR 2021 paper [i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning]. @inproceedings{lee2021i

Kibok Lee 68 Nov 27, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
Distilled coarse part of LoFTR adapted for compatibility with TensorRT and embedded divices

Coarse LoFTR TRT Google Colab demo notebook This project provides a deep learning model for the Local Feature Matching for two images that can be used

Kirill 46 Dec 24, 2022
HairCLIP: Design Your Hair by Text and Reference Image

Overview This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image". Our single

322 Jan 06, 2023
Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 03, 2021
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

Renzhe Xu 6 Oct 26, 2022
[ICCV2021] Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Xuanchi Ren 44 Dec 03, 2022
Styled Handwritten Text Generation with Transformers (ICCV 21)

⚑ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

Ankan Kumar Bhunia 85 Dec 22, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021) Overview We release the code of the MVFNet (Multi-View Fusion Network).

Wenhao Wu 114 Nov 27, 2022
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of

Wodson de Andrade 8 Dec 24, 2022
Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

Discrete Denoising Flows This repository contains the code for the experiments presented in the paper Discrete Denoising Flows [1]. To give a short ov

Alexandra Lindt 3 Oct 09, 2022
Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

Clova AI Research 56 Jan 02, 2023
an implementation of softmax splatting for differentiable forward warping using PyTorch

softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I

Simon Niklaus 338 Dec 28, 2022