A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset

Related tags

Data Analysisxwrf
Overview

xwrf

A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset. The primary objective of xwrf is to replicate crucial I/O functionality from the wrf-python package in a way that is more convenient for users and provides seamless integration with the rest of the Pangeo software stack.

CI GitHub Workflow Status Code Coverage Status
Docs Documentation Status
License License

This code is highly experimental! Let the buyer beware ⚠️ ;)

Installation

xwrf may be installed with pip:

python -m pip install git+https://github.com/NCAR/xwrf.git

What is it?

The native WRF output files are not CF compliant. This makes these files not the easiest NetCDF files to use with tools like xarray. This package provides a simple interface for reading in the WRF output files into xarray Dataset objects using xarray's flexible and extensible I/O backend API. For example, the following code reads in a WRF output file:

Dimensions: (Time: 1, south_north: 546, west_east: 480) Coordinates: XLONG (south_north, west_east) float32 ... XLAT (south_north, west_east) float32 ... Dimensions without coordinates: Time, south_north, west_east Data variables: Q2 (Time, south_north, west_east) float32 ... PSFC (Time, south_north, west_east) float32 ... Attributes: (12/86) TITLE: OUTPUT FROM WRF V3.3.1 MODEL START_DATE: 2012-04-20_00:00:00 SIMULATION_START_DATE: 2012-04-20_00:00:00 WEST-EAST_GRID_DIMENSION: 481 SOUTH-NORTH_GRID_DIMENSION: 547 BOTTOM-TOP_GRID_DIMENSION: 32 ... ... NUM_LAND_CAT: 24 ISWATER: 16 ISLAKE: -1 ISICE: 24 ISURBAN: 1 ISOILWATER: 14 ">
In [1]: import xarray as xr

In [2]: path = "./tests/sample-data/wrfout_d03_2012-04-22_23_00_00_subset.nc"

In [3]: ds = xr.open_dataset(path, engine="xwrf")

In [4]: # or

In [5]: # ds = xr.open_dataset(path, engine="wrf")

In [6]: ds
Out[6]:
<xarray.Dataset>
Dimensions:  (Time: 1, south_north: 546, west_east: 480)
Coordinates:
    XLONG    (south_north, west_east) float32 ...
    XLAT     (south_north, west_east) float32 ...
Dimensions without coordinates: Time, south_north, west_east
Data variables:
    Q2       (Time, south_north, west_east) float32 ...
    PSFC     (Time, south_north, west_east) float32 ...
Attributes: (12/86)
    TITLE:                            OUTPUT FROM WRF V3.3.1 MODEL
    START_DATE:                      2012-04-20_00:00:00
    SIMULATION_START_DATE:           2012-04-20_00:00:00
    WEST-EAST_GRID_DIMENSION:        481
    SOUTH-NORTH_GRID_DIMENSION:      547
    BOTTOM-TOP_GRID_DIMENSION:       32
    ...                              ...
    NUM_LAND_CAT:                    24
    ISWATER:                         16
    ISLAKE:                          -1
    ISICE:                           24
    ISURBAN:                         1
    ISOILWATER:                      14

In addition to being able to use xr.open_dataset, xwrf also allows reading in multiple WRF output files at once via xr.open_mfdataset function:

ds = xr.open_mfdataset(list_of_files, engine="xwrf", parallel=True,
                       concat_dim="Time", combine="nested")

Why not just a preprocess function?

One can achieve the same functionality with a preprocess function. However, there are some additional I/O features that wrf-python implements under the hood that we think would be worth implementing as part of a backend engine instead of a regular preprocess function.

Comments
  • First Release Blog Post

    First Release Blog Post

    Description

    I think that once we have a first release of xwrf, we should write a blog post demonstrating its use. It would be great if one of our WRF expert collaborators could spearhead this blog. Any volunteers?

    Implementation

    Personally, I think that a Jupyter Notebook is a good medium for a demonstration, and the notebook can be easily converted to a markdown doc for a blog-post.

    Tests

    N/A

    Questions

    Before embarking on this, though, we need to complete the features that we want in the first release. That said, I wouldn't be too overly excited to delay the release. Earlier is better, even if incomplete.

    enhancement 
    opened by kmpaul 32
  • Implementation of salem-style x, y, and z coordinates

    Implementation of salem-style x, y, and z coordinates

    Change Summary

    As alluded to in #2, including dimension coordinates in the grid mapping/projection space is a key feature for integrating with other tools in the ecosystem like metpy and xgcm. In this (draft) PR, I've combined code ported from salem with some of my own one-off scripts and what already exists in xwrf to meet this goal. In particular, this introduces a pyproj dependency (for CRS handling and transforming the domain center point from lon/lat to easting/northing). Matching the assumptions already present in xwrf and salem, this implementation assumes we do not have a moving domain (which simplifies things greatly). Also, this implements the c_grid_axis_shift attr as-needed, so xgcm should be able to interpret our coords automatically, eliminating the need for direct handling (like #5) in xwrf.

    ~~Also, because it existed in salem and my scripts alongside the dimension coordinate handling, I also included my envisioned diagnostic field calculations. These are deliberately limited to only those four fields that require WRF-specific handling:~~

    • ~~ 'T' going to potential temperature has a magic number offset of 300 K~~
    • ~~ 'P' and 'PB' combine to form pressure, and are not otherwise used~~
    • ~~ 'PH' and 'PHB' combine to form geopotential, and are not otherwise used~~
    • ~~ Geopotential to geopotential height conversion depends on a particular value of g (9.81 m s**2) that may not match the value used elsewhere~~

    ~~Unless I'm missing something, any other diagnostics should be derivable using these or other existing fields in a non-WRF-specific way (and so, fit outside of xwrf). If the netcdf4 backend already handles Dask chunks, then this should "just work" as it is currently written. However, I'm not sure how this should behave with respect to lazy-loading when chunks are not specified, so that is definitely a discussion to have in relation to #10.~~

    ~~Right now, no tests are included, as this is just a draft implementation to get the conversation started on how we want to approach these features. So, please do share your thoughts and ask questions!~~

    Related issue number

    • Closes #3
    • Closes #11

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [ ] Documentation reflects the changes where applicable
    enhancement 
    opened by jthielen 31
  • First Release?

    First Release?

    Now that we have xwrf in a usable state, should we consider cutting its first release soon (later this week or next week)? We already have the infrastructure in place for automatically publishing the package to PyPI. One missing piece is the documentation. The infrastructure for authoring the docs is already in place (uses markdown via myst + furo theme, and the current template follows this documentation system guide). I am opening this issue to keep track of other outstanding issues that need to be addressed before the first release. Feel free to add to this list (cc @ncar-xdev/xwrf)

    • [x] Update documentation
    • [x] Publish to PyPI
    • [x] Publish to conda-forge
    opened by andersy005 27
  • Tutorial

    Tutorial

    Change Summary

    Tutorial showing xWRF usage.

    Related issue number

    • Towards #69

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [x] Documentation reflects the changes where applicable
    opened by lpilz 20
  • Tutorial on xWRF

    Tutorial on xWRF

    What is your issue?

    The aim of this issue is to track the progress in creating a tutorial for xWRF. Here the start of a list of features which are to be presented. Please feel free to add to this list - I'll work on implementing this over coming days.

    • [x] general parsing/coordinate transformation (what does xwrf do?)
    • [x] interface to metpy via unit CF-conventions and pint
    • [x] destaggering data using xgcm
    • [x] vertically interpolating data using xgcm
    • [x] plotting
    opened by lpilz 17
  • Update of tutorials for v0.0.2

    Update of tutorials for v0.0.2

    Change Summary

    Added a tutorial for using xgcm with dask-data.

    Related issue number

    Closes #69

    Checklist

    • [x] Documentation reflects the changes where applicable
    documentation 
    opened by lpilz 13
  • First draft

    First draft "destagger" function

    Change Summary

    Here's an attempt at a "destaggering" function. This is based on the function in WRF-python (https://github.com/NCAR/wrf-python/blob/22fb45c54f5193b849fdff0279445532c1a6c89f/src/wrf/destag.py).

    I've tested in on "east_west_stag" and "north_south_stag" coordinates. The function takes an xarray data-array and guesses the name of the staggered coordinate (it ends in "_stag"). If there is more than one (I don't think there are in WRF?), a NotImplenetedError is raised.

    I'm also not sure if this should ultimately look like this at all, but I wanted to go ahead and throw this code out there.

    Related issue number

    This is related to issue #35

    Checklist

    I don't have any unit tests to check this -- I'm open to ideas on how to make unit tests (do they need to be on "real" data?) Maybe that's a separate issue.

    • [ ] Unit tests for the changes exist
    • [ ] Tests pass on CI
    • [ ] Documentation reflects the changes where applicable

    I'm new to collaborating on open-source projects, and writing code for wide usage, so any feedback is welcome!

    enhancement 
    opened by bsu-wrudisill 13
  • [MISC]: Curate sample datasets

    [MISC]: Curate sample datasets

    What is your issue?

    We currently don't have great sample datasets to use for testing, documentation. It's worth curating exemplar, small data sets. We could emulate the approach used by fatiando/ensaio or xarray tutorial module. These datasets should probably be hosted in a separate GitHub repository.

    • Option 1: A separate data package (xwrf_data)
    import xwrf_data
    import xwrf
    import xarray as xr
    
    fname = xwrf_data.fetch_foo_dataset()
    ds = xr.open_dataset(fname).wrf.diag_and_destagger()
    
    • Option 2: Tutorial module within xwrf
    import xwrf
    import xarray as xr
    
    ds = xwrf.tutorial.open_dataset('foo_dataset').wrf.diag_and_destagger()
    

    Cc @ncar-xdev/xwrf

    enhancement 
    opened by andersy005 12
  • Division of Features in Top-Level API

    Division of Features in Top-Level API

    While detailed API discussions will be ongoing based on https://github.com/NCAR/xwrf/discussions/13 and other issues/discussions that follow from that, https://github.com/NCAR/xwrf/pull/14#issuecomment-977066277 and https://github.com/NCAR/xwrf/pull/14#issuecomment-977157649 raised a more high-level API point that would be good to clear up first: what features go into the xwrf backend, and what goes elsewhere (such as a .wrf accessor)?

    Original comments:


    If so, I think this means we can't have direct Dask operations within the backend, but would rather need to design custom backend arrays that play nicely with the Dask chunking xarray itself does, or re-evaluate the approach for derived quantities so that they are outside the backend. Perhaps the intake-esm approach could help in that regard at least?

    Wouldn't creating custom backend arrays be overkill? Assuming we want to support reading files via the Python-netCDF4 library, we might be able to write a custom data store that borrows from xarray's NetCDF4DataStore: https://github.com/pydata/xarray/blob/5db40465955a30acd601d0c3d7ceaebe34d28d11/xarray/backends/netCDF4_.py#L291. With this custom datastore, we would have more control over what to do with variables, dimensions, attrs before passing them to xarray. Wouldn't this suffice for the data loading (without the derived quantities)?

    I think there's value in keeping the backend plugin simple (e.g. performing simple tasks such as decoding coordinates, fixing attributes/metadata, etc) and everything else outside the backend. Deriving quantities doesn't seem simple enough to warrant having this functionality during the data loading.

    Some of the benefits of deriving quantities outside the backend are that this approach:

    (1) doesn't obfuscate what's going on, (2) gives users the opportunity to fix aspects of the dataset that might be missed by xwrf during data loading before passing this cleaned dataset to the functionality for deriving quantities. (3) removes the requirement for deriving quantities to be a lazy operation i.e. if your dataset is in memory, deriving the quantity is done eagerly...

    Originally posted by @andersy005 in https://github.com/NCAR/xwrf/issues/14#issuecomment-977066277


    Some of the benefits of deriving quantities outside the backend are that this approach:

    Also, Wouldn't it be beneficial for deriving quantities to be backend agnostic? I'm imagining cases in which the data have been post-processed and saved in a different format (e.g. Zarr) and you still want to be able to use the same code for deriving quantities on the fly.

    Originally posted by @andersy005 in https://github.com/NCAR/xwrf/issues/14#issuecomment-977072366


    Deriving quantities doesn't seem simple enough to warrant having this functionality during the data loading.

    This sounds like it factors directly into the "keep the solutions as general as possible (so that maybe also MPAS can profit from it)" discussion. However, I feel that we have to think about the user-perspective too. I don't have any set opinions on this and we should definitely discuss this maybe in a larger group too. Here some thoughts on this so far:

    I think the reason users like wrf-python is because it is an easy one-stop-shop for getting wrf output to work with python - this is especially true because lots of users are scientists and not software engineers or programmers. I personally take from this point that it would be prudent to keep the UX as easy as possible. I think this is what the Backend-approach does really well. Basically users just have to add the engine='xwrf' kwarg and then it just works (TM). Meaning that it provides the users with CF-compliant de-WRFified meteo data. Also, given that the de-WRFification of the variable data is not too difficult (it's basically just adding fields for three variables), I think the overhead in complexity wouldn't be too great. However, while I do see that it breaks the conceptual barrier between data loading (and decoding etc.) and computation, this breakage would be required in order to provide the user with meteo data rather than raw wrf fields.

    @andersy005 do you already have some other ideas on how one could handle this elegantly?

    Also, should we move this discussion to a separate issue maybe?

    Originally posted by @lpilz in https://github.com/NCAR/xwrf/issues/14#issuecomment-977157649

    opened by jthielen 10
  • Coordinate UX

    Coordinate UX

    I think this is pretty straightforward as we just need the lat, lon and time coordinates, all other can be discarded. Unstaggering will be done in the variable initialization. However, we should be aware of moving-nest runs and keep the time-dependence of lat and lon for these occasions.

    enhancement 
    opened by lpilz 9
  • Create xWRF logo

    Create xWRF logo

    What is your issue?

    It would be nice to have a minimalistic logo for the project. Does anyone have or know someone with design skills? :). This would be good for the overall branding of the project once we start advertising the project after the first release

    • https://github.com/ncar-xdev/xwrf/issues/51

    Cc @ncar-xdev/xwrf

    opened by andersy005 8
  • [Bug]: ValueError when using MetPy to calculate geostrophic winds

    [Bug]: ValueError when using MetPy to calculate geostrophic winds

    What happened?

    I'm trying to use the MetPy function mpcalc.geostrophic_wind() to calculate geostrophic winds from a wrfout file.

    I'm getting "ValueError: Must provide dx/dy arguments or input DataArray with latitude/longitude coordinates", along with a warning, "warnings.warn('More than one ' + axis + ' coordinate present for variable'".

    I don't know what's causing the problem.

    Minimal Complete Verifiable Example

    import metpy.calc as mpcalc
    import xarray as xr
    import xwrf
    
    # Open the NetCDF file
    filename = "wrfout_d01_2016-10-04_12:00:00"
    ds = xr.open_dataset(filename).xwrf.postprocess()
    
    # Extract the geopotential height
    z = ds['geopotential_height']
    
    # Compute the geostrophic wind
    geo_wind_u, geo_wind_v = mpcalc.geostrophic_wind(z)
    

    Relevant log output

    /mnt/iusers01/fatpou01/sees01/w34926hb/.conda/envs/metpy_env/lib/python3.9/site-packages/metpy/xarray.py:355: UserWarning: More than one latitude coordinate present for variable "geopotential_height".
      warnings.warn('More than one ' + axis + ' coordinate present for variable'
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/mnt/iusers01/fatpou01/sees01/w34926hb/.conda/envs/metpy_env/lib/python3.9/site-packages/metpy/xarray.py", line 1508, in wrapper
        raise ValueError('Must provide dx/dy arguments or input DataArray with '
    ValueError: Must provide dx/dy arguments or input DataArray with latitude/longitude coordinates.
    

    Environment

    System Information
    ------------------
    xWRF commit : None
    python      : 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:58:50)
    [GCC 10.3.0]
    python-bits : 64
    OS          : Linux
    OS-release  : 3.10.0-1127.19.1.el7.x86_64
    machine     : x86_64
    processor   : x86_64
    byteorder   : little
    LC_ALL      : None
    LANG        : en_GB.UTF-8
    LOCALE      : ('en_GB', 'UTF-8')
    
    Installed Python Packages
    -------------------------
    cf_xarray   : 0.7.5
    dask        : 2022.11.0
    donfig      : 0.7.0
    matplotlib  : 3.6.2
    metpy       : 1.3.1
    netCDF4     : 1.6.2
    numpy       : 1.23.5
    pandas      : 1.5.1
    pint        : 0.20.1
    pooch       : v1.6.0
    pyproj      : 3.4.0
    xarray      : 2022.11.0
    xgcm        : 0.8.0
    xwrf        : 0.0.2
    

    Anything else we need to know?

    No response

    bug waiting for response 
    opened by starforge 3
  • [MISC]: Plot in metpy tutorial is missing

    [MISC]: Plot in metpy tutorial is missing

    What is your issue?

    On https://xwrf.readthedocs.io/en/latest/tutorials/metpy.html, the Skew-T plot is missing. @andersy005 is this an intermittent sphinx issue or do we have some malconfiguration somewhere?

    opened by lpilz 1
  • More comprehensive unit harmonization

    More comprehensive unit harmonization

    Change Summary

    Unit harmonization is improved by:

    • using a better map parsed from WRF Registries (yes, all of them, but not WPS)
      • translations are generated manually using custom external tool
      • includes all versions from WRFv4.0 onwards
      • makes bracket cleaning superfluous
    • extracting this map from the config yaml to avoid clutter

    Related issue number

    Checklist

    • [x] Unit tests for the changes exist
    • [x] Tests pass on CI
    • [x] Documentation reflects the changes where applicable
    enhancement 
    opened by lpilz 4
  • [FEATURE]: Add functionality to organize WRF data into a DataTree

    [FEATURE]: Add functionality to organize WRF data into a DataTree

    Description

    WRF output can easily have a couple hundred data variables in a dataset, which is not ideal for interactive exploration of a dataset's contents. With DataTree, we would have a tree-like hierarchical data structure for xarray which could be used for this.

    From @lpilz in https://github.com/xarray-contrib/xwrf/issues/10:

    • Which diagnostics do we want to provide and do we want to expose them in a DataTree eventually?

    One suggestion might be:

    DataTree("root")
    |-- DataNode("2d_variables")
    |   |-- DataArrayNode("sea_surface_temperature")
    |   |-- DataArrayNode("surface_temperature")
    |   |-- DataArrayNode("surface_air_pressure")
    |   |-- DataArrayNode("air_pressure_at_sea_level")
    |   |-- DataArrayNode("air_temperature_at_2m") (?)
    |   ....
    |-- DataNode("3d_variables")
        |-- DataArrayNode("air_temperature")
        |-- DataArrayNode("air_pressure")
        |-- DataArrayNode("northward_wind")
        |-- DataArrayNode("eastward_wind")
        ....
    

    Implementation

    This would likely become a new accessor method, such as .xwrf.organize().

    Tests

    After xwrf.postprocess(), we have a post processed dataset (with likely many data variables). Then, after xwrf.organize(), we would have a DataTree with (a yet to be decided) tree-like grouping of data variables. Calling xwrf.organize() without xwrf.postprocess() would fail.

    Questions

    What form of heirarchy would we want to have and how deep?

    • 2d_variables vs. 3d_variables?
    • semantic grouping of variables, such as thermodynamic, grid_metrics, kinematic, accumulated, etc.?
    • Parse the WRF Registry somehow and assign groups based on that?
    • some other strategy?
    enhancement 
    opened by jthielen 0
  • [META]: Support for unexpected/non-pristine wrfout datasets

    [META]: Support for unexpected/non-pristine wrfout datasets

    What is your issue?

    As encountered in #36 and https://github.com/xarray-contrib/xwrf-data/pull/34 (and perhaps elsewhere), there may be several unexpected factors (old versions, tweaked registries, subsetting, etc.) that could result in xWRF's standard functionality being unsupported or failing. While it is definitely something not to prioritize for immediate releases, it would still be nice to make as many subsets of xWRF functionality available to users whose WRF datasets "break" xWRF's norms as possible. So, I propose this to be a meta-issue to

    • track such unexpected/non-pristine examples
    • work towards features to enable extended compatibility and/or custom application of atomized functionality outside of the standard postprocess()
    • discuss any high-level design strategies to improve the experience of xWRF in these situations

    Running list of sub-issues

    (feel free to add/modify)

    • [ ] Missing latitude/longitude coordinates (xref #36)
      • Could be addressed by (one or both of)
        • Convenience methods to merge in coordinates from geo_em files
        • Recompute lat/lon from projection coordinates
    • [ ] Dataset grid definition attributes partially invalid due to spatial subsetting prior to postprocessing (xref https://github.com/xarray-contrib/xwrf-data/pull/34; local issue TBD)
      • Could be addressed by (one or both of)
        • Reference lat/lon being derived from XLAT/XLONG corner(s) rather than CEN_LON/CEN_LAT attrs
        • Require user input of needed info if some sanity check fails (which would also lead to support for completely missing attrs, not just CEN_LON/CEN_LAT being rendered invalid)
    enhancement 
    opened by jthielen 0
  • [MISC]: More careful consideration of different xarray options

    [MISC]: More careful consideration of different xarray options

    What is your issue?

    Test expected results under different xarray options

    In the spirit of improving the quality of our tests (xref #60), it would be nice to implement tests where different relevant xarray options are enabled (using set_options as a context manager). This would likely make it easier to catch issues like #96 .

    Xarray options in issue reports

    Not sure the best way to do this (bundle into xwrf.show_versions()? Add another copy-paste box to the issue template?), but it could help with debugging if we knew the state of xarray.get_options.

    maintenance 
    opened by jthielen 0
Releases(v0.0.2)
  • v0.0.2(Sep 21, 2022)

    What's Changed

    • Add destaggering functionality by @jthielen in https://github.com/xarray-contrib/xwrf/pull/93
    • Fix destagger attrs by @lpilz in https://github.com/xarray-contrib/xwrf/pull/97
    • Fix staggered coordinate destaggering for dataarray destagger method by @jthielen in https://github.com/xarray-contrib/xwrf/pull/101
    • Added earth-relative wind field calculation to base diagnostics by @lpilz in https://github.com/xarray-contrib/xwrf/pull/100
    • Clean up _destag_variable with respect to types and terminology by @jthielen in https://github.com/xarray-contrib/xwrf/pull/103
    • Changed wrfout file (cf. xwrf-data/#34) by @lpilz in https://github.com/xarray-contrib/xwrf/pull/102
    • More unit harmonization by @lpilz in https://github.com/xarray-contrib/xwrf/pull/105
    • Fixing a further coords attrs fail. by @lpilz in https://github.com/xarray-contrib/xwrf/pull/107
    • Clear c_grid_axis_shift from attrs when destaggering by @jthielen in https://github.com/xarray-contrib/xwrf/pull/106
    • Update of tutorials for v0.0.2 by @lpilz in https://github.com/xarray-contrib/xwrf/pull/89

    Full Changelog: https://github.com/xarray-contrib/xwrf/compare/v0.0.1...v0.0.2

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Sep 9, 2022)

    This is the first packaged release of xWRF (a lightweight interface for working with the Weather Research and Forecasting (WRF) model output in xarray). Features in this release include:

    • A xwrf Dataset accessor with a postprocess method that can perform the following operations
      • Rename dimensions to match the CF conventions.
      • Rename variables to match the CF conventions.
      • Rename variable attributes to match the CF conventions.
      • Convert units to Pint-friendly units.
      • Decode times.
      • Include projection coordinates.
      • Collapse time dimension.
    • A tutorial module with several sample datasets
    • Documentation with several examples/tutorials

    Thank you to the following contributors for their efforts towards this release!

    • @andersy005
    • @lpilz
    • @jthielen
    • @kmpaul
    • @dcherian
    • @jukent

    Full Changelog: https://github.com/xarray-contrib/xwrf/commits/v0.0.1

    Source code(tar.gz)
    Source code(zip)
Owner
National Center for Atmospheric Research
NCAR is sponsored by the National Science Foundation and managed by the University Corporation for Atmospheric Research.
National Center for Atmospheric Research
AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures.

AptaMAT Purpose AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures. The method is based on the compa

GEC UTC 3 Nov 03, 2022
Airflow ETL With EKS EFS Sagemaker

Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp

1 Feb 14, 2022
Data science/Analysis Health Care Portfolio

Health-Care-DS-Projects Data Science/Analysis Health Care Portfolio Consists Of 3 Projects: Mexico Covid-19 project, analyze the patient medical histo

Mohamed Abd El-Mohsen 1 Feb 13, 2022
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is

6 Sep 07, 2022
Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center 2 Jul 01, 2022
A data analysis using python and pandas to showcase trends in school performance.

A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda

Jimmy Faccioli 0 Sep 07, 2021
Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge S

1 Jan 09, 2022
This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

📈 Statistical Quality Control 📉 This repo contains a simple but effective tool made using python which can be used for quality control in statistica

SasiVatsal 8 Oct 18, 2022
A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenari

epispot 12 Dec 28, 2022
cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

YaqiangCao 25 Dec 14, 2022
Additional tools for particle accelerator data analysis and machine information

PyLHC Tools This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Au

PyLHC 3 Apr 13, 2022
Py-price-monitoring - A Python price monitor

A Python price monitor This project was focused on Brazil, so the monitoring is

Samuel 1 Jan 04, 2022
VevestaX is an open source Python package for ML Engineers and Data Scientists.

VevestaX Track failed and successful experiments as well as features. VevestaX is an open source Python package for ML Engineers and Data Scientists.

Vevesta 24 Dec 14, 2022
CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner. It is aimed to integrate this tool with several more features including providing a U

Ravi Prakash 3 Jun 27, 2021
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house

This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file con

Amit Prakash 1 Jan 21, 2022
Desafio proposto pela IGTI em seu bootcamp de Cloud Data Engineer

Desafio Modulo 4 - Cloud Data Engineer Bootcamp - IGTI Objetivos Criar infraestrutura como código Utuilizando um cluster Kubernetes na Azure Ingestão

Otacilio Filho 4 Jan 23, 2022
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.

BasstatPL is a package for performing different tabulations and calculations for descriptive statistics. It provides: Frequency table constr

Angel Chavez 1 Oct 31, 2021
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
An extension to pandas dataframes describe function.

pandas_summary An extension to pandas dataframes describe function. The module contains DataFrameSummary object that extend describe() with: propertie

Mourad 450 Dec 30, 2022