A Python package for modular causal inference analysis and model evaluations

Last update: Dec 19, 2022

Related tags

Overview

Causal Inference 360

A Python package for inferring causal effects from observational data.

Description

Causal inference analysis enables estimating the causal effect of an intervention on some outcome from real-world non-experimental observational data.

This package provides a suite of causal methods, under a unified scikit-learn-inspired API. It implements meta-algorithms that allow plugging in arbitrarily complex machine learning models. This modular approach supports highly-flexible causal modelling. The fit-and-predict-like API makes it possible to train on one set of examples and estimate an effect on the other (out-of-bag), which allows for a more "honest"¹ effect estimation.

The package also includes an evaluation suite. Since most causal-models utilize machine learning models internally, we can diagnose poor-performing models by re-interpreting known ML evaluations from a causal perspective.

If you use the package, please consider citing Shimoni et al., 2019:

Reference

@article{causalevaluations,
  title={An Evaluation Toolkit to Guide Model Selection and Cohort Definition in Causal Inference},
  author={Shimoni, Yishai and Karavani, Ehud and Ravid, Sivan and Bak, Peter and Ng, Tan Hung and Alford, Sharon Hensley and Meade, Denise and Goldschmidt, Yaara},
  journal={arXiv preprint arXiv:1906.00442},
  year={2019}
}

¹ Borrowing Wager & Athey terminology of avoiding overfit.

Installation

pip install causallib

Usage

The package is imported using the name causallib. Each causal model requires an internal machine-learning model. causallib supports any model that has a sklearn-like fit-predict API (note some models might require a predict_proba implementation). For example:

from sklearn.linear_model import LogisticRegression
from causallib.estimation import IPW 
from causallib.datasets import load_nhefs

data = load_nhefs()
ipw = IPW(LogisticRegression())
ipw.fit(data.X, data.a)
potential_outcomes = ipw.estimate_population_outcome(data.X, data.a, data.y)
effect = ipw.estimate_effect(potential_outcomes[1], potential_outcomes[0])

Comprehensive Jupyter Notebooks examples can be found in the examples directory.

Community support

We use the Slack workspace at causallib.slack.com for informal communication. We encourage you to ask questions regarding causal-inference modelling or usage of causallib that don't necessarily merit opening an issue on Github.

Use this invite link to join causallib on Slack.

Approach to causal-inference

Some key points on how we address causal-inference estimation

1. Emphasis on potential outcome prediction

Causal effect may be the desired outcome. However, every effect is defined by two potential (counterfactual) outcomes. We adopt this two-step approach by separating the effect-estimating step from the potential-outcome-prediction step. A beneficial consequence to this approach is that it better supports multi-treatment problems where "effect" is not well-defined.

2. Stratified average treatment effect

The causal inference literature devotes special attention to the population on which the effect is estimated on. For example, ATE (average treatment effect on the entire sample), ATT (average treatment effect on the treated), etc. By allowing out-of-bag estimation, we leave this specification to the user. For example, ATE is achieved by model.estimate_population_outcome(X, a) and ATT is done by stratifying on the treated: model.estimate_population_outcome(X.loc[a==1], a.loc[a==1])

3. Families of causal inference models

We distinguish between two types of models:

Weight models: weight the data to balance between the treatment and control groups, and then estimates the potential outcome by using a weighted average of the observed outcome. Inverse Probability of Treatment Weighting (IPW or IPTW) is the most known example of such models.
Direct outcome models: uses the covariates (features) and treatment assignment to build a model that predicts the outcome directly. The model can then be used to predict the outcome under any assignment of treatment values, specifically the potential-outcome under assignment of all controls or all treated.
These models are usually known as Standardization models, and it should be noted that, currently, they are the only ones able to generate individual effect estimation (otherwise known as CATE).

4. Confounders and DAGs

One of the most important steps in causal inference analysis is to have proper selection on both dimensions of the data to avoid introducing bias:

On rows: thoughtfully choosing the right inclusion\exclusion criteria for individuals in the data.
On columns: thoughtfully choosing what covariates (features) act as confounders and should be included in the analysis.

This is a place where domain expert knowledge is required and cannot be fully and truly automated by algorithms. This package assumes that the data provided to the model fit the criteria. However, filtering can be applied in real-time using a scikit-learn pipeline estimator that chains preprocessing steps (that can filter rows and select columns) with a causal model at the end.

Comments

'module' object is not callable

I try to apply several examples of causallib

============================= %matplotlib inline from causallib.evaluation import evaluate import matplotlib.pyplot as plt

evaluation_results = evaluate(ipw, X, a, y)

Whenever I import evaluate from causallib.evaluation, I always meet same error. How I can solve this problem.

TypeError Traceback (most recent call last) Input In [20], in <cell line: 5>() 2 from causallib.evaluation import evaluate 3 import matplotlib.pyplot as plt ----> 5 evaluation_results = evaluate(ipw, X, a, y) 6 fig, ax = plt.subplots(figsize=(6, 6)) 7 evaluation_results.plot_covariate_balance(kind="love", ax=ax)

TypeError: 'module' object is not callable

opened by woonjeung 8

PyPi package fails to install and run

Creating a python 3.6.0 virtual environment and running pip install causallib succeeds, but then trying to import any modules fails. E.g. from causallib.estimation import IPW:

...

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/core/frame.py in <module>
     86 from pandas.core.arrays.datetimelike import DatetimeLikeArrayMixin as DatetimeLikeArray
     87 from pandas.core.arrays.sparse import SparseFrameAccessor
---> 88 from pandas.core.generic import NDFrame, _shared_docs
     89 from pandas.core.index import (
     90     Index,

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/core/generic.py in <module>
     69 from pandas.core.ops import _align_method_FRAME
     70 
---> 71 from pandas.io.formats.format import DataFrameFormatter, format_percentiles
     72 from pandas.io.formats.printing import pprint_thing
     73 from pandas.tseries.frequencies import to_offset

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/io/formats/format.py in <module>
     45 from pandas.core.indexes.datetimes import DatetimeIndex
     46 
---> 47 from pandas.io.common import _expand_user, _stringify_path
     48 from pandas.io.formats.printing import adjoin, justify, pprint_thing
     49 

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/io/common.py in <module>
      7 from http.client import HTTPException  # noqa
      8 from io import BytesIO
----> 9 import lzma
     10 import mmap
     11 import os

~/.pyenv/versions/3.6.0/lib/python3.6/lzma.py in <module>
     25 import io
     26 import os
---> 27 from _lzma import *
     28 from _lzma import _encode_filter_properties, _decode_filter_properties
     29 import _compression

ModuleNotFoundError: No module named '_lzma'

It looks like there are versioning issues with pandas.

opened by BoltzmannBrain 6

Issues with Categorical Data

So I'm working on a survey data where I am trying to figure out cause and effect relationship between a person's responses to the survey questions and his/her final preference towards that product. The data is categorical entirely. On using Causal Inference 360's evaluation plots, I got the following error:

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric precision could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted']. warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric recall could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted']. warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric f1 could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted']. warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric roc_auc could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multi_class must be in ('ovo', 'ovr') warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric avg_precision could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric hinge could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: The shape of pred_decision cannot be 1d arraywith a multiclass target. pred_decision shape must be (n_samples, n_classes), that is (1977, 3). Got: (1977,) warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric brier could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Only binary classification is supported. The type of the target is multiclass. warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric roc_curve could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported warnings.warn(str(v)) /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric pr_curve could not be evaluated warnings.warn(f"metric {metric_name} could not be evaluated") /usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported warnings.warn(str(v))

KeyError Traceback (most recent call last) in 3 4 eval_results = evaluate(ipw, X, a, y) ----> 5 eval_results.plot_all() 6 eval_results.plot_covariate_balance(kind="love");

8 frames /usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in plot_all(self, phase) 343 """ 344 phases_to_plot = self.predictions.keys() if phase is None else [phase] --> 345 multipanel_plot = { 346 plotted_phase: self._make_multipanel_evaluation_plot( 347 plot_names=self.all_plot_names, phase=plotted_phase

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in (.0) 344 phases_to_plot = self.predictions.keys() if phase is None else [phase] 345 multipanel_plot = { --> 346 plotted_phase: self._make_multipanel_evaluation_plot( 347 plot_names=self.all_plot_names, phase=plotted_phase 348 )

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in _make_multipanel_evaluation_plot(self, plot_names, phase) 353 def _make_multipanel_evaluation_plot(self, plot_names, phase): 354 phase_fig, phase_axes = plots.get_subplots(len(plot_names)) --> 355 named_axes = { 356 name: self._make_single_panel_evaluation_plot(name, phase, ax) 357 for name, ax in zip(plot_names, phase_axes.ravel())

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in (.0) 354 phase_fig, phase_axes = plots.get_subplots(len(plot_names)) 355 named_axes = { --> 356 name: self._make_single_panel_evaluation_plot(name, phase, ax) 357 for name, ax in zip(plot_names, phase_axes.ravel()) 358 }

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in _make_single_panel_evaluation_plot(self, plot_name, phase, ax, **kwargs) 379 plot_func = plots.lookup_name(plot_name) 380 plot_data = self.get_data_for_plot(plot_name, phase=phase) --> 381 return plot_func(*plot_data, ax=ax, **kwargs)

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/plots.py in plot_mean_features_imbalance_love_folds(table1_folds, cv, aggregate_folds, thresh, plot_semi_grid, ax) 813 aggregated_table1 = aggregated_table1.groupby(aggregated_table1.index) 814 --> 815 order = aggregated_table1.mean().sort_values(by="unweighted", ascending=True).index 816 817 if aggregate_folds:

/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs) 309 stacklevel=stacklevel, 310 ) --> 311 return func(*args, **kwargs) 312 313 return wrapper

/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py in sort_values(self, by, axis, ascending, inplace, kind, na_position, ignore_index, key) 6257 6258 by = by[0] -> 6259 k = self._get_label_or_level_values(by, axis=axis) 6260 6261 # need to rewrap column in Series to apply key function

/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py in _get_label_or_level_values(self, key, axis) 1777 values = self.axes[axis].get_level_values(key)._values 1778 else: -> 1779 raise KeyError(key) 1780 1781 # Check for duplicates

KeyError: 'unweighted'

Can you help me out please?

opened by jgdpsingh 3
Error in Example "causal_simulator.ipynb"
Hello and thank you for your amazing work! I stumbled upon a tiny mistake in one of the examples in causallib/examples/causal_simulator.ipynb

You'll find the following first code cell:

import pandas as pd from causallib.datasets import load_nhefs from causallib.simulation import CausalSimulator from causallib.simulation import generate_random_topology

which does not work properly, since the simulator is now part of dataset

to regain function just change the call accordingly:

import pandas as pd from causallib.datasets import load_nhefs from causallib.datasets import CausalSimulator from causallib.datasets import generate_random_topology

Expected behavior: import library and run cell

Observed behavior: ImportError: cannot import name 'CausalSimulator' from 'causallib.simulation' (/opt/conda/lib/python3.10/site-packages/causallib/simulation/__init__.py)

Thanks for your time and efforts.
opened by GMGassner 2
Matching more neighbors than the number of examples in the treatment group

Hello,

If I have 10 examples in my "treated" group and 1000 in the "control" group, Is it possible to do one-side matching (match control to treatment) of more than 10 neighbors (e.g. 50)?

I tried using the "matching_mode" argument in causallib.estimation.Matching for one-directional matching, but still got the error "Expected n_neighbors <= n_samples" when using matcher.match.

Thank you!

opened by michalshap 2
ImportError: cannot import name 'PropensityEvaluator' from 'causallib.estimation'
On Google Colab, after installing causallib from PiP I try to import the PropensityEvaluator, but it fails.

!pip install causallib from causallib.estimation import IPW, PropensityEvaluator

Collecting causallib Downloading causallib-0.7.1-py3-none-any.whl (2.1 MB) |████████████████████████████████| 2.1 MB 5.1 MB/s Requirement already satisfied: numpy<2,>=1.13 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.19.5) Requirement already satisfied: matplotlib<4,>=2.2 in /usr/local/lib/python3.7/dist-packages (from causallib) (3.2.2) Requirement already satisfied: scipy<2,>=0.19 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.4.1) Requirement already satisfied: networkx<3,>=1.1 in /usr/local/lib/python3.7/dist-packages (from causallib) (2.6.3) Requirement already satisfied: statsmodels<1,>=0.8 in /usr/local/lib/python3.7/dist-packages (from causallib) (0.10.2) Requirement already satisfied: pandas<2,>=0.25.2 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.1.5) Requirement already satisfied: scikit-learn<2,>=0.20 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.0.1) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (3.0.6) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (2.8.2) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (1.3.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (0.11.0) Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas<2,>=0.25.2->causallib) (2018.9) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib<4,>=2.2->causallib) (1.15.0) Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn<2,>=0.20->causallib) (1.1.0) Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn<2,>=0.20->causallib) (3.0.0) Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from statsmodels<1,>=0.8->causallib) (0.5.2) Installing collected packages: causallib Successfully installed causallib-0.7.1

ImportError: cannot import name 'PropensityEvaluator' from 'causallib.estimation' (/usr/local/lib/python3.7/dist-packages/causallib/estimation/init.py)

I can't understand why this happens. In my local machine, I can run this snippet without problems.

Google Colab uses Python 3.7.12
opened by GiacomoPinardi 2
ModuleNotFoundError: No module named 'causallib.contrib.hemm'

Having this error while running this line in hemm_demo

ModuleNotFoundError Traceback (most recent call last) in ----> 1 from causallib.contrib.hemm.gen_synthetic_data import gen_montecarlo 2 3 syn_data = gen_montecarlo(5000, 2, 100)

ModuleNotFoundError: No module named 'causallib.contrib.hemm'

Any help is appreciated.

opened by Mawul4j 2
Misstatement in Standardization example

In the text after cell 7, the example describes the model as a Logistic Regression model, whereas I believe the model used in the sklearn LinearRegression() model.

Source: https://github.com/IBM/causallib/blob/master/examples/standardization.ipynb

opened by bantucaravan 2
Covariate imbalance scatterplot

Adds a new covariate imbalance plot. The scatter plot is more suitable than the classical Love plot when there are many covariates.

Implemented by @edenjenzohar

opened by ehudkr 1
Est pop patch
Confirm the logic w/ @ehudkr.

i.e. individual outcome averaged to get the pop outcome as a Default Case for now.

Backlog: Support for multiple treatment strategies. Like a list of "always_treated" and "never_treated".
opened by JulinaM 1
limit n_neighbors to n_samples before matching

Closes https://github.com/IBM/causallib/issues/37 .

Because of how matching is executed based on sklearn, both directions are matched even if only one is requested. The filtering by direction happens when the results are reported, not at match time. Therefore you can have a situation as reported in the linked issue in which one direction has enough neighbors to match while the other direction does not and neither directions work.

The proposed solution is to enable one direction of matching even when the other has too few samples. In reality it does execute the matching in both directions, but the problematic direction is limited to n_samples == n_neighbors and a warning is printed to screen.

Failing unit test which is reproduces the problem and is fixed by this code is added in this PR as well.

opened by mmdanziger 1
Fix support for scikit-learn>=1.2.0 and Numpy=1.24.0
Scikit-learn version 1.2.0 enforces two API changes that currently break tests.

LinearRegression no longer supports the normalize keyword argument, which some of the tests use. Fix should theoretically be rather simple since it is just replacing LinearRegression with a Pipeline object with a StandardScaler preprocessing step.

Scikit-learn now enforces strict column name restrictions. First, all columns must be of the same type, and second, column names should match between fit and predict. This might require a solution of larger breadth. The first part will require a "safe join" that is column-name-type aware and replace all the instances we join covariate X with treatment assignment a. The second part require to validate column-names are consistent/preserved when new data is inputted. Which might be mostly in the time-pooled survival models where a time range is artificially created and placed as a predictor.

A slightly more minor exception was also raised with Numpy v1.24.0. Throwing a TypeError: ufunc 'isfinite' not supported for the input types exception when generating calibration plots calls matplotlib's fill_between call that fails. Need to dig deeper into that and whether that's a causallib problem (providing bad fill values) or some external matplotlib-numpy mismatch.

In the meantime, PR https://github.com/BiomedSciAI/causallib/pull/50 limited the allowed dependency versions.
opened by ehudkr 0
The parameter 'std' keeps decreasing

Thanks for your amazing work! I had some data tested with the HEMM method and the result of subgroup prediction is abnormal. With further evaluation, I found that the parameter 'std' of the Gaussian distribution keeps decreasing and fell below zero while it supposed to converge to a positive value. What's the cause of this phenomenon and how can I fix it? Is this about parameter initialization?

opened by R2Bb1T 4
weight matrix in IPW calculation can have weights info due to division by zero

weight_matrix = probabilities.rdiv(1.0) statement in this file can return inf weights if some entries in "probabilities" series are zero. Maybe there could be some way to ignore the corresponding inf weights while applying weight_matrix later on.

opened by glotglutton 2

Releases(v0.9.1)

v0.9.1(Nov 24, 2022)
Release v0.9.1 https://pypi.org/project/causallib/0.9.1/

What's Changed

Model selection within weight-based survival models by @ehudkr in https://github.com/IBM/causallib/pull/47

Full Changelog: https://github.com/IBM/causallib/compare/v0.9.0...v0.9.1
Source code(tar.gz)
Source code(zip)
v0.9.0(Sep 29, 2022)
Release v0.9.0 https://pypi.org/project/causallib/0.9.0/

Main changes

Two main additions on the model evaluations front.

We refactored the whole evaluation module, changing the API to be a lot more user friendly, with options to customize the generated plots.

We added a whole suite of causal-oriented metrics and scorers, that allow to integrate with scikit-learn's model selection machinery (like GridSearchCV, or any other scikit-learn compatible hyperparameter search model), and perform model selection in cross validation.

What's Changed

limit n_neighbors to n_samples before matching by @mmdanziger in https://github.com/IBM/causallib/pull/38

Evaluation refactoring and interface change by @mmdanziger in https://github.com/IBM/causallib/pull/40

Covariate imbalance scatterplot by @edenjenzohar in https://github.com/IBM/causallib/pull/43

Causal model selection by @ehudkr in https://github.com/IBM/causallib/pull/45

New Contributors

@edenjenzohar made their first contribution in https://github.com/IBM/causallib/pull/43

Full Changelog: https://github.com/IBM/causallib/compare/v0.8.2...v0.9.0
Source code(tar.gz)
Source code(zip)
v0.8.2(May 24, 2022)
Release v0.8.2 https://pypi.org/project/causallib/0.8.2/

What's Changed

PropensityFeatureStandardization deepcopy fix by @mmdanziger in https://github.com/IBM/causallib/pull/35

Full Changelog: https://github.com/IBM/causallib/compare/v0.8.1...v0.8.2
Source code(tar.gz)
Source code(zip)
v0.8.1(Apr 6, 2022)
Release v0.8.1 https://pypi.org/project/causallib/0.8.1/

What's Changed

Fix argument misalignment when passing custom metric to OutcomeEvaluator by @yoavkt in https://github.com/IBM/causallib/pull/33

Full Changelog: https://github.com/IBM/causallib/compare/v0.8.0...v0.8.1
Source code(tar.gz)
Source code(zip)
v0.8.0(Feb 8, 2022)
Release v0.8.0 https://pypi.org/project/causallib/0.8.0/

What's Added:

Causal survival models by @liorness in https://github.com/IBM/causallib/pull/25

Confounder selection module by @ehudkr and @onkarbhardwaj in https://github.com/IBM/causallib/pull/22

Targeted Maximum Likelihood Estimator (TMLE) by @ehudkr in https://github.com/IBM/causallib/pull/26

Augmented Inverse Probability Weighting (AIPW) by @ehudkr in https://github.com/IBM/causallib/pull/30

Multiple types of propensity-based features in doubly robust models by @ehudkr in https://github.com/IBM/causallib/pull/28 and https://github.com/IBM/causallib/pull/30

R-learner by @Itaymanes in https://github.com/IBM/causallib/pull/24

X-learner by @yoavkt in https://github.com/IBM/causallib/pull/31

Verbosity control in IPW truncation by @liranszlak in https://github.com/IBM/causallib/pull/27

Backward compatibility-breaking changes

Doubly robust models have been renamed @ehudkr in https://github.com/IBM/causallib/pull/28 and https://github.com/IBM/causallib/pull/30

DoublyRobustIpFeature to PropensityFeatureStandardization

DoublyRobustJoffe to WeightedStandardization

DoublyRobustVanilla to AIPW

Asymmetric propensity truncation in IPW by @liranszlak in https://github.com/IBM/causallib/pull/27

Moving from a single symmetric truncation (truncate_eps) to a two-parameter asymmetric truncation (clip_min, clip_max)

New Contributors

@onkarbhardwaj made their first contribution in https://github.com/IBM/causallib/pull/22

@Itaymanes made their first contribution in https://github.com/IBM/causallib/pull/24

@liorness made their first contribution in https://github.com/IBM/causallib/pull/25

@liranszlak made their first contribution in https://github.com/IBM/causallib/pull/27

@yoavkt made their first contribution in https://github.com/IBM/causallib/pull/31

Full Changelog: https://github.com/IBM/causallib/compare/v0.7.1...v0.8.0
Source code(tar.gz)
Source code(zip)
v0.7.1(Oct 5, 2021)
Release v0.7.1 https://pypi.org/project/causallib/0.7.1/

Changes:

Basic unit testing for plots

Bug fixes for plotting propensity distribution with non-integer treatment encoding

Source code(tar.gz)
Source code(zip)
v0.7.0(Aug 26, 2021)
Release v0.7.0 https://pypi.org/project/causallib/0.7.0/

Changes:

New models:

Matching (estimator and preprocessing transformer)

Overlap Weights

HEMM

Weight models now have same fit() API as outcome models

Updated dependency

Dropped seaborn

pandas at 0.25

scikit-learn at 0.25

Additional fixes and maintenance

Source code(tar.gz)
Source code(zip)
causallib-0.7.0-py3-none-any.whl(1.96 MB)
causallib-0.7.0.tar.gz(1.76 MB)
v0.6.0(Feb 13, 2020)
Release v0.6.0 https://pypi.org/project/causallib/0.6.0/

Changes:

datasets module with toy datasets for causal analysis

NHEFS data from Hernan & Robins' book

Simulation benchmark data from the ACIC 2016 data challenge

contrib module for new state-of-the-art outside contributions

Adversarial Balancing model

New implementation for MarginalOutcomeEstimator (formerly UncorrectedEstimator) using WeightEstimator API

Additional Jupyter Notebook examples

NHEFS (Healthcare data)

Lalonde (Economic data)

Additional bug fix and documentation

Source code(tar.gz)
Source code(zip)
v0.5.0-beta(Jul 12, 2019)

Release v0.5.0-beta https://pypi.org/project/causallib/0.5.0b0/
Source code(tar.gz)
Source code(zip)

Owner

International Business Machines

GitHub Repository

Display the behaviour of a realtime program with a scope or logic analyser.

1. A monitor for realtime MicroPython code This library provides a means of examining the behaviour of a running system. It was initially designed to

17 Dec 05, 2022

Template for a Dataflow Flex Template in Python

Dataflow Flex Template in Python This repository contains a template for a Dataflow Flex Template written in Python that can easily be used to build D

5 Apr 28, 2022

a tool that compiles a csv of all h1 program stats

h1stats - h1 Program Stats Scraper This python3 script will call out to HackerOne's graphql API and scrape all currently active programs for informati

40 Oct 27, 2022

Udacity-api-reporting-pipeline - Udacity api reporting pipeline

udacity-api-reporting-pipeline In this exercise, you'll use portions of each of

1 Feb 15, 2022

Binance Kline Data With Python

Binance Kline Data by seunghan(gingerthorp) reference https://github.com/binance/binance-public-data/ All intervals are supported: 1m, 3m, 5m, 15m, 30

5 Jul 13, 2022

COVID-19 deaths statistics around the world

COVID-19-Deaths-Dataset COVID-19 deaths statistics around the world This is a daily updated dataset of COVID-19 deaths around the world. The dataset c

4 Jul 10, 2022

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan

150 Jan 06, 2023

Statistical Rethinking course winter 2022

Statistical Rethinking (2022 Edition) Instructor: Richard McElreath Lectures: Uploaded Playlist and pre-recorded, two per week Discussion: Online, F

3.9k Dec 31, 2022

BAyesian Model-Building Interface (Bambi) in Python.

Bambi BAyesian Model-Building Interface in Python Overview Bambi is a high-level Bayesian model-building interface written in Python. It's built on to

861 Dec 29, 2022

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

1.8k Jan 09, 2023

DataPrep — The easiest way to prepare data in Python

1.5k Dec 27, 2022

Powerful, efficient particle trajectory analysis in scientific Python.

freud Overview The freud Python library provides a simple, flexible, powerful set of tools for analyzing trajectories obtained from molecular dynamics

195 Dec 20, 2022

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed

4 Aug 02, 2022

Conduits - A Declarative Pipelining Tool For Pandas

Conduits - A Declarative Pipelining Tool For Pandas Traditional tools for declaring pipelines in Python suck. They are mostly imperative, and can some

7 Nov 21, 2021

Kennedy Institute of Rheumatology University of Oxford Project November 2019

TradingBot6M Kennedy Institute of Rheumatology University of Oxford Project November 2019 Run Change api.txt to binance api key: https://www.binance.c

2 Nov 16, 2021

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

14 Aug 19, 2022

Implementation in Python of the reliability measures such as Omega.

reliabiliPy Summary Simple implementation in Python of the [reliability](https://en.wikipedia.org/wiki/Reliability_(statistics) measures for surveys:

2 Apr 27, 2022

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

📈 Statistical Quality Control 📉 This repo contains a simple but effective tool made using python which can be used for quality control in statistica

8 Oct 18, 2022

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

21 Aug 30, 2022

MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI

MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI Hallo

1 Feb 07, 2022

A Python package for modular causal inference analysis and model evaluations

Related tags

Overview

Causal Inference 360

Description

Installation

Usage

Community support

Approach to causal-inference

1. Emphasis on potential outcome prediction

2. Stratified average treatment effect

3. Families of causal inference models

4. Confounders and DAGs

Comments

evaluation_results = evaluate(ipw, X, a, y)

Having this error while running this line in hemm_demo

Releases(v0.9.1)

v0.9.1(Nov 24, 2022)

What's Changed

v0.9.0(Sep 29, 2022)

Main changes

What's Changed

New Contributors

v0.8.2(May 24, 2022)

What's Changed

v0.8.1(Apr 6, 2022)

What's Changed

v0.8.0(Feb 8, 2022)

What's Added:

Backward compatibility-breaking changes

New Contributors

v0.7.1(Oct 5, 2021)

v0.7.0(Aug 26, 2021)

v0.6.0(Feb 13, 2020)

v0.5.0-beta(Jul 12, 2019)

Owner

International Business Machines

Display the behaviour of a realtime program with a scope or logic analyser.

Template for a Dataflow Flex Template in Python

a tool that compiles a csv of all h1 program stats

Udacity-api-reporting-pipeline - Udacity api reporting pipeline

Binance Kline Data With Python

COVID-19 deaths statistics around the world

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

Statistical Rethinking course winter 2022

BAyesian Model-Building Interface (Bambi) in Python.

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

DataPrep — The easiest way to prepare data in Python

Powerful, efficient particle trajectory analysis in scientific Python.

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams

Conduits - A Declarative Pipelining Tool For Pandas

Kennedy Institute of Rheumatology University of Oxford Project November 2019

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Implementation in Python of the reliability measures such as Omega.

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI