Skip to content

Commit

Permalink
Merge pull request #11 from CyberAgentAILab/feat/get-started-doc
Browse files Browse the repository at this point in the history
Update get-started guide
  • Loading branch information
TomeHirata authored Jul 12, 2024
2 parents c820ee4 + f16e23c commit 3f3dcbc
Show file tree
Hide file tree
Showing 15 changed files with 284 additions and 253 deletions.
56 changes: 56 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@

# Contribution
Thank you for considering contributing to this project! Here are some guidelines to help you get started.

## How can I contribute to this project?
### Reporting Bugs
If you find a bug, please report it by opening an issue in the issue tracker. Provide as much detail as possible to help us understand and reproduce the issue:
- A clear and descriptive title.
- A detailed description of the problem.
- Steps to reproduce the issue.
- Any error messages or screenshots.

### Suggesting Enhancements
We welcome suggestions for improvements! To suggest an enhancement:
- Check the issue tracker to see if someone else has already suggested it.
- If not, open a new issue and describe your idea clearly.
- Explain why you believe the enhancement would be beneficial.

### Pull Requests
Pull requests are welcome! If you plan to make significant changes, please open an issue first to discuss your idea. This helps us ensure that your contribution fits with the project's direction. Follow these steps for a smooth pull request process:

- Fork the repository.
- Clone your fork to your local machine.
- Create a new branch: `git checkout -b my-feature-branch`.
- Make your changes.
- Commit your changes: `git commit -m 'Add some feature'.
- Push to the branch: `git push origin my-feature-branch`.
- Open a pull request in the original repository.

## Development
Here are the basic commands you can use to develop this package.

### Install Pipenv
If you don't have `pipenv` installed, you can install it using `pip`:

```sh
pip install pipenv
```

### Linting
We use `ruff` for linting the code. To run the linter, use the following command:
```sh
pipenv run lint
```

### Auto format
We use `ruff` for formatting the code. To run the formatter, use the following command:
```sh
pipenv run format
```

### Unit test
We use `unittest` for testing the code. To run the unit tests, use the following command:
```sh
pipenv run unittest
```
36 changes: 11 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Overview

This an a Python package for building the regression adjusted distribution function estimator proposed in "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction".
This a Python package for building the regression adjusted distribution function estimator proposed in "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction". For the details of this package, see [the documentation](https://cyberagentailab.github.io/python-dte-adjustment/).

## Installation

Expand All @@ -17,29 +17,15 @@ This an a Python package for building the regression adjusted distribution funct
pip install -e .
```

## Basic Usage
Examples of how to use this package are available in [this Get-started Guide](https://cyberagentailab.github.io/python-dte-adjustment/get_started.html).

## Development
We welcome contributions to the project! Please review our [Contribution Guide](CONTRIBUTING.md) for details on how to get started.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

### Install Pipenv
If you don't have `pipenv` installed, you can install it using `pip`:
```sh
pip install pipenv
```
### Linting
We use `ruff` for linting the code. To run the linter, use the following command:
```sh
pipenv run lint
```
### Auto format
We use `ruff` for formatting the code. To run the formatter, use the following command:
```sh
pipenv run format
```
### Unit test
We use `unittest` for testing the code. To run the unit tests, use the following command:
```sh
pipenv run unittest
```
## Maintainers
- [Tomu Hirata](https://github.com/TomeHirata)
Binary file modified docs/source/_static/dte_empirical.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/dte_moment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/dte_simple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/dte_uniform.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/pte_simple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/qte.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/source/contributing.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Contribution Guide
==================

Regarding how to contribute to this package, please refer to https://github.com/CyberAgentAILab/python-dte-adjustment/CONTRIBUTING.md for more details.
22 changes: 11 additions & 11 deletions docs/source/get_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ Generate data for training cumulative distribution function:
quadratic_term = np.dot(X**2, gamma)
# Outcome equation
Y = D + linear_term + quadratic_term + U
Y = 5 * D + linear_term + quadratic_term + U
return X, D, Y
n = 100 # Sample size
n = 1000 # Sample size
X, D, Y = generate_data(n)
Then, let's build an empirical cumulative distribution function (CDF).
Expand All @@ -63,13 +63,14 @@ Distributional treatment effect (DTE) can be computed easily in the following co

.. code-block:: python
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=np.sort(Y), variance_type="simple")
locations = np.linspace(Y.min(), Y.max(), 20)
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=locations, variance_type="simple")
A convenience function is available to visualize distribution effects. This method can be used for other distribution parameters including Probability Treatment Effect (PTE) and Quantile Treatment Effect (QTE).

.. code-block:: python
plot(np.sort(Y), dte, lower_bound, upper_bound, title="DTE of simple estimator")
plot(locations, dte, lower_bound, upper_bound, title="DTE of simple estimator")
.. image:: _static/dte_empirical.png
:alt: DTE of empirical estimator
Expand All @@ -92,8 +93,8 @@ DTE can be computed and visualized in the following code.

.. code-block:: python
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=np.sort(Y), variance_type="simple")
plot(np.sort(Y), dte, lower_bound, upper_bound, title="DTE of adjusted estimator with simple confidence band")
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=locations, variance_type="simple")
plot(locations, dte, lower_bound, upper_bound, title="DTE of adjusted estimator with simple confidence band")
.. image:: _static/dte_simple.png
:alt: DTE of adjusted estimator with simple confidence band
Expand All @@ -105,8 +106,8 @@ Confidence bands can be computed in different ways. In the following code, we us

.. code-block:: python
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=np.sort(Y), variance_type="moment")
plot(np.sort(Y), dte, lower_bound, upper_bound, title="DTE of adjusted estimator with moment confidence band")
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=locations, variance_type="moment")
plot(locations, dte, lower_bound, upper_bound, title="DTE of adjusted estimator with moment confidence band")
.. image:: _static/dte_moment.png
:alt: DTE of adjusted estimator with moment confidence band
Expand All @@ -118,8 +119,8 @@ Also, an uniform confidence band is used when "uniform" is specified for the "va

.. code-block:: python
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=np.sort(Y), variance_type="uniform")
plot(np.sort(Y), dte, lower_bound, upper_bound, title="DTE of adjusted estimator with uniform confidence band")
dte, lower_bound, upper_bound = estimator.predict_dte(target_treatment_arm=1, control_treatment_arm=0, locations=locations, variance_type="uniform")
plot(locations, dte, lower_bound, upper_bound, title="DTE of adjusted estimator with uniform confidence band")
.. image:: _static/dte_uniform.png
:alt: DTE of adjusted estimator with uniform confidence band
Expand All @@ -131,7 +132,6 @@ To compute PTE, we can use "predict_pte" method.

.. code-block:: python
locations = np.linspace(Y.min(), Y.max(), 20)
pte, lower_bound, upper_bound = estimator.predict_pte(target_treatment_arm=1, control_treatment_arm=0, width=1, locations=locations, variance_type="simple")
plot(locations, pte, lower_bound, upper_bound, chart_type="bar", title="PTE of adjusted estimator with simple confidence band")
Expand Down
15 changes: 11 additions & 4 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,27 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
dte_adj Documentation
dte_adj
===================================

This a Python package for building the regression adjusted distribution function estimator proposed in "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction".

.. toctree::
:maxdepth: 2
:maxdepth: 1
:caption: Contents:

installation
get_started
modules
contributing

Indices and tables
==================
~~~~~~~~~~~~~~~~~~

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
* :ref:`search`

License
~~~~~~~
MIT License
12 changes: 12 additions & 0 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,25 @@ Installation Guide

This package can be installed either through PyPI or source code.

Requirement
~~~~~~~~~~~

You need to use Python version 3.6 or higher to use this package.


Install from PyPI
~~~~~~~~~~~~~~~~~

For installing the package from PyPI, please use the following command.

.. code-block:: bash
pip install dte_adj
Install from source code
~~~~~~~~~~~~~~~~~~~~~~~~

For installing the package from the source code, please use the following commands.

.. code-block:: bash
Expand Down
53 changes: 30 additions & 23 deletions dte_adj/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -282,25 +282,32 @@ def _compute_qtes(
outcomes: np.array,
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
"""Compute expected QTEs."""
treatment_cumulative, _ = self._compute_cumulative_distribution(
np.full(outcomes.shape, target_treatment_arm),
outcomes,
confoundings,
treatment_arms,
outcomes,
)
control_cumulative, _ = self._compute_cumulative_distribution(
np.full(outcomes.shape, control_treatment_arm),
outcomes,
confoundings,
treatment_arms,
outcomes,
)
locations = np.sort(outcomes)

def find_quantile(quantile, arm):
low, high = 0, locations.shape[0] - 1
result = -1
while low <= high:
mid = (low + high) // 2
val, _ = self._compute_cumulative_distribution(
np.full((1), arm),
np.full((1), locations[mid]),
confoundings,
treatment_arms,
outcomes,
)
if val[0] <= quantile:
result = locations[mid]
low = mid + 1
else:
high = mid - 1
return result

result = np.zeros(quantiles.shape)
for i, q in enumerate(quantiles):
treatment_idx = find_le(treatment_cumulative, q)
control_idx = find_le(control_cumulative, q)
result[i] = outcomes[treatment_idx] - outcomes[control_idx]
result[i] = find_quantile(q, target_treatment_arm) - find_quantile(
q, control_treatment_arm
)

return result

Expand Down Expand Up @@ -415,15 +422,15 @@ def _compute_cumulative_distribution(
d_confounding = {}
d_outcome = {}
n_obs = outcomes.shape[0]
n_loc = outcomes.shape[0]
n_loc = locations.shape[0]
for arm in unique_treatment_arm:
selected_confounding = confoundings[treatment_arms == arm]
selected_outcome = outcomes[treatment_arms == arm]
sorted_indices = np.argsort(selected_outcome)
d_confounding[arm] = selected_confounding[sorted_indices]
d_outcome[arm] = selected_outcome[sorted_indices]
cumulative_distribution = np.zeros(outcomes.shape)
for i, (outcome, arm) in enumerate(zip(outcomes, target_treatment_arms)):
cumulative_distribution = np.zeros(locations.shape)
for i, (outcome, arm) in enumerate(zip(locations, target_treatment_arms)):
cumulative_distribution[i] = (
find_le(d_outcome[arm], outcome) + 1
) / d_outcome[arm].shape[0]
Expand Down Expand Up @@ -518,10 +525,10 @@ def _compute_cumulative_distribution(
np.ndarray: Estimated cumulative distribution values.
"""
n_obs = outcomes.shape[0]
n_loc = outcomes.shape[0]
cumulative_distribution = np.zeros(outcomes.shape)
n_loc = locations.shape[0]
cumulative_distribution = np.zeros(locations.shape)
superset_prediction = np.zeros((n_obs, n_loc))
for i, (location, arm) in enumerate(zip(outcomes, target_treatment_arms)):
for i, (location, arm) in enumerate(zip(locations, target_treatment_arms)):
confounding_in_arm = confoundings[treatment_arms == arm]
outcome_in_arm = outcomes[treatment_arms == arm]
subset_prediction = np.zeros(outcome_in_arm.shape[0])
Expand Down
Loading

0 comments on commit 3f3dcbc

Please sign in to comment.