Metadata-Version: 2.1
Name: hestia-earth-distribution
Version: 0.0.11
Summary: Hestia's Distribution library
Home-page: https://gitlab.com/hestia-earth/hestia-distribution
Author: Hestia Team
Author-email: guillaume@hestia.earth
License: MIT
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3.6
Requires-Python: >=3
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: hestia-earth.schema (>=14.0.0)
Requires-Dist: hestia-earth.utils (>=0.10.0)
Requires-Dist: numpy
Requires-Dist: pandas (>=1.2.0)
Provides-Extra: stats
Requires-Dist: pymc (==4.4.0) ; extra == 'stats'

# Hestia Data Utils

Utils library to manipulate distributions on the Hestia platform

## Install

1. `pip install hestia_earth.distribution`
2. Optional: to generate distribution files, please install [pymc 4](https://pypi.org/project/pymc/).

## Usage

By default, all output files will be stored under _./data_ folder.
You can set the env variable `DISTRIBUTION_DATA_FOLDER` to store in a different folder.

To get posterior distribution:
```python
from hestia_earth.distribution.posterior_yield import get_post_ensemble, get_post

# get a single posterior distribution, run:
mu_ensemble, sd_ensemble = get_post_ensemble('GADM-GBR', 'wheatGrain')

# Or, if only instrested in the mean of the mu and sd values, run:
mu, sd = get_post('GADM-GBR', 'wheatGrain')
```

## Advance Usage

You can clone this repository to use the commands below.

### Generate prior distribution

To generate yield prior file for all products:
```
python generate_prior_yield.py --overwrite
```

For more information, run `python generate_prior_yield.py --help`.

### Generate likelihood data

In order to generate likelihood data (a spreadsheet of crop yield and fertiliser data) for a specific product and a specific country, run:
```
python generate_likelihood.py --product-id="wheatGrain" --country-id="GADM-GBR" --limit=1000
```

For more information, run `python generate_likelihood.py --help`.

### Generate posterior distribution

* In order to generate posterior distribution (for Bayesian statistics) for a specific country, run:
```
python generate_posterior_yield.py --country-id="GADM-GBR"
```

or to generate the fertiliser usage:
```
python generate_posterior_fert.py --country-id="GADM-GBR"
```

or to generate the pesticide usage:
```
python generate_posterior_pest.py --country-id="GADM-GBR"
```

Note: all commands above will update the same CSV file so they must not be run **at the same time**.

### Plotting

#### Prior Yield

To plot prior distribution by product by country:

```
python plot_prior_yield.py --country-id='GADM-GBR' --product-id='wheatGrain' --output-file='prior.png'
```

To plot FAO annual yield data, change `--type` parameter to one of the four options: `fao_per_country`, `fao_per_product`, `fao_per_country_per_product`, `world_mu_signma`. Example:
```
python plot_prior_yield.py --country-id='GADM-GBR' --output-file='fao-yield-gbr-allProducts.png' --type='fao_per_country'
```

For more information, run `python plot_prior_yield.py --help`.

#### Cycle Yield

To plot the bivariate distribution of yield data for [Wheat, grain](https://hestia.earth/term/wheatGrain) in [United Kingdom](https://hestia.earth/term/GADM-GBR):

```
python plot_cycle_yield.py --product-id=wheatGrain" --country-id="GADM-GBR" --limit=100
```

This will take a sample size of `100` and create a `result.png` file with the distribution.

For more information, run `python plot_cycle_yield.py --help`.

#### Posterior Yield

In order to plot the posterior distribution for a specific product and a specific country, run:
```
python plot_posterior_yield.py --country-id="GADM-GBR" --product-id="wheatGrain" --output-file="post.png"
```

For more information, run `python plot_posterior_yield.py --help`.
