Metadata-Version: 2.1
Name: autora-experimentalist-sampler-novelty
Version: 1.0.1
Summary: AutoRA Novelty Experimentalist
Author-email: Sebastian Musslick <sebastian@musslick.de>
License: MIT License
Project-URL: homepage, http://www.empiricalresearch.ai
Project-URL: repository, https://github.com/AutoResearch/autora-experimentalist-sampler-novelty
Project-URL: documentation, https://autoresearch.github.io/autora/
Requires-Python: <4,>=3.8.10
Description-Content-Type: text/markdown
Provides-Extra: dev

# AutoRA Novelty Sampler

The novelty sampler identifies experimental conditions $\vec{x}' \in X'$ with respect to
a pairwise distance metric applied to existing experimental conditions $\vec{x} \in X$:

$$
\underset{\vec{x}'}{\arg\max}~f(d(\vec{x}, \vec{x}'))
$$

where $f$ is an integration function applied to all pairwise  distances.

## Example

For instance,
the integration function $f(x)=\min(x)$ and distance function $d(x, x')=|x-x'|$ identifies
condition $\vec{x}'$ with the greatest minimal Euclidean distance to all
existing conditions in $\vec{x} \in X$.

$$
\underset{\vec{x}}{\arg\max}~\min_i(\sum_{j=1}^n(x_{i,j} - x_{i,j}')^2)
$$

To illustrate this sampling strategy, consider the following four experimental conditions that
were already probed:


| $x_{i,0}$ | $x_{i,1}$ | $x_{i,2}$ |
|-----------|-----------|-----------|
| 0         | 0         | 0         |
| 1         | 0         | 0         |
| 0         | 1         | 0         |
| 0         | 0         | 1         |

Fruthermore, let's consider the following three candidate conditions $X'$:

| $x_{i,0}'$ | $x_{i,1}'$ | $x_{i,2}'$ |
|------------|------------|------------|
| 1          | 1          | 1          |
| 2          | 2          | 2          |
| 3          | 3          | 3          |


If the novelty sampler is tasked to identify two novel conditions, it will select
the last two candidate conditions $x'_{1,j}$ and $x'_{2,j}$ because they have the greatest
minimal distance to all existing conditions $x_{i,j}$:

### Example Code
```python
import numpy as np
from autora.experimentalist.sampler.novelty import novelty_sampler, novelty_score_sampler

# Specify X and X'
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
X_prime = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])

# Here, we choose to identify two novel conditions
n = 2
X_sampled = novelty_sampler(condition_pool=X_prime, reference_conditions=X, num_samples=n)

# We may also obtain samples along with their z-scored novelty scores
(X_sampled, scores) = novelty_score_sampler(condition_pool=X_prime, reference_conditions=X, num_samples=n)
```




