Metadata-Version: 2.4
Name: composeml
Version: 0.10.1
Summary: a framework for automated prediction engineering
Author-email: "Alteryx, Inc." <open_source_support@alteryx.com>
Maintainer-email: "Alteryx, Inc." <open_source_support@alteryx.com>
License: BSD 3-clause
Project-URL: Documentation, https://compose.alteryx.com
Project-URL: Source Code, https://github.com/alteryx/compose/
Project-URL: Changes, https://compose.alteryx.com/en/latest/release_notes.html
Project-URL: Issue Tracker, https://github.com/alteryx/compose/issues
Project-URL: Twitter, https://twitter.com/alteryxoss
Project-URL: Chat, https://join.slack.com/t/alteryx-oss/shared_invite/zt-182tyvuxv-NzIn6eiCEf8TBziuKp0bNA
Keywords: prediction engineering,data science,machine learning
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: OS Independent
Requires-Python: <4,>=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.3.0
Requires-Dist: tqdm>=4.32.0
Requires-Dist: matplotlib>=3.3.3
Requires-Dist: seaborn>=0.11.0
Provides-Extra: test
Requires-Dist: pip>=21.3.1; extra == "test"
Requires-Dist: pytest-cov==3.0.0; extra == "test"
Requires-Dist: pytest-xdist>=2.5.0; extra == "test"
Requires-Dist: wheel>=0.33.1; extra == "test"
Requires-Dist: featuretools>=1.4.0; extra == "test"
Requires-Dist: woodwork>=0.11.0; extra == "test"
Requires-Dist: pyarrow>=3.0.0; extra == "test"
Provides-Extra: updater
Requires-Dist: alteryx-open-src-update-checker>=2.1.0; extra == "updater"
Provides-Extra: docs
Requires-Dist: evalml>=0.45.0; extra == "docs"
Provides-Extra: dev
Requires-Dist: codecov==2.1.12; extra == "dev"
Requires-Dist: flake8==4.0.1; extra == "dev"
Requires-Dist: isort==5.9.3; extra == "dev"
Requires-Dist: black==22.10.0; extra == "dev"
Requires-Dist: nbsphinx==0.8.7; extra == "dev"
Requires-Dist: pydata-sphinx-theme==0.7.1; extra == "dev"
Requires-Dist: Sphinx==4.2.0; extra == "dev"
Requires-Dist: sphinx-inline-tabs==2022.1.2b11; extra == "dev"
Requires-Dist: sphinx-copybutton==0.4.0; extra == "dev"
Requires-Dist: myst-parser==0.16.1; extra == "dev"
Requires-Dist: nbconvert==6.4.5; extra == "dev"
Requires-Dist: ipython==7.31.1; extra == "dev"
Requires-Dist: pygments==2.10.0; extra == "dev"
Requires-Dist: jupyter==1.0.0; extra == "dev"
Requires-Dist: pandoc==1.1.0; extra == "dev"
Requires-Dist: ipykernel==6.4.2; extra == "dev"
Requires-Dist: scikit-learn!=0.22,<1.2.0,>=0.20.0; extra == "dev"
Provides-Extra: complete
Requires-Dist: composeml[updater]; extra == "complete"
Dynamic: license-file

<p align="center"><img width=50% src="https://raw.githubusercontent.com/alteryx/compose/main/docs/source/images/compose.png" alt="Compose" /></p>
<p align="center"><i>"Build better training examples in a fraction of the time."</i></p>
<p align="center">
    <a href="https://github.com/alteryx/compose/actions?query=workflow%3ATests" target="_blank">
        <img src="https://github.com/alteryx/compose/workflows/Tests/badge.svg" alt="Tests" />
    </a>
    <a href="https://codecov.io/gh/alteryx/compose">
        <img src="https://codecov.io/gh/alteryx/compose/branch/main/graph/badge.svg?token=mDz4ueTUEO"/>
    </a>
    <a href="https://compose.alteryx.com/en/stable/?badge=stable" target="_blank">
        <img src="https://readthedocs.com/projects/feature-labs-inc-compose/badge/?version=stable&token=5c3ace685cdb6e10eb67828a4dc74d09b20bb842980c8ee9eb4e9ed168d05b00"
            alt="ReadTheDocs" />
    </a>
    <a href="https://badge.fury.io/py/composeml" target="_blank">
        <img src="https://badge.fury.io/py/composeml.svg?maxAge=2592000" alt="PyPI Version" />
    </a>
    <a href="https://stackoverflow.com/questions/tagged/compose-ml" target="_blank">
        <img src="https://img.shields.io/badge/questions-on_stackoverflow-blue.svg?" alt="StackOverflow" />
    </a>
    <a href="https://pepy.tech/project/composeml" target="_blank">
        <img src="https://pepy.tech/badge/composeml/month" alt="PyPI Downloads" />
    </a>
</p>
<hr>

[Compose](https://compose.alteryx.com) is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. An end user defines an outcome of interest by writing a *labeling function*, then runs a search to automatically extract training examples from historical data. Its result is then provided to [Featuretools](https://docs.featuretools.com/) for automated feature engineering and subsequently to [EvalML](https://evalml.alteryx.com/) for automated machine learning. The workflow of an applied machine learning engineer then becomes:

<br><p align="center"><img width=90% src="https://raw.githubusercontent.com/alteryx/compose/main/docs/source/images/workflow.png" alt="Compose" /></p><br>

By automating the early stage of the machine learning pipeline, our end user can easily define a task and solve it. See the [documentation](https://compose.alteryx.com) for more information.

## Installation
Install with pip

```
python -m pip install composeml
```

or from the Conda-forge channel on [conda](https://anaconda.org/conda-forge/composeml):

```
conda install -c conda-forge composeml
```

### Add-ons

**Update checker** - Receive automatic notifications of new Compose releases

```
python -m pip install "composeml[update_checker]"
```

## Example
> Will a customer spend more than 300 in the next hour of transactions?

In this example, we automatically generate new training examples from a historical dataset of transactions.

```python
import composeml as cp
df = cp.demos.load_transactions()
df = df[df.columns[:7]]
df.head()
```

<table border="0" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>transaction_id</th>
      <th>session_id</th>
      <th>transaction_time</th>
      <th>product_id</th>
      <th>amount</th>
      <th>customer_id</th>
      <th>device</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>298</td>
      <td>1</td>
      <td>2014-01-01 00:00:00</td>
      <td>5</td>
      <td>127.64</td>
      <td>2</td>
      <td>desktop</td>
    </tr>
    <tr>
      <td>10</td>
      <td>1</td>
      <td>2014-01-01 00:09:45</td>
      <td>5</td>
      <td>57.39</td>
      <td>2</td>
      <td>desktop</td>
    </tr>
    <tr>
      <td>495</td>
      <td>1</td>
      <td>2014-01-01 00:14:05</td>
      <td>5</td>
      <td>69.45</td>
      <td>2</td>
      <td>desktop</td>
    </tr>
    <tr>
      <td>460</td>
      <td>10</td>
      <td>2014-01-01 02:33:50</td>
      <td>5</td>
      <td>123.19</td>
      <td>2</td>
      <td>tablet</td>
    </tr>
    <tr>
      <td>302</td>
      <td>10</td>
      <td>2014-01-01 02:37:05</td>
      <td>5</td>
      <td>64.47</td>
      <td>2</td>
      <td>tablet</td>
    </tr>
  </tbody>
</table>

First, we represent the prediction problem with a labeling function and a label maker.

```python
def total_spent(ds):
    return ds['amount'].sum()

label_maker = cp.LabelMaker(
    target_dataframe_index="customer_id",
    time_index="transaction_time",
    labeling_function=total_spent,
    window_size="1h",
)
```

Then, we run a search to automatically generate the training examples.

```python
label_times = label_maker.search(
    df.sort_values('transaction_time'),
    num_examples_per_instance=2,
    minimum_data='2014-01-01',
    drop_empty=False,
    verbose=False,
)

label_times = label_times.threshold(300)
label_times.head()
```

<table border="0" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>customer_id</th>
      <th>time</th>
      <th>total_spent</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td>2014-01-01 00:00:00</td>
      <td>True</td>
    </tr>
    <tr>
      <td>1</td>
      <td>2014-01-01 01:00:00</td>
      <td>True</td>
    </tr>
    <tr>
      <td>2</td>
      <td>2014-01-01 00:00:00</td>
      <td>False</td>
    </tr>
    <tr>
      <td>2</td>
      <td>2014-01-01 01:00:00</td>
      <td>False</td>
    </tr>
    <tr>
      <td>3</td>
      <td>2014-01-01 00:00:00</td>
      <td>False</td>
    </tr>
  </tbody>
</table>

We now have labels that are ready to use in [Featuretools](https://docs.featuretools.com/) to generate features.

## Support

The Innovation Labs open source community is happy to provide support to users of Compose. Project support can be found in three places depending on the type of question:

1. For usage questions, use [Stack Overflow](https://stackoverflow.com/questions/tagged/compose-ml) with the `composeml` tag.
2. For bugs, issues, or feature requests start a Github [issue](https://github.com/alteryx/compose/issues/new).
3. For discussion regarding development on the core library, use [Slack](https://join.slack.com/t/alteryx-oss/shared_invite/zt-182tyvuxv-NzIn6eiCEf8TBziuKp0bNA).
4. For everything else, the core developers can be reached by email at open_source_support@alteryx.com

## Citing Compose
Compose is built upon a newly defined part of the machine learning process — prediction engineering. If you use Compose, please consider citing this paper:
James Max Kanter, Gillespie, Owen, Kalyan Veeramachaneni. [Label, Segment,Featurize: a cross domain framework for prediction engineering.](https://dai.lids.mit.edu/wp-content/uploads/2017/10/Pred_eng1.pdf) IEEE DSAA 2016.

BibTeX entry:

```bibtex
@inproceedings{kanter2016label,
  title={Label, segment, featurize: a cross domain framework for prediction engineering},
  author={Kanter, James Max and Gillespie, Owen and Veeramachaneni, Kalyan},
  booktitle={2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)},
  pages={430--439},
  year={2016},
  organization={IEEE}
}
```

## Acknowledgements 

The open source development has been supported in part by DARPA's Data driven discovery of models program (D3M). 

## Alteryx

**Compose** is an open source project maintained by [Alteryx](https://www.alteryx.com). We developed Compose to enable flexible definition of the machine learning task. To see the other open source projects we’re working on visit [Alteryx Open Source](https://www.alteryx.com/open-source). If building impactful data science pipelines is important to you or your business, please get in touch.

<p align="center">
  <a href="https://www.alteryx.com/open-source">
    <img src="https://alteryx-oss-web-images.s3.amazonaws.com/OpenSource_Logo-01.png" alt="Alteryx Open Source" width="800"/>
  </a>
</p>
