Metadata-Version: 2.1
Name: quickplotter
Version: 1.0
Summary: Instantly generate common EDA plots without cleaning your DataFrame
Home-page: https://github.com/jlehrer1/InstantEDA
Author: Julian Lehrer
Author-email: julianmlehrer@gmail.com
License: MIT
Keywords: VISUALIZATION,PYTHON,DATA SCIENCE
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: plotly
Requires-Dist: scikit-learn
Requires-Dist: sklearn-pandas

# Instant EDA
Instantly generate common exploratory data plots without having to worry about cleaning your data.

Note: To find the most updated documentation, visit the Github [repo](https://github.com/jlehrer1/InstantEDA).

Description: The `quickplotter` module provided here is meant to provide common exploratory data plots without having to worry about cleaning your DataFrame or preanalyzing your data. Additionally, these plots can be exported to `.{png, jpeg}` for use in reports and papers. 

## 1. Basic Usage:
```python3
plotter = quickplotter.QuickPlotter(df: pd.DataFrame) #creates a QuickPlotter object with the given DataFrame

plotter.common(subset=['correlation', 'percent_nan']) #plots correlation between features, and percent nan in each column

plotter.distribution(column_subset=df.columns[0:4]) #plots distributions for the first four columns in the DataFrame

plotter.common(column_subset=['body_mass_index', 'blood_type']) #plots common plots for the given columns
```

## 2. Fundamentals

If the number of `NaN` values in the DataFrame is `<= 5%` of the total values, the NaN rows will be dropped and the plots will be generated without them. **Remember, this is meant to be a quick and dirty tool for exploration, and not for being delicate with each data entry.**

### subset & diff lists
The quickplot module works mainly with two specifications, `subset` and `diff`. 

For any `subset`-like list, the items in the list will be used. For any `diff`-like list, all items *except* those in the list will be used. 

The options are as follow:
- `subset`: Use only the plots specified in the list
- `diff`: Use all plots *except* those specified in the list
- `subset_columns`: Use all columns specified in the list. Can either be `df.columns` slicing or by name
- `diff_columns`: Use all columns *except* those specified in the list. Can either be `df.columns` slicing or by name. 

## 3. Contributing

If you have read this far I hope you've found this tool useful. I am always looking to learn more and develop as a collaborative programmer, so if you have any ideas or contributions, feel free to write a feature or pull request. 







