Metadata-Version: 2.1
Name: records-mover
Version: 0.4.0
Summary: Library and CLI to move relational data from one place to another - DBs/CSV/gsheets/dataframes/...
Home-page: UNKNOWN
Author: Vince Broz
Author-email: opensource@bluelabs.com
License: Apache Software License
Download-URL: https://github.com/bluelabsio/records-mover/tarball/0.4.0
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Database :: Front-Ends
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: timeout-decorator
Requires-Dist: PyYAML (<5.3)
Requires-Dist: db-facts (<4,>=3)
Requires-Dist: chardet
Requires-Dist: tenacity (>=6<7)
Provides-Extra: airflow
Requires-Dist: apache-airflow (<2,>=1.10) ; extra == 'airflow'
Provides-Extra: aws
Requires-Dist: awscli (<2,>=1) ; extra == 'aws'
Requires-Dist: boto (<3,>=2) ; extra == 'aws'
Requires-Dist: boto3 ; extra == 'aws'
Requires-Dist: smart-open (<1.9.0,>=1.8.4) ; extra == 'aws'
Requires-Dist: s3-concat (<0.2,>=0.1.7) ; extra == 'aws'
Provides-Extra: bigquery
Requires-Dist: pybigquery ; extra == 'bigquery'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'bigquery'
Provides-Extra: cli
Requires-Dist: odictliteral ; extra == 'cli'
Requires-Dist: jsonschema ; extra == 'cli'
Requires-Dist: typing-inspect ; extra == 'cli'
Requires-Dist: docstring-parser ; extra == 'cli'
Provides-Extra: db
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'db'
Provides-Extra: gsheets
Requires-Dist: google ; extra == 'gsheets'
Requires-Dist: google-auth-httplib2 ; extra == 'gsheets'
Requires-Dist: oauth2client (<2.1.0,>=2.0.2) ; extra == 'gsheets'
Requires-Dist: PyOpenSSL ; extra == 'gsheets'
Requires-Dist: google-api-python-client (<1.6.0,>=1.5.0) ; extra == 'gsheets'
Provides-Extra: itest
Requires-Dist: jsonschema ; extra == 'itest'
Requires-Dist: google-api-python-client (<1.6.0,>=1.5.0) ; extra == 'itest'
Provides-Extra: literally_every_single_database_binary
Requires-Dist: sqlalchemy-vertica-python (<0.6,>=0.5.5) ; extra == 'literally_every_single_database_binary'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'literally_every_single_database_binary'
Requires-Dist: psycopg2-binary ; extra == 'literally_every_single_database_binary'
Requires-Dist: sqlalchemy-redshift (>=0.7.7) ; extra == 'literally_every_single_database_binary'
Requires-Dist: awscli (<2,>=1) ; extra == 'literally_every_single_database_binary'
Requires-Dist: boto (<3,>=2) ; extra == 'literally_every_single_database_binary'
Requires-Dist: boto3 ; extra == 'literally_every_single_database_binary'
Requires-Dist: smart-open (<1.9.0,>=1.8.4) ; extra == 'literally_every_single_database_binary'
Requires-Dist: s3-concat (<0.2,>=0.1.7) ; extra == 'literally_every_single_database_binary'
Requires-Dist: pybigquery ; extra == 'literally_every_single_database_binary'
Requires-Dist: mysqlclient ; extra == 'literally_every_single_database_binary'
Provides-Extra: mysql
Requires-Dist: mysqlclient ; extra == 'mysql'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'mysql'
Provides-Extra: pandas
Requires-Dist: pandas (<2) ; extra == 'pandas'
Provides-Extra: postgres-binary
Requires-Dist: psycopg2-binary ; extra == 'postgres-binary'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'postgres-binary'
Provides-Extra: postgres-source
Requires-Dist: psycopg2 ; extra == 'postgres-source'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'postgres-source'
Provides-Extra: redshift-binary
Requires-Dist: psycopg2-binary ; extra == 'redshift-binary'
Requires-Dist: sqlalchemy-redshift (>=0.7.7) ; extra == 'redshift-binary'
Requires-Dist: awscli (<2,>=1) ; extra == 'redshift-binary'
Requires-Dist: boto (<3,>=2) ; extra == 'redshift-binary'
Requires-Dist: boto3 ; extra == 'redshift-binary'
Requires-Dist: smart-open (<1.9.0,>=1.8.4) ; extra == 'redshift-binary'
Requires-Dist: s3-concat (<0.2,>=0.1.7) ; extra == 'redshift-binary'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'redshift-binary'
Provides-Extra: redshift-source
Requires-Dist: psycopg2 ; extra == 'redshift-source'
Requires-Dist: sqlalchemy-redshift (>=0.7.7) ; extra == 'redshift-source'
Requires-Dist: awscli (<2,>=1) ; extra == 'redshift-source'
Requires-Dist: boto (<3,>=2) ; extra == 'redshift-source'
Requires-Dist: boto3 ; extra == 'redshift-source'
Requires-Dist: smart-open (<1.9.0,>=1.8.4) ; extra == 'redshift-source'
Requires-Dist: s3-concat (<0.2,>=0.1.7) ; extra == 'redshift-source'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'redshift-source'
Provides-Extra: unittest
Requires-Dist: odictliteral ; extra == 'unittest'
Requires-Dist: jsonschema ; extra == 'unittest'
Requires-Dist: typing-inspect ; extra == 'unittest'
Requires-Dist: docstring-parser ; extra == 'unittest'
Requires-Dist: apache-airflow (<2,>=1.10) ; extra == 'unittest'
Requires-Dist: google ; extra == 'unittest'
Requires-Dist: google-auth-httplib2 ; extra == 'unittest'
Requires-Dist: oauth2client (<2.1.0,>=2.0.2) ; extra == 'unittest'
Requires-Dist: PyOpenSSL ; extra == 'unittest'
Requires-Dist: google-api-python-client (<1.6.0,>=1.5.0) ; extra == 'unittest'
Requires-Dist: sqlalchemy-vertica-python (<0.6,>=0.5.5) ; extra == 'unittest'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'unittest'
Requires-Dist: psycopg2-binary ; extra == 'unittest'
Requires-Dist: sqlalchemy-redshift (>=0.7.7) ; extra == 'unittest'
Requires-Dist: awscli (<2,>=1) ; extra == 'unittest'
Requires-Dist: boto (<3,>=2) ; extra == 'unittest'
Requires-Dist: boto3 ; extra == 'unittest'
Requires-Dist: smart-open (<1.9.0,>=1.8.4) ; extra == 'unittest'
Requires-Dist: s3-concat (<0.2,>=0.1.7) ; extra == 'unittest'
Requires-Dist: pybigquery ; extra == 'unittest'
Requires-Dist: mysqlclient ; extra == 'unittest'
Requires-Dist: pandas (<2) ; extra == 'unittest'
Provides-Extra: vertica
Requires-Dist: sqlalchemy-vertica-python (<0.6,>=0.5.5) ; extra == 'vertica'
Requires-Dist: sqlalchemy (!=1.3.16) ; extra == 'vertica'

# Records Mover - mvrec

Records Mover is a command-line tool and Python library you can
use to move relational data from one place to another.

Relational data here means anything roughly "rectangular" - with
columns and rows.  For example, CSV it supports reading and writing
data in:

* Databases, including using native high-speed methods of
  import/export of bulk data.  Redshift and Vertica are
  well-supported, with some support for BigQuery and PostgreSQL.
* Google Sheets
* Pandas DataFrames
* CSV files, either alone or in a records directory - a structured
  directory of CSV/Parquet/etc files containing some JSON metadata
  about their format and origins.  Records directories are especially
  helpful for the ever-ambiguous CSV format, where they solve the
  problem of 'hey, this may be a CSV - but what's the schema?  What's
  the format of the CSV itself?  How is it escaped?'

The record mover can be exended expand to handle additional database
and data file types by building on top of their
[SQLAlchemy](https://www.sqlalchemy.org/) drivers, and is able to
auto-negotiate the most efficient way of moving data from one to the
other.

Example CLI use:

```sh
pip3 install 'records_mover[movercli]'
mvrec --help
mvrec table2table mydb1 myschema1 mytable1 mydb2 myschema2 mytable2
```

For more installation notes, see [INSTALL.md](./INSTALL.md)

Note that the connection details for the database names here must be
configured using
[db-facts](https://github.com/bluelabsio/db-facts/blob/master/CONFIGURATION.md).

Example Python library use:

First, install records_mover.  We'll also use Pandas, so we'll install
that, too:

```sh
pip3 install records_mover pandas
```

Now we can run this code:

```python
#!/usr/bin/env python3

# Pull in the records-mover library - be sure to run the pip install above first!
from records_mover import Session
from pandas import DataFrame

session = Session()
records = session.records

# This is a SQLAlchemy database engine.
#
# You can instead call session.get_db_engine('cred name').
#
# On your laptop, 'cred name' is the same thing passed to dbcli (mapping to
# something in your db-facts config).
#
# In Airflow, 'cred name' maps to the connection ID in the admin Connnections UI.
#
# Or you can build your own and pass it in!
db_engine = session.get_default_db_engine()

df = DataFrame.from_dict([{'a': 1}]) # or make your own!

source = records.sources.dataframe(df=df)
target = records.targets.table(schema_name='myschema',
                               table_name='mytable',
                               db_engine=db_engine)
results = records.move(source, target)
```

When moving data, the sources supported can be found
[here](./records_mover/records/sources/factory.py), and the
targets supported can be found [here](./records_mover/records/targets/factory.py).


