Metadata-Version: 2.1
Name: dbt-spark-livy
Version: 1.3.1
Summary: The dbt-spark-livy adapter plugin for Spark in Cloudera DataHub with Livy interface
Home-page: https://github.com/cloudera/dbt-spark-livy
Author: Cloudera
Author-email: innovation-feedback@cloudera.com
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: dbt-core (~=1.3.0)
Requires-Dist: sqlparams (>=3.0.0)
Requires-Dist: requests-kerberos (==0.14)
Requires-Dist: requests-toolbelt (>=0.9.1)
Requires-Dist: python-decouple (>=3.6)
Provides-Extra: odbc
Requires-Dist: pyodbc (>=4.0.30) ; extra == 'odbc'
Provides-Extra: pyhive
Requires-Dist: PyHive[hive] (<0.7.0,>=0.6.0) ; extra == 'pyhive'
Requires-Dist: thrift (<0.16.0,>=0.11.0) ; extra == 'pyhive'
Provides-Extra: all
Requires-Dist: pyodbc (>=4.0.30) ; extra == 'all'
Requires-Dist: PyHive[hive] (<0.7.0,>=0.6.0) ; extra == 'all'
Requires-Dist: thrift (<0.16.0,>=0.11.0) ; extra == 'all'
Requires-Dist: pyspark (<4.0.0,>=3.0.0) ; extra == 'all'
Provides-Extra: session
Requires-Dist: pyspark (<4.0.0,>=3.0.0) ; extra == 'session'

# dbt-spark-livy

The `dbt-spark-livy` adapter allows you to use [dbt](https://www.getdbt.com/) along with [Apache spark-livy](https://spark.apache.org/) and [Cloudera Data Platform](https://cloudera.com) with Livy server support. This code bases use the dbt-spark project (https://github.com/dbt-labs/dbt-spark), and provides a Livy connectivity support over it. 

## Getting started

- [Install dbt](https://docs.getdbt.com/docs/installation)
- Read the [introduction](https://docs.getdbt.com/docs/introduction/) and [viewpoint](https://docs.getdbt.com/docs/about/viewpoint/)

## Running locally
A `docker-compose` environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend.
Note: dbt-spark now supports Spark 3.1.1 (formerly on Spark 2.x).

Python >= 3.8

dbt-core ~= 1.3.0

pyspark

sqlparams

requests_kerberos

requests-toolbelt

python-decouple


### Installing dbt-spark-livy

`pip install dbt-spark-livy`

### Profile Setup

```
demo_project:
  target: dev
  outputs:
    dev:
     type: spark_livy
     method: livy
     schema: my_db
     host: https://spark-livy-gateway.my.org.com/dbt-spark/cdp-proxy-api/livy_for_spark3/
     user: my_user
     password: my_pass
```

### Caveats
- While using livy , in the Livy UI if you notice sessions change state to dead from starting instead of idle, make sure there is a proper mapping for the user in the IDBroker mapping section 
- Actions > Manage Access > IDBroker Mappings . [Reference](https://docs.cloudera.com/cdf-datahub/7.2.15/flink-analyzing-data/topics/cdf-datahub-sa-create-idbroker-mapping.html)
- Also make sure the workload password is set either through UI or CLI. [Reference](https://docs.cloudera.com/management-console/cloud/user-management/topics/mc-setting-the-ipa-password.html)

## Supported features
Please see the original adapter documentation: https://github.com/dbt-labs/dbt-spark and https://docs.getdbt.com/reference/warehouse-profiles/spark-profile
