Metadata-Version: 2.1
Name: swh.storage
Version: 0.0.158
Summary: Software Heritage storage manager
Home-page: https://forge.softwareheritage.org/diffusion/DSTO/
Author: Software Heritage developers
Author-email: swh-devel@inria.fr
License: UNKNOWN
Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest
Project-URL: Funding, https://www.softwareheritage.org/donate
Project-URL: Source, https://forge.softwareheritage.org/source/swh-storage
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 5 - Production/Stable
Description-Content-Type: text/markdown
Requires-Dist: click
Requires-Dist: flask
Requires-Dist: psycopg2
Requires-Dist: python-dateutil
Requires-Dist: vcversioner
Requires-Dist: aiohttp
Requires-Dist: swh.core[db,http] (>=0.0.65)
Requires-Dist: swh.model (>=0.0.51)
Requires-Dist: swh.objstorage (>=0.0.17)
Provides-Extra: journal
Requires-Dist: swh.journal (>=0.0.17) ; extra == 'journal'
Provides-Extra: schemata
Requires-Dist: SQLAlchemy ; extra == 'schemata'
Provides-Extra: testing
Requires-Dist: hypothesis (>=3.11.0) ; extra == 'testing'
Requires-Dist: pytest ; extra == 'testing'
Requires-Dist: pytest-postgresql (>=2.1.0) ; extra == 'testing'
Requires-Dist: sqlalchemy-stubs ; extra == 'testing'
Requires-Dist: swh.model[testing] (>=0.0.50) ; extra == 'testing'
Requires-Dist: pytz ; extra == 'testing'
Requires-Dist: swh.journal (>=0.0.17) ; extra == 'testing'

swh-storage
===========

Abstraction layer over the archive, allowing to access all stored source code
artifacts as well as their metadata.

See the
[documentation](https://docs.softwareheritage.org/devel/swh-storage/index.html)
for more details.

## Quick start

### Dependencies

Python tests for this module include tests that cannot be run without
a local Postgresql database, so you need the Postgresql server executable on
your machine (no need to have a running Postgresql server). On a Debian-like
host:

```
$ sudo apt install libpq-dev postgresql
```

### Installation

It is strongly recommended to use a virtualenv. In the following, we
consider you work in a virtualenv named `swh`. See the
[developer setup guide](https://docs.softwareheritage.org/devel/developer-setup.html#developer-setup)
for a more details on how to setup a working environment.


You can install the package directly from
[pypi](https://pypi.org/p/swh.storage):

```
(swh) :~$ pip install swh.storage
[...]
```

Or from sources:

```
(swh) :~$ git clone https://forge.softwareheritage.org/source/swh-storage.git
[...]
(swh) :~$ cd swh-storage
(swh) :~/swh-storage$ pip install .
[...]
```

Then you can check it's properly installed:
```
(swh) :~$ swh storage --help
Usage: swh storage [OPTIONS] COMMAND [ARGS]...

  Software Heritage Storage tools.

Options:
  -h, --help  Show this message and exit.

Commands:
  rpc-serve  Software Heritage Storage RPC server.
```


## Tests

The best way of running Python tests for this module is to use
[tox](https://tox.readthedocs.io/).

```
(swh) :~$ pip install tox
```

### tox

From the sources directory, simply use tox:

```
(swh) :~/swh-storage$ tox
[...]
========= 315 passed, 6 skipped, 15 warnings in 40.86 seconds ==========
_______________________________ summary ________________________________
  flake8: commands succeeded
  py3: commands succeeded
  congratulations :)
```

## Development

The storage server can be locally started. It requires a configuration file and
a running Postgresql database.

### Sample configuration

A typical configuration `storage.yml` file is:

```
storage:
  cls: local
  args:
    db: "dbname=softwareheritage-dev user=<user> password=<pwd>"
    objstorage:
      cls: pathslicing
      args:
        root: /tmp/swh-storage/
        slicing: 0:2/2:4/4:6
```

which means, this uses:

- a local storage instance whose db connection is to
  `softwareheritage-dev` local instance,

- the objstorage uses a local objstorage instance whose:

  - `root` path is /tmp/swh-storage,

  - slicing scheme is `0:2/2:4/4:6`. This means that the identifier of
    the content (sha1) which will be stored on disk at first level
    with the first 2 hex characters, the second level with the next 2
    hex characters and the third level with the next 2 hex
    characters. And finally the complete hash file holding the raw
    content. For example: 00062f8bd330715c4f819373653d97b3cd34394c
    will be stored at 00/06/2f/00062f8bd330715c4f819373653d97b3cd34394c

Note that the `root` path should exist on disk before starting the server.


### Starting the storage server

If the python package has been properly installed (e.g. in a virtual env), you
should be able to use the command:

```
(swh) :~/swh-storage$ swh storage rpc-serve storage.yml
```

This runs a local swh-storage api at 5002 port.

```
(swh) :~/swh-storage$ curl http://127.0.0.1:5002
<html>
<head><title>Software Heritage storage server</title></head>
<body>
<p>You have reached the
<a href="https://www.softwareheritage.org/">Software Heritage</a>
storage server.<br />
See its
<a href="https://docs.softwareheritage.org/devel/swh-storage/">documentation
and API</a> for more information</p>
```

### And then what?

In your upper layer
([loader-git](https://forge.softwareheritage.org/source/swh-loader-git/),
[loader-svn](https://forge.softwareheritage.org/source/swh-loader-svn/),
etc...), you can define a remote storage with this snippet of yaml
configuration.

```
storage:
  cls: remote
  args:
    url: http://localhost:5002/
```

You could directly define a local storage with the following snippet:

```
storage:
  cls: local
  args:
    db: service=swh-dev
    objstorage:
      cls: pathslicing
      args:
        root: /home/storage/swh-storage/
        slicing: 0:2/2:4/4:6
```


