Metadata-Version: 2.1
Name: vecsim
Version: 0.0.61
Summary: Vector Similarity Search Engine
Home-page: https://github.com/argmaxml/vecsim
Author: ArgmaxML
Author-email: ugoren@argmax.ml
Keywords: vector-similarity,faiss,hnsw,redis,matching,ranking,elasticsearch,search,embedding
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy (>=1.21.2)
Requires-Dist: pandas (>=1.3.0)
Requires-Dist: scikit-learn (>=0.19.0)
Provides-Extra: elasticsearch
Requires-Dist: elasticsearch (>=8.5.0) ; extra == 'elasticsearch'
Provides-Extra: faiss
Requires-Dist: faiss-cpu (>=1.7.1) ; extra == 'faiss'
Provides-Extra: pinecone
Requires-Dist: pinecone-client (>=2.2.0) ; extra == 'pinecone'
Provides-Extra: postgres
Requires-Dist: psycopg2-binary (~=2.9.3) ; extra == 'postgres'
Requires-Dist: SQLAlchemy (~=1.3.22) ; extra == 'postgres'
Provides-Extra: redis
Requires-Dist: redis (>=4.3.0) ; extra == 'redis'

# VecSim - A unified interface for similarity servers
A standard, light-weight interface to all popular similarity servers.

## The problems we are trying to solve:
1. **Standard API** - Different vector similarity servers have different APIs - so switching is not trivial.
1. **Identifiers** - Some vector similarity servers support string IDs, some do not - we keep track of the mapping.
1. **Partitions** - In most cases, pre-filtering is needed prior to querying, we abstract this concept away.

## Supported engines:
1. Scikit-learn, via [NearestNeighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html)
1. [RediSearch](https://redis.io/docs/stack/search/reference/vectors/)
1. [Faiss](https://github.com/facebookresearch/faiss)
1. [ElasticSearch](https://www.elastic.co)
1. [Pinecone](https://www.pinecone.io)


## QuickStart example
```python
import numpy as np
# Import a similarity server of your choice:
# SKlearn (best for small datasets or testing)
from vecsim import SciKitIndex
sim = SciKitIndex(metric='cosine', dim=32)

user_ids = ["user_"+str(1+i) for i in range(100)]
user_data = np.random.random((100,32))
item_ids=["item_"+str(101+i) for i in range(100)]
item_data = np.random.random((100,32))
sim.add_items(user_data, user_ids, partition="users")
sim.add_items(item_data, item_ids, partition="items")
# Index the data
sim.init()
# Run nearest neighbor vector search
query = np.random.random(32)
dists, items = sim.search(query, k=10) # returns a list of users and items
dists, items = sim.search(query, k=10, partition="users") # returns a list of users only
```

For more examples, please read our [documentation](https://vecsim.readthedocs.io/)
