Metadata-Version: 2.1
Name: b10-kernel
Version: 0.1.1
Summary: Baseten Kernel Library
Keywords: machine learning,gpu,cuda,kernels,pytorch
Author-Email: Ke Bao <ke.bao@baseten.co>
Maintainer-Email: Ke Bao <ke.bao@baseten.co>, Pankaj Gupta <pankaj@baseten.co>, Yikai Zhu <yikai.zhu@baseten.co>, Shounak Ray <shounak.ray@baseten.co>
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: C++
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Requires-Dist: nvidia-cutlass-dsl==4.1.0
Requires-Dist: torch>=2.8.0
Provides-Extra: test
Requires-Dist: pytest>=8.0.0; extra == "test"
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: scikit-build-core>=0.10; extra == "dev"
Description-Content-Type: text/markdown

# b10-kernel

Baseten Kernel Library - High-performance GPU kernels for AI inference workloads.

## Installation

### From PyPI
```bash
pip install b10-kernel
```

**Requirements:**
- Python >= 3.12
- CUDA-compatible GPU and drivers
- PyTorch >= 2.8.0 with CUDA support

### From Source
```bash
git clone <repository>
cd mp/kernels/b10-kernel
pip install -e .
```

### For Development
```bash
# Install with test dependencies
pip install -e .[test]

# Install with all development dependencies  
pip install -e .[dev]
```

## Development guide
- Build the library from source
```bash
make build
make rebuild
```
- Run unit tests
```bash
make test
```
- Format code
```bash
make format
```

## Kernel Development Workflow
Steps to add a new kernel:
- Implement the kernel in `csrc`
- Expose the interface in `include/b10_kernel_ops.h`
- Create torch extension in `csrc/common_extension.cc`
- Update `CMakeLists.txt` to include new CUDA source
- Expose Python interface in `python/b10_kernel/xxx.py` and `python/b10_kernel/__init__.py`
- Add unit test for the kernel in `test/test_xxx.py`
- Add benchmark script for the kernel in `benchmark/bench_xxx.py`
- Format code with `make format`

