Metadata-Version: 2.4
Name: funasr-python
Version: 0.1.2
Summary: A high-performance Python client for FunASR WebSocket speech recognition service
Project-URL: Documentation, https://github.com/alibaba-damo-academy/FunASR
Project-URL: Repository, https://github.com/alibaba-damo-academy/FunASR
Project-URL: Changelog, https://github.com/alibaba-damo-academy/FunASR/blob/main/CHANGELOG.md
Project-URL: Bug Reports, https://github.com/alibaba-damo-academy/FunASR/issues
Author: FunASR Team
Maintainer: FunASR Team
License: MIT License
        Copyright (c) 2025 FunASR
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in
        all copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
        THE SOFTWARE.
License-File: LICENSE
Keywords: asr,funasr,real-time,speech-recognition,streaming,websocket
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: build>=1.2.2.post1
Requires-Dist: librosa>=0.9.0
Requires-Dist: numpy>=1.20.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=0.19.0
Requires-Dist: soundfile>=0.12.0
Requires-Dist: twine>=6.1.0
Requires-Dist: typing-extensions>=4.0.0; python_version < '3.11'
Requires-Dist: websockets>=11.0.0
Provides-Extra: all
Requires-Dist: coverage[toml]>=7.0.0; extra == 'all'
Requires-Dist: hatch>=1.7.0; extra == 'all'
Requires-Dist: mypy>=1.5.0; extra == 'all'
Requires-Dist: orjson>=3.8.0; extra == 'all'
Requires-Dist: pre-commit>=3.0.0; extra == 'all'
Requires-Dist: pyaudio>=0.2.11; extra == 'all'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'all'
Requires-Dist: pytest-cov>=4.0.0; extra == 'all'
Requires-Dist: pytest-mock>=3.10.0; extra == 'all'
Requires-Dist: pytest-timeout>=2.1.0; extra == 'all'
Requires-Dist: pytest>=7.0.0; extra == 'all'
Requires-Dist: ruff>=0.1.0; extra == 'all'
Requires-Dist: types-setuptools; extra == 'all'
Requires-Dist: uvloop>=0.17.0; (sys_platform != 'win32') and extra == 'all'
Requires-Dist: webrtcvad>=2.0.10; extra == 'all'
Provides-Extra: audio
Requires-Dist: pyaudio>=0.2.11; extra == 'audio'
Requires-Dist: webrtcvad>=2.0.10; extra == 'audio'
Provides-Extra: dev
Requires-Dist: coverage[toml]>=7.0.0; extra == 'dev'
Requires-Dist: hatch>=1.7.0; extra == 'dev'
Requires-Dist: mypy>=1.5.0; extra == 'dev'
Requires-Dist: orjson>=3.8.0; extra == 'dev'
Requires-Dist: pre-commit>=3.0.0; extra == 'dev'
Requires-Dist: pyaudio>=0.2.11; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest-mock>=3.10.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.1.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: types-setuptools; extra == 'dev'
Requires-Dist: uvloop>=0.17.0; (sys_platform != 'win32') and extra == 'dev'
Requires-Dist: webrtcvad>=2.0.10; extra == 'dev'
Provides-Extra: lint
Requires-Dist: mypy>=1.5.0; extra == 'lint'
Requires-Dist: ruff>=0.1.0; extra == 'lint'
Requires-Dist: types-setuptools; extra == 'lint'
Provides-Extra: performance
Requires-Dist: orjson>=3.8.0; extra == 'performance'
Requires-Dist: uvloop>=0.17.0; (sys_platform != 'win32') and extra == 'performance'
Provides-Extra: test
Requires-Dist: coverage[toml]>=7.0.0; extra == 'test'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'test'
Requires-Dist: pytest-cov>=4.0.0; extra == 'test'
Requires-Dist: pytest-mock>=3.10.0; extra == 'test'
Requires-Dist: pytest-timeout>=2.1.0; extra == 'test'
Requires-Dist: pytest>=7.0.0; extra == 'test'
Description-Content-Type: text/markdown

# FunASR Python Client

[![PyPI version](https://badge.fury.io/py/funasr-python.svg)](https://badge.fury.io/py/funasr-python)
[![Python versions](https://img.shields.io/pypi/pyversions/funasr-python.svg)](https://pypi.org/project/funasr-python)
[![License](https://img.shields.io/pypi/l/funasr-python.svg)](https://pypi.org/project/funasr-python)
[![Tests](https://github.com/your-org/funasr-python/workflows/Tests/badge.svg)](https://github.com/your-org/funasr-python/actions)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

A high-performance, enterprise-grade Python client for FunASR WebSocket speech recognition service. Built for production use with comprehensive error handling, automatic reconnection, and extensive customization options.

## Features

### 🚀 **High Performance**
- **Asynchronous I/O**: Built on asyncio for maximum concurrency
- **Connection Pooling**: Efficient WebSocket connection management
- **Streaming Recognition**: Real-time speech recognition with minimal latency
- **Memory Efficient**: Optimized audio processing with configurable buffering

### 🔧 **Production Ready**
- **Robust Error Handling**: Comprehensive exception handling and recovery
- **Automatic Reconnection**: Smart reconnection with exponential backoff
- **Health Monitoring**: Built-in connection health checks
- **Resource Management**: Automatic cleanup and resource deallocation

### 📊 **Recognition Modes for Different Scenarios**
- **Offline Mode**: Best for complete audio files, highest accuracy
- **Online Mode**: Ultra-low latency streaming, suitable for interactive applications
- **Two-Pass Mode** ⭐: **Recommended for real-time scenarios** - combines streaming speed with offline accuracy

### 🎯 **Enterprise Features**
- **Configuration Management**: Flexible configuration with .env support
- **Comprehensive Logging**: Structured logging with configurable levels
- **Metrics & Monitoring**: Built-in performance metrics
- **Type Safety**: Full type hints for better IDE support

### 🎵 **Audio Processing**
- **Multiple Formats**: Support for WAV, FLAC, MP3, and more
- **Automatic Resampling**: Smart audio format conversion
- **Voice Activity Detection**: Optional VAD for improved efficiency
- **Microphone Integration**: Real-time microphone recording support

## Installation

### Basic Installation

```bash
pip install funasr-python
```

### With Optional Dependencies

```bash
# Audio processing capabilities
pip install funasr-python[audio]

# Performance optimizations
pip install funasr-python[performance]

# Development tools
pip install funasr-python[dev]

# Everything
pip install funasr-python[all]
```

### From Source

```bash
git clone https://github.com/alibaba-damo-academy/FunASR.git
cd FunASR/clients/funasr-python
pip install -e .
```

## Quick Start

### Basic Usage

```python
import asyncio
from funasr_client import AsyncFunASRClient

async def main():
    client = AsyncFunASRClient()

    # Recognize an audio file
    result = await client.recognize_file("examples/audio/asr_example.wav")
    print(f"Recognition result: {result.text}")

    await client.close()

if __name__ == "__main__":
    asyncio.run(main())
```

### Stream Recognition (Async Iterator)

Use `recognize_stream` to process custom audio streams from any source:

```python
import asyncio
from funasr_client import AsyncFunASRClient
from funasr_client.callbacks import SimpleCallback

async def stream_recognition_demo():
    """Recognize audio from custom async stream."""
    client = AsyncFunASRClient(
        server_url="ws://localhost:10095",
        mode="2pass"  # Two-Pass mode for best results
    )

    def on_partial_result(result):
        print(f"Partial: {result.text}")

    def on_final_result(result):
        print(f"Final: {result.text} (confidence: {result.confidence:.2f})")

    callback = SimpleCallback(
        on_partial=on_partial_result,
        on_final=on_final_result
    )

    await client.start()

    # Example 1: Stream from file in chunks
    async def audio_stream_from_file(file_path, chunk_size=3200):
        """Read audio file and yield chunks."""
        with open(file_path, 'rb') as f:
            # Skip WAV header (44 bytes)
            f.read(44)
            while True:
                chunk = f.read(chunk_size)
                if not chunk:
                    break
                yield chunk
                await asyncio.sleep(0.01)  # Simulate real-time streaming

    # Start streaming recognition
    await client.recognize_stream(
        audio_stream_from_file("examples/audio/asr_example.wav"),
        callback
    )

    await client.close()

# Example 2: Stream from network source
async def stream_from_network():
    """Stream audio from network source (e.g., RTP, RTSP)."""
    import aiohttp

    client = AsyncFunASRClient(server_url="ws://localhost:10095")

    async def network_audio_stream(url):
        """Stream audio from HTTP/network source."""
        async with aiohttp.ClientSession() as session:
            async with session.get(url) as response:
                async for chunk in response.content.iter_chunked(3200):
                    yield chunk

    def on_result(result):
        if result.is_final:
            print(f"Transcription: {result.text}")

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(on_final=on_result)

    await client.start()
    await client.recognize_stream(
        network_audio_stream("http://example.com/audio.pcm"),
        callback
    )
    await client.close()

# Example 3: Stream from microphone (PyAudio)
async def stream_from_microphone():
    """Real-time recognition from microphone using PyAudio."""
    import pyaudio
    import asyncio

    client = AsyncFunASRClient(server_url="ws://localhost:10095")

    async def microphone_stream():
        """Capture audio from microphone and yield chunks."""
        CHUNK = 1600  # 100ms at 16kHz
        FORMAT = pyaudio.paInt16
        CHANNELS = 1
        RATE = 16000

        p = pyaudio.PyAudio()
        stream = p.open(
            format=FORMAT,
            channels=CHANNELS,
            rate=RATE,
            input=True,
            frames_per_buffer=CHUNK
        )

        print("🎤 Recording... Press Ctrl+C to stop")
        try:
            while True:
                data = await asyncio.get_event_loop().run_in_executor(
                    None, stream.read, CHUNK
                )
                yield data
        except KeyboardInterrupt:
            pass
        finally:
            stream.stop_stream()
            stream.close()
            p.terminate()

    def on_result(result):
        if result.is_final:
            print(f"You said: {result.text}")
        else:
            print(f"Hearing: {result.text}")

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(
        on_partial=on_result,
        on_final=on_result
    )

    await client.start()
    await client.recognize_stream(microphone_stream(), callback)
    await client.close()

if __name__ == "__main__":
    # Run different examples
    asyncio.run(stream_recognition_demo())
    # asyncio.run(stream_from_network())
    # asyncio.run(stream_from_microphone())
```

### Real-time Recognition (Microphone)

For real-time applications, we recommend **Two-Pass Mode** which provides the best balance of speed and accuracy:

```python
import asyncio
from funasr_client import AsyncFunASRClient
from funasr_client.models import RecognitionMode, ClientConfig

async def realtime_recognition():
    # Two-Pass Mode: Optimal for real-time scenarios
    config = ClientConfig(
        server_url="ws://localhost:10095",
        mode=RecognitionMode.TWO_PASS,  # Recommended for real-time
        enable_vad=True,  # Voice activity detection
        chunk_interval=10  # Balanced latency/accuracy
    )

    client = AsyncFunASRClient(config=config)

    def on_partial_result(result):
        print(f"Partial: {result.text}")

    def on_final_result(result):
        print(f"Final: {result.text} (confidence: {result.confidence:.2f})")

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(
        on_partial=on_partial_result,
        on_final=on_final_result
    )

    await client.start()

    # Start real-time session
    session = await client.start_realtime(callback)

    # Your audio streaming logic here
    # In practice, you would stream from microphone or audio source

    await client.close()

if __name__ == "__main__":
    asyncio.run(realtime_recognition())
```

### Ultra-Low Latency (Interactive Applications)

For scenarios requiring minimal latency (e.g., voice assistants):

```python
async def ultra_low_latency():
    config = ClientConfig(
        mode=RecognitionMode.ONLINE,  # Ultra-low latency
        chunk_interval=5,  # Faster processing
        enable_vad=True
    )

    client = AsyncFunASRClient(config=config)
    # Implementation similar to above
```

### Configuration with Environment Variables

Create a `.env` file:

```env
FUNASR_WS_URL=ws://localhost:10095
FUNASR_MODE=2pass  # Recommended: Two-Pass Mode for optimal real-time performance
FUNASR_SAMPLE_RATE=16000
FUNASR_ENABLE_ITN=true
FUNASR_ENABLE_VAD=true  # Recommended for real-time scenarios
```

```python
from funasr_client import create_async_client

# Configuration loaded automatically from .env
client = await create_async_client()
result = await client.recognize_file("examples/audio/asr_example.wav")
print(result.text)
```

## Advanced Usage

### Custom Configuration

```python
from funasr_client import AsyncFunASRClient, ClientConfig, AudioConfig
from funasr_client.models import RecognitionMode, AudioFormat

config = ClientConfig(
    server_url="ws://your-server:10095",
    mode=RecognitionMode.TWO_PASS,
    timeout=30.0,
    max_retries=3,
    audio=AudioConfig(
        sample_rate=16000,
        format=AudioFormat.PCM,
        channels=1
    )
)

client = AsyncFunASRClient(config=config)
```

### Callback Handlers

```python
from funasr_client.callbacks import SimpleCallback

def on_result(result):
    print(f"Received: {result.text}")

def on_error(error):
    print(f"Error: {error}")

callback = SimpleCallback(
    on_result=on_result,
    on_error=on_error
)

client = AsyncFunASRClient(callback=callback)
```

### Multiple Recognition Sessions

```python
async def recognize_multiple():
    # Use Two-Pass Mode for optimal performance
    client = AsyncFunASRClient(
        mode=RecognitionMode.TWO_PASS  # ⭐ Recommended
    )

    # Process multiple files concurrently
    tasks = [
        client.recognize_file("examples/audio/asr_example.wav"),
        client.recognize_file("examples/audio/61-70970-0001.wav"),
        client.recognize_file("examples/audio/61-70970-0016.wav")
    ]

    results = await asyncio.gather(*tasks)
    for i, result in enumerate(results, 1):
        print(f"File {i}: {result.text}")
```

### Real-time Applications Examples

#### Live Streaming Transcription

```python
async def live_transcription():
    """Real-time transcription for live streams."""
    config = ClientConfig(
        mode=RecognitionMode.TWO_PASS,  # ⭐ Optimal for live streaming
        enable_vad=True,                # Filter silence
        chunk_interval=8,               # Balanced performance
        auto_reconnect=True             # Handle network issues
    )

    client = AsyncFunASRClient(config=config)

    def on_result(result):
        if result.is_final:
            # Send to subtitle system
            send_subtitle(result.text, result.confidence)
        else:
            # Show live preview
            show_live_text(result.text)

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(on_final=on_result, on_partial=on_result)

    await client.start()
    session = await client.start_realtime(callback)

    # Your audio streaming implementation here
    await stream_audio_to_session(session)
```

#### Voice Assistant Integration

```python
async def voice_assistant():
    """Voice assistant with Two-Pass optimization."""
    config = ClientConfig(
        mode=RecognitionMode.TWO_PASS,  # ⭐ Best for voice assistants
        enable_vad=True,                # Automatic speech detection
        chunk_interval=10               # Good responsiveness
    )

    client = AsyncFunASRClient(config=config)

    async def process_command(result):
        if result.is_final and result.confidence > 0.8:
            # Process voice command
            response = await process_voice_command(result.text)
            await speak_response(response)

    from funasr_client.callbacks import AsyncSimpleCallback
    callback = AsyncSimpleCallback(on_final=process_command)

    await client.start()
    session = await client.start_realtime(callback)

    print("🎤 Voice assistant ready. Speak now...")
    # Your microphone streaming logic here
```

## Command Line Interface

The package includes a full-featured CLI:

```bash
# Basic recognition
funasr-client recognize examples/audio/asr_example.wav

# Real-time recognition from microphone
funasr-client stream --source microphone

# Batch processing
funasr-client batch examples/audio/*.wav --output results.jsonl

# Server configuration
funasr-client configure --server-url ws://localhost:10095

# Test connection
funasr-client test-connection
```

## Recognition Mode Selection Guide

Choose the optimal recognition mode for your use case:

| Mode | Latency | Accuracy | Best For | Use Cases |
|------|---------|----------|----------|-----------|
| **Two-Pass** ⭐ | Medium | **High** | **Real-time applications** | Live streaming, real-time subtitles, voice assistants |
| **Online** | **Low** | Medium | Interactive apps | Voice commands, quick responses |
| **Offline** | High | **Highest** | File processing | Transcription services, post-processing |

### Two-Pass Mode Advantages ⭐

**Recommended for real-time scenarios** because it:

- ✅ **Fast partial results** for immediate user feedback (Phase 1: Online)
- ✅ **High-accuracy final results** using 2-pass optimization (Phase 2: Offline)
- ✅ **Balanced resource usage** with smart buffering
- ✅ **Production-ready** with robust error handling

```python
# Recommended configuration for real-time applications
config = ClientConfig(
    mode=RecognitionMode.TWO_PASS,  # Best balance
    enable_vad=True,                # Improves efficiency
    chunk_interval=10,              # Optimal for most cases
    auto_reconnect=True             # Production reliability
)
```

> ⚠️ **Important**: To ensure you receive **both** partial (online) and final (offline) results in Two-Pass mode:
> - ✅ Use `recognize_file()` for complete audio files (handles end-of-speech automatically)
> - ✅ Call `end_realtime_session()` after each utterance in streaming scenarios
> - ✅ Enable VAD (`enable_vad=True`) for better speech boundary detection
> - ✅ Include sufficient silence (0.5-1s) at the end of speech segments
> 
> 📖 **See detailed guide**: [Two-Pass Best Practices](docs/TWO_PASS_BEST_PRACTICES_zh.md)

## Configuration

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `FUNASR_WS_URL` | WebSocket server URL | `ws://localhost:10095` |
| `FUNASR_MODE` | Recognition mode (`offline`, `online`, `2pass`) | `2pass` ⭐ |
| `FUNASR_TIMEOUT` | Connection timeout | `30.0` |
| `FUNASR_MAX_RETRIES` | Max retry attempts | `3` |
| `FUNASR_SAMPLE_RATE` | Audio sample rate | `16000` |
| `FUNASR_ENABLE_ITN` | Enable inverse text normalization | `true` |
| `FUNASR_ENABLE_VAD` | Enable voice activity detection | `true` |
| `FUNASR_DEBUG` | Enable debug logging | `false` |

> 💡 **Tip**: Two-Pass Mode (`2pass`) is recommended for most real-time applications as it provides the best balance between latency and accuracy.

### Configuration File

```python
from funasr_client import ConfigManager

# Load from custom config file
config = ConfigManager.from_file("my_config.json")
client = AsyncFunASRClient(config=config.client_config)
```

## Error Handling

```python
from funasr_client.errors import (
    FunASRError,
    ConnectionError,
    AudioError,
    TimeoutError
)

try:
    result = await client.recognize_file("examples/audio/asr_example.wav")
except ConnectionError:
    print("Failed to connect to server")
except AudioError:
    print("Audio processing failed")
except TimeoutError:
    print("Request timed out")
except FunASRError as e:
    print(f"Recognition error: {e}")
```

## Performance Optimization

### Real-time Performance Best Practices

For optimal real-time performance, follow these recommendations:

```python
from funasr_client import AsyncFunASRClient, ClientConfig
from funasr_client.models import RecognitionMode, AudioConfig

# Optimized configuration for real-time scenarios
config = ClientConfig(
    # Core settings
    mode=RecognitionMode.TWO_PASS,  # ⭐ Best balance for real-time
    enable_vad=True,                # Reduces processing load
    chunk_interval=10,              # Optimal latency/accuracy trade-off

    # Performance settings
    auto_reconnect=True,            # Production reliability
    connection_pool_size=5,         # Connection reuse
    buffer_size=8192,               # Optimal buffer size

    # Audio optimization
    audio=AudioConfig(
        sample_rate=16000,          # Standard ASR rate
        channels=1,                 # Mono for efficiency
        sample_width=2              # 16-bit PCM
    )
)

client = AsyncFunASRClient(config=config)
```

### Performance Tuning Guidelines

| Parameter | Recommended Value | Impact |
|-----------|------------------|---------|
| `mode` | `TWO_PASS` ⭐ | Best accuracy/latency balance |
| `chunk_interval` | `10` | Standard real-time performance |
| `chunk_interval` | `5` | Lower latency, higher CPU usage |
| `chunk_interval` | `20` | Higher latency, lower CPU usage |
| `enable_vad` | `True` | Reduces unnecessary processing |
| `sample_rate` | `16000` | Optimal for most ASR models |

### Connection Pooling

```python
from funasr_client import ConnectionManager

# Use connection manager for multiple clients
manager = ConnectionManager(max_connections=10)
client1 = AsyncFunASRClient(connection_manager=manager)
client2 = AsyncFunASRClient(connection_manager=manager)
```

### Audio Processing

```python
from funasr_client import AudioProcessor

# Pre-process audio for better performance
processor = AudioProcessor(
    target_sample_rate=16000,
    enable_vad=True,
    chunk_size=1024
)

processed_audio = processor.process_file("examples/audio/asr_example.wav")
result = await client.recognize_audio(processed_audio)
```

## Testing

Run the test suite:

```bash
# Install test dependencies
pip install funasr-python[test]

# Run all tests
pytest

# Run with coverage
pytest --cov=funasr_client

# Run specific test categories
pytest -m unit
pytest -m integration
```

## Development

### Setup Development Environment

```bash
git clone https://github.com/alibaba-damo-academy/FunASR.git
cd FunASR/clients/funasr-python

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install
```

### Code Quality

```bash
# Format code
ruff format src/ tests/

# Lint code
ruff check src/ tests/

# Type check
mypy src/

# Run all quality checks
pre-commit run --all-files
```

## API Reference

### Core Classes

- **`AsyncFunASRClient`**: Main asynchronous client
- **`FunASRClient`**: Synchronous client wrapper
- **`ClientConfig`**: Client configuration
- **`AudioConfig`**: Audio processing configuration
- **`RecognitionResult`**: Recognition result container

### Callback System

- **`RecognitionCallback`**: Abstract callback interface
- **`SimpleCallback`**: Basic callback implementation
- **`LoggingCallback`**: Logging-based callback
- **`MultiCallback`**: Combines multiple callbacks

### Audio Processing

- **`AudioProcessor`**: Audio processing utilities
- **`AudioRecorder`**: Microphone recording
- **`AudioFileStreamer`**: File-based audio streaming

### Utilities

- **`ConfigManager`**: Configuration management
- **`ConnectionManager`**: Connection pooling
- **`Timer`**: Performance timing utilities

## Documentation & Guides

### Quick References ⚡
- [Two-Pass Quick Reference](docs/TWO_PASS_QUICK_REFERENCE.md) - Fast solutions for common Two-Pass mode issues
- [Examples Directory](examples/) - Comprehensive usage examples

### Detailed Guides 📖
- [Two-Pass Best Practices (中文)](docs/TWO_PASS_BEST_PRACTICES_zh.md) - Complete guide to avoid empty Phase 2 results
- API Reference (Coming soon)
- Configuration Guide (Coming soon)
- Performance Optimization (Coming soon)

### Architecture Documentation
- [FunASR WebSocket Protocol](../../runtime/docs/websocket_protocol.md)
- [Two-Pass Architecture](../../runtime/docs/funasr-wss-server-2pass-architecture.puml)

## Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details.

### Development Process

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Run the test suite
6. Submit a pull request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history.

## Support

- **Documentation**: [FunASR Documentation](https://github.com/alibaba-damo-academy/FunASR)
- **Issues**: [GitHub Issues](https://github.com/alibaba-damo-academy/FunASR/issues)
- **Discussions**: [GitHub Discussions](https://github.com/alibaba-damo-academy/FunASR/discussions)

## Acknowledgments

- Built on the excellent [FunASR](https://github.com/alibaba-damo-academy/FunASR) speech recognition toolkit
- Inspired by best practices from the Python asyncio ecosystem
- Thanks to all contributors and users for feedback and improvements