Skip to content

Validation Handler Hook

The validation-handler hook monitors Bash command output for pytest and compilation results, automatically parsing test counts and updating the session checkpoint status.

Overview

Attribute Value
Hook Name validation-handler
Script hooks/scripts/validation-handler.py
Event PostToolUse
Matcher Bash
Blocking No (post-processing only)
Timeout 5000ms

Purpose

The validation handler provides:

  1. Automatic Test Parsing: Extracts pass/fail/skip counts from pytest output
  2. Compilation Monitoring: Detects SAT/UNSAT and syntax errors from clingo
  3. State Updates: Writes results to session checkpoint data
  4. Status Messages: Reports test results in human-readable format

Configuration

In hooks/hooks.json:

{
  "PostToolUse": [
    {
      "matcher": "Bash",
      "hooks": [
        {
          "type": "command",
          "command": "python ${CLAUDE_PLUGIN_ROOT}/hooks/scripts/validation-handler.py \"$TOOL_OUTPUT\"",
          "timeout": 5000
        }
      ]
    }
  ]
}

Behavior

Pytest Results Detected

When pytest output is detected and tests pass:

{
  "continue": true,
  "message": "Tests Passed\n   Passed: 25\n   Skipped: 2"
}

When tests fail:

{
  "continue": true,
  "message": "Tests Failed\n   Passed: 23\n   Failed: 2\n   Errors: 0"
}

Compilation Results Detected

When clingo compilation succeeds:

{
  "continue": true,
  "message": "Compilation successful (SAT)"
}

When compilation fails:

{
  "continue": true,
  "message": "Compilation failed: syntax"
}

Non-Validation Output

When output is not from validation commands:

{
  "continue": true,
  "message": ""
}

The hook always allows the operation to proceed since it runs post-execution.

Detection Patterns

Validation Command Detection

def is_validation_command(output: str) -> bool:
    """Check if output is from a validation-related command."""
    indicators = [
        "pytest", "passed", "failed", "PASSED", "FAILED",
        "clingo", "grounding", "UNSAT", "SAT",
        "compile", "syntax error"
    ]
    return any(ind in output for ind in indicators)

The hook only processes output containing these keywords.

Pytest Output Parsing

def parse_pytest_output(output: str) -> dict | None:
    """Parse pytest output for pass/fail counts."""
    passed_match = re.search(r'(\d+)\s+passed', output)
    failed_match = re.search(r'(\d+)\s+failed', output)
    skipped_match = re.search(r'(\d+)\s+skipped', output)
    error_match = re.search(r'(\d+)\s+error', output)

    # Only process if this looks like pytest output
    if not any([passed_match, failed_match]):
        return None

    return {
        "passed": int(passed_match.group(1)) if passed_match else 0,
        "failed": int(failed_match.group(1)) if failed_match else 0,
        "skipped": int(skipped_match.group(1)) if skipped_match else 0,
        "errors": int(error_match.group(1)) if error_match else 0,
    }

Matched Patterns:

Pattern Example Output
(\d+)\s+passed "25 passed"
(\d+)\s+failed "3 failed"
(\d+)\s+skipped "2 skipped"
(\d+)\s+error "1 error"

Compilation Output Parsing

def parse_compile_output(output: str) -> dict | None:
    """Parse compile output for success/failure."""
    # Failure indicators
    if "error" in output.lower() and "syntax" in output.lower():
        return {"success": False, "error_type": "syntax"}

    if "UNSAT" in output or "UNSATISFIABLE" in output:
        return {"success": False, "error_type": "unsat"}

    if "grounding" in output.lower() and "error" in output.lower():
        return {"success": False, "error_type": "grounding"}

    # Success indicators
    if "SAT" in output or "SATISFIABLE" in output:
        return {"success": True, "status": "sat"}

    if "Models" in output and re.search(r'Models\s*:\s*[1-9]', output):
        return {"success": True, "status": "sat"}

    return None

Compilation Status Detection:

Indicator Result Error Type
"syntax" + "error" Failure syntax
"UNSAT" / "UNSATISFIABLE" Failure unsat
"grounding" + "error" Failure grounding
"SAT" / "SATISFIABLE" Success -
"Models: N" (N > 0) Success -

Session State Updates

Test Results Storage

When pytest results are detected:

# .mistaber-session.yaml
checkpoints:
  validate:
    status: tests_passed  # or tests_failed
    last_test_run: "2026-01-25T14:30:00.123456"
    test_results:
      passed: 25
      failed: 0
      skipped: 2
      errors: 0

Compilation Results Storage

When clingo compilation is detected:

# .mistaber-session.yaml
checkpoints:
  validate:
    status: compile_failed  # Only set on failure
    last_compile: "2026-01-25T14:35:00.456789"
    compile_results:
      success: true  # or false
      status: sat    # or error_type: syntax/unsat/grounding

Status Values

Test Status

Status Condition
tests_passed failed == 0 AND errors == 0
tests_failed failed > 0 OR errors > 0

Compile Status

Status Condition
compile_failed Any error detected
(unchanged) Compilation successful

Implementation Flow

graph TD
    A[Bash Command Completes] --> B{Is Validation Output?}
    B -->|No| C[Return - No Message]
    B -->|Yes| D{Pytest Patterns?}
    D -->|Yes| E[Parse Test Counts]
    D -->|No| F{Compile Patterns?}
    E --> G[Load Session State]
    F -->|Yes| H[Parse Compile Status]
    F -->|No| C
    H --> G
    G --> I{Session Exists?}
    I -->|Yes| J[Update Checkpoint]
    I -->|No| K[Display Results Only]
    J --> L[Save Session]
    L --> M[Return with Message]
    K --> M

Message Formats

Test Pass Message

Tests Passed
   Passed: 25
   Skipped: 2

Test Fail Message

Tests Failed
   Passed: 23
   Failed: 2
   Errors: 0

Compilation Success Message

Compilation successful (SAT)

Compilation Failure Messages

Compilation failed: syntax
Compilation failed: unsat
Compilation failed: grounding

No Session Message

When no session state exists but results are detected:

Test Results: 25 passed, 0 failed

Integration Points

With Validate Skill

The validation-handler automatically updates checkpoint status after running:

# User runs via validate skill
pytest tests/corpus/test_yd_87.py -v

# Hook parses output and updates session:
# checkpoints.validate.status = tests_passed
# checkpoints.validate.test_results = {passed: 10, failed: 0, ...}

With Review Skill

The review skill reads validation results:

# Session state after validation-handler
checkpoints:
  validate:
    status: tests_passed
    test_results:
      passed: 10
      failed: 0

The review skill can reference these results when preparing the review package.

With Clingo Compilation

When clingo is invoked for compilation testing:

clingo mistaber/ontology/corpus/yd_87/base.lp -n0

# Output contains "SAT" or "UNSAT"
# Hook updates compile_results

Example Sessions

Successful Test Run

Command:

pytest tests/corpus/test_yd_87.py -v

Output (truncated):

tests/corpus/test_yd_87.py::test_mechaber_rules PASSED
tests/corpus/test_yd_87.py::test_rema_override PASSED
...
========= 10 passed, 2 skipped in 1.23s =========

Hook Response:

{
  "continue": true,
  "message": "Tests Passed\n   Passed: 10\n   Skipped: 2"
}

Session Update:

checkpoints:
  validate:
    status: tests_passed
    last_test_run: "2026-01-25T14:30:00.123456"
    test_results:
      passed: 10
      failed: 0
      skipped: 2
      errors: 0

Failed Test Run

Command:

pytest tests/corpus/test_yd_87.py -v

Output (truncated):

tests/corpus/test_yd_87.py::test_mechaber_rules PASSED
tests/corpus/test_yd_87.py::test_rema_override FAILED
...
========= 8 passed, 2 failed in 1.45s =========

Hook Response:

{
  "continue": true,
  "message": "Tests Failed\n   Passed: 8\n   Failed: 2\n   Errors: 0"
}

Session Update:

checkpoints:
  validate:
    status: tests_failed
    last_test_run: "2026-01-25T14:32:00.789012"
    test_results:
      passed: 8
      failed: 2
      skipped: 0
      errors: 0

Compilation Test

Command:

clingo mistaber/ontology/corpus/yd_87/base.lp \
       mistaber/ontology/worlds/mechaber.lp -n0

Output:

clingo version 5.6.2
Reading from base.lp ...
Solving...
Answer: 1
issur(mixture_1,achiila,mechaber)
SATISFIABLE

Models       : 1

Hook Response:

{
  "continue": true,
  "message": "Compilation successful (SAT)"
}

Debugging

Manual Testing

# Test with pytest output
echo "10 passed, 2 failed in 1.23s" | \
  python mistaber-skills/hooks/scripts/validation-handler.py

# Test with compile output
echo "SATISFIABLE\nModels: 1" | \
  python mistaber-skills/hooks/scripts/validation-handler.py

# Test with non-validation output
echo "ls -la completed" | \
  python mistaber-skills/hooks/scripts/validation-handler.py

Debug Mode

export MISTABER_DEBUG=1
python mistaber-skills/hooks/scripts/validation-handler.py "25 passed, 0 failed"

Verify Session Updates

# Run a test
pytest tests/corpus/test_yd_87.py -v

# Check session state
cat .mistaber-session.yaml | grep -A10 "validate:"

Common Issues

Results Not Captured

Symptom: Test runs but session not updated.

Causes: - Output doesn't contain expected patterns - No active session file - PyYAML not installed

Solutions:

# Check for patterns in output
pytest tests/ -v 2>&1 | grep -E "(passed|failed|skipped)"

# Verify session exists
ls -la .mistaber-session.yaml

# Check PyYAML
python -c "import yaml; print(yaml.__version__)"

Wrong Status Set

Symptom: Status shows "tests_failed" when tests passed.

Causes: - Previous run's errors still in output - Regex matching unintended text

Solutions:

# Run with clean output
pytest tests/ -v 2>&1 | tail -5

# Check parsed values
python -c "
import re
output = '25 passed, 0 failed'
print('passed:', re.search(r'(\d+)\s+passed', output))
print('failed:', re.search(r'(\d+)\s+failed', output))
"

Session Not Saved

Symptom: Results displayed but not persisted.

Causes: - File permission issues - YAML write failure - Session file locked

Solutions:

# Check permissions
ls -la .mistaber-session.yaml

# Test YAML write
python -c "
import yaml
with open('.mistaber-session.yaml', 'r') as f:
    data = yaml.safe_load(f)
print('Session loaded:', data is not None)
"

Performance Notes

  • Execution Time: < 100ms for typical output
  • Regex Patterns: Compiled once per invocation
  • File I/O: One read, one write per validation detection
  • Non-Validation Output: Returns immediately (no processing)