Skip to content

Validation Skill

The validate skill compiles encoded HLL rules, runs semantic validation checks, generates and executes test scenarios, and verifies machloket encoding before final human review.

Overview

Attribute Value
Skill Name validate
Phase 3 of 5
Checkpoint 3: Test Review
Prerequisites Checkpoint 2 (hll-encode) approved
Outputs validation-report.md, validation-results.yaml, test-scenarios.yaml
Trigger Phrases "validate encoding", "run tests", "compile rules", "check rules", "verify encoding"

Prerequisites

Before this skill can execute:

  1. HLL-encode checkpoint MUST be approved
  2. Encoded .lp files must exist:
  3. Shared definitions in mistaber/ontology/corpus/yd_{siman}/base.lp
  4. World-specific rules in mistaber/ontology/worlds/*.lp
  5. Encoding report and mapping artifacts must exist
  6. Session state shows hll-encode.status: approved

Phase A: Compilation Validation

Step 1: HLL Compilation

The skill compiles HLL to ASP using the Mistaber compiler:

from mistaber.dsl.compiler import compile_hll_file, CompileError

try:
    # Compile world files that include the new rules
    for world_file in Path("mistaber/ontology/worlds").glob("*.lp"):
        result = compile_hll_file(world_file)
    compilation_status = "passed"
except CompileError as e:
    compilation_status = "failed"
    error_details = str(e)

Compilation Checks:

Check Description
Syntax Valid ASP syntax
Directives Valid HLL directive format
Includes All #include files exist
Predicates All predicates defined

Step 2: ASP Grounding Test

The skill tests that Clingo can ground the program:

import clingo

ctl = clingo.Control(["--warn=none"])

# Load all ontology
for lp_file in Path("mistaber/ontology").rglob("*.lp"):
    ctl.load(str(lp_file))

# Ground
try:
    ctl.ground([("base", [])])
    grounding_status = "passed"
except RuntimeError as e:
    grounding_status = "failed"
    grounding_error = str(e)

Grounding Statistics:

Metric Description
Rules grounded Number of ground rules generated
Atoms created Number of ground atoms
Grounding time Time to complete grounding (ms)

Step 3: Satisfiability Check

The skill verifies the program is satisfiable (has at least one answer set):

with ctl.solve(yield_=True) as handle:
    model_count = 0
    for model in handle:
        model_count += 1
        if model_count >= 1:
            break

if model_count == 0:
    satisfiability = "UNSAT"  # CRITICAL - rules are contradictory
else:
    satisfiability = "SAT"

UNSAT is Critical

If the program is UNSAT, the rules contain logical contradictions. This MUST be resolved before proceeding. The skill will not pass checkpoint with UNSAT result.

UNSAT Debugging:

# Get minimal unsatisfiable core
clingo ontology.lp --opt-mode=optN -n 0

# Trace to find conflict
clingo --trace=asp ontology.lp

Phase B: Semantic Validation

Step 1: Disjointness Verification

The skill verifies no entity violates disjointness constraints:

% Constraints checked:
% No food can be both basar and chalav
:- food(F), food_type(F, basar), food_type(F, chalav).

% No mixture can have contradictory status in same world
:- asserts(W, issur(A, M, _)), asserts(W, heter(A, M)), not override(W, _, _).

Validation Query:

violations = engine.query("disjoint_violation(X, Y)")
if violations:
    semantic_errors.append(f"Disjointness violation: {violations}")

Step 2: Madrega Consistency

The skill verifies d'oraita rules have Torah sources:

for rule in rules_with_madrega_doraita:
    makor_chain = get_makor_chain(rule)
    if not has_torah_source(makor_chain):
        warnings.append(
            f"Rule {rule} marked d_oraita but chain doesn't reach Torah"
        )

Madrega Validation Rules:

Madrega Required Source
d_oraita Torah verse
d_rabanan Mishnah, Gemara, or Takana
minhag Community practice reference
chumra Any halachic source

Step 3: World Inheritance Verification

The skill verifies world hierarchy is respected:

# Check that asserts flow correctly through accessible/2
for world in ["mechaber", "rema", "sefardi_yo", "ashk_mb"]:
    base_assertions = engine.query("asserts(base, X)")
    world_holds = engine.query(f"holds(X, {world})")

    # Base assertions should propagate unless overridden
    for assertion in base_assertions:
        if assertion not in world_holds:
            if not is_overridden(world, assertion):
                errors.append(f"Inheritance broken: {assertion} not in {world}")

World Inheritance Map:

graph TD
    BASE["base"] --> MECH["mechaber"]
    BASE --> REMA["rema"]
    BASE --> GRA["gra"]
    MECH --> SEF["sefardi_yo"]
    REMA --> ASHK_MB["ashk_mb"]
    REMA --> ASHK_AH["ashk_ah"]
    GRA --> ASHK_AH

Step 4: Makor Chain Integrity

The skill verifies every normative rule has makor:

normative_rules = engine.query("rule(R), scope(R, _)")
for rule in normative_rules:
    makor = engine.query(f"makor({rule['R']}, _)")
    if not makor:
        errors.append(f"Rule {rule['R']} missing makor")

Phase C: Behavioral Testing

Step 1: Generate Test Scenarios

Based on the corpus statements, the skill generates test cases:

# test-scenarios-YD-87-3.yaml
scenarios:
  - id: test_dag_chalav_mechaber
    description: "Fish + dairy mixture should trigger sakana in mechaber world"
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
    expected:
      mechaber:
        - holds(sakana(m1), mechaber)
        - holds(issur(achiila, m1, sakana), mechaber)
      rema:
        - holds(heter(achiila, m1), rema)
        - holds(no_sakana(m1), rema)

Step 2: Positive Tests

Tests that expected conclusions hold:

def run_positive_test(scenario, world):
    # Setup scenario
    result = engine.analyze(scenario['setup'], world=world)

    # Check expected atoms
    for expected in scenario['expected'][world]:
        assert expected in result['atoms'], \
            f"Expected {expected} not found in {world}"

Positive Test Categories:

Category Tests
Base rules Universal prohibitions hold
World-specific Mechaber/Rema differences work
Inheritance Child worlds inherit correctly
Override Overrides take effect

Step 3: Negative Tests

Tests that incorrect conclusions do NOT hold:

def run_negative_test(scenario, world):
    result = engine.analyze(scenario['setup'], world=world)

    for unexpected in scenario.get('unexpected', {}).get(world, []):
        assert unexpected not in result['atoms'], \
            f"Unexpected {unexpected} found in {world}"

Example Negative Tests:

negative_tests:
  - id: rema_no_sakana
    description: "Rema world should NOT have sakana for fish+dairy"
    setup: |
      food(salmon). food_type(salmon, dag).
      food(butter). food_type(butter, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, butter).
    unexpected:
      rema:
        - holds(sakana(m1), rema)
        - holds(issur(achiila, m1, sakana), rema)

Step 4: Edge Cases

Tests boundary conditions:

edge_cases:
  - id: empty_mixture
    description: "Empty mixture should not trigger rules"
    setup: "mixture(m_empty)."
    expected:
      base:
        - not holds(issur(achiila, m_empty, _), base)

  - id: single_component
    description: "Single component mixture - no prohibition"
    setup: |
      food(beef). food_type(beef, beheima).
      mixture(m_single). contains(m_single, beef).
    expected:
      base:
        - not holds(issur(achiila, m_single, _), base)

  - id: two_basar_no_chalav
    description: "Two basar types without chalav - no BB"
    setup: |
      food(beef). food_type(beef, beheima).
      food(chicken). food_type(chicken, of).
      mixture(m_basar). contains(m_basar, beef). contains(m_basar, chicken).
    expected:
      base:
        - not holds(issur(achiila, m_basar, d_oraita), base)

Step 5: Machloket Verification

The skill verifies both dispute positions are derivable:

def verify_machloket(machloket_entry):
    topic = machloket_entry['topic']
    auth_a = machloket_entry['position_a']['authority']
    auth_b = machloket_entry['position_b']['authority']

    # Check position A holds in world A
    result_a = engine.query(f"holds({topic}_position_a, {auth_a})")
    assert result_a, f"Position A not derivable in {auth_a}"

    # Check position B holds in world B
    result_b = engine.query(f"holds({topic}_position_b, {auth_b})")
    assert result_b, f"Position B not derivable in {auth_b}"

    # Check machloket marker exists
    marker = engine.query(f"machloket({topic}, {auth_a}, {auth_b}, _)")
    assert marker, f"Machloket marker missing for {topic}"

Step 6: Regression Testing

The skill runs all existing tests to ensure no regressions:

pytest tests/ontology/ -v

Result Parsing:

def parse_pytest_results(output):
    passed = int(re.search(r'(\d+) passed', output).group(1))
    failed = int(re.search(r'(\d+) failed', output).group(1) or 0)
    skipped = int(re.search(r'(\d+) skipped', output).group(1) or 0)

    return {
        'passed': passed,
        'failed': failed,
        'skipped': skipped,
        'total': passed + failed + skipped,
        'success': failed == 0
    }

Step 7: Context Sensitivity Testing

Tests lechatchila/bedieved distinctions:

context_tests:
  - id: test_lechatchila
    context: lechatchila
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
      context(lechatchila).
    expected:
      mechaber:
        - holds(issur(achiila, m1, sakana), mechaber)

  - id: test_bedieved
    context: bedieved
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
      context(bedieved).
      hefsed_merubeh.
    expected:
      mechaber:
        - "varies by posek"  # May need human interpretation

Phase D: Query Verification

Step 1: Engine API Testing

The skill tests all engine methods work with new rules:

# Test ask()
assert engine.ask("holds(issur(achiila, m1, d_oraita), base)")

# Test query()
results = engine.query("holds(issur(A, M, L), W)")
assert len(results) > 0

# Test compare()
comparison = engine.compare(
    "holds(issur(achiila, M, L), W)",
    worlds=["mechaber", "rema"]
)
assert "mechaber" in comparison
assert "rema" in comparison

Step 2: Derivation Chain Testing

The skill tests explain() produces valid derivation:

explanation = engine.explain(
    "holds(issur(achiila, m1, d_oraita), mechaber)",
    world="mechaber",
    additional_facts="food(beef). food_type(beef, beheima). ..."
)

assert explanation['derivation'] is not None
assert len(explanation['tree']) > 0

Step 3: Performance Testing

The skill ensures queries complete in reasonable time:

import time

start = time.time()
for _ in range(100):
    engine.query("holds(issur(A, M, L), W)")
elapsed = time.time() - start

avg_query_time = elapsed / 100
assert avg_query_time < 0.1, f"Query too slow: {avg_query_time}s"

Performance Targets:

Metric Target Warning Threshold
Grounding time <100ms >200ms
First query <50ms >100ms
Subsequent queries <10ms >50ms
Memory usage <100MB >200MB

Phase E: Validation Report

Output 1: Validation Report

validation-report-YD-{siman}-{seif}.md

Uses template from templates/validation-report.md:

# Validation Report: YD {siman}:{seif}

## Summary
- **Reference:** YD {siman}:{seif}
- **Validated:** {timestamp}
- **Status:** {PASSED/FAILED}

## Compilation
- HLL Compilation: {passed/failed}
- ASP Grounding: {passed/failed}
- Satisfiability: {SAT/UNSAT}

## Semantic Validation
- Disjointness: {passed/failed}
- Madrega Consistency: {passed/failed}
- World Inheritance: {passed/failed}
- Makor Integrity: {passed/failed}

## Behavioral Tests
- Positive Tests: {N} passed
- Negative Tests: {N} passed
- Edge Cases: {N} passed
- Machloket Verification: {passed/failed}
- Regression: {N}/{N} passed

## Performance
- Average Query Time: {N}ms
- Grounding Time: {N}ms

## Issues Found
{list of issues if any}

Output 2: Validation Results YAML

validation-results-YD-{siman}-{seif}.yaml

reference: "YD:{siman}:{seif}"
validated_at: "{timestamp}"
status: passed  # or failed

compilation:
  hll_compilation: passed
  asp_grounding: passed
  satisfiability: SAT

semantic:
  disjointness: passed
  madrega_consistency: passed
  world_inheritance: passed
  makor_integrity: passed

behavioral:
  positive_tests:
    total: 10
    passed: 10
    failed: 0
  negative_tests:
    total: 5
    passed: 5
    failed: 0
  edge_cases:
    total: 3
    passed: 3
    failed: 0
  machloket:
    verified: true
  regression:
    total: 495
    passed: 495
    failed: 0
    skipped: 0

performance:
  avg_query_ms: 45
  grounding_ms: 120

Output 3: Test Scenarios

test-scenarios-YD-{siman}-{seif}.yaml

Full test scenarios for future regression testing.

Checkpoint Criteria

Before requesting human approval, verify:

  • [ ] Compilation successful (no syntax errors)
  • [ ] Program is SAT (no contradictions)
  • [ ] All semantic checks pass
  • [ ] All generated tests pass
  • [ ] Machloket properly verified
  • [ ] No regression failures
  • [ ] Performance acceptable (<100ms avg query)

Session State Update

After completing validation:

current_phase: validate
checkpoints:
  validate:
    status: pending_review
    artifacts:
      - validation-report-YD-{siman}-{seif}.md
      - validation-results-YD-{siman}-{seif}.yaml
      - test-scenarios-YD-{siman}-{seif}.yaml
    test_summary:
      total: 25
      passed: 25
      failed: 0
    performance:
      avg_query_ms: 45

Common Issues

UNSAT Result

Symptom: Clingo returns UNSATISFIABLE.

Diagnosis:

clingo ontology.lp --opt-mode=optN -n 0

Common Causes:

  1. Direct contradiction (issur + heter for same thing)
  2. Circular negation without choice rule
  3. Conflicting world assertions

Grounding Timeout

Symptom: Grounding takes >10 seconds.

Solutions:

  1. Add domain bounds
  2. Reduce Cartesian products
  3. Use projection directives

Test Failure

Symptom: Expected atom not found.

Diagnosis:

# Debug specific query
result = engine.analyze(setup_facts, world="mechaber")
print("All atoms:", result['atoms'])

Common Causes:

  1. Missing prerequisite facts
  2. World inheritance not working
  3. Override incorrectly structured