Validation Skill¶

The validate skill compiles encoded HLL rules, runs semantic validation checks, generates and executes test scenarios, and verifies machloket encoding before final human review.

Overview¶

Attribute	Value
Skill Name	`validate`
Phase	3 of 5
Checkpoint	3: Test Review
Prerequisites	Checkpoint 2 (hll-encode) approved
Outputs	validation-report.md, validation-results.yaml, test-scenarios.yaml
Trigger Phrases	"validate encoding", "run tests", "compile rules", "check rules", "verify encoding"

Prerequisites¶

Before this skill can execute:

HLL-encode checkpoint MUST be approved
Encoded .lp files must exist:
Shared definitions in mistaber/ontology/corpus/yd_{siman}/base.lp
World-specific rules in mistaber/ontology/worlds/*.lp
Encoding report and mapping artifacts must exist
Session state shows hll-encode.status: approved

Phase A: Compilation Validation¶

Step 1: HLL Compilation¶

The skill compiles HLL to ASP using the Mistaber compiler:

from mistaber.dsl.compiler import compile_hll_file, CompileError

try:
    # Compile world files that include the new rules
    for world_file in Path("mistaber/ontology/worlds").glob("*.lp"):
        result = compile_hll_file(world_file)
    compilation_status = "passed"
except CompileError as e:
    compilation_status = "failed"
    error_details = str(e)

Compilation Checks:

Check	Description
Syntax	Valid ASP syntax
Directives	Valid HLL directive format
Includes	All `#include` files exist
Predicates	All predicates defined

Step 2: ASP Grounding Test¶

The skill tests that Clingo can ground the program:

import clingo

ctl = clingo.Control(["--warn=none"])

# Load all ontology
for lp_file in Path("mistaber/ontology").rglob("*.lp"):
    ctl.load(str(lp_file))

# Ground
try:
    ctl.ground([("base", [])])
    grounding_status = "passed"
except RuntimeError as e:
    grounding_status = "failed"
    grounding_error = str(e)

Grounding Statistics:

Metric	Description
Rules grounded	Number of ground rules generated
Atoms created	Number of ground atoms
Grounding time	Time to complete grounding (ms)

Step 3: Satisfiability Check¶

The skill verifies the program is satisfiable (has at least one answer set):

with ctl.solve(yield_=True) as handle:
    model_count = 0
    for model in handle:
        model_count += 1
        if model_count >= 1:
            break

if model_count == 0:
    satisfiability = "UNSAT"  # CRITICAL - rules are contradictory
else:
    satisfiability = "SAT"

UNSAT is Critical

If the program is UNSAT, the rules contain logical contradictions. This MUST be resolved before proceeding. The skill will not pass checkpoint with UNSAT result.

UNSAT Debugging:

# Get minimal unsatisfiable core
clingo ontology.lp --opt-mode=optN -n 0

# Trace to find conflict
clingo --trace=asp ontology.lp

Phase B: Semantic Validation¶

Step 1: Disjointness Verification¶

The skill verifies no entity violates disjointness constraints:

% Constraints checked:
% No food can be both basar and chalav
:- food(F), food_type(F, basar), food_type(F, chalav).

% No mixture can have contradictory status in same world
:- asserts(W, issur(A, M, _)), asserts(W, heter(A, M)), not override(W, _, _).

Validation Query:

violations = engine.query("disjoint_violation(X, Y)")
if violations:
    semantic_errors.append(f"Disjointness violation: {violations}")

Step 2: Madrega Consistency¶

The skill verifies d'oraita rules have Torah sources:

for rule in rules_with_madrega_doraita:
    makor_chain = get_makor_chain(rule)
    if not has_torah_source(makor_chain):
        warnings.append(
            f"Rule {rule} marked d_oraita but chain doesn't reach Torah"
        )

Madrega Validation Rules:

Madrega	Required Source
d_oraita	Torah verse
d_rabanan	Mishnah, Gemara, or Takana
minhag	Community practice reference
chumra	Any halachic source

Step 3: World Inheritance Verification¶

The skill verifies world hierarchy is respected:

# Check that asserts flow correctly through accessible/2
for world in ["mechaber", "rema", "sefardi_yo", "ashk_mb"]:
    base_assertions = engine.query("asserts(base, X)")
    world_holds = engine.query(f"holds(X, {world})")

    # Base assertions should propagate unless overridden
    for assertion in base_assertions:
        if assertion not in world_holds:
            if not is_overridden(world, assertion):
                errors.append(f"Inheritance broken: {assertion} not in {world}")

World Inheritance Map:

graph TD
    BASE["base"] --> MECH["mechaber"]
    BASE --> REMA["rema"]
    BASE --> GRA["gra"]
    MECH --> SEF["sefardi_yo"]
    REMA --> ASHK_MB["ashk_mb"]
    REMA --> ASHK_AH["ashk_ah"]
    GRA --> ASHK_AH

Step 4: Makor Chain Integrity¶

The skill verifies every normative rule has makor:

normative_rules = engine.query("rule(R), scope(R, _)")
for rule in normative_rules:
    makor = engine.query(f"makor({rule['R']}, _)")
    if not makor:
        errors.append(f"Rule {rule['R']} missing makor")

Phase C: Behavioral Testing¶

Step 1: Generate Test Scenarios¶

Based on the corpus statements, the skill generates test cases:

# test-scenarios-YD-87-3.yaml
scenarios:
  - id: test_dag_chalav_mechaber
    description: "Fish + dairy mixture should trigger sakana in mechaber world"
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
    expected:
      mechaber:
        - holds(sakana(m1), mechaber)
        - holds(issur(achiila, m1, sakana), mechaber)
      rema:
        - holds(heter(achiila, m1), rema)
        - holds(no_sakana(m1), rema)

Step 2: Positive Tests¶

Tests that expected conclusions hold:

def run_positive_test(scenario, world):
    # Setup scenario
    result = engine.analyze(scenario['setup'], world=world)

    # Check expected atoms
    for expected in scenario['expected'][world]:
        assert expected in result['atoms'], \
            f"Expected {expected} not found in {world}"

Positive Test Categories:

Category	Tests
Base rules	Universal prohibitions hold
World-specific	Mechaber/Rema differences work
Inheritance	Child worlds inherit correctly
Override	Overrides take effect

Step 3: Negative Tests¶

Tests that incorrect conclusions do NOT hold:

def run_negative_test(scenario, world):
    result = engine.analyze(scenario['setup'], world=world)

    for unexpected in scenario.get('unexpected', {}).get(world, []):
        assert unexpected not in result['atoms'], \
            f"Unexpected {unexpected} found in {world}"

Example Negative Tests:

negative_tests:
  - id: rema_no_sakana
    description: "Rema world should NOT have sakana for fish+dairy"
    setup: |
      food(salmon). food_type(salmon, dag).
      food(butter). food_type(butter, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, butter).
    unexpected:
      rema:
        - holds(sakana(m1), rema)
        - holds(issur(achiila, m1, sakana), rema)

Step 4: Edge Cases¶

Tests boundary conditions:

edge_cases:
  - id: empty_mixture
    description: "Empty mixture should not trigger rules"
    setup: "mixture(m_empty)."
    expected:
      base:
        - not holds(issur(achiila, m_empty, _), base)

  - id: single_component
    description: "Single component mixture - no prohibition"
    setup: |
      food(beef). food_type(beef, beheima).
      mixture(m_single). contains(m_single, beef).
    expected:
      base:
        - not holds(issur(achiila, m_single, _), base)

  - id: two_basar_no_chalav
    description: "Two basar types without chalav - no BB"
    setup: |
      food(beef). food_type(beef, beheima).
      food(chicken). food_type(chicken, of).
      mixture(m_basar). contains(m_basar, beef). contains(m_basar, chicken).
    expected:
      base:
        - not holds(issur(achiila, m_basar, d_oraita), base)

Step 5: Machloket Verification¶

The skill verifies both dispute positions are derivable:

def verify_machloket(machloket_entry):
    topic = machloket_entry['topic']
    auth_a = machloket_entry['position_a']['authority']
    auth_b = machloket_entry['position_b']['authority']

    # Check position A holds in world A
    result_a = engine.query(f"holds({topic}_position_a, {auth_a})")
    assert result_a, f"Position A not derivable in {auth_a}"

    # Check position B holds in world B
    result_b = engine.query(f"holds({topic}_position_b, {auth_b})")
    assert result_b, f"Position B not derivable in {auth_b}"

    # Check machloket marker exists
    marker = engine.query(f"machloket({topic}, {auth_a}, {auth_b}, _)")
    assert marker, f"Machloket marker missing for {topic}"

Step 6: Regression Testing¶

The skill runs all existing tests to ensure no regressions:

pytest tests/ontology/ -v

Result Parsing:

def parse_pytest_results(output):
    passed = int(re.search(r'(\d+) passed', output).group(1))
    failed = int(re.search(r'(\d+) failed', output).group(1) or 0)
    skipped = int(re.search(r'(\d+) skipped', output).group(1) or 0)

    return {
        'passed': passed,
        'failed': failed,
        'skipped': skipped,
        'total': passed + failed + skipped,
        'success': failed == 0
    }

Step 7: Context Sensitivity Testing¶

Tests lechatchila/bedieved distinctions:

context_tests:
  - id: test_lechatchila
    context: lechatchila
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
      context(lechatchila).
    expected:
      mechaber:
        - holds(issur(achiila, m1, sakana), mechaber)

  - id: test_bedieved
    context: bedieved
    setup: |
      food(salmon). food_type(salmon, dag).
      food(cream). food_type(cream, chalav).
      mixture(m1). contains(m1, salmon). contains(m1, cream).
      context(bedieved).
      hefsed_merubeh.
    expected:
      mechaber:
        - "varies by posek"  # May need human interpretation

Phase D: Query Verification¶

Step 1: Engine API Testing¶

The skill tests all engine methods work with new rules:

# Test ask()
assert engine.ask("holds(issur(achiila, m1, d_oraita), base)")

# Test query()
results = engine.query("holds(issur(A, M, L), W)")
assert len(results) > 0

# Test compare()
comparison = engine.compare(
    "holds(issur(achiila, M, L), W)",
    worlds=["mechaber", "rema"]
)
assert "mechaber" in comparison
assert "rema" in comparison

Step 2: Derivation Chain Testing¶

The skill tests explain() produces valid derivation:

explanation = engine.explain(
    "holds(issur(achiila, m1, d_oraita), mechaber)",
    world="mechaber",
    additional_facts="food(beef). food_type(beef, beheima). ..."
)

assert explanation['derivation'] is not None
assert len(explanation['tree']) > 0

Step 3: Performance Testing¶

The skill ensures queries complete in reasonable time:

import time

start = time.time()
for _ in range(100):
    engine.query("holds(issur(A, M, L), W)")
elapsed = time.time() - start

avg_query_time = elapsed / 100
assert avg_query_time < 0.1, f"Query too slow: {avg_query_time}s"

Performance Targets:

Metric	Target	Warning Threshold
Grounding time	<100ms	>200ms
First query	<50ms	>100ms
Subsequent queries	<10ms	>50ms
Memory usage	<100MB	>200MB

Phase E: Validation Report¶

Output 1: Validation Report¶

validation-report-YD-{siman}-{seif}.md

Uses template from templates/validation-report.md:

# Validation Report: YD {siman}:{seif}

## Summary
- **Reference:** YD {siman}:{seif}
- **Validated:** {timestamp}
- **Status:** {PASSED/FAILED}

## Compilation
- HLL Compilation: {passed/failed}
- ASP Grounding: {passed/failed}
- Satisfiability: {SAT/UNSAT}

## Semantic Validation
- Disjointness: {passed/failed}
- Madrega Consistency: {passed/failed}
- World Inheritance: {passed/failed}
- Makor Integrity: {passed/failed}

## Behavioral Tests
- Positive Tests: {N} passed
- Negative Tests: {N} passed
- Edge Cases: {N} passed
- Machloket Verification: {passed/failed}
- Regression: {N}/{N} passed

## Performance
- Average Query Time: {N}ms
- Grounding Time: {N}ms

## Issues Found
{list of issues if any}

Output 2: Validation Results YAML¶

validation-results-YD-{siman}-{seif}.yaml

reference: "YD:{siman}:{seif}"
validated_at: "{timestamp}"
status: passed  # or failed

compilation:
  hll_compilation: passed
  asp_grounding: passed
  satisfiability: SAT

semantic:
  disjointness: passed
  madrega_consistency: passed
  world_inheritance: passed
  makor_integrity: passed

behavioral:
  positive_tests:
    total: 10
    passed: 10
    failed: 0
  negative_tests:
    total: 5
    passed: 5
    failed: 0
  edge_cases:
    total: 3
    passed: 3
    failed: 0
  machloket:
    verified: true
  regression:
    total: 495
    passed: 495
    failed: 0
    skipped: 0

performance:
  avg_query_ms: 45
  grounding_ms: 120

Output 3: Test Scenarios¶

test-scenarios-YD-{siman}-{seif}.yaml

Full test scenarios for future regression testing.

Checkpoint Criteria¶

Before requesting human approval, verify:

[ ] Compilation successful (no syntax errors)
[ ] Program is SAT (no contradictions)
[ ] All semantic checks pass
[ ] All generated tests pass
[ ] Machloket properly verified
[ ] No regression failures
[ ] Performance acceptable (<100ms avg query)

Session State Update¶

After completing validation:

current_phase: validate
checkpoints:
  validate:
    status: pending_review
    artifacts:
      - validation-report-YD-{siman}-{seif}.md
      - validation-results-YD-{siman}-{seif}.yaml
      - test-scenarios-YD-{siman}-{seif}.yaml
    test_summary:
      total: 25
      passed: 25
      failed: 0
    performance:
      avg_query_ms: 45

Common Issues¶

UNSAT Result¶

Symptom: Clingo returns UNSATISFIABLE.

Diagnosis:

clingo ontology.lp --opt-mode=optN -n 0

Common Causes:

Direct contradiction (issur + heter for same thing)
Circular negation without choice rule
Conflicting world assertions

Grounding Timeout¶

Symptom: Grounding takes >10 seconds.

Solutions:

Add domain bounds
Reduce Cartesian products
Use projection directives

Test Failure¶

Symptom: Expected atom not found.