Sefaria Logger Hook¶

The sefaria-logger hook captures all Sefaria MCP tool calls and logs them to a source chain file, providing complete traceability for all source material fetched during encoding.

Overview¶

Attribute	Value
Hook Name	sefaria-logger
Script	`hooks/scripts/sefaria-logger.py`
Event	PreToolUse
Matcher	`mcp__sefaria`
Blocking	No (logging only)
Timeout	3000ms

Purpose¶

The Sefaria logger provides:

Source Traceability: Complete record of all texts fetched
Audit Trail: Timestamped log of research process
Reference Validation: Captures what references were queried
Session Documentation: Preserves research history

Configuration¶

In hooks/hooks.json:

{
  "PreToolUse": [
    {
      "matcher": "mcp__sefaria",
      "hooks": [
        {
          "type": "command",
          "command": "python ${CLAUDE_PLUGIN_ROOT}/hooks/scripts/sefaria-logger.py \"$TOOL_NAME\" \"$TOOL_INPUT\"",
          "timeout": 3000
        }
      ]
    }
  ]
}

Behavior¶

Successful Logging¶

When a Sefaria tool is called:

{
  "continue": true,
  "message": "Logged source fetch: Shulchan Arukh, Yoreh De'ah 87:3"
}

No Reference Found¶

When reference cannot be extracted:

{
  "continue": true,
  "message": ""
}

The hook always allows the operation to proceed.

Matched Tools¶

The hook matches all Sefaria MCP tools:

Tool	Category	Example Reference
`mcp__sefaria_texts__get_text`	text_fetch	"Genesis 1:1"
`mcp__sefaria_texts__get_english_translations`	translations	"Berakhot 2a"
`mcp__sefaria_texts__get_links_between_texts`	links_fetch	"YD 87:3"
`mcp__sefaria_texts__clarify_name_argument`	validation	"Shulchan Aruch"
`mcp__sefaria_texts__get_text_or_category_shape`	structure	"Yoreh Deah"
`mcp__sefaria_texts__text_search`	search	"search: basar bechalav"
`mcp__sefaria_texts__english_semantic_search`	search	"search: meat milk prohibition"
`mcp__sefaria_texts__search_in_dictionaries`	dictionary	"noten taam"
`mcp__sefaria_texts__get_topic_details`	topic_lookup	"topic: meat-and-milk"

Log File Format¶

Location¶

.mistaber-artifacts/source-chain-log.yaml

Structure¶

source_chain:
  - tool: get_text
    reference: "Shulchan Arukh, Yoreh De'ah 87:3"
    category: text_fetch
    timestamp: "2026-01-25T10:05:00.123456"

  - tool: get_links_between_texts
    reference: "Shulchan Arukh, Yoreh De'ah 87:3"
    category: links_fetch
    timestamp: "2026-01-25T10:05:30.456789"

  - tool: get_text
    reference: "Shakh on Shulchan Arukh, Yoreh De'ah 87:3"
    category: text_fetch
    timestamp: "2026-01-25T10:06:00.789012"

  - tool: text_search
    reference: "search: dag bechalav"
    category: search
    timestamp: "2026-01-25T10:07:00.123456"

  - tool: get_topic_details
    reference: "topic: meat-and-milk"
    category: topic_lookup
    timestamp: "2026-01-25T10:08:00.456789"

last_updated: "2026-01-25T10:08:00.456789"

Reference Extraction¶

From Different Tools¶

def extract_reference(tool_name: str, tool_input: str) -> str | None:
    """Extract reference from tool input."""
    try:
        data = json.loads(tool_input)

        # Different tools use different parameter names
        ref = data.get("reference")
        if ref:
            return ref

        ref = data.get("name")
        if ref:
            return ref

        ref = data.get("book_name")
        if ref:
            return ref

        ref = data.get("query")
        if ref:
            return f"search: {ref}"

        ref = data.get("topic_slug")
        if ref:
            return f"topic: {ref}"

        return None
    except json.JSONDecodeError:
        return None

Tool Categories¶

def get_tool_category(tool_name: str) -> str:
    """Categorize the Sefaria tool."""
    if "get_text" in tool_name:
        return "text_fetch"
    elif "links" in tool_name:
        return "links_fetch"
    elif "search" in tool_name:
        return "search"
    elif "topic" in tool_name:
        return "topic_lookup"
    elif "clarify" in tool_name:
        return "validation"
    elif "shape" in tool_name:
        return "structure"
    elif "translations" in tool_name:
        return "translations"
    elif "dictionar" in tool_name:
        return "dictionary"
    else:
        return "other"

Log Entry Structure¶

Each log entry contains:

Field	Type	Description
`tool`	string	Tool name (without mcp prefix)
`reference`	string	Reference or query extracted
`category`	string	Tool category
`timestamp`	string	ISO 8601 timestamp

Implementation Details¶

Log Loading¶

def load_source_log() -> list:
    """Load existing source chain log."""
    log_path = Path(".mistaber-artifacts/source-chain-log.yaml")

    if not log_path.exists():
        return []

    try:
        with open(log_path, "r") as f:
            data = yaml.safe_load(f)
            return data.get("source_chain", []) if data else []
    except Exception:
        return []

Log Saving¶

def save_source_log(log: list) -> None:
    """Save source chain log."""
    log_dir = Path(".mistaber-artifacts")
    log_dir.mkdir(exist_ok=True)

    log_path = log_dir / "source-chain-log.yaml"

    data = {
        "source_chain": log,
        "last_updated": datetime.now().isoformat()
    }

    with open(log_path, "w") as f:
        yaml.dump(data, f, default_flow_style=False, allow_unicode=True)

Main Execution¶

def main():
    """Main hook execution."""
    tool_name = sys.argv[1] if len(sys.argv) > 1 else "unknown"
    tool_input = sys.argv[2] if len(sys.argv) > 2 else "{}"

    output = {
        "continue": True,  # Never block - logging only
        "message": ""
    }

    reference = extract_reference(tool_name, tool_input)

    if reference:
        log = load_source_log()

        entry = {
            "tool": tool_name.replace("mcp__sefaria_texts__", ""),
            "reference": reference,
            "category": get_tool_category(tool_name),
            "timestamp": datetime.now().isoformat(),
        }

        log.append(entry)
        save_source_log(log)

        output["message"] = f"Logged source fetch: {reference}"

    print(json.dumps(output))
    return 0

Use Cases¶

Corpus Preparation¶

During corpus-prep, the log captures all sources fetched:

source_chain:
  # Primary text fetch
  - tool: get_text
    reference: "Shulchan Arukh, Yoreh De'ah 87:3"
    category: text_fetch
    timestamp: "2026-01-25T10:00:00Z"

  # Get translations
  - tool: get_english_translations
    reference: "Shulchan Arukh, Yoreh De'ah 87:3"
    category: translations
    timestamp: "2026-01-25T10:00:30Z"

  # Get commentaries via links
  - tool: get_links_between_texts
    reference: "Shulchan Arukh, Yoreh De'ah 87:3"
    category: links_fetch
    timestamp: "2026-01-25T10:01:00Z"

  # Fetch Shach
  - tool: get_text
    reference: "Shakh on Shulchan Arukh, Yoreh De'ah 87:3"
    category: text_fetch
    timestamp: "2026-01-25T10:02:00Z"

  # Fetch Taz
  - tool: get_text
    reference: "Turei Zahav on Shulchan Arukh, Yoreh De'ah 87:3"
    category: text_fetch
    timestamp: "2026-01-25T10:03:00Z"

Derivation Chain Building¶

When tracing sources:

source_chain:
  # Trace to Tur
  - tool: get_links_between_texts
    reference: "Tur, Yoreh Deah 87"
    category: links_fetch

  # Trace to Rambam
  - tool: get_links_between_texts
    reference: "Mishneh Torah, Forbidden Foods 9"
    category: links_fetch

  # Trace to Gemara
  - tool: get_text
    reference: "Chullin 104b"
    category: text_fetch

Research and Validation¶

When validating references:

source_chain:
  - tool: clarify_name_argument
    reference: "Shulchan Aruch Yoreh Deah"
    category: validation

  - tool: get_text_or_category_shape
    reference: "Yoreh Deah 87"
    category: structure

Log Analysis¶

Count by Category¶

# Using yq or Python
cat .mistaber-artifacts/source-chain-log.yaml | \
  python3 -c "import yaml, sys; d=yaml.safe_load(sys.stdin); print({c: sum(1 for e in d['source_chain'] if e['category']==c) for c in set(e['category'] for e in d['source_chain'])})"

List Unique References¶

cat .mistaber-artifacts/source-chain-log.yaml | \
  python3 -c "import yaml, sys; d=yaml.safe_load(sys.stdin); [print(r) for r in sorted(set(e['reference'] for e in d['source_chain']))]"

Debugging¶

Manual Testing¶

# Test with get_text
python mistaber-skills/hooks/scripts/sefaria-logger.py \
  "mcp__sefaria_texts__get_text" \
  '{"reference": "Genesis 1:1"}' | jq .

# Test with search
python mistaber-skills/hooks/scripts/sefaria-logger.py \
  "mcp__sefaria_texts__text_search" \
  '{"query": "basar bechalav"}' | jq .

# Check log file
cat .mistaber-artifacts/source-chain-log.yaml

Debug Mode¶

export MISTABER_DEBUG=1
python mistaber-skills/hooks/scripts/sefaria-logger.py "mcp__sefaria_texts__get_text" '{"reference": "test"}'

Common Issues¶

Log File Not Created¶

Symptom: No source-chain-log.yaml file.

Causes: - Artifacts directory doesn't exist - Permission issues - PyYAML not installed

Solutions:

mkdir -p .mistaber-artifacts
pip install pyyaml

Reference Not Logged¶

Symptom: Sefaria call made but not in log.

Causes: - Tool name doesn't match pattern - Reference not in expected field - Hook timeout

Solutions: Check tool name matches mcp__sefaria:

echo "Tool: mcp__sefaria_texts__get_text" | grep "mcp__sefaria"

Log Corrupted¶

Symptom: YAML parse error on load.