Sefaria Logger Hook¶
The sefaria-logger hook captures all Sefaria MCP tool calls and logs them to a source chain file, providing complete traceability for all source material fetched during encoding.
Overview¶
| Attribute | Value |
|---|---|
| Hook Name | sefaria-logger |
| Script | hooks/scripts/sefaria-logger.py |
| Event | PreToolUse |
| Matcher | mcp__sefaria |
| Blocking | No (logging only) |
| Timeout | 3000ms |
Purpose¶
The Sefaria logger provides:
- Source Traceability: Complete record of all texts fetched
- Audit Trail: Timestamped log of research process
- Reference Validation: Captures what references were queried
- Session Documentation: Preserves research history
Configuration¶
In hooks/hooks.json:
{
"PreToolUse": [
{
"matcher": "mcp__sefaria",
"hooks": [
{
"type": "command",
"command": "python ${CLAUDE_PLUGIN_ROOT}/hooks/scripts/sefaria-logger.py \"$TOOL_NAME\" \"$TOOL_INPUT\"",
"timeout": 3000
}
]
}
]
}
Behavior¶
Successful Logging¶
When a Sefaria tool is called:
No Reference Found¶
When reference cannot be extracted:
The hook always allows the operation to proceed.
Matched Tools¶
The hook matches all Sefaria MCP tools:
| Tool | Category | Example Reference |
|---|---|---|
mcp__sefaria_texts__get_text |
text_fetch | "Genesis 1:1" |
mcp__sefaria_texts__get_english_translations |
translations | "Berakhot 2a" |
mcp__sefaria_texts__get_links_between_texts |
links_fetch | "YD 87:3" |
mcp__sefaria_texts__clarify_name_argument |
validation | "Shulchan Aruch" |
mcp__sefaria_texts__get_text_or_category_shape |
structure | "Yoreh Deah" |
mcp__sefaria_texts__text_search |
search | "search: basar bechalav" |
mcp__sefaria_texts__english_semantic_search |
search | "search: meat milk prohibition" |
mcp__sefaria_texts__search_in_dictionaries |
dictionary | "noten taam" |
mcp__sefaria_texts__get_topic_details |
topic_lookup | "topic: meat-and-milk" |
Log File Format¶
Location¶
.mistaber-artifacts/source-chain-log.yaml
Structure¶
source_chain:
- tool: get_text
reference: "Shulchan Arukh, Yoreh De'ah 87:3"
category: text_fetch
timestamp: "2026-01-25T10:05:00.123456"
- tool: get_links_between_texts
reference: "Shulchan Arukh, Yoreh De'ah 87:3"
category: links_fetch
timestamp: "2026-01-25T10:05:30.456789"
- tool: get_text
reference: "Shakh on Shulchan Arukh, Yoreh De'ah 87:3"
category: text_fetch
timestamp: "2026-01-25T10:06:00.789012"
- tool: text_search
reference: "search: dag bechalav"
category: search
timestamp: "2026-01-25T10:07:00.123456"
- tool: get_topic_details
reference: "topic: meat-and-milk"
category: topic_lookup
timestamp: "2026-01-25T10:08:00.456789"
last_updated: "2026-01-25T10:08:00.456789"
Reference Extraction¶
From Different Tools¶
def extract_reference(tool_name: str, tool_input: str) -> str | None:
"""Extract reference from tool input."""
try:
data = json.loads(tool_input)
# Different tools use different parameter names
ref = data.get("reference")
if ref:
return ref
ref = data.get("name")
if ref:
return ref
ref = data.get("book_name")
if ref:
return ref
ref = data.get("query")
if ref:
return f"search: {ref}"
ref = data.get("topic_slug")
if ref:
return f"topic: {ref}"
return None
except json.JSONDecodeError:
return None
Tool Categories¶
def get_tool_category(tool_name: str) -> str:
"""Categorize the Sefaria tool."""
if "get_text" in tool_name:
return "text_fetch"
elif "links" in tool_name:
return "links_fetch"
elif "search" in tool_name:
return "search"
elif "topic" in tool_name:
return "topic_lookup"
elif "clarify" in tool_name:
return "validation"
elif "shape" in tool_name:
return "structure"
elif "translations" in tool_name:
return "translations"
elif "dictionar" in tool_name:
return "dictionary"
else:
return "other"
Log Entry Structure¶
Each log entry contains:
| Field | Type | Description |
|---|---|---|
tool |
string | Tool name (without mcp prefix) |
reference |
string | Reference or query extracted |
category |
string | Tool category |
timestamp |
string | ISO 8601 timestamp |
Implementation Details¶
Log Loading¶
def load_source_log() -> list:
"""Load existing source chain log."""
log_path = Path(".mistaber-artifacts/source-chain-log.yaml")
if not log_path.exists():
return []
try:
with open(log_path, "r") as f:
data = yaml.safe_load(f)
return data.get("source_chain", []) if data else []
except Exception:
return []
Log Saving¶
def save_source_log(log: list) -> None:
"""Save source chain log."""
log_dir = Path(".mistaber-artifacts")
log_dir.mkdir(exist_ok=True)
log_path = log_dir / "source-chain-log.yaml"
data = {
"source_chain": log,
"last_updated": datetime.now().isoformat()
}
with open(log_path, "w") as f:
yaml.dump(data, f, default_flow_style=False, allow_unicode=True)
Main Execution¶
def main():
"""Main hook execution."""
tool_name = sys.argv[1] if len(sys.argv) > 1 else "unknown"
tool_input = sys.argv[2] if len(sys.argv) > 2 else "{}"
output = {
"continue": True, # Never block - logging only
"message": ""
}
reference = extract_reference(tool_name, tool_input)
if reference:
log = load_source_log()
entry = {
"tool": tool_name.replace("mcp__sefaria_texts__", ""),
"reference": reference,
"category": get_tool_category(tool_name),
"timestamp": datetime.now().isoformat(),
}
log.append(entry)
save_source_log(log)
output["message"] = f"Logged source fetch: {reference}"
print(json.dumps(output))
return 0
Use Cases¶
Corpus Preparation¶
During corpus-prep, the log captures all sources fetched:
source_chain:
# Primary text fetch
- tool: get_text
reference: "Shulchan Arukh, Yoreh De'ah 87:3"
category: text_fetch
timestamp: "2026-01-25T10:00:00Z"
# Get translations
- tool: get_english_translations
reference: "Shulchan Arukh, Yoreh De'ah 87:3"
category: translations
timestamp: "2026-01-25T10:00:30Z"
# Get commentaries via links
- tool: get_links_between_texts
reference: "Shulchan Arukh, Yoreh De'ah 87:3"
category: links_fetch
timestamp: "2026-01-25T10:01:00Z"
# Fetch Shach
- tool: get_text
reference: "Shakh on Shulchan Arukh, Yoreh De'ah 87:3"
category: text_fetch
timestamp: "2026-01-25T10:02:00Z"
# Fetch Taz
- tool: get_text
reference: "Turei Zahav on Shulchan Arukh, Yoreh De'ah 87:3"
category: text_fetch
timestamp: "2026-01-25T10:03:00Z"
Derivation Chain Building¶
When tracing sources:
source_chain:
# Trace to Tur
- tool: get_links_between_texts
reference: "Tur, Yoreh Deah 87"
category: links_fetch
# Trace to Rambam
- tool: get_links_between_texts
reference: "Mishneh Torah, Forbidden Foods 9"
category: links_fetch
# Trace to Gemara
- tool: get_text
reference: "Chullin 104b"
category: text_fetch
Research and Validation¶
When validating references:
source_chain:
- tool: clarify_name_argument
reference: "Shulchan Aruch Yoreh Deah"
category: validation
- tool: get_text_or_category_shape
reference: "Yoreh Deah 87"
category: structure
Log Analysis¶
Count by Category¶
# Using yq or Python
cat .mistaber-artifacts/source-chain-log.yaml | \
python3 -c "import yaml, sys; d=yaml.safe_load(sys.stdin); print({c: sum(1 for e in d['source_chain'] if e['category']==c) for c in set(e['category'] for e in d['source_chain'])})"
List Unique References¶
cat .mistaber-artifacts/source-chain-log.yaml | \
python3 -c "import yaml, sys; d=yaml.safe_load(sys.stdin); [print(r) for r in sorted(set(e['reference'] for e in d['source_chain']))]"
Debugging¶
Manual Testing¶
# Test with get_text
python mistaber-skills/hooks/scripts/sefaria-logger.py \
"mcp__sefaria_texts__get_text" \
'{"reference": "Genesis 1:1"}' | jq .
# Test with search
python mistaber-skills/hooks/scripts/sefaria-logger.py \
"mcp__sefaria_texts__text_search" \
'{"query": "basar bechalav"}' | jq .
# Check log file
cat .mistaber-artifacts/source-chain-log.yaml
Debug Mode¶
export MISTABER_DEBUG=1
python mistaber-skills/hooks/scripts/sefaria-logger.py "mcp__sefaria_texts__get_text" '{"reference": "test"}'
Common Issues¶
Log File Not Created¶
Symptom: No source-chain-log.yaml file.
Causes: - Artifacts directory doesn't exist - Permission issues - PyYAML not installed
Solutions:
Reference Not Logged¶
Symptom: Sefaria call made but not in log.
Causes: - Tool name doesn't match pattern - Reference not in expected field - Hook timeout
Solutions:
Check tool name matches mcp__sefaria:
Log Corrupted¶
Symptom: YAML parse error on load.
Solutions:
# Validate YAML
python -c "import yaml; yaml.safe_load(open('.mistaber-artifacts/source-chain-log.yaml'))"
# Reset if needed
rm .mistaber-artifacts/source-chain-log.yaml
Integration¶
With Corpus Preparation¶
The log is used by corpus-prep to document all sources:
# In corpus-sources-YD-87-3.yaml
sources_fetched:
primary: ["Shulchan Arukh, Yoreh De'ah 87:3"]
commentaries: ["Shakh...", "Taz..."]
chain: ["Tur...", "Rambam...", "Chullin 104b"]
# Derived from source-chain-log.yaml
With Session Archive¶
The log is archived with the session:
docs/encoding-sessions/yd_87_3_2026-01-25/
├── source-chain-log.yaml # Complete fetch history
└── ...
Related Documentation¶
- Hooks Overview - All hooks
- Corpus Preparation Skill - Uses logged sources
- Troubleshooting - MCP issues