Behavioral Traces
packages/evals/ no longer ships a local eval CLI, local YAML ingress, local judge stack, or local bundle/report pipeline.
The current split is:
@moltzap/server-coreemits behavioral traces as OpenTelemetry spans (Effect’swithSpan).- The server records message-delivery and hook-block traces directly where they happen, as the
moltzap.message.deliveredandmoltzap.message.blockedspans. packages/evals/only keeps the scenario YAML catalog.@moltzap/runtimesowns the runtime adapters and the compiledtrace-capture-harnessmodule loaded bycc-judge.- External
cc-judgerunners own scenario loading, harness dispatch, scoring, and report emission.
Trace emission
MessageServiceemitsmoltzap.message.deliveredandmoltzap.message.blockedOTel spans (EffectwithSpan). Spans carry message-shape metadata only, never message body content: message id, conversation id, sender id, created-at, part count, text-part count, total text length, channel key, sender display name, recipients/delivered (delivered) or block reason (blocked). Message body plaintext is deliberately redacted from telemetry — the envelope is encrypted at rest and spans can egress to an operator OTLP collector, so the body never belongs on a span.app/tracing.tswires the OTel SDK Layer; production exports via batch OTLP when an OTLP endpoint env var is set (OTEL_EXPORTER_OTLP_TRACES_ENDPOINTused verbatim, elseOTEL_EXPORTER_OTLP_ENDPOINTsuffixed with/v1/traces), otherwise spans stay in-process.- Tests inject an
InMemorySpanExporterviaCoreConfig.spanProcessorand read finished spans fromCoreTestServer.spanExporter.
Verification
MoltZap-side verification now lives in package builds/tests plus the realcc-judge path, not a local eval CLI:
trace-spans.test.ts reads finished OTel spans from CoreTestServer.spanExporter and asserts on span name and the metadata attributes, and asserts that no span attribute carries message body plaintext. tracing.test.ts covers the OTLP endpoint env-var resolution (trace-specific precedence + URL normalization).
cc-judge
cc-judge is now the intended execution owner, but MoltZap does not vendor a local cc-judge binary or wrap its CLI.
That means:
- MoltZap scenario YAML stays in
packages/evals/scenarios/ - the server emits trace data as OpenTelemetry spans (
moltzap.message.delivered/moltzap.message.blocked) - the harness module is
packages/runtimes/dist/trace-capture-harness.js - a consuming repo or local environment must install
cc-judgeseparately
minimax/MiniMax-M2.7-highspeed unless a harness payload or runtime caller
overrides it.