[{"data":1,"prerenderedAt":1466},["ShallowReactive",2],{"blog-\u002Fblog\u002Fstopping-a-streaming-llm-agent":3,"blog-posts":756},{"id":4,"title":5,"author":6,"authorAvatar":7,"authorBio":8,"authorRole":9,"body":10,"cover":742,"coverDark":743,"date":744,"description":745,"draft":746,"extension":747,"meta":748,"navigation":264,"path":749,"seo":750,"stem":751,"tags":752,"video":754,"__hash__":755},"blog\u002Fblog\u002F2.stopping-a-streaming-llm-agent.md","Token accounting and cancellation in durable agent workflows","Meir Zana","\u002Fblog\u002Fimg\u002Fmeirz.webp","We are researchers and engineers building tools that help people reason over large bodies of literature without losing the thread back to the source.","Founder",{"type":11,"value":12,"toc":733},"minimark",[13,17,20,23,26,31,39,42,45,48,51,54,57,60,77,80,84,87,94,97,104,111,114,118,125,140,151,161,165,184,187,190,193,290,293,408,415,424,428,435,441,444,451,562,565,568,572,575,578,584,708,716,719,723,726,729],[14,15,16],"p",{},"Stopping a local coding agent and stopping a remote agent look like the same product gesture. They are not the same engineering problem.",[14,18,19],{},"In a local harness, like Codex or Claude Code, Stop can mean killing the agent loop that is currently driving the model. In a hosted agent setting, the browser stream is only a client connection receiving updates. Killing that stream does not necessarily stop the workflow, the model call, the database writes, or the credit meter behind it.",[14,21,22],{},"Once a remote workflow can both modify application state and consume billable tokens, Stop becomes a backend protocol. The system has to stop future work, preserve committed writes, update the UI to match persisted state, and finalize token usage into credits. Getting it right is a task that spans the frontend, the API server, the durable workflow engine (Temporal), and the agent worker running the workflow and its activities.",[14,24,25],{},"This post reviews how our implementation moved from treating the SSE connection as the main control surface to an orderly cancellation process for the workflow behind it.",[27,28,30],"h2",{"id":29},"disconnect-is-not-cancel","Disconnect is not cancel",[14,32,33,34,38],{},"If the agent loop runs inside the API server request context, cancellation is easier to handle. When the browser closes the connection, the server can observe the request abort, propagate an ",[35,36,37],"code",{},"AbortSignal"," into the model call, and run whatever cleanup the handler owns.",[14,40,41],{},"That pattern stops being enough when the work is meant to outlive the request, or when it is carried out in a separate worker.",[14,43,44],{},"Long-running agent workflows usually run in a durable workflow engine, such as Temporal, because the client connection is not a reliable lifetime boundary. The workflow can retry steps after transient failures, resume after a worker restart, and continue after a user refreshes the browser. That is what we needed for long research operations.",[14,46,47],{},"It also means a browser disconnect is not a cancellation signal.",[14,49,50],{},"In Agent Bayes, the research agent runs as a Temporal workflow on a separate worker pool. The browser receives progress over Server-Sent Events, but that stream is only an observer of the workflow. Refreshing the page or navigating away aborts the local SSE request, but the workflow keeps running. The client can later load persisted messages and reconnect to the running conversation stream.",[14,52,53],{},"This also lets users run agents across several mindmaps without keeping several browser streams alive until each one finishes. The workflow is the durable unit of work. The SSE connection is only the live display for whichever operation the user is watching right now.",[14,55,56],{},"That separation matters in ordinary failure cases too. A flaky Wi-Fi connection should not kill three minutes of paid work, because losing Wi-Fi should only cost patience, not credits.",[14,58,59],{},"So we have two actions that look similar in the browser and mean different things on the backend:",[61,62,63,71],"ul",{},[64,65,66,70],"li",{},[67,68,69],"strong",{},"Disconnect"," aborts the local SSE request and leaves the Temporal workflow alone.",[64,72,73,76],{},[67,74,75],{},"Cancel"," calls the cancel endpoint for the current session, asks Temporal to cancel that workflow, and reflects any state changes that occurred before cancellation completes.",[14,78,79],{},"The rest of the implementation follows from that split: request cancellation through the API, let Temporal deliver it to the workflow execution, let a running activity observe it through heartbeats when needed, then finalize state and accounting from the workflow.",[27,81,83],{"id":82},"cancellation-has-to-settle-state-and-accounting","Cancellation has to settle state and accounting",[14,85,86],{},"Once an agent can change application state, cancellation has to be graceful for two reasons.",[14,88,89,90,93],{},"First, the frontend state has to reflect what actually happened. If the user clicks Stop while the agent is writing nodes into a mindmap, any edits committed before cancellation completes still need to appear in the UI. Otherwise, the visible map diverges from the persisted one. The cancellation path also has to converge on a durable answer status, in our case ",[35,91,92],{},"CANCELLED",", so replay, refresh, and audit history all agree.",[14,95,96],{},"Second, and perhaps more importantly, usage accounting has to finalize. If a workflow stops before accounting closes, provider usage and the product credit ledger can fall out of sync. The provider may still charge for tokens already generated, while the product never records the corresponding credit spend. Over time, that becomes a serious leak.",[14,98,99,100,103],{},"Most products that sell credits have an exchange rate between provider usage tokens and product credits. In our case, we want the user's credit history to show one meaningful row for the workflow, such as ",[35,101,102],{},"mindmap",", not ten rows that expose internal multi-agent steps, tool calls, and retries. Hopefully nobody opens a credit ledger hoping to reverse engineer your orchestration graph.",[14,105,106,107,110],{},"During a workflow, each model response records actual token usage under a shared ",[35,108,109],{},"operation_id",". The credit ledger reserves capacity at the start, then finalizes once at the end by aggregating the operation totals. The user sees one spend entry for the workflow, while the system still keeps per-step usage for internal accounting.",[14,112,113],{},"Cancellation is the case where those two ledgers are most likely to diverge. The workflow has to stop future work, preserve committed state, aggregate whatever usage has landed, charge only the actual credits consumed, and release the unused reserve.",[27,115,117],{"id":116},"temporal-cancellation-is-a-request-not-a-terminal-state","Temporal cancellation is a request, not a terminal state",[14,119,120,121,124],{},"In our API, the cancel endpoint finds the running workflow for the mindmap session and calls ",[35,122,123],{},"handle.cancel()"," on its workflow ID. That call requests cancellation from Temporal, but cancellation is cooperative. The workflow has to observe the request and stop itself. Temporal does not interrupt arbitrary code running inside a workflow or activity.",[14,126,127,128,131,132,135,136,139],{},"The workflow code handles cancellation in two places. If cancellation is observed while the workflow is not inside an activity, the workflow catches ",[35,129,130],{},"asyncio.CancelledError",", marks the run as cancelled, and re-raises. If cancellation is observed while the workflow is waiting on an activity, the workflow catches an ",[35,133,134],{},"ActivityError"," whose cause is Temporal ",[35,137,138],{},"CancelledError",", then marks the run as cancelled.",[14,141,142,143,146,147,150],{},"Both cases rely on the same ",[35,144,145],{},"finally"," block. The workflow always calls ",[35,148,149],{},"finalize_workflow",". That finalization activity records the token summary, writes the cancelled answer state, and finalizes credits in an idempotent way.",[14,152,153,154,156,157,160],{},"The SSE connection is not part of the cancellation mechanism. It keeps reading persisted agent messages and the answer row. When the answer status becomes ",[35,155,92],{},", it emits an ",[35,158,159],{},"answer_completed"," SSE event and closes.",[27,162,164],{"id":163},"temporal-cancellation-depends-on-activity-heartbeats","Temporal cancellation depends on activity heartbeats",[14,166,167,168,171,172,175,176,183],{},"Temporal cannot interrupt arbitrary code running inside a remote activity. The activity has to cooperate. For non-local activities, the Python SDK requires a ",[35,169,170],{},"heartbeat_timeout"," and calls to ",[35,173,174],{},"activity.heartbeat()"," so the worker can receive a ",[177,178,182],"a",{"href":179,"rel":180},"https:\u002F\u002Fdocs.temporal.io\u002Fdevelop\u002Fpython\u002Fworkflows\u002Fcancellation",[181],"nofollow","cancellation request",".",[14,185,186],{},"The heartbeat point needs to be inside the work that can run for a long time. In our case, that is the innermost loop over LangGraph stream events, not the workflow loop around whole agent roles.",[14,188,189],{},"A Researcher role may stream tokens for a while, call a retrieval tool, and then resume streaming. If the activity heartbeats only after the role completes, cancellation cannot be observed until that role has already finished.",[14,191,192],{},"We pass a callback into the stream loop and invoke it after each event has been processed:",[194,195,200],"pre",{"className":196,"code":197,"language":198,"meta":199,"style":199},"language-python shiki shiki-themes github-light github-dark","async for stream_mode, data in agent.astream(\n    {\"messages\": messages},\n    **kwargs,\n):\n    # Convert and persist token, tool-call, and tool-result events here.\n\n    if on_event:\n        await on_event({\"stream_mode\": stream_mode})\n","python","",[35,201,202,224,237,246,252,259,266,275],{"__ignoreMap":199},[203,204,207,211,214,218,221],"span",{"class":205,"line":206},"line",1,[203,208,210],{"class":209},"szBVR","async",[203,212,213],{"class":209}," for",[203,215,217],{"class":216},"sVt8B"," stream_mode, data ",[203,219,220],{"class":209},"in",[203,222,223],{"class":216}," agent.astream(\n",[203,225,227,230,234],{"class":205,"line":226},2,[203,228,229],{"class":216},"    {",[203,231,233],{"class":232},"sZZnC","\"messages\"",[203,235,236],{"class":216},": messages},\n",[203,238,240,243],{"class":205,"line":239},3,[203,241,242],{"class":209},"    **",[203,244,245],{"class":216},"kwargs,\n",[203,247,249],{"class":205,"line":248},4,[203,250,251],{"class":216},"):\n",[203,253,255],{"class":205,"line":254},5,[203,256,258],{"class":257},"sJ8bj","    # Convert and persist token, tool-call, and tool-result events here.\n",[203,260,262],{"class":205,"line":261},6,[203,263,265],{"emptyLinePlaceholder":264},true,"\n",[203,267,269,272],{"class":205,"line":268},7,[203,270,271],{"class":209},"    if",[203,273,274],{"class":216}," on_event:\n",[203,276,278,281,284,287],{"class":205,"line":277},8,[203,279,280],{"class":209},"        await",[203,282,283],{"class":216}," on_event({",[203,285,286],{"class":232},"\"stream_mode\"",[203,288,289],{"class":216},": stream_mode})\n",[14,291,292],{},"The activity supplies the callback:",[194,294,296],{"className":196,"code":295,"language":198,"meta":199,"style":199},"_last_heartbeat = time.monotonic()\n\nasync def _on_event(event: dict) -> None:\n    nonlocal _last_heartbeat\n    now = time.monotonic()\n    if activity.is_cancelled() or (now - _last_heartbeat) >= 5.0:\n        activity.heartbeat(f\"...\")\n        _last_heartbeat = now\n",[35,297,298,309,313,340,348,357,384,398],{"__ignoreMap":199},[203,299,300,303,306],{"class":205,"line":206},[203,301,302],{"class":216},"_last_heartbeat ",[203,304,305],{"class":209},"=",[203,307,308],{"class":216}," time.monotonic()\n",[203,310,311],{"class":205,"line":226},[203,312,265],{"emptyLinePlaceholder":264},[203,314,315,317,320,324,327,331,334,337],{"class":205,"line":239},[203,316,210],{"class":209},[203,318,319],{"class":209}," def",[203,321,323],{"class":322},"sScJk"," _on_event",[203,325,326],{"class":216},"(event: ",[203,328,330],{"class":329},"sj4cs","dict",[203,332,333],{"class":216},") -> ",[203,335,336],{"class":329},"None",[203,338,339],{"class":216},":\n",[203,341,342,345],{"class":205,"line":248},[203,343,344],{"class":209},"    nonlocal",[203,346,347],{"class":216}," _last_heartbeat\n",[203,349,350,353,355],{"class":205,"line":254},[203,351,352],{"class":216},"    now ",[203,354,305],{"class":209},[203,356,308],{"class":216},[203,358,359,361,364,367,370,373,376,379,382],{"class":205,"line":261},[203,360,271],{"class":209},[203,362,363],{"class":216}," activity.is_cancelled() ",[203,365,366],{"class":209},"or",[203,368,369],{"class":216}," (now ",[203,371,372],{"class":209},"-",[203,374,375],{"class":216}," _last_heartbeat) ",[203,377,378],{"class":209},">=",[203,380,381],{"class":329}," 5.0",[203,383,339],{"class":216},[203,385,386,389,392,395],{"class":205,"line":268},[203,387,388],{"class":216},"        activity.heartbeat(",[203,390,391],{"class":209},"f",[203,393,394],{"class":232},"\"...\"",[203,396,397],{"class":216},")\n",[203,399,400,403,405],{"class":205,"line":277},[203,401,402],{"class":216},"        _last_heartbeat ",[203,404,305],{"class":209},[203,406,407],{"class":216}," now\n",[14,409,410,411,414],{},"The stream loop calls ",[35,412,413],{},"on_event"," after it processes each event. That ordering gives token usage events and tool completion events a chance to be yielded and persisted before the heartbeat delivers the cancellation request that unwinds the activity.",[14,416,417,418,423],{},"This is still cooperative cancellation. If the activity is inside one long awaited tool call, the callback cannot run until control returns. ",[177,419,422],{"href":420,"rel":421},"https:\u002F\u002Fgithub.com\u002Ftemporalio\u002Fsdk-python\u002Fissues\u002F700",[181],"A Temporal Python SDK issue about activity cancellation"," describes the same edge: cancellation is delivered through heartbeats, and Python async cancellation still needs an await point.",[27,425,427],{"id":426},"the-workflow-waits-before-finalizing-accounting","The workflow waits before finalizing accounting",[14,429,430,431,434],{},"Temporal's ",[35,432,433],{},"ActivityCancellationType"," decides what the workflow should wait for after requesting activity cancellation.",[14,436,437,440],{},[35,438,439],{},"TRY_CANCEL"," sends cancellation to the activity and lets the workflow move on promptly. That shortens the workflow's cancellation path, but it does not guarantee that the activity has finished unwinding.",[14,442,443],{},"If the workflow moves to finalization while the activity is still unwinding, finalization can read an incomplete operation total. The last model response may have generated tokens, but the usage event may not have reached the database yet. Now the credit ledger finalizes too early.",[14,445,446,447,450],{},"For an agent role activity, we use ",[35,448,449],{},"WAIT_CANCELLATION_COMPLETED",":",[194,452,454],{"className":196,"code":453,"language":198,"meta":199,"style":199},"result = await workflow.execute_activity(\n    MindmapAgentActivities.execute_role,\n    args=[...],\n    start_to_close_timeout=timedelta(...),\n    heartbeat_timeout=timedelta(...),\n    retry_policy=RetryPolicy(...),\n    cancellation_type=(\n        ActivityCancellationType.WAIT_CANCELLATION_COMPLETED\n    ),\n)\n",[35,455,456,469,474,491,506,519,533,543,551,557],{"__ignoreMap":199},[203,457,458,461,463,466],{"class":205,"line":206},[203,459,460],{"class":216},"result ",[203,462,305],{"class":209},[203,464,465],{"class":209}," await",[203,467,468],{"class":216}," workflow.execute_activity(\n",[203,470,471],{"class":205,"line":226},[203,472,473],{"class":216},"    MindmapAgentActivities.execute_role,\n",[203,475,476,480,482,485,488],{"class":205,"line":239},[203,477,479],{"class":478},"s4XuR","    args",[203,481,305],{"class":209},[203,483,484],{"class":216},"[",[203,486,487],{"class":329},"...",[203,489,490],{"class":216},"],\n",[203,492,493,496,498,501,503],{"class":205,"line":248},[203,494,495],{"class":478},"    start_to_close_timeout",[203,497,305],{"class":209},[203,499,500],{"class":216},"timedelta(",[203,502,487],{"class":329},[203,504,505],{"class":216},"),\n",[203,507,508,511,513,515,517],{"class":205,"line":254},[203,509,510],{"class":478},"    heartbeat_timeout",[203,512,305],{"class":209},[203,514,500],{"class":216},[203,516,487],{"class":329},[203,518,505],{"class":216},[203,520,521,524,526,529,531],{"class":205,"line":261},[203,522,523],{"class":478},"    retry_policy",[203,525,305],{"class":209},[203,527,528],{"class":216},"RetryPolicy(",[203,530,487],{"class":329},[203,532,505],{"class":216},[203,534,535,538,540],{"class":205,"line":268},[203,536,537],{"class":478},"    cancellation_type",[203,539,305],{"class":209},[203,541,542],{"class":216},"(\n",[203,544,545,548],{"class":205,"line":277},[203,546,547],{"class":216},"        ActivityCancellationType.",[203,549,550],{"class":329},"WAIT_CANCELLATION_COMPLETED\n",[203,552,554],{"class":205,"line":553},9,[203,555,556],{"class":216},"    ),\n",[203,558,560],{"class":205,"line":559},10,[203,561,397],{"class":216},[14,563,564],{},"This ensures finalization does not run until the activity has actually closed.",[14,566,567],{},"The implementation separates those concerns: cancellation latency (=heartbeat rate) is handled inside the activity, and accounting order is handled by the workflow.",[27,569,571],{"id":570},"finalization-commits-usage-and-terminal-state","Finalization commits usage and terminal state",[14,573,574],{},"At the start of a run we reserve credits. Each model response records actual token usage under the operation ID. At the end, finalization charges the recorded usage and releases the unused reserve.",[14,576,577],{},"Finalization is not cleanup after the main work. It is the step that commits the terminal state of the operation. It must run after success, cancellation, resource deletion (think of the case where the agent is working on a task and the user deletes the mindmap, because PhDs can be dramatic sometimes), and failure, and it must be safe to retry.",[14,579,580,581,583],{},"The workflow carries terminal state into a single ",[35,582,145],{}," block:",[194,585,587],{"className":196,"code":586,"language":198,"meta":199,"style":199},"finally:\n    await workflow.execute_activity(\n        MindmapAgentActivities.finalize_workflow,\n        args=[MindmapAgentFinalizeInput(\n            ...\n        )],\n        start_to_close_timeout=timedelta(...),\n        retry_policy=RetryPolicy(\n            maximum_attempts=...,\n            backoff_coefficient=...,\n        ),\n        cancellation_type=(\n            ActivityCancellationType.WAIT_CANCELLATION_COMPLETED\n        ),\n    )\n",[35,588,589,595,602,607,617,622,627,640,650,662,673,679,689,697,702],{"__ignoreMap":199},[203,590,591,593],{"class":205,"line":206},[203,592,145],{"class":209},[203,594,339],{"class":216},[203,596,597,600],{"class":205,"line":226},[203,598,599],{"class":209},"    await",[203,601,468],{"class":216},[203,603,604],{"class":205,"line":239},[203,605,606],{"class":216},"        MindmapAgentActivities.finalize_workflow,\n",[203,608,609,612,614],{"class":205,"line":248},[203,610,611],{"class":478},"        args",[203,613,305],{"class":209},[203,615,616],{"class":216},"[MindmapAgentFinalizeInput(\n",[203,618,619],{"class":205,"line":254},[203,620,621],{"class":329},"            ...\n",[203,623,624],{"class":205,"line":261},[203,625,626],{"class":216},"        )],\n",[203,628,629,632,634,636,638],{"class":205,"line":268},[203,630,631],{"class":478},"        start_to_close_timeout",[203,633,305],{"class":209},[203,635,500],{"class":216},[203,637,487],{"class":329},[203,639,505],{"class":216},[203,641,642,645,647],{"class":205,"line":277},[203,643,644],{"class":478},"        retry_policy",[203,646,305],{"class":209},[203,648,649],{"class":216},"RetryPolicy(\n",[203,651,652,655,657,659],{"class":205,"line":553},[203,653,654],{"class":478},"            maximum_attempts",[203,656,305],{"class":209},[203,658,487],{"class":329},[203,660,661],{"class":216},",\n",[203,663,664,667,669,671],{"class":205,"line":559},[203,665,666],{"class":478},"            backoff_coefficient",[203,668,305],{"class":209},[203,670,487],{"class":329},[203,672,661],{"class":216},[203,674,676],{"class":205,"line":675},11,[203,677,678],{"class":216},"        ),\n",[203,680,682,685,687],{"class":205,"line":681},12,[203,683,684],{"class":478},"        cancellation_type",[203,686,305],{"class":209},[203,688,542],{"class":216},[203,690,692,695],{"class":205,"line":691},13,[203,693,694],{"class":216},"            ActivityCancellationType.",[203,696,550],{"class":329},[203,698,700],{"class":205,"line":699},14,[203,701,678],{"class":216},[203,703,705],{"class":205,"line":704},15,[203,706,707],{"class":216},"    )\n",[14,709,710,712,713,183],{},[35,711,149],{}," records a token summary, writes the terminal answer and conversation statuses, and calls ",[35,714,715],{},"credits_manager.finalize(operation_id)",[14,717,718],{},"Credit finalization is idempotent. It takes an advisory lock on the operation ID, checks whether usage has already been recorded, aggregates all usage events for that operation, inserts one or more credit usage rows, and expires the reserve. If credit finalization fails, the activity raises and Temporal retries it.",[27,720,722],{"id":721},"cancellation-is-complete-after-finalization","Cancellation is complete after finalization",[14,724,725],{},"The terminal state should be driven by the durable record written after finalization. By then, committed application state has been preserved, the answer has a final status, token usage has been aggregated, and the credit reserve has either been charged or released.",[14,727,728],{},"If you made it this far, congratulations. Your reading session can now be finalized, with zero credits charged.",[730,731,732],"style",{},"html pre.shiki code .szBVR, html code.shiki .szBVR{--shiki-default:#D73A49;--shiki-dark:#F97583}html pre.shiki code .sVt8B, html code.shiki .sVt8B{--shiki-default:#24292E;--shiki-dark:#E1E4E8}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html pre.shiki code .sJ8bj, html code.shiki .sJ8bj{--shiki-default:#6A737D;--shiki-dark:#6A737D}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html pre.shiki code .s4XuR, html code.shiki .s4XuR{--shiki-default:#E36209;--shiki-dark:#FFAB70}",{"title":199,"searchDepth":248,"depth":248,"links":734},[735,736,737,738,739,740,741],{"id":29,"depth":226,"text":30},{"id":82,"depth":226,"text":83},{"id":116,"depth":226,"text":117},{"id":163,"depth":226,"text":164},{"id":426,"depth":226,"text":427},{"id":570,"depth":226,"text":571},{"id":721,"depth":226,"text":722},"\u002Fblog\u002Fimg\u002Ftoken-accounting-cancellation-cover-light.webp","\u002Fblog\u002Fimg\u002Ftoken-accounting-cancellation-cover-dark.webp","2026-07-02","Closing an LLM stream is not the same as stopping the work behind it. Durable agents need cancellation paths that settle application state and credit accounting before the UI declares victory.",false,"md",{},"\u002Fblog\u002Fstopping-a-streaming-llm-agent",{"title":5,"description":745},"blog\u002F2.stopping-a-streaming-llm-agent",[753],"Engineering","\u002Fblog\u002Fvideo\u002Fcancelling.mp4","4fxFpS5IHqvF6qyeljJltvA_FAv9ZzygoT4DX5yGJEo",[757,1266],{"id":4,"title":5,"author":6,"authorAvatar":7,"authorBio":8,"authorRole":9,"body":758,"cover":742,"coverDark":743,"date":744,"description":745,"draft":746,"extension":747,"meta":1263,"navigation":264,"path":749,"seo":1264,"stem":751,"tags":1265,"video":754,"__hash__":755},{"type":11,"value":759,"toc":1254},[760,762,764,766,768,770,774,776,778,780,782,784,786,788,798,800,802,804,808,810,814,818,820,822,826,834,840,846,848,857,859,861,863,921,923,1009,1013,1018,1020,1024,1028,1030,1034,1122,1124,1126,1128,1130,1132,1136,1238,1244,1246,1248,1250,1252],[14,761,16],{},[14,763,19],{},[14,765,22],{},[14,767,25],{},[27,769,30],{"id":29},[14,771,33,772,38],{},[35,773,37],{},[14,775,41],{},[14,777,44],{},[14,779,47],{},[14,781,50],{},[14,783,53],{},[14,785,56],{},[14,787,59],{},[61,789,790,794],{},[64,791,792,70],{},[67,793,69],{},[64,795,796,76],{},[67,797,75],{},[14,799,79],{},[27,801,83],{"id":82},[14,803,86],{},[14,805,89,806,93],{},[35,807,92],{},[14,809,96],{},[14,811,99,812,103],{},[35,813,102],{},[14,815,106,816,110],{},[35,817,109],{},[14,819,113],{},[27,821,117],{"id":116},[14,823,120,824,124],{},[35,825,123],{},[14,827,127,828,131,830,135,832,139],{},[35,829,130],{},[35,831,134],{},[35,833,138],{},[14,835,142,836,146,838,150],{},[35,837,145],{},[35,839,149],{},[14,841,153,842,156,844,160],{},[35,843,92],{},[35,845,159],{},[27,847,164],{"id":163},[14,849,167,850,171,852,175,854,183],{},[35,851,170],{},[35,853,174],{},[177,855,182],{"href":179,"rel":856},[181],[14,858,186],{},[14,860,189],{},[14,862,192],{},[194,864,865],{"className":196,"code":197,"language":198,"meta":199,"style":199},[35,866,867,879,887,893,897,901,905,911],{"__ignoreMap":199},[203,868,869,871,873,875,877],{"class":205,"line":206},[203,870,210],{"class":209},[203,872,213],{"class":209},[203,874,217],{"class":216},[203,876,220],{"class":209},[203,878,223],{"class":216},[203,880,881,883,885],{"class":205,"line":226},[203,882,229],{"class":216},[203,884,233],{"class":232},[203,886,236],{"class":216},[203,888,889,891],{"class":205,"line":239},[203,890,242],{"class":209},[203,892,245],{"class":216},[203,894,895],{"class":205,"line":248},[203,896,251],{"class":216},[203,898,899],{"class":205,"line":254},[203,900,258],{"class":257},[203,902,903],{"class":205,"line":261},[203,904,265],{"emptyLinePlaceholder":264},[203,906,907,909],{"class":205,"line":268},[203,908,271],{"class":209},[203,910,274],{"class":216},[203,912,913,915,917,919],{"class":205,"line":277},[203,914,280],{"class":209},[203,916,283],{"class":216},[203,918,286],{"class":232},[203,920,289],{"class":216},[14,922,292],{},[194,924,925],{"className":196,"code":295,"language":198,"meta":199,"style":199},[35,926,927,935,939,957,963,971,991,1001],{"__ignoreMap":199},[203,928,929,931,933],{"class":205,"line":206},[203,930,302],{"class":216},[203,932,305],{"class":209},[203,934,308],{"class":216},[203,936,937],{"class":205,"line":226},[203,938,265],{"emptyLinePlaceholder":264},[203,940,941,943,945,947,949,951,953,955],{"class":205,"line":239},[203,942,210],{"class":209},[203,944,319],{"class":209},[203,946,323],{"class":322},[203,948,326],{"class":216},[203,950,330],{"class":329},[203,952,333],{"class":216},[203,954,336],{"class":329},[203,956,339],{"class":216},[203,958,959,961],{"class":205,"line":248},[203,960,344],{"class":209},[203,962,347],{"class":216},[203,964,965,967,969],{"class":205,"line":254},[203,966,352],{"class":216},[203,968,305],{"class":209},[203,970,308],{"class":216},[203,972,973,975,977,979,981,983,985,987,989],{"class":205,"line":261},[203,974,271],{"class":209},[203,976,363],{"class":216},[203,978,366],{"class":209},[203,980,369],{"class":216},[203,982,372],{"class":209},[203,984,375],{"class":216},[203,986,378],{"class":209},[203,988,381],{"class":329},[203,990,339],{"class":216},[203,992,993,995,997,999],{"class":205,"line":268},[203,994,388],{"class":216},[203,996,391],{"class":209},[203,998,394],{"class":232},[203,1000,397],{"class":216},[203,1002,1003,1005,1007],{"class":205,"line":277},[203,1004,402],{"class":216},[203,1006,305],{"class":209},[203,1008,407],{"class":216},[14,1010,410,1011,414],{},[35,1012,413],{},[14,1014,417,1015,423],{},[177,1016,422],{"href":420,"rel":1017},[181],[27,1019,427],{"id":426},[14,1021,430,1022,434],{},[35,1023,433],{},[14,1025,1026,440],{},[35,1027,439],{},[14,1029,443],{},[14,1031,446,1032,450],{},[35,1033,449],{},[194,1035,1036],{"className":196,"code":453,"language":198,"meta":199,"style":199},[35,1037,1038,1048,1052,1064,1076,1088,1100,1108,1114,1118],{"__ignoreMap":199},[203,1039,1040,1042,1044,1046],{"class":205,"line":206},[203,1041,460],{"class":216},[203,1043,305],{"class":209},[203,1045,465],{"class":209},[203,1047,468],{"class":216},[203,1049,1050],{"class":205,"line":226},[203,1051,473],{"class":216},[203,1053,1054,1056,1058,1060,1062],{"class":205,"line":239},[203,1055,479],{"class":478},[203,1057,305],{"class":209},[203,1059,484],{"class":216},[203,1061,487],{"class":329},[203,1063,490],{"class":216},[203,1065,1066,1068,1070,1072,1074],{"class":205,"line":248},[203,1067,495],{"class":478},[203,1069,305],{"class":209},[203,1071,500],{"class":216},[203,1073,487],{"class":329},[203,1075,505],{"class":216},[203,1077,1078,1080,1082,1084,1086],{"class":205,"line":254},[203,1079,510],{"class":478},[203,1081,305],{"class":209},[203,1083,500],{"class":216},[203,1085,487],{"class":329},[203,1087,505],{"class":216},[203,1089,1090,1092,1094,1096,1098],{"class":205,"line":261},[203,1091,523],{"class":478},[203,1093,305],{"class":209},[203,1095,528],{"class":216},[203,1097,487],{"class":329},[203,1099,505],{"class":216},[203,1101,1102,1104,1106],{"class":205,"line":268},[203,1103,537],{"class":478},[203,1105,305],{"class":209},[203,1107,542],{"class":216},[203,1109,1110,1112],{"class":205,"line":277},[203,1111,547],{"class":216},[203,1113,550],{"class":329},[203,1115,1116],{"class":205,"line":553},[203,1117,556],{"class":216},[203,1119,1120],{"class":205,"line":559},[203,1121,397],{"class":216},[14,1123,564],{},[14,1125,567],{},[27,1127,571],{"id":570},[14,1129,574],{},[14,1131,577],{},[14,1133,580,1134,583],{},[35,1135,145],{},[194,1137,1138],{"className":196,"code":586,"language":198,"meta":199,"style":199},[35,1139,1140,1146,1152,1156,1164,1168,1172,1184,1192,1202,1212,1216,1224,1230,1234],{"__ignoreMap":199},[203,1141,1142,1144],{"class":205,"line":206},[203,1143,145],{"class":209},[203,1145,339],{"class":216},[203,1147,1148,1150],{"class":205,"line":226},[203,1149,599],{"class":209},[203,1151,468],{"class":216},[203,1153,1154],{"class":205,"line":239},[203,1155,606],{"class":216},[203,1157,1158,1160,1162],{"class":205,"line":248},[203,1159,611],{"class":478},[203,1161,305],{"class":209},[203,1163,616],{"class":216},[203,1165,1166],{"class":205,"line":254},[203,1167,621],{"class":329},[203,1169,1170],{"class":205,"line":261},[203,1171,626],{"class":216},[203,1173,1174,1176,1178,1180,1182],{"class":205,"line":268},[203,1175,631],{"class":478},[203,1177,305],{"class":209},[203,1179,500],{"class":216},[203,1181,487],{"class":329},[203,1183,505],{"class":216},[203,1185,1186,1188,1190],{"class":205,"line":277},[203,1187,644],{"class":478},[203,1189,305],{"class":209},[203,1191,649],{"class":216},[203,1193,1194,1196,1198,1200],{"class":205,"line":553},[203,1195,654],{"class":478},[203,1197,305],{"class":209},[203,1199,487],{"class":329},[203,1201,661],{"class":216},[203,1203,1204,1206,1208,1210],{"class":205,"line":559},[203,1205,666],{"class":478},[203,1207,305],{"class":209},[203,1209,487],{"class":329},[203,1211,661],{"class":216},[203,1213,1214],{"class":205,"line":675},[203,1215,678],{"class":216},[203,1217,1218,1220,1222],{"class":205,"line":681},[203,1219,684],{"class":478},[203,1221,305],{"class":209},[203,1223,542],{"class":216},[203,1225,1226,1228],{"class":205,"line":691},[203,1227,694],{"class":216},[203,1229,550],{"class":329},[203,1231,1232],{"class":205,"line":699},[203,1233,678],{"class":216},[203,1235,1236],{"class":205,"line":704},[203,1237,707],{"class":216},[14,1239,1240,712,1242,183],{},[35,1241,149],{},[35,1243,715],{},[14,1245,718],{},[27,1247,722],{"id":721},[14,1249,725],{},[14,1251,728],{},[730,1253,732],{},{"title":199,"searchDepth":248,"depth":248,"links":1255},[1256,1257,1258,1259,1260,1261,1262],{"id":29,"depth":226,"text":30},{"id":82,"depth":226,"text":83},{"id":116,"depth":226,"text":117},{"id":163,"depth":226,"text":164},{"id":426,"depth":226,"text":427},{"id":570,"depth":226,"text":571},{"id":721,"depth":226,"text":722},{},{"title":5,"description":745},[753],{"id":1267,"title":1268,"author":6,"authorAvatar":7,"authorBio":8,"authorRole":9,"body":1269,"cover":1453,"coverDark":1454,"date":1455,"description":1456,"draft":746,"extension":747,"meta":1457,"navigation":264,"path":1458,"seo":1459,"stem":1460,"tags":1461,"video":1464,"__hash__":1465},"blog\u002Fblog\u002F1.introducing-agent-bayes.md","Introducing Agent Bayes: research you can actually trust",{"type":11,"value":1270,"toc":1443},[1271,1274,1277,1280,1284,1287,1293,1299,1305,1309,1312,1315,1318,1322,1325,1328,1331,1334,1337,1341,1344,1347,1350,1354,1357,1360,1363,1367,1411,1415,1418,1422,1435],[14,1272,1273],{},"Reading is the easy part. The hard part is holding fifty papers in your head at once: which ones agree, which ones contradict each other, where the evidence is thin, and exactly which sentence in which PDF backs the claim you are about to make. That work does not scale with effort. It scales with the number of sources, and at some point a literature review stops being a reading task and becomes a memory problem.",[14,1275,1276],{},"The tools meant to help have mostly made this worse. A general chatbot will happily summarize a paper it has never read, invent a citation that looks plausible, and forget the whole conversation the moment you close the tab. You get fluent text with no thread back to the evidence. For casual questions that is fine. For research, an unverifiable claim is worse than no claim at all.",[14,1278,1279],{},"We built Agent Bayes to close that gap. It is a multi-agent AI research assistant built around a shared, interactive mindmap, where every substantive claim is backed by a citation from your own library that you can open and check.",[27,1281,1283],{"id":1282},"the-problem-with-chatting-your-way-through-literature","The problem with chatting your way through literature",[14,1285,1286],{},"Most AI research tools are a chat window bolted onto a language model. That design carries three failures that matter enormously when you are doing serious work.",[14,1288,1289,1292],{},[67,1290,1291],{},"They hallucinate sources."," A model optimized to provide answers quickly will produce inaccurate citations that are not easily verifiable. It reads sources, summarizes them, and loses the nuances. You cannot tell the difference from the text alone, so you end up re-verifying everything by hand, which consumes much more time and creates frustration when you find out something is wrong after the fact.",[14,1294,1295,1298],{},[67,1296,1297],{},"They forget."," A chat thread is a transcript, not a workspace. There is no durable structure that accumulates as you work. There's no way to manage numerous follow-ups or trim what's not needed. You end up pasting context you kept in external tools and hope the model will align with where you left off.",[14,1300,1301,1304],{},[67,1302,1303],{},"They flatten disagreement."," When summarizing a contested topic, existing tools tend to average the views into a single confident paragraph. But the disagreement is the most important part of the literature. Smoothing it away hides exactly what a researcher needs to see.",[27,1306,1308],{"id":1307},"research-needs-a-structure-you-can-see","Research needs a structure you can see",[14,1310,1311],{},"Think about how a coding agent works on a real project. It does not operate on one long chat. It works against a codebase: a tree of files and modules, organized so that any part can be found, changed, and reasoned about on its own. That structure is what keeps the work tractable. Take it away, and even a capable agent loses the plot.",[14,1313,1314],{},"Research is no different. A literature review is not a flat list of summaries. It is a hierarchy: themes split into sub-themes, claims supported by evidence, positions answered by counter-positions. That structure is the actual product of synthesis. A wall of prose hides it, and a chat log destroys it.",[14,1316,1317],{},"A mindmap makes the hierarchy visible, and seeing it is what lets you act on it. You take in the shape of the argument at a glance: which branches run deep and which are thin, where two lines of evidence converge, which question still has no answer. You spot the gap because you can see the empty branch. You cut the tangent because you can watch it sprawling. You reorganize because the structure is in front of you instead of buried in paragraphs you did not ask for and in notes you maintain and have to re-read and edit constantly in order to navigate and make progress.",[27,1319,1321],{"id":1320},"where-deep-research-gets-you-halfway","Where \"deep research\" gets you halfway",[14,1323,1324],{},"\"Deep Research\" tools get closer to this than a plain chatbot. They work in two passes: a broad sweep to map out the aspects of a question, then a deeper dive into each one. That first pass is, in effect, a tree, with the aspects as branches. A mind map is simply that tree made explicit and kept around.",[14,1326,1327],{},"The trouble is what happens next. Say the broad pass surfaces eight aspects and writes a long report covering all of them. Maybe three are genuinely interesting. You have paid for the other five in tokens and reading time, and now you want to go deeper on the three that matter. What are your options?",[14,1329,1330],{},"You can ask follow-ups in the same conversation, but as the research grows, the thread becomes impossible to track. Each new answer pushes the earlier structure further out of view, and nothing accumulates into a workspace you can navigate.",[14,1332,1333],{},"Or you can run a fresh deep research pass on each of the three aspects. Now you have three more reports, each with its own structure, none of them connected to the others. You are back to stitching documents together by hand. What you end up with is a pile of notes, citations you never verified pointing at sources you never opened, and claims flattened down to \"this is roughly the spirit of this paper\" rather than what the paper actually says. That is how you get academic slop, and it is a large part of why many researchers are wary of AI in serious work.",[14,1335,1336],{},"Agent Bayes is built so that the tree is the workspace itself, not a byproduct you throw away once the report is written. It stays around so you can expand or trim any branch, reorganize, rephrase, and edit in place as your understanding changes. And it treats provenance as the point, not an afterthought, which is what the next section is about.",[27,1338,1340],{"id":1339},"a-mindmap-not-a-chat-log","A mindmap, not a chat log",[14,1342,1343],{},"Agent Bayes replaces the chat transcript with a persistent mind map. The mind map is the single source of truth for your research, and it grows and reorganizes as you work rather than scrolling away.",[14,1345,1346],{},"You bring a library of papers, uploaded as PDFs or synced from Zotero, and your instructions initiate a multi-agent pipeline that retrieves the relevant passages from your sources, synthesizes citation-backed claims, and writes them into the map as structured nodes. You direct every step, and the work stays yours.",[14,1348,1349],{},"Because the result is a structure rather than a wall of text, you can expand a branch, dive deeper, restructure an argument, reorganize nodes into chapters, or ask for a prose synthesis when you are ready to write. The mind map is something you build on, not a message you scroll past.",[27,1351,1353],{"id":1352},"every-claim-traces-back-to-a-source","Every claim traces back to a source",[14,1355,1356],{},"This is the part we care about most. When the agent writes a claim, it does not just name a paper. It pins the claim to the exact passage it drew from, down to the specific page, and gives you a link straight to that spot in the source so you can read it in context. We have put a great deal of effort into reaching this level of provenance, because a citation you cannot check is not really a citation. The system does not fabricate sources, and it will tell you when the evidence is not there rather than filling the gap with confident prose.",[14,1358,1359],{},"When sources disagree, Agent Bayes preserves the disagreement. Contradicting viewpoints become sibling nodes in the map instead of being averaged into a false consensus. You see the shape of the debate, not a flattened summary of it.",[14,1361,1362],{},"You can also work without the agent at all. Search your library semantically, in English even when the papers are in other languages, and attach citations to nodes you wrote yourself. When you do, the agent can score how well your wording matches the cited passage and suggest a rewrite where the phrasing overstates the evidence.",[27,1364,1366],{"id":1365},"what-you-get-out-of-it","What you get out of it",[61,1368,1369,1375,1381,1387,1393,1399,1405],{},[64,1370,1371,1374],{},[67,1372,1373],{},"Provenance you can check."," Every claim is pinned to the exact passage and page it came from, with a direct link to the source. No invented citations, no unverifiable summaries.",[64,1376,1377,1380],{},[67,1378,1379],{},"A structure you can see."," Your research lives as a visible hierarchy you can navigate, expand, prune, and reorganize, instead of a transcript you scroll.",[64,1382,1383,1386],{},[67,1384,1385],{},"Work that compounds."," The mindmap persists and accumulates across sessions, so your research builds on itself instead of resetting every time you start a new chat.",[64,1388,1389,1392],{},[67,1390,1391],{},"Disagreement preserved."," Competing views are kept as distinct nodes, so the structure of a debate stays visible instead of being smoothed away.",[64,1394,1395,1398],{},[67,1396,1397],{},"You stay in control."," You direct the research, approve the structure, and write your own claims. The agent retrieves and drafts, you decide.",[64,1400,1401,1404],{},[67,1402,1403],{},"A library that answers back."," Semantic search across your whole corpus, across languages, with citations you can attach yourself.",[64,1406,1407,1410],{},[67,1408,1409],{},"Fast enough to stay in flow."," Typical workflows complete in 90 to 180 seconds, streaming updates into the map as they happen.",[27,1412,1414],{"id":1413},"who-this-is-for","Who this is for",[14,1416,1417],{},"Agent Bayes is built for people who work with bodies of literature and need to stand behind what they write: master's and PhD students, postdocs, researchers, analysts, and writers.",[27,1419,1421],{"id":1420},"getting-started","Getting started",[14,1423,1424,1425,1429,1430,1434],{},"The fastest way to understand Agent Bayes is to point it at a few papers and watch a mindmap take shape. Start with the ",[177,1426,1428],{"href":1427},"\u002Fdocs\u002Fgetting-started\u002Fquickstart","Quickstart"," to get your first map going in a few minutes, or read the ",[177,1431,1433],{"href":1432},"\u002Fdocs\u002Fgetting-started\u002Fintroduction","Introduction"," for the full tour.",[14,1436,1437,1438,1442],{},"We are just getting started, and we would love your feedback. If you want access, ",[177,1439,1441],{"href":1440},"\u002F#waitlist","join the waiting list"," and tell us what you are researching.",{"title":199,"searchDepth":248,"depth":248,"links":1444},[1445,1446,1447,1448,1449,1450,1451,1452],{"id":1282,"depth":226,"text":1283},{"id":1307,"depth":226,"text":1308},{"id":1320,"depth":226,"text":1321},{"id":1339,"depth":226,"text":1340},{"id":1352,"depth":226,"text":1353},{"id":1365,"depth":226,"text":1366},{"id":1413,"depth":226,"text":1414},{"id":1420,"depth":226,"text":1421},"\u002Fhero-screenshot-light.webp","\u002Fhero-screenshot-dark.webp","2026-07-01","Why we built a multi-agent research assistant around a shared, citation-backed mindmap, and how it fixes what generic chatbots get wrong about working with literature.",{},"\u002Fblog\u002Fintroducing-agent-bayes",{"title":1268,"description":1456},"blog\u002F1.introducing-agent-bayes",[1462,1463],"Announcements","Product",null,"6oGyUE5smgrcHWdSYCmOhW11rZdMv6VAzbZ_9yapwTs",1783002964088]