Telegram Attachments
Audience: operators who have already paired their phone with
--telegramand want to send files to the fleet or have the fleet send files back. For pair / use / recover guidance seedocs/telegram-remote-steering.md; for the security posture seedocs/telegram-threat-model.md; for the developer architecture seedocs/telegram-architecture.md.
agents-fleet's Telegram channel supports inbound file uploads (you send a file from your phone, the coordinator can read it) and outbound file delivery (the coordinator chooses to push a file back to every paired chat). Both directions reuse the same AttachmentStore and the same allowlist / auth gate as text steering — pairing requirements do not change.
At a glance
| You want to… | Do this |
|---|---|
| Send a screenshot to the fleet | Attach a photo in Telegram; add a caption explaining what to do with it |
| Send a log file or PDF | Attach as a document |
| Send a voice memo | Hold the mic button and record |
| Have the coordinator deliver a generated file back to you | Ask in your prompt; the coordinator decides whether to call send_attachment |
Inbound — sending a file to the bot
From any paired Telegram chat, attach a photo, document, audio, video, voice note, or video note and (optionally) add a caption.
The bot downloads the file (
BotTransport.downloadFile) and stores it under~/.fleet/attachments/<sessionId>/<unique-name>via theAttachmentStore.Your caption (if any) is echoed into the local CLI transcript as
[via Telegram] <caption>exactly like a plain-text message, so the operator at the host sees what was sent.A second prompt is synthesized and dispatched to the coordinator that combines the caption with one bullet per saved file:
texttake a look at this screenshot [Operator shared attachment: - /home/me/.fleet/attachments/sess-…/screenshot.png (84213 bytes, image/png)]The coordinator can
viewthe file directly, hand it to a worker, or process it with any tool that accepts a path.
Filename handling
- Telegram supplies
fileNamefor documents (and often for audio/video). When present, that name is sanitized and used. - Photos, voice notes, and some stripped media arrive without a name. In that case the stored name is
<fileUniqueId><ext>, where<ext>is derived from the reported MIME type (image/jpeg→.jpg,audio/ogg→.ogg,application/pdf→.pdf, etc.). Unknown MIME types fall back to.bin. - Sanitization (
sanitizeAttachmentNameinsrc/bot/attachmentStore.ts): directory separators stripped, leading dots removed, characters outside[a-zA-Z0-9._-]replaced with_, trailing dots/spaces trimmed. Empty names fall back toattachment.bin. - Name collisions resolve by appending
-1,-2, …, up to-999to the stem (extension preserved).
Failure posture
- Per-attachment failures (download error, store error) are logged and skipped; other attachments on the same message still proceed.
- If every attachment fails the caption is still preserved in the local transcript via the synchronous echo above, but no coordinator prompt is synthesized (so the coordinator never sees an empty
[Operator shared attachment: …]block).
Outbound — delivering files from the fleet
The coordinator has a send_attachment tool (src/coordinator/tools/sendAttachment.ts) that pushes a file to every paired chat. Workers can produce files (logs, screenshots, generated artifacts) but do not auto-deliver: the coordinator decides whether to forward each one.
Tool surface
{
"name": "send_attachment",
"parameters": {
"path": "string (absolute or cwd-relative)",
"caption": "string (optional)",
"kind": "auto | document | photo | audio | video | voice (optional, default 'auto')"
}
}pathmust point at an existing readable regular file.captionis rendered alongside the file in Telegram (subject to Telegram's caption length cap).kind=autopicks the Telegram method by extension:.jpg,.jpeg,.png,.gif,.webp→ photo.mp3,.wav,.ogg,.m4a,.flac→ audio.mp4,.mov,.mkv,.webm,.avi→ video- anything else → document
Pass an explicit
kindto override (e.g.voicefor an OGG/Opus voice note rendered with a waveform).
Result shape
The tool returns a structured result rather than throwing:
{ "ok": true, "kind": "photo", "path": "/abs/path", "sentCount": 2 }
{ "ok": false, "error": "Cannot read file at /abs/path: ENOENT" }
{ "ok": false, "error": "No attachment channel is attached. …" }So an outage on one paired chat does not stall the tool call, and a mis-typed path surfaces to the model as a recoverable error.
Current wiring status
send_attachment is registered unconditionally in toolHandlers.ts, but the underlying AttachmentSender implementation on TelegramChannel lands in a follow-up PR. Until then the tool reports { ok: false, error: "No attachment channel is attached. …" } in production. The interface contract is stable; once the channel wiring ships no further coordinator-side changes are required.
Worked example
Sending a screenshot for review, getting back an annotated copy:
You (Telegram, attaches screenshot.png with caption):
this dialog renders wrong on the small viewport — what's broken?
Fleet (local CLI transcript echoes):
[via Telegram] this dialog renders wrong on the small viewport — what's broken?
[via Telegram] 📎 attachment: screenshot.png → /home/me/.fleet/attachments/sess-abc/screenshot.png
Coordinator (Telegram reply, after viewing the file):
The flex container is using `align-items: flex-end` which collapses the
modal under 480px. Patching `src/ui/Dialog.tsx`…
Coordinator (Telegram, later, via send_attachment):
↳ patched-dialog.png «here's the fix rendered at 375×667»Storage layout
~/.fleet/attachments/
<sessionId>/ ← directory mode 0o700
screenshot.png ← file mode 0o600
voice-AgADBQADxxx.ogg
log.txt
log-1.txt ← collision suffix~/.fleet/attachments/is the root passed toAttachmentStore(src/cli/telegramSetup.ts:368).- The sub-directory name comes from the active
sessionId(orno-sessionwhen one hasn't been minted yet). - Files are written via
writeStream → renameof a.partialsidecar so partial downloads never appear as completed files. - Telegram caps single uploads at 50 MB (Bot API limit); agents-fleet does not impose an additional size cap on inbound files.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
Caption echoed locally but no 📎 attachment: line | Download failed (network blip, Telegram quota) | Re-send the file; check AGENTS_FLEET_TG_DEBUG=1 stderr for downloadFile failed |
| Coordinator never reacts to the attachment | Per-attachment store error after download; caption alone reaches the transcript | Check stderr for saveFromStream failed; verify ~/.fleet/attachments/ is writable |
send_attachment returns "No attachment channel is attached" | The AttachmentSender wiring on TelegramChannel hasn't shipped in your version, or --telegram was never started | Update agents-fleet; confirm at least one chat is paired before calling the tool |
send_attachment returns "Cannot read file at …" | Path doesn't exist or is not readable by the fleet process | Pass an absolute path that the coordinator process can stat; cwd-relative paths resolve against the fleet's launch directory |
Stored filename looks like AgADBQADxxx.jpg instead of the original | Telegram did not include fileName (always the case for photos / voice notes) | Cosmetic — the file is intact; rename if you need a friendlier name |
See also
docs/telegram-remote-steering.md— operator quick start, command matrix, recovery.docs/telegram-threat-model.md— file handling considerations: sanitization, no AV scanning, no size cap.docs/telegram-architecture.md— internal flow:AttachmentStore, inbound dispatch pipeline, outboundAttachmentSenderinterface.src/bot/attachmentStore.ts— storage- sanitization implementation.
src/cli/telegramInboundDispatch.ts—handleInboundAttachments(download → store → coordinator prompt).src/coordinator/tools/sendAttachment.ts—send_attachmenttool +AttachmentSendercontract.