IM Agent Interaction Model
Summary
AgentInbox should model IM integrations as three separate capabilities:
- durable reminder subscriptions
- on-demand context fetching
- feedback and reply operations
For IM providers such as Feishu/Lark, the provider usually exposes an app-level
message event stream rather than a separate provider-side subscription for each
chat or mention target. AgentInbox should keep that provider stream shared,
then represent chat-level and mention-level interest as agent-specific
subscriptions.
When an agent receives an IM reminder, the inbox item should include enough
stable anchors to fetch more context later, but it should not eagerly store the
whole chat window. Context fetching should be an explicit source-specific read
operation routed through AgentInbox and backed by UXC provider credentials.
The first implementation target is Feishu/Lark.
Problem
The desired user experience is:
- an agent follows a group mention
- someone mentions the agent in a group
- the agent receives a concise reminder
- the agent asks for message context before responding
- the agent replies in the right place
This does not fit cleanly into a single "message received" event.
If the inbox item stores only the mention event, the agent lacks enough context to answer well. If the inbox item eagerly stores a large chat window, the source will do unnecessary provider reads, create large inbox entries, and make the subscription path depend on runtime-specific reasoning needs.
There is also a product-boundary question. Agents can call provider CLIs such as
lark-cli, but making that the normal path leaks provider tooling, credentials,
and output shape into every agent runtime.
Goals
- make IM reminders first-class without turning each chat into a private source
- keep provider credentials and API access behind UXC-backed source modules
- let agents fetch context only when they need it
- expose stable, structured context data instead of provider CLI output
- preserve delivery and feedback as
DeliveryHandleplus source-specific operations - make Feishu/Lark the first case without hardcoding Feishu/Lark semantics into the core
- keep
lark-cliuseful for development, debugging, and smoke testing, not as the required runtime contract
Non-Goals
- do not make
AgentInboxan IM client - do not build a general chat UI or conversation memory layer
- do not copy all provider messages into AgentInbox by default
- do not require every IM provider to expose identical event or context APIs
- do not let agents directly manage provider credentials
- do not make
lark-clithe product contract for agents
Terms
IM Host
A provider account or app identity that can receive and send IM events.
For Feishu/Lark this is the app/bot identity configured in UXC.
IM Message Stream
A shared source stream of provider message events.
For Feishu/Lark this maps to app-level im.message.receive_v1 event
consumption. It is not scoped to one chat.
Reminder Subscription
An agent-specific interest over a shared IM message stream.
Examples:
- messages in chat
oc_xxx - messages in chat
oc_xxxthat mention open idou_xxx - direct messages to the bot
Context Anchor
Stable metadata on an inbox item that lets a source module fetch context later.
Examples:
chatIdmessageIdmessageTypesenderOpenIdoccurredAtthreadIdparentIdrootId
Context Operation
A source-specific read operation that uses a context anchor to fetch nearby messages, thread messages, or message details.
Feedback Operation
A source-specific write operation that sends a reply, reaction, acknowledgement, or follow-up message through the provider.
Conceptual Model
IM integrations should split the interaction into three planes.
1. Reminder Plane
The reminder plane decides which provider events become agent-visible inbox items.
This is normal AgentInbox source and subscription behavior:
- a shared source receives provider events
- source mapping extracts normalized metadata
- subscriptions match on metadata and payload
- matching items are delivered to the agent inbox
- activations notify the runtime
The reminder plane should stay cheap and deterministic. It should not fetch a large context window on every event.
2. Context Plane
The context plane answers: "What does the agent need to know before responding?"
It should be explicit and on demand. A runtime or operator should be able to ask for context using either:
- an inbox item id
- a delivery/context handle
- explicit provider anchors such as
chatIdandmessageId
The source module should decide the provider-specific retrieval strategy.
For an IM provider, the module may:
- fetch the anchor message
- fetch a thread if the message is in a thread
- fetch a bounded chat window around the anchor message
- fetch sender or participant display data when useful
The return shape should be stable AgentInbox JSON, not raw CLI text.
3. Feedback Plane
The feedback plane sends provider-visible output.
This should reuse the existing delivery direction:
- inbox items expose or preserve a
DeliveryHandle - source modules expose operation descriptors for available actions
- callers execute
handle + operation + input
For IM providers, common operations include:
- reply to the anchor message
- reply in thread
- send a chat message
- add a reaction
- mark or acknowledge if the provider supports it
Reminder Subscriptions
Source modules should expose follow templates for common IM intents.
Illustrative examples:
agentinbox follow im chat --arg chatId=oc_xxx
agentinbox follow im mention --arg chatId=oc_xxx --arg openId=ou_agent
agentinbox follow feishu mention --arg chatId=oc_xxx --arg openId=ou_agent
agentinbox follow feishu mentions
The compiled subscription should use normal filter semantics.
Example mention filter:
{
"metadata": {
"chatId": "oc_xxx"
},
"expr": "contains(metadata.mentionOpenIds, \"ou_agent\")"
}
For IM-native onboarding, a provider can also expose a broader mention intent
that does not require the user or agent to know a group chatId up front.
For Feishu/Lark, feishu.mentions follows all chats visible to the app event
stream and filters only by mentionOpenIds. If the caller does not pass an
explicit openId, the Feishu/Lark module should resolve the configured app
bot's open_id through the provider credential before compiling the filter:
{
"expr": "contains(metadata.mentionOpenIds, \"ou_agent\")"
}
This does not mean every tenant chat. It means every chat whose message events are delivered to the configured Feishu/Lark app and UXC managed source.
The exact metadata fields are source-specific, but IM source modules should prefer a common vocabulary where provider data allows it:
providerchatIdchatTypemessageIdmessageTypesenderIdsenderOpenIdmentionsmentionOpenIdscontentthreadIdparentIdrootId
Context Handles
An IM inbox item should preserve enough information to fetch context later.
The minimal handle shape should be source-specific but predictable:
{
"provider": "feishu",
"surface": "message_context",
"targetRef": "om_xxx",
"threadRef": "omt_xxx",
"chatId": "oc_xxx",
"occurredAt": "2026-05-18T06:00:00.000Z"
}
The existing DeliveryHandle is write-oriented. AgentInbox may either add a
separate contextHandle field or derive a context handle from existing item
metadata and raw payload. The important contract is that the source module owns
provider-specific context retrieval.
Context Operations
Source modules should expose read operations separately from delivery operations.
A minimal IM context operation can be:
{
"name": "get_message_context",
"inputSchema": {
"type": "object",
"required": ["messageId"],
"properties": {
"messageId": { "type": "string" },
"chatId": { "type": "string" },
"threadId": { "type": "string" },
"windowBefore": { "type": "number" },
"windowAfter": { "type": "number" }
}
}
}
The output should separate the anchor from retrieved context:
{
"anchorMessage": {
"messageId": "om_xxx",
"chatId": "oc_xxx",
"senderId": "ou_xxx",
"content": "Can you check this?",
"createdAt": "2026-05-18T06:00:00.000Z"
},
"threadMessages": [],
"chatWindowMessages": [],
"deliveryHandle": {
"provider": "feishu",
"surface": "message_reply",
"targetRef": "om_xxx"
}
}
Provider-specific raw responses may be included under a debug or raw field, but the normal agent path should consume the normalized shape.
Proposed UX
Low-level operation:
agentinbox source invoke src_xxx --operation get_message_context \
--input-json '{"messageId":"om_xxx","chatId":"oc_xxx","windowBefore":20,"windowAfter":5}'
Inbox-item oriented convenience:
agentinbox inbox context itm_xxx --before 20 --after 5
The inbox-oriented command should:
- read the item
- resolve the source and context anchor
- call the source context operation
- return normalized context JSON
This keeps agents from needing provider-specific CLI knowledge for the common path.
Feishu/Lark Case
Feishu/Lark should be the first implementation.
Runtime Boundary
Credentials should live in UXC auth profiles. AgentInbox should store the
UXC auth reference, source routing config, and subscription filters.
lark-cli is useful for:
- API exploration
- manual smoke testing
- comparing response shapes
- temporary operational fallback
It should not be the normal agent-facing runtime contract.
Event Source
The shared source should consume the app-level message event stream.
For Feishu/Lark, lark-cli event schema im.message.receive_v1 shows that the
event exposes useful anchors such as:
chat_idchat_typemessage_idmessage_typesender_idcontentcreate_time
The provider event does not expose a chat-scoped event subscription parameter,
so "follow this chat" and "follow mentions in this chat" belong in
AgentInbox subscription filters.
Context Retrieval
The Feishu/Lark source module should implement get_message_context.
The retrieval strategy should be:
- fetch or hydrate the anchor message by
messageId - if the message has a thread or topic id, fetch thread messages
- otherwise fetch a bounded chat window around the message creation time
- normalize messages into a stable message list
Useful provider capabilities observed in lark-cli:
lark-cli im +messages-mget --message-ids om_xxx --as bot
lark-cli im +threads-messages-list --thread om_xxx --as bot
lark-cli im +chat-messages-list --chat-id oc_xxx --start ... --end ... --sort asc --as bot
The production implementation should call the same Lark APIs through UXC, using the source's UXC auth profile.
Feedback
The existing Feishu delivery path already supports:
- reply to message
- send chat message
The Feishu module should continue to expose those as delivery operations.
Future feedback operations may include:
- reply with rich text
- reply in thread
- send interactive card
- add reaction
Core Changes
The core does not need to understand IM semantics.
It should only need generic extension points:
- source modules can describe read/context operations
- source modules can invoke read/context operations
- inbox items can preserve or derive context anchors
- CLI and HTTP can route context operation calls
This parallels the existing delivery operation model without treating reads as deliveries.
Implementation Plan
- Define source-module hooks for read/context operation discovery and invocation.
- Add a Feishu/Lark
get_message_contextoperation. - Add an inbox-item oriented context command.
- Add Feishu/Lark follow templates for
chatandmention. - Add tests for filter expansion, context operation invocation, and normalized output shape.
- Add a Feishu/Lark user guide after the developer boundary is settled.
Open Questions
- Should
contextHandlebe stored on inbox items, or derived from metadata/raw payload at read time? - Should source read operations share the same descriptor schema as delivery operations with a different operation category?
- How should context reads be logged for audit and rate-limit visibility?
- Should context fetch results be cached, and if so, should the cache live in
AgentInboxor in the source runtime layer? - What is the minimum normalized IM message schema that works across Feishu, Slack, Discord, and Telegram without overfitting?
Decision
Adopt the following product boundary:
- IM event streams are shared sources.
- Chat and mention tracking are agent-specific reminder subscriptions.
- Message context is fetched on demand through source-specific read operations.
- Replies and other provider-visible actions use delivery operations.
- Provider CLIs such as
lark-cliare development and fallback tools, not the normal agent product contract.