Slate is the state and runtime context management layer that sits between your AI agents and storage. It provides a cognitive memory architecture inspired by human cognition.
Slate provides four memory types:
| Memory Type | Internal Module | Purpose |
|---|---|---|
| Working Memory | Flux | Context buffer with time-decay attention |
| Episodic Memory | Echoes | Long-term interaction history with vector search |
| Semantic Memory | Nexus | Knowledge graph and fact storage |
| Procedural Memory | Reflex | WASM runtime for executing agent skills |
npm install git+https://github.com/rice-ai-hq/slate.git#subdirectory=clients/node
pip install git+https://github.com/rice-ai-hq/slate.git#subdirectory=clients/python
Both clients use CortexClient to communicate with Slate. You need the server address and an optional auth token.
from slate_client import CortexClient
client = CortexClient(
address="grpc.your-instance-id.slate.tryrice.com:80",
token="your-auth-token",
run_id="my-session" # Optional, defaults to "default"
)
import { CortexClient } from "slate-client";
const client = new CortexClient(
"grpc.your-instance-id.slate.tryrice.com:80",
"your-auth-token",
"my-session" // Optional, defaults to "default"
);
Working Memory manages the agent's immediate context with dynamic decay and attention.
Key Characteristics:
API:
| Method | Description |
|---|---|
focus(content) | Push information into the agent's attention stream |
drift() | Retrieve current context sorted by relevance |
# Add to working memory
response = client.focus("User is planning a trip to Japan")
print(f"Stored with ID: {response.id}")
client.focus("Budget is around $3000")
# Retrieve context (sorted by relevance)
response = client.drift()
for item in response.items:
print(f"[{item.relevance:.2f}] {item.content}")
await client.focus("User is planning a trip to Japan");
await client.focus("Budget is around $3000");
const items = await client.drift();
items.forEach((item) => {
console.log(`[${item.relevance}] ${item.content}`);
});
Episodic Memory stores the history of interactions as traces. Every action an agent takes can be committed for future learning.
Trace Structure:
| Component | Description |
|---|---|
| Input | What triggered the action |
| Outcome | The result of the action |
| Action | What the agent decided to do |
| Reasoning | Why the agent made that decision |
API:
| Method | Description |
|---|---|
commit(input, outcome, ...) | Record an experience as a trace |
reminisce(query) | Recall past experiences similar to the query |
# Record an interaction
client.commit(
"What should I pack for Japan?", # input
"Suggested layers, umbrella, walking shoes", # outcome
action="travel_advice",
reasoning="Spring weather is variable"
)
# Recall similar past interactions (limit defaults to 5)
response = client.reminisce("packing for Asia trip", limit=3)
for trace in response.traces:
print(f"Past: {trace.input} -> {trace.outcome}")
// Record an interaction
await client.commit(
"What should I pack for Japan?",
"Suggested layers, umbrella, walking shoes",
{
action: "travel_advice",
reasoning: "Spring weather is variable"
}
);
// Recall similar past interactions (limit defaults to 5)
const traces = await client.reminisce("packing for Asia trip", 3);
traces.forEach((trace) => {
console.log(`Past: ${trace.input} -> ${trace.outcome}`);
});
Procedural Memory executes compiled WebAssembly skills server-side with deterministic, sandboxed execution.
API:
| Method | Description |
|---|---|
trigger(skill_name) | Execute a server-side WASM skill |
result = client.trigger("calculate_tax")
const result = await client.trigger("calculate_tax");
Semantic Memory stores invariant facts and knowledge. It maintains a knowledge graph with relationships between concepts.
Use Cases:
Slate supports multi-tenancy through the run_id mechanism. Each run is isolated and all operations are scoped to the current run.
# Each agent has its own memory space
agent1 = CortexClient(address="grpc.your-instance-id.slate.tryrice.com:80", run_id="agent-1")
agent2 = CortexClient(address="grpc.your-instance-id.slate.tryrice.com:80", run_id="agent-2")
# Data from agent1 won't appear in agent2's memory
agent1.focus("Secret info for agent 1")
agent2.drift() # Returns empty
# Multiple agents sharing the same memory
researcher = CortexClient(address="grpc.your-instance-id.slate.tryrice.com:80", run_id="research-team")
writer = CortexClient(address="grpc.your-instance-id.slate.tryrice.com:80", run_id="research-team")
# Both can read and write to the same memory space
researcher.focus("Found relevant paper on topic X")
writer.drift() # Sees the researcher's context
# Delete all data for the current run
client.delete_run()
await client.deleteRun();
Every cognitive agent follows this pattern:
reminisce(input) - Find similar past experiencesdrift() - Get current contextcommit(input, outcome, ...) - Save the resultfocus(new_state) - Update working memorydef agent_loop(user_input):
# Recall past experiences
past = client.reminisce(user_input, limit=3).traces
# Get current context
context = client.drift().items
# Generate response with LLM
response = llm.generate(
input=user_input,
context=context,
examples=past
)
# Learn from this interaction
client.commit(
user_input, # input
response, # outcome
action="respond",
reasoning="Generated based on context and past examples"
)
# Update context
client.focus(f"User asked: {user_input}")
client.focus(f"Agent responded: {response}")
return response
Slate supports token-based authentication. Get your auth token from the Slate Console.
client = CortexClient(address="grpc.your-instance-id.slate.tryrice.com:80", token="your-token")
const client = new CortexClient("grpc.your-instance-id.slate.tryrice.com:80", "your-token");
Slate supports multiple vector database backends for Episodic and Semantic memory:
| Provider | Configuration |
|---|---|
| Mock (In-Memory) | Default, no config needed |
| Qdrant | Set QDRANT_URL |
| Pinecone | Set PINECONE_API_KEY and PINECONE_INDEX_HOST |
| ChromaDB | Set CHROMA_URL |
| RiceDB | Set RICEDB_URL |