Architecture

OpenAssistant v1.0.0 is designed with a modular, framework-agnostic architecture that allows you to build AI-powered spatial data applications with any AI framework of your choice.

Design Principles

1. Framework Agnostic

Unlike v0.x, OpenAssistant v1.0.0 doesn't depend on any specific AI framework. Each tool can be converted to work with different frameworks through adapter methods:

typescript

// Works with Vercel AI SDK
const vercelTool = tool.toVercelAiTool(tool);

// Works with LangChain
const langchainTool = tool.toLangChainTool();

2. Browser-First

All core functionality runs in the browser using:

WebAssembly: DuckDB, GeoDA, and other compute-intensive operations
Web Workers: Background processing for non-blocking operations
IndexedDB: Client-side data persistence
Modern Web APIs: Geolocation, Canvas, WebGL for visualizations

This approach provides:

Privacy: Data never leaves the user's device
Performance: Reduced network latency
Cost Efficiency: No server infrastructure required for compute

3. Modular Packages

Each package is independent and can be installed separately:

@openassistant/
├── utils/              # Core utilities
├── tools/
│   ├── duckdb/        # SQL query tools
│   ├── geoda/         # Spatial statistics
│   ├── map/           # Map manipulation
│   ├── osm/           # OpenStreetMap
│   ├── places/        # Location services
│   ├── plots/         # Visualization tools
│   └── h3/            # H3 spatial indexing
└── components/
    ├── assistant/     # Assistant component
    ├── echarts/       # ECharts components
    ├── keplergl/      # Kepler.gl components
    ├── leaflet/       # Leaflet components
    ├── vegalite/      # Vega-Lite components
    ├── tables/        # Table components
    ├── hooks/         # React hooks
    └── common/        # Shared UI utilities

Core Concepts

OpenAssistantTool Type

All tools conform to the OpenAssistantTool type (defined in @openassistant/utils):

typescript

export type OpenAssistantTool<
  TArgs extends ZodType = ZodType<unknown>,
  TLlmResult = unknown,
  TAdditionalData = unknown,
  TContext = never,
> = {
  name: string;
  description: string;
  parameters: TArgs; // Zod schema for parameter validation
  context?: TContext;
  component?: unknown;
  onToolCompleted?: OpenAssistantOnToolCompleted;
  execute: OpenAssistantExecuteFunction<
    z.infer<TArgs>, 
    TLlmResult, 
    TAdditionalData, 
    TContext
  >;
};

Type Parameters:

TArgs: Zod schema type for tool parameters
TLlmResult: Type of result returned to the LLM
TAdditionalData: Type of additional data for UI components
TContext: Type of context object for accessing application data

Tool Context

Tools often need access to application-specific data or services. This is provided through the context property:

typescript

import { localQuery } from '@openassistant/duckdb';

// Create a tool instance with custom context
const localQueryTool = {
  ...localQuery,
  context: {
    getValues: async (dataset: string, variable: string) => {
      // Application-specific data fetching
      return await fetchData(dataset, variable);
    },
  },
};

This pattern allows tools to be:

Reusable: Same tool definition, different contexts
Testable: Easy to mock context in tests
Flexible: Adapt to your application's architecture
Type-safe: Context is typed through the tool's type parameters

Tool Output Cache (Client-Side)

The ToolCache class (from @openassistant/utils) provides client-side caching for tool results. It's a singleton that stores datasets generated by tools, making them available for subsequent operations without re-computation.

Supported Data Types:

typescript

type ToolCacheDataset =
  | { type: 'geojson'; content: GeoJSON.FeatureCollection }
  | { type: 'columnData'; content: Record<string, unknown>[] }
  | { type: 'string'; content: string }
  | { type: 'rowObjects'; content: unknown[][] }
  | { type: 'json'; content: Record<string, unknown> }
  | { type: 'weights'; content: { weights: number[][]; weightsMeta: Record<string, unknown> } }
  | { type: 'arrow'; content: unknown };

Basic Usage:

typescript

import { ToolCache } from '@openassistant/utils';

// Get the singleton instance
const cache = ToolCache.getInstance();

// Add a dataset to cache
cache.addDataset('tool-call-123', {
  datasetName: 'myDataset',
  myDataset: {
    type: 'geojson',
    content: {
      type: 'FeatureCollection',
      features: [/* ... */]
    }
  }
});

// Check if dataset exists
if (cache.hasDataset('myDataset')) {
  const dataset = cache.getDataset('myDataset');
  console.log('Cached dataset:', dataset);
}

// Remove a specific dataset
cache.removeDataset('myDataset');

// Clear all cached data
cache.clearCache();

Client-Side Integration Example:

typescript

'use client';

import { useChat } from 'ai/react';
import { ToolCache } from '@openassistant/utils';
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

export default function ChatComponent() {
  const cache = ToolCache.getInstance();
  
  // Create tool with cache integration
  const queryTool = {
    ...localQuery,
    context: {
      getValues: async (dataset, variable) => {
        // Check cache first
        if (cache.hasDataset(dataset)) {
          const cached = cache.getDataset(dataset);
          if (cached?.type === 'columnData') {
            return cached.content.map(row => row[variable]);
          }
        }
        
        // Fetch if not cached
        const data = await fetchData(dataset, variable);
        return data;
      },
    },
    onToolCompleted: (toolCallId, additionalData) => {
      // Cache the result from queryTool, see LocalQueryAdditionalData for the type of additionalData
      cache.addDataset(toolCallId, additionalData);
    },
  };
  
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: '/api/chat',
  });
  
  return (
    <div>
      {/* Chat UI */}
    </div>
  );
}

Use Cases:

Avoid Re-computation: Cache expensive operations like spatial joins or large queries
Data Persistence: Keep data available across multiple tool calls in the same session
Client-Side Storage: Store intermediate results without server roundtrips
Performance Optimization: Reduce redundant API calls and computations

Tool Output Management (Server-Side)

Tools return structured results with two components:

typescript

export type OpenAssistantExecuteFunctionResult<
  TLlmResult = unknown,
  TAdditionalData = unknown,
> = {
  llmResult: TLlmResult;        // Data returned to the LLM
  additionalData?: TAdditionalData;  // Data for UI components/callbacks
};

The ToolOutputManager (from @openassistant/utils) handles server-side tool output management:

Output Storage: Caching tool results on the server
Output Retrieval: Accessing results from different contexts
Client-Server Exchange: Transferring data between client and server
Conversation Scoping: Maintaining separate caches per conversation

Server-Side Example:

typescript

// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { ToolOutputManager, ConversationCache } from '@openassistant/utils';
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

// Get conversation-scoped output manager
const conversationCache = ConversationCache.getInstance();

export async function POST(req: Request) {
  const { messages } = await req.json();
  
  // Get or create manager for this conversation
  const outputManager = conversationCache.getOrCreateManager(messages);
  
  // Register tools with the manager
  const queryTool = {
    ...localQuery,
    context: {
      getValues: async (dataset, variable) => {
        // Check if we have cached output from previous tool calls
        const cachedOutput = outputManager.getToolOutput('query', dataset);
        if (cachedOutput) {
          return cachedOutput.data[variable];
        }
        
        // Fetch from database
        return await db.query(dataset, variable);
      },
    },
    onToolCompleted: (toolCallId, additionalData) => {
      // Cache the output on server side
      outputManager.setToolOutput(toolCallId, additionalData);
    },
  };
  
  const result = streamText({
    model: openai('gpt-4'),
    messages,
    tools: {
      query: tool(convertToVercelAiTool(queryTool)),
    },
  });
  
  return result.toDataStreamResponse();
}

Key Differences from Client-Side Cache:

Feature	Client-Side (ToolCache)	Server-Side (ToolOutputManager)
Scope	Browser session	Conversation-scoped
Storage	In-memory (singleton)	Per-conversation instance
Persistence	Lost on page reload	Maintained across requests
Use Case	UI state, client data	Server data, cross-request sharing
Access	`ToolCache.getInstance()`	`ConversationCache.getOrCreateManager()`

Data Flow

Client-Side Flow

User Input
    ↓
AI Model (via SDK)
    ↓
Tool Call
    ↓
OpenAssistant Tool
    ↓
Execute in Browser (WASM/WebWorker)
    ↓
Result to AI Model
    ↓
Display to User

Server-Side Flow

User Input (from client)
    ↓
API Route
    ↓
AI Model (via SDK)
    ↓
Tool Call
    ↓
OpenAssistant Tool
    ↓
Execute on Server
    ↓
Stream Result to Client
    ↓
Display to User

Hybrid Flow

User Input
    ↓
AI Model (client)
    ↓
Tool Call
    ↓
Send to Server (if needed)
    ↓
Execute Tool (client or server)
    ↓
Result Exchange via ToolOutputManager
    ↓
Display to User

Package Dependencies

Dependency Graph

Components (chat, echarts, etc.)
    ↓ depends on
Utils
    ↑ used by
Tools (duckdb, geoda, etc.)

Tools and Components depend on Utils
Tools are independent of each other
Components are independent of Tools

External Dependencies

OpenAssistant minimizes external dependencies:

Tools: Mostly self-contained with WebAssembly modules
Components: React, and visualization libraries (ECharts, Leaflet, etc.)
Utils: Zero external dependencies (except peer dependencies)

Performance Considerations

Bundle Size

Each package is tree-shakeable
Use dynamic imports for large dependencies
WebAssembly modules are loaded on demand

typescript

// Dynamic import for large tools
const { LocalQueryTool } = await import('@openassistant/duckdb');

Memory Management

Tools implement cleanup methods
WebAssembly instances are properly disposed
Large datasets use streaming when possible

Optimization Strategies

Code Splitting: Load tools only when needed
Web Workers: Run intensive computations in background threads
Caching: Use ToolCache for repeated computations
Streaming: Stream large results instead of loading all at once

Migration from v0.x

Key changes from v0.x to v1.0.0:

Removed Packages

@openassistant/ui - Replaced by @sqlrooms/ai for better UI components

New Patterns

typescript

// v0.x
import { useAssistant } from '@openassistant/core';

// v1.0.0 - Use your framework's hooks directly
import { useChat } from 'ai/react'; // Vercel AI SDK
// or
import { useLangChain } from '@langchain/react'; // LangChain

Tool Usage

typescript

// v1.0.0 - Object-based with type safety
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

const localQueryTool = {
  ...localQuery,
  context: {
    getValues: async (dataset, variable) => {
      return await fetchData(dataset, variable);
    },
  },
};

// Convert to AI framework format
const aiTool = tool(convertToVercelAiTool(localQueryTool));

Next Steps

Creating Custom Tools - Build your own tools
AI SDK Integration - Deep dive into framework integration
Tools Overview - Explore available tools

Architecture ​

Design Principles ​

1. Framework Agnostic ​

2. Browser-First ​

3. Modular Packages ​

Core Concepts ​

OpenAssistantTool Type ​

Tool Context ​

Tool Output Cache (Client-Side) ​

Tool Output Management (Server-Side) ​

Data Flow ​

Client-Side Flow ​

Server-Side Flow ​

Hybrid Flow ​

Package Dependencies ​

Dependency Graph ​

External Dependencies ​

Performance Considerations ​

Bundle Size ​

Memory Management ​

Optimization Strategies ​

Migration from v0.x ​

Removed Packages ​

New Patterns ​

Tool Usage ​

Next Steps ​

Architecture

Design Principles

1. Framework Agnostic

2. Browser-First

3. Modular Packages

Core Concepts

OpenAssistantTool Type

Tool Context

Tool Output Cache (Client-Side)

Tool Output Management (Server-Side)

Data Flow

Client-Side Flow

Server-Side Flow

Hybrid Flow

Package Dependencies

Dependency Graph

External Dependencies

Performance Considerations

Bundle Size

Memory Management

Optimization Strategies

Migration from v0.x

Removed Packages

New Patterns

Tool Usage

Next Steps