Skip to content

Architecture

OpenAssistant v1.0.0 is designed with a modular, framework-agnostic architecture that allows you to build AI-powered spatial data applications with any AI framework of your choice.

Design Principles

1. Framework Agnostic

Unlike v0.x, OpenAssistant v1.0.0 doesn't depend on any specific AI framework. Each tool can be converted to work with different frameworks through adapter methods:

typescript
// Works with Vercel AI SDK
const vercelTool = tool.toVercelAiTool(tool);

// Works with LangChain
const langchainTool = tool.toLangChainTool();

2. Browser-First

All core functionality runs in the browser using:

  • WebAssembly: DuckDB, GeoDA, and other compute-intensive operations
  • Web Workers: Background processing for non-blocking operations
  • IndexedDB: Client-side data persistence
  • Modern Web APIs: Geolocation, Canvas, WebGL for visualizations

This approach provides:

  • Privacy: Data never leaves the user's device
  • Performance: Reduced network latency
  • Cost Efficiency: No server infrastructure required for compute

3. Modular Packages

Each package is independent and can be installed separately:

@openassistant/
├── utils/              # Core utilities
├── tools/
│   ├── duckdb/        # SQL query tools
│   ├── geoda/         # Spatial statistics
│   ├── map/           # Map manipulation
│   ├── osm/           # OpenStreetMap
│   ├── places/        # Location services
│   ├── plots/         # Visualization tools
│   └── h3/            # H3 spatial indexing
└── components/
    ├── assistant/     # Assistant component
    ├── echarts/       # ECharts components
    ├── keplergl/      # Kepler.gl components
    ├── leaflet/       # Leaflet components
    ├── vegalite/      # Vega-Lite components
    ├── tables/        # Table components
    ├── hooks/         # React hooks
    └── common/        # Shared UI utilities

Core Concepts

OpenAssistantTool Type

All tools conform to the OpenAssistantTool type (defined in @openassistant/utils):

typescript
export type OpenAssistantTool<
  TArgs extends ZodType = ZodType<unknown>,
  TLlmResult = unknown,
  TAdditionalData = unknown,
  TContext = never,
> = {
  name: string;
  description: string;
  parameters: TArgs; // Zod schema for parameter validation
  context?: TContext;
  component?: unknown;
  onToolCompleted?: OpenAssistantOnToolCompleted;
  execute: OpenAssistantExecuteFunction<
    z.infer<TArgs>, 
    TLlmResult, 
    TAdditionalData, 
    TContext
  >;
};

Type Parameters:

  • TArgs: Zod schema type for tool parameters
  • TLlmResult: Type of result returned to the LLM
  • TAdditionalData: Type of additional data for UI components
  • TContext: Type of context object for accessing application data

Tool Context

Tools often need access to application-specific data or services. This is provided through the context property:

typescript
import { localQuery } from '@openassistant/duckdb';

// Create a tool instance with custom context
const localQueryTool = {
  ...localQuery,
  context: {
    getValues: async (dataset: string, variable: string) => {
      // Application-specific data fetching
      return await fetchData(dataset, variable);
    },
  },
};

This pattern allows tools to be:

  • Reusable: Same tool definition, different contexts
  • Testable: Easy to mock context in tests
  • Flexible: Adapt to your application's architecture
  • Type-safe: Context is typed through the tool's type parameters

Tool Output Cache (Client-Side)

The ToolCache class (from @openassistant/utils) provides client-side caching for tool results. It's a singleton that stores datasets generated by tools, making them available for subsequent operations without re-computation.

Supported Data Types:

typescript
type ToolCacheDataset =
  | { type: 'geojson'; content: GeoJSON.FeatureCollection }
  | { type: 'columnData'; content: Record<string, unknown>[] }
  | { type: 'string'; content: string }
  | { type: 'rowObjects'; content: unknown[][] }
  | { type: 'json'; content: Record<string, unknown> }
  | { type: 'weights'; content: { weights: number[][]; weightsMeta: Record<string, unknown> } }
  | { type: 'arrow'; content: unknown };

Basic Usage:

typescript
import { ToolCache } from '@openassistant/utils';

// Get the singleton instance
const cache = ToolCache.getInstance();

// Add a dataset to cache
cache.addDataset('tool-call-123', {
  datasetName: 'myDataset',
  myDataset: {
    type: 'geojson',
    content: {
      type: 'FeatureCollection',
      features: [/* ... */]
    }
  }
});

// Check if dataset exists
if (cache.hasDataset('myDataset')) {
  const dataset = cache.getDataset('myDataset');
  console.log('Cached dataset:', dataset);
}

// Remove a specific dataset
cache.removeDataset('myDataset');

// Clear all cached data
cache.clearCache();

Client-Side Integration Example:

typescript
'use client';

import { useChat } from 'ai/react';
import { ToolCache } from '@openassistant/utils';
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

export default function ChatComponent() {
  const cache = ToolCache.getInstance();
  
  // Create tool with cache integration
  const queryTool = {
    ...localQuery,
    context: {
      getValues: async (dataset, variable) => {
        // Check cache first
        if (cache.hasDataset(dataset)) {
          const cached = cache.getDataset(dataset);
          if (cached?.type === 'columnData') {
            return cached.content.map(row => row[variable]);
          }
        }
        
        // Fetch if not cached
        const data = await fetchData(dataset, variable);
        return data;
      },
    },
    onToolCompleted: (toolCallId, additionalData) => {
      // Cache the result from queryTool, see LocalQueryAdditionalData for the type of additionalData
      cache.addDataset(toolCallId, additionalData);
    },
  };
  
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: '/api/chat',
  });
  
  return (
    <div>
      {/* Chat UI */}
    </div>
  );
}

Use Cases:

  • Avoid Re-computation: Cache expensive operations like spatial joins or large queries
  • Data Persistence: Keep data available across multiple tool calls in the same session
  • Client-Side Storage: Store intermediate results without server roundtrips
  • Performance Optimization: Reduce redundant API calls and computations

Tool Output Management (Server-Side)

Tools return structured results with two components:

typescript
export type OpenAssistantExecuteFunctionResult<
  TLlmResult = unknown,
  TAdditionalData = unknown,
> = {
  llmResult: TLlmResult;        // Data returned to the LLM
  additionalData?: TAdditionalData;  // Data for UI components/callbacks
};

The ToolOutputManager (from @openassistant/utils) handles server-side tool output management:

  • Output Storage: Caching tool results on the server
  • Output Retrieval: Accessing results from different contexts
  • Client-Server Exchange: Transferring data between client and server
  • Conversation Scoping: Maintaining separate caches per conversation

Server-Side Example:

typescript
// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { ToolOutputManager, ConversationCache } from '@openassistant/utils';
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

// Get conversation-scoped output manager
const conversationCache = ConversationCache.getInstance();

export async function POST(req: Request) {
  const { messages } = await req.json();
  
  // Get or create manager for this conversation
  const outputManager = conversationCache.getOrCreateManager(messages);
  
  // Register tools with the manager
  const queryTool = {
    ...localQuery,
    context: {
      getValues: async (dataset, variable) => {
        // Check if we have cached output from previous tool calls
        const cachedOutput = outputManager.getToolOutput('query', dataset);
        if (cachedOutput) {
          return cachedOutput.data[variable];
        }
        
        // Fetch from database
        return await db.query(dataset, variable);
      },
    },
    onToolCompleted: (toolCallId, additionalData) => {
      // Cache the output on server side
      outputManager.setToolOutput(toolCallId, additionalData);
    },
  };
  
  const result = streamText({
    model: openai('gpt-4'),
    messages,
    tools: {
      query: tool(convertToVercelAiTool(queryTool)),
    },
  });
  
  return result.toDataStreamResponse();
}

Key Differences from Client-Side Cache:

FeatureClient-Side (ToolCache)Server-Side (ToolOutputManager)
ScopeBrowser sessionConversation-scoped
StorageIn-memory (singleton)Per-conversation instance
PersistenceLost on page reloadMaintained across requests
Use CaseUI state, client dataServer data, cross-request sharing
AccessToolCache.getInstance()ConversationCache.getOrCreateManager()

Data Flow

Client-Side Flow

User Input

AI Model (via SDK)

Tool Call

OpenAssistant Tool

Execute in Browser (WASM/WebWorker)

Result to AI Model

Display to User

Server-Side Flow

User Input (from client)

API Route

AI Model (via SDK)

Tool Call

OpenAssistant Tool

Execute on Server

Stream Result to Client

Display to User

Hybrid Flow

User Input

AI Model (client)

Tool Call

Send to Server (if needed)

Execute Tool (client or server)

Result Exchange via ToolOutputManager

Display to User

Package Dependencies

Dependency Graph

Components (chat, echarts, etc.)
    ↓ depends on
Utils
    ↑ used by
Tools (duckdb, geoda, etc.)
  • Tools and Components depend on Utils
  • Tools are independent of each other
  • Components are independent of Tools

External Dependencies

OpenAssistant minimizes external dependencies:

  • Tools: Mostly self-contained with WebAssembly modules
  • Components: React, and visualization libraries (ECharts, Leaflet, etc.)
  • Utils: Zero external dependencies (except peer dependencies)

Performance Considerations

Bundle Size

  • Each package is tree-shakeable
  • Use dynamic imports for large dependencies
  • WebAssembly modules are loaded on demand
typescript
// Dynamic import for large tools
const { LocalQueryTool } = await import('@openassistant/duckdb');

Memory Management

  • Tools implement cleanup methods
  • WebAssembly instances are properly disposed
  • Large datasets use streaming when possible

Optimization Strategies

  1. Code Splitting: Load tools only when needed
  2. Web Workers: Run intensive computations in background threads
  3. Caching: Use ToolCache for repeated computations
  4. Streaming: Stream large results instead of loading all at once

Migration from v0.x

Key changes from v0.x to v1.0.0:

Removed Packages

  • @openassistant/ui - Replaced by @sqlrooms/ai for better UI components

New Patterns

typescript
// v0.x
import { useAssistant } from '@openassistant/core';

// v1.0.0 - Use your framework's hooks directly
import { useChat } from 'ai/react'; // Vercel AI SDK
// or
import { useLangChain } from '@langchain/react'; // LangChain

Tool Usage

typescript
// v1.0.0 - Object-based with type safety
import { localQuery } from '@openassistant/duckdb';
import { convertToVercelAiTool } from '@openassistant/utils';
import { tool } from 'ai';

const localQueryTool = {
  ...localQuery,
  context: {
    getValues: async (dataset, variable) => {
      return await fetchData(dataset, variable);
    },
  },
};

// Convert to AI framework format
const aiTool = tool(convertToVercelAiTool(localQueryTool));

Next Steps

Released under the MIT License.