Documentation / @openassistant/geoda / dataClassify
Variable: dataClassify
constdataClassify:OpenAssistantTool<DataClassifyFunctionArgs,DataClassifyLlmResult,DataClassifyAdditionalData,DataClassifyFunctionContext>
Defined in: src/data-classify/tool.ts:107
dataClassify Tool
This tool is used to classify numerical data into k bins or classes using various statistical methods. It returns break points that can be used to categorize continuous data into discrete intervals.
Classification Methods
The classification method can be one of the following types:
- quantile: Divides data into equal-sized groups based on quantiles
- natural breaks: Uses Jenks' algorithm to minimize within-group variance
- equal interval: Creates intervals of equal width across the data range
- percentile: Uses percentile-based breaks (25th, 50th, 75th percentiles)
- box: Uses box plot statistics (hinge = 1.5 or 3.0)
- standard deviation: Creates breaks based on standard deviation intervals
- unique values: Returns all unique values in the dataset
Parameters
datasetName: Name of the dataset containing the variablevariableName: Name of the numerical variable to classifymethod: Classification method (see above)k: Number of bins/classes (required for quantile, natural breaks, equal interval)hinge: Hinge value for box method (default: 1.5)
Example user prompts:
- "Can you classify the population data into 5 classes using natural breaks?"
- "Classify the income variable using quantile method with 4 bins"
- "Use box plot method to classify the housing prices"
Example
typescript
import { dataClassify } from "@openassistant/geoda";
import { convertToVercelAiTool } from "@openassistant/utils";
const classifyTool = {
...dataClassify,
context: {
getValues: async (datasetName: string, variableName: string) => {
// Implementation to retrieve values from your data source
return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20];
},
},
};
// Usage with AI model
const result = await generateText({
model: yourModel,
prompt: 'Can you classify the population data into 5 classes using natural breaks?',
tools: { dataClassify: convertToVercelAiTool(classifyTool) },
});