ai-chunk-overlap Edge Function — Ai
AISplits text into overlapping chunks with configurable size, overlap, and boundary snapping (char, word, sentence) for RAG and embedding pipelines.
Edge function ai-chunk-overlap Splits text into overlapping chunks with configurable size, overlap, and boundary snapping (char, word, sentence) for RAG and embedding pipelines.. Deployed on Cloudflare Workers — zero cold starts, globally distributed. Mount it via your Aerostack workspace to call it from any AI agent.
npx aerostack add navin/ai-chunk-overlap Use with AI Assistants
MCPConnect Claude, Cursor, or any MCP-compatible client — then call this function by slug
① Add MCP Server
Add this once — access all Aerostack functions from your AI tool.
{
"mcpServers": {
"aerostack": {
"url": "https://mcp.aerostack.dev",
"type": "http"
}
}
} ② Call this function
Ask your AI to use the call_function tool with this slug:
call_function({
slug: "ai-chunk-overlap",
args: {
"text": "example_text",
"chunkSize": 1000,
"overlap": 200,
"boundary": "word"
}
}) ai-chunk-overlap — Split text into overlapping chunks for RAG pipelines
Splits a document into overlapping text chunks with configurable size, overlap, and boundary snapping — no external API calls required.
API
POST /api/ai-chunk-overlap
Request body
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string | ✅ | — | Text to split into chunks |
chunkSize |
number | ❌ | 1000 |
Target chunk size in characters |
overlap |
number | ❌ | 200 |
Characters to repeat between adjacent chunks |
boundary |
"char" | "word" | "sentence" |
❌ | "word" |
Boundary snapping mode |
Success response (200)
{
"success": true,
"data": {
"chunks": [
{ "text": "...", "start": 0, "end": 1024, "index": 0 },
{ "text": "...", "start": 824, "end": 1848, "index": 1 }
],
"count": 2,
"totalChars": 5000
}
}
Error responses
| Code | HTTP | When |
|---|---|---|
INVALID_INPUT |
400 | Missing text or invalid parameters |
INTERNAL_ERROR |
500 | overlap >= chunkSize |
Usage
cURL
curl -X POST "$FUNCTION_URL" \
-H "Content-Type: application/json" \
-d '{"text": "...", "chunkSize": 500, "overlap": 100, "boundary": "sentence"}'
TypeScript / JavaScript (HTTP)
const response = await fetch(FUNCTION_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: documentText, chunkSize: 1000, overlap: 200 }),
});
const { data } = await response.json();
for (const chunk of data.chunks) {
await embedAndStore(chunk.text, chunk.start, chunk.end);
}
Direct import (Node / Bun / Deno)
import { aiChunkOverlap } from '@aerostack/functions/ai-chunk-overlap';
const { chunks } = aiChunkOverlap({ text: longDocument, chunkSize: 800, overlap: 150 });
Use Cases
- Preparing documents for embedding by splitting them into overlapping chunks before storing in a vector database.
- Chunking transcripts or long-form articles for RAG retrieval, ensuring context spanning chunk boundaries is preserved.
- Splitting code or documentation files into manageable pieces for LLM-based analysis.
Notes
startandendare character offsets into the original text —text.slice(start, end)reproduces each chunk.- Overlap ensures that phrases or sentences crossing a chunk boundary appear in both adjacent chunks.
boundary: 'sentence'snaps to the nearest.,!, or?character.overlapmust be less thanchunkSize— equal or greater values throw an error.
Metadata
Tags
Publisher
@navin verified
Build and publish your own functions
Write a TypeScript function, deploy it to the edge, and share it with thousands of developers — in minutes.
More AI Functions
Browse AI Functions →ai-context-window-fit
by @navin
Trims a conversation message array to fit within a model's context window using configurable strategies, without making any API calls.
ai-cost-estimate
by @navin
Calculates the API cost for an LLM request given a model name, prompt token count, and completion token count, supporting multiple currencies.
ai-extract-keywords
by @navin
Extracts the top N keywords from text using TF-IDF inspired scoring with built-in English stopword filtering, no external API calls required.
ai-guardrail-injection-detect
by @navin
Scores text for common prompt injection attack patterns including role overrides, instruction leaking, and jailbreak attempts.
ai-language-detect
by @navin
Detects the natural language of a text string using character trigram frequency analysis, supporting 13 languages with no external API calls.
ai-messages-to-prompt
by @navin
Serialises a structured message array into a formatted prompt string for open-source LLMs, supporting ChatML, Llama 2, Alpaca, and plain text formats.
Frequently asked questions
What does the ai-chunk-overlap function do? +
ai-chunk-overlap is a serverless edge function for ai automation written in aerostack. Deploy it to Cloudflare Workers via your Aerostack workspace.
How do I deploy the ai-chunk-overlap function? +
Install the Aerostack CLI and run: ```bash aerostack deploy function @navin/ai-chunk-overlap ``` It will be live on Cloudflare Workers in seconds.
What runtime does ai-chunk-overlap use? +
ai-chunk-overlap runs on aerostack on the Cloudflare Workers edge runtime — zero cold starts, globally distributed.
Can I customise the ai-chunk-overlap function? +
Yes. Fork the function from your Aerostack dashboard, modify the source, and redeploy. All changes are version-controlled.