feat(knowledge): add Ollama embedding provider support by teedonk · Pull Request #3714 · simstudioai/sim

teedonk · 2026-03-22T23:34:49Z

Summary

Adds Ollama as an embedding provider for knowledge bases, enabling fully offline/private RAG workflows
Implements per-KB dynamic pgvector tables to handle variable embedding dimensions across Ollama models
Auto-detects embedding dimension and context length from Ollama's /api/show endpoint
Adds provider selection UI with preset models (nomic-embed-text, mxbai-embed-large, all-minilm, etc.)

Fixes (#3037)

Per-KB embedding tables solve the variable dimension problem
Context-aware chunking and smart batching prevent context length overflow
Cross-provider search normalizes distance scores across OpenAI and Ollama KBs
All existing OpenAI/Azure paths remain untouched

Type of Change

Testing

Tested with Ollama (nomic-embed-text) on local setup:

Created KB with Ollama provider — dimension auto-detected
Uploaded small and large documents (40+ pages) — all chunks processed successfully
Search returns relevant results across both OpenAI and Ollama KBs
All 4267 existing tests pass, zero TypeScript errors, Biome lint clean

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the [Contributor License Agreement (CLA)](./CONTRIBUTING.md#contributor-license-agree

Video.Project.1.mp4

ment-cla)

…keyboard shortcuts, audit logs

…rects to rewrites

…stash, algolia tools; isolated-vm robustness improvements, tables backend (simstudioai#3271) * feat(tools): advanced fields for youtube, vercel; added cloudflare and dataverse tools (simstudioai#3257) * refactor(vercel): mark optional fields as advanced mode Move optional/power-user fields behind the advanced toggle: - List Deployments: project filter, target, state - Create Deployment: project ID override, redeploy from, target - List Projects: search - Create/Update Project: framework, build/output/install commands - Env Vars: variable type - Webhooks: project IDs filter - Checks: path, details URL - Team Members: role filter - All operations: team ID scope Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style(youtube): mark optional params as advanced mode Hide pagination, sort order, and filter fields behind the advanced toggle for a cleaner default UX across all YouTube operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * added advanced fields for vercel and youtube, added cloudflare and dataverse block * addded desc for dataverse * add more tools * ack comment * more * ops --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat(tables): added tables (simstudioai#2867) * updates * required * trashy table viewer * updates * updates * filtering ui * updates * updates * updates * one input mode * format * fix lints * improved errors * updates * updates * chages * doc strings * breaking down file * update comments with ai * updates * comments * changes * revert * updates * dedupe * updates * updates * updates * refactoring * renames & refactors * refactoring * updates * undo * update db * wand * updates * fix comments * fixes * simplify comments * u[dates * renames * better comments * validation * updates * updates * updates * fix sorting * fix appearnce * updating prompt to make it user sort * rm * updates * rename * comments * clean comments * simplicifcaiton * updates * updates * refactor * reduced type confusion * undo * rename * undo changes * undo * simplify * updates * updates * revert * updates * db updates * type fix * fix * fix error handling * updates * docs * docs * updates * rename * dedupe * revert * uncook * updates * fix * fix * fix * fix * prepare merge * readd migrations * add back missed code * migrate enrichment logic to general abstraction * address bugbot concerns * adhere to size limits for tables * remove conflicting migration * add back migrations * fix tables auth * fix permissive auth * fix lint * reran migrations * migrate to use tanstack query for all server state * update table-selector * update names * added tables to permission groups, updated subblock types --------- Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai> Co-authored-by: waleed <walif6@gmail.com> * fix(snapshot): changed insert to upsert when concurrent identical child workflows are running (simstudioai#3259) * fix(snapshot): changed insert to upsert when concurrent identical child workflows are running * fixed ci tests failing * fix(workflows): disallow duplicate workflow names at the same folder level (simstudioai#3260) * feat(tools): added redis, upstash, algolia, and revenuecat (simstudioai#3261) * feat(tools): added redis, upstash, algolia, and revenuecat * ack comment * feat(models): add gemini-3.1-pro-preview and update gemini-3-pro thinking levels (simstudioai#3263) * fix(audit-log): lazily resolve actor name/email when missing (simstudioai#3262) * fix(blocks): move type coercions from tools.config.tool to tools.config.params (simstudioai#3264) * fix(blocks): move type coercions from tools.config.tool to tools.config.params Number() coercions in tools.config.tool ran at serialization time before variable resolution, destroying dynamic references like <block.result.count> by converting them to NaN/null. Moved all coercions to tools.config.params which runs at execution time after variables are resolved. Fixed in 15 blocks: exa, arxiv, sentry, incidentio, wikipedia, ahrefs, posthog, elasticsearch, dropbox, hunter, lemlist, spotify, youtube, grafana, parallel. Also added mode: 'advanced' to optional exa fields. Closes simstudioai#3258 * fix(blocks): address PR review — move remaining param mutations from tool() to params() - Moved field mappings from tool() to params() in grafana, posthog, lemlist, spotify, dropbox (same dynamic reference bug) - Fixed parallel.ts excerpts/full_content boolean logic - Fixed parallel.ts search_queries empty case (must set undefined) - Fixed elasticsearch.ts timeout not included when already ends with 's' - Restored dropbox.ts tool() switch for proper default fallback * fix(blocks): restore field renames to tool() for serialization-time validation Field renames (e.g. personalApiKey→apiKey) must be in tool() because validateRequiredFieldsBeforeExecution calls selectToolId()→tool() then checks renamed field names on params. Only type coercions (Number(), boolean) stay in params() to avoid destroying dynamic variable references. * improvement(resolver): resovled empty sentinel to not pass through unexecuted valid refs to text inputs (simstudioai#3266) * fix(blocks): add required constraint for serviceDeskId in JSM block (simstudioai#3268) * fix(blocks): add required constraint for serviceDeskId in JSM block * fix(blocks): rename custom field values to request field values in JSM create request * fix(trigger): add isolated-vm support to trigger.dev container builds (simstudioai#3269) Scheduled workflow executions running in trigger.dev containers were failing to spawn isolated-vm workers because the native module wasn't available in the container. This caused loop condition evaluation to silently fail and exit after one iteration. - Add isolated-vm to build.external and additionalPackages in trigger config - Include isolated-vm-worker.cjs via additionalFiles for child process spawning - Add fallback path resolution for worker file in trigger.dev environment * fix(tables): hide tables from sidebar and block registry (simstudioai#3270) * fix(tables): hide tables from sidebar and block registry * fix(trigger): add isolated-vm support to trigger.dev container builds (simstudioai#3269) Scheduled workflow executions running in trigger.dev containers were failing to spawn isolated-vm workers because the native module wasn't available in the container. This caused loop condition evaluation to silently fail and exit after one iteration. - Add isolated-vm to build.external and additionalPackages in trigger config - Include isolated-vm-worker.cjs via additionalFiles for child process spawning - Add fallback path resolution for worker file in trigger.dev environment * lint * fix(trigger): update node version to align with main app (simstudioai#3272) * fix(build): fix corrupted sticky disk cache on blacksmith (simstudioai#3273) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Lakee Sivaraya <71339072+lakeesiv@users.noreply.github.com> Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai> Co-authored-by: Vikhyath Mondreti <vikhyathvikku@gmail.com>

…t support

… fixes, removed retired models, hex integration

…mprovements

…ogle tasks and bigquery integrations, workflow lock

…umentation

…gespeed insights, pagerduty

…, brandfetch, google meet

… pagination, memory improvements

… selectors for 14 blocks

…ory instrumentation

…aders, webhook trigger configs (simstudioai#3530)

…anvas navigation updates

… batching

…ization

vercel · 2026-03-22T23:34:55Z

@teedonk is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

cursor · 2026-03-22T23:34:56Z

PR Summary

High Risk
High risk because it adds a new embedding provider, introduces dynamic per-knowledge-base pgvector tables with raw SQL, and changes document ingestion + search paths that affect core data storage and retrieval behavior.

Overview
Adds Ollama as an embedding provider for knowledge bases, including creation-time validation against Ollama’s /api/show, optional ollamaBaseUrl handling, and UI controls to pick preset/custom models and dimensions.

Introduces lib/knowledge/dynamic-tables.ts to manage per-KB pgvector tables (create/drop, insert/delete, and vector/tag search), and wires this into the system: KB deletion drops the per-KB table, document processing routes embeddings into either the shared embedding table (OpenAI) or the KB-specific table (Ollama), and search now splits requested KBs by provider, generates provider-specific query embeddings, queries both backends, then normalizes/merges scores before returning results.

Updates embedding/chunking to be Ollama-aware: generateEmbeddings/generateSearchEmbedding can call Ollama’s /api/embed with smart batching/truncation, and document chunk sizing/overlap are capped based on Ollama model context length; token estimation for chunking becomes model-dependent.

^{Written by Cursor Bugbot for commit 075b005. This will update automatically on new commits. Configure here.}

apps/sim/lib/knowledge/embeddings.ts

apps/sim/lib/knowledge/documents/service.ts

apps/sim/lib/knowledge/dynamic-tables.ts

greptile-apps · 2026-03-22T23:38:45Z

Greptile Summary

This PR adds Ollama as a fully local/private embedding provider for knowledge bases, implementing per-KB dynamic pgvector tables to handle variable embedding dimensions across different Ollama models. The overall architecture is well-reasoned — auto-detecting dimensions from /api/show, context-aware chunking, smart batching, and cross-provider search normalization — and the OpenAI/Azure paths are left untouched.

Key issues found:

Data loss on Ollama document reprocessing (service.ts:611): The Ollama path deletes existing embeddings then inserts new ones without a transaction. A failure mid-insert leaves the document with no embeddings and no recovery path. The OpenAI path correctly uses db.transaction; Ollama should too.
Orphaned KB row on table creation failure (route.ts:136): If createKBEmbeddingTable throws after createKnowledgeBase has already committed, the KB row is left in the database with no corresponding embedding table. The catch block should delete the orphaned KB row before re-throwing.
Document status update regression (service.ts:628): The refactor extracted processingStatus: 'completed' from the db.transaction to a standalone call. For the OpenAI path this is a regression — if the embedding transaction succeeds but the status update fails, the document is permanently stuck in 'processing' state.
sql.raw vector interpolation (dynamic-tables.ts:238): The query vector is interpolated into a sql.raw template string. While the value is always JSON.stringify(number[]) today, this pattern bypasses Drizzle's parameterization and is fragile for future changes. A sql tagged template with a placeholder is safer.
Native <select> in the UI (create-base-modal.tsx): The model-picker uses a plain HTML <select> with hand-written Tailwind classes instead of the project's Select component, which won't inherit dark-mode tokens or future design-system updates.
Module-level cache in serverless (embeddings.ts:39): ollamaModelInfoCache won't persist across serverless invocations, making the TTL cache a no-op in typical deployment targets.

Confidence Score: 3/5

Not yet safe to merge — two P1 data-integrity bugs (non-atomic Ollama delete+insert and orphaned KB row) need to be fixed before this lands.
The feature design is solid and well-scoped, but service.ts introduces a data-loss scenario for Ollama document processing (no transaction around delete+insert) and a regression for OpenAI (status update pulled out of its transaction). The KB creation flow also leaves orphaned DB rows if pgvector table creation fails. These three issues touch the core data path and would cause observable bugs in production. The fixes are small and targeted, but they need to land before merge.
apps/sim/lib/knowledge/documents/service.ts (data-loss and regression), apps/sim/app/api/knowledge/route.ts (orphaned KB row), apps/sim/lib/knowledge/dynamic-tables.ts (sql.raw interpolation)

Important Files Changed

Filename	Overview
apps/sim/lib/knowledge/dynamic-tables.ts	New file implementing per-KB dynamic pgvector tables for Ollama embeddings. Core logic is sound (kbId → table name, create/drop/insert/search), but the vector literal is interpolated via sql.raw rather than parameterized, and the search functions don't wrap delete+insert atomically.
apps/sim/lib/knowledge/documents/service.ts	Two bugs introduced: (1) Ollama delete + insert embeddings are not wrapped in a transaction — partial insert failure leaves document with no embeddings (data-loss); (2) the document status update was extracted from the OpenAI transaction, meaning status can get stuck in 'processing' if the update call fails after embeddings are successfully inserted (regression).
apps/sim/app/api/knowledge/route.ts	Adds Ollama KB creation flow with dimension auto-detection. Bug: if createKBEmbeddingTable throws after createKnowledgeBase succeeds, the KB row is left orphaned in the DB with no corresponding embedding table and no cleanup.
apps/sim/lib/knowledge/embeddings.ts	Adds getOllamaModelInfo (with TTL cache), getOllamaModelContextLength, callOllamaEmbeddingAPI, and Ollama branches in generateEmbeddings/generateSearchEmbedding. Logic is clean; module-level cache is a minor concern in serverless environments.
apps/sim/app/api/knowledge/search/route.ts	Refactored to split OpenAI and Ollama KB IDs, generate separate query embeddings per provider, and merge results. The per-provider routing logic is clean and correctly handles mixed KB searches.
apps/sim/app/workspace/[workspaceId]/knowledge/components/create-base-modal/create-base-modal.tsx	Adds provider toggle (OpenAI/Ollama) with preset model selector and custom model/dimension inputs. Functional, but uses a raw HTML instead of the project's component library Select, which won't inherit dark-mode or design-system updates automatically.

Sequence Diagram

sequenceDiagram
    participant UI as CreateBaseModal
    participant API as POST /api/knowledge
    participant OllamaAPI as Ollama /api/show
    participant DB as Database
    participant DynTable as Per-KB Table

    UI->>API: POST {name, embeddingModel: "ollama/nomic-embed-text", ollamaBaseUrl}
    API->>OllamaAPI: POST /api/show {name: "nomic-embed-text"}
    OllamaAPI-->>API: {embedding_length: 768, context_length: 2048}
    API->>DB: createKnowledgeBase(data)
    DB-->>API: newKnowledgeBase {id}
    API->>DynTable: CREATE TABLE kb_embeddings_{id} (embedding vector(768))
    DynTable-->>API: OK

    Note over UI,DynTable: Document Upload Flow
    UI->>API: POST /api/knowledge/{id}/documents
    API->>OllamaAPI: GET context length (cached)
    OllamaAPI-->>API: contextLength=2048
    API->>API: chunk document (effectiveChunkSize = 0.3 × contextLength)
    API->>OllamaAPI: POST /api/embed {model, input: chunks[]}
    OllamaAPI-->>API: embeddings[]
    API->>DynTable: DELETE WHERE document_id=X
    API->>DynTable: INSERT embeddings (batched, no transaction)

    Note over UI,DynTable: Search Flow
    UI->>API: POST /api/knowledge/search
    API->>DB: SELECT embeddingModel FROM knowledgeBase WHERE id IN (...)
    DB-->>API: [{id, embeddingModel: "ollama/..."}]
    API->>OllamaAPI: POST /api/embed {model, input: [query]}
    OllamaAPI-->>API: queryVector[]
    API->>DynTable: SELECT ... ORDER BY embedding <=> queryVector LIMIT topK
    DynTable-->>API: SearchResult[]
    API-->>UI: merged results

_{Reviews (1): Last reviewed commit: "test(knowledge): update search tests for..." | Re-trigger Greptile}

greptile-apps · 2026-03-22T23:38:49Z

apps/sim/lib/knowledge/documents/service.ts

+          if (embeddingProvider === 'ollama') {
+            // Per-KB table: delete old chunks then bulk-insert new ones
+            await deleteKBDocumentEmbeddings(knowledgeBaseId, documentId)
+            await insertKBEmbeddings(knowledgeBaseId, embeddingRecords, kb[0].embeddingDimension)


Non-atomic delete + insert risks data loss

The Ollama path deletes all existing embeddings for the document and then inserts new ones without a wrapping transaction. If insertKBEmbeddings fails mid-way (e.g., after a partial batch insert), the old embeddings have already been deleted but the new ones are only partially written — leaving the document with fewer embeddings or none at all, with no way to recover automatically.

The OpenAI path below correctly wraps both operations in a db.transaction. Consider wrapping the Ollama path similarly. Since deleteKBDocumentEmbeddings and insertKBEmbeddings both use db.execute / db.insert, they should participate in a transaction too.

greptile-apps · 2026-03-22T23:38:49Z

apps/sim/app/api/knowledge/route.ts

      const newKnowledgeBase = await createKnowledgeBase(createData, requestId)

+      if (provider === 'ollama') {
+        try {
+          await createKBEmbeddingTable(newKnowledgeBase.id, effectiveDimension)
+        } catch (tableError) {
+          logger.error(
+            `[${requestId}] Failed to create embedding table for KB ${newKnowledgeBase.id}`,
+            tableError
+          )
+          throw tableError
+        }


Orphaned KB row when embedding table creation fails

createKnowledgeBase persists the KB row to the database at line 136 before createKBEmbeddingTable is called. If createKBEmbeddingTable throws (e.g. the pgvector extension isn't enabled or a naming collision occurs), the error is re-thrown and the request returns 500 — but the KB record is left behind in the database without a corresponding embedding table. Any subsequent document upload to this KB will fail with a table-not-found error and the user has no way to fix it from the UI.

A safe fix is to delete the orphaned KB row in the catch block before re-throwing:

} catch (tableError) { logger.error(...) // Clean up orphaned KB row await deleteKnowledgeBase(newKnowledgeBase.id, requestId).catch(() => {}) throw tableError }

greptile-apps · 2026-03-22T23:38:50Z

apps/sim/lib/knowledge/dynamic-tables.ts

+      // Use drizzle's insert API with dynamic table schema
+      await db.insert(dynamicTable).values(batch)
+    } catch (err: unknown) {
+      const pg = err as { code?: string; detail?: string; message?: string; cause?: unknown }
+      logger.error(`insertKBEmbeddings failed for table ${table}`, {
+        code: pg.code,
+        detail: pg.detail,
+        message: pg.message,
+        cause: pg.cause,


Vector literal injected via sql.raw — fragile and risky pattern

queryVector is constructed by the caller as JSON.stringify(emb) where emb is a number[], so in practice it contains no single quotes. However, interpolating it directly into a sql.raw string template is fragile: a future refactor that passes a different value could silently introduce a SQL injection or at minimum a syntax error.

Drizzle's parameterized approach is safer and just as performant:

// Instead of: const vecLiteral = sql.raw(`'${queryVector}'::vector`) // Prefer: const vecLiteral = sql`${queryVector}::vector`

Using a Drizzle placeholder ensures the value is always parameterized correctly, regardless of its content.

greptile-apps · 2026-03-22T23:38:51Z

apps/sim/lib/knowledge/documents/service.ts

+        await db
+          .update(document)
+          .set({
+            chunkCount: processed.metadata.chunkCount,
+            tokenCount: processed.metadata.tokenCount,
+            characterCount: processed.metadata.characterCount,
+            processingStatus: 'completed',
+            processingCompletedAt: now,
+            processingError: null,
+          })
+          .where(eq(document.id, documentId))


Document status update moved outside the transaction (regression)

In the original code, both the embedding inserts and the processingStatus: 'completed' update were inside a single db.transaction, so they were atomic. This PR extracts the status update to a separate await db.update(document)... call that runs after the transaction.

For the OpenAI path, if the transaction (embedding inserts) succeeds but the subsequent status update fails, the document stays permanently in 'processing' state — even though its embeddings are fully in place. There's no retry that would recover this (the catch block at line 656 sets status to 'failed', which is also incorrect since the embeddings are already there).

This is a regression for the OpenAI path that was introduced by the refactor to support the Ollama code path. The processingStatus update should remain inside the db.transaction for the OpenAI path.

greptile-apps · 2026-03-22T23:38:53Z

...sim/app/workspace/[workspaceId]/knowledge/components/create-base-modal/create-base-modal.tsx

+                        onChange={(e) => {
+                          const preset = OLLAMA_PRESET_MODELS.find(
+                            (m) => m.value === e.target.value
+                          )
+                          setOllamaPreset(e.target.value)
+                          if (preset && preset.dimension > 0) {
+                            setOllamaDimension(preset.dimension)
+                          }
+                        }}
+                        className='rounded-[4px] border border-[var(--border-1)] bg-[var(--surface-1)] px-[10px] py-[6px] text-[12px] text-[var(--text-primary)]'
+                      >
+                        {OLLAMA_PRESET_MODELS.map((m) => (
+                          <option key={m.value} value={m.value}>
+                            {m.label}
+                          </option>
+                        ))}
+                      </select>
+                    </div>
+


Native <select> instead of the component library's Select

The rest of the form uses components from the existing @/components/ui library (e.g. Input, Label, Button). This <select> is a raw HTML element with hand-written Tailwind classes for the border/background/text, which won't automatically respect dark mode tokens or any future design-system updates.

Consider replacing it with the project's Select / SelectContent / SelectItem / SelectTrigger pattern used elsewhere in the codebase.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-22T23:38:54Z

apps/sim/lib/knowledge/embeddings.ts

+  baseUrl = 'http://localhost:11434'
+): Promise<OllamaModelInfo> {
+  const cacheKey = `${modelName}@${baseUrl}`
+  const cached = ollamaModelInfoCache.get(cacheKey)


Module-level cache ineffective in serverless/edge deployments

ollamaModelInfoCache is a module-level Map. In serverless environments (Next.js API routes deployed to Vercel, AWS Lambda, etc.) each invocation may spawn a fresh process where the module is re-initialized, so the cache is effectively a no-op. Additionally, multiple concurrent invocations will bypass the cache and issue redundant /api/show calls.

If the project already uses a shared cache layer (e.g. Redis via lib/core/storage), consider using it here. Otherwise, this is worth documenting as a known limitation so future maintainers aren't surprised.

…t silent failures

…update

apps/sim/app/api/knowledge/search/route.ts

apps/sim/app/api/knowledge/route.ts

apps/sim/lib/knowledge/dynamic-tables.ts

…works

…licate TAG_SLOT_KEYS

apps/sim/app/api/knowledge/route.ts

cursor · 2026-03-23T00:38:58Z

apps/sim/app/api/knowledge/search/route.ts

+        )
+        results = [...openaiResults, ...ollamaResults]
+          .sort((a, b) => a.distance - b.distance)
+          .slice(0, validatedData.topK)


Cross-provider normalization skips single-result sets causing unfair ranking

Low Severity

normalizeScores returns single-item arrays unchanged (if (items.length <= 1) return items). In cross-provider search, this means if one provider returns only 1 result, its raw distance (e.g., 0.5) is sorted alongside the other provider's normalized 0–1 distances. The raw score from one embedding space isn't comparable to normalized scores from another, producing incorrect ranking when merged.

cursor · 2026-03-23T00:38:58Z

apps/sim/lib/knowledge/service.ts

+    // Store ollamaBaseUrl inside chunkingConfig JSONB to avoid a schema migration
+    chunkingConfig: data.ollamaBaseUrl
+      ? { ...data.chunkingConfig, ollamaBaseUrl: data.ollamaBaseUrl }
+      : data.chunkingConfig,


Returned chunkingConfig omits stored ollamaBaseUrl field

Low Severity

createKnowledgeBase stores ollamaBaseUrl inside chunkingConfig JSONB in the database (line 104–106), but the returned object uses the original data.chunkingConfig without ollamaBaseUrl (line 123). This means the API response from KB creation reflects a different chunkingConfig than what's persisted, which could cause inconsistencies if consumers cache or rely on the returned value.

…URL validation

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-23T07:25:43Z

apps/sim/app/api/knowledge/search/route.ts

+      let openaiQueryVector: string | null = null
+      if (hasQuery && openaiKbIds.length > 0) {
+        const emb = await generateSearchEmbedding(validatedData.query!, undefined, workspaceId)
+        openaiQueryVector = JSON.stringify(emb)


OpenAI search embedding ignores KB's configured model

High Severity

The search route generates a single OpenAI query embedding using the hardcoded default text-embedding-3-small, ignoring each KB's actual embeddingModel. This diff newly allows text-embedding-3-large in the creation schema, but search always calls generateSearchEmbedding(query, undefined, workspaceId) where undefined defaults to text-embedding-3-small. A KB created with text-embedding-3-large (3072 dimensions) would fail at search time because the query vector (1536 dimensions) can't be compared against stored vectors via pgvector's <=> operator. The Ollama path correctly generates per-model embeddings, but the OpenAI path does not.

Additional Locations (1)

apps/sim/app/api/knowledge/route.ts#L35-L37

cursor · 2026-03-23T07:25:43Z

apps/sim/lib/knowledge/documents/service.ts

          tokenCount: Math.ceil(chunk.text.length / 4),
          embedding: embeddings[chunkIndex] || null,
-          embeddingModel: 'text-embedding-3-small',
+          embeddingModel: embeddingModelName,


Stored token count uses wrong ratio for Ollama

Low Severity

The tokenCount in embedding records is always calculated as Math.ceil(chunk.text.length / 4), using the OpenAI token estimation ratio. For Ollama models, the TextChunker uses a ratio of 3 characters per token, so the stored tokenCount is inconsistent with how the chunker actually estimated tokens. This affects any downstream analytics or display that relies on per-chunk token counts.

waleedlatif1 and others added 30 commits February 16, 2026 00:36

v0.5.91: docs i18n, turborepo upgrade

b7e377e

v0.5.92: shortlinks, copilot scrolling stickiness, pagination

da46a38

v0.5.93: NextJS config changes, MCP and Blocks whitelisting, copilot …

fdca736

…keyboard shortcuts, audit logs

v0.5.94: vercel integration, folder insertion, migrated tracking redi…

15ace5e

…rects to rewrites

v0.5.96: sim oauth provider, slack ephemeral message tool and blockki…

34d92fa

…t support

v0.5.97: oidc discovery for copilot mcp

115f04e

v0.5.98: change detection improvements, rate limit and code execution…

0d86ea0

… fixes, removed retired models, hex integration

v0.5.99: local dev improvements, live workflow logs in terminal

af59234

v0.5.100: multiple credentials, 40% speedup, gong, attio, audit log i…

67f8a68

…mprovements

v0.5.101: circular dependency mitigation, confluence enhancements, go…

4fd0989

…ogle tasks and bigquery integrations, workflow lock

v0.5.102: new integrations, new tools, ci speedups, memory leak instr…

0d2e6ff

…umentation

v0.5.103: memory util instrumentation, API docs, amplitude, google pa…

e07e3c3

…gespeed insights, pagerduty

v0.5.104: memory improvements, nested subflows, careers page redirect…

f1ec5fe

…, brandfetch, google meet

v0.5.105: slack remove reaction, nested subflow locks fix, servicenow…

70c36cb

… pagination, memory improvements

v0.5.106: condition block and legacy kbs fixes, GPT 5.4

3ce9475

v0.5.107: new reddit, slack tools

6586c5c

v0.5.108: workflow input params in agent tools, bun upgrade, dropdown…

8c0a2e0

… selectors for 14 blocks

v0.5.109: obsidian and evernote integrations, slack fixes, remove mem…

ecd3536

…ory instrumentation

v0.5.110: webhook execution speedups, SSRF patches

1c2c2c6

v0.5.111: non-polling webhook execs off trigger.dev, gmail subject he…

36612ae

…aders, webhook trigger configs (simstudioai#3530)

v0.5.112: trace spans improvements, fathom integration, jira fixes, c…

e9bdc57

…anvas navigation updates

v0.5.113: jira, ashby, google ads, grain updates

4c12914

feat(knowledge): add Ollama embedding types

255640f

feat(knowledge): add per-KB dynamic pgvector tables

b043bc2

feat(knowledge): add Ollama embedding generation with retry and smart…

61f05a7

… batching

feat(knowledge): store ollamaBaseUrl in KB config

546dd7c

feat(chunkers): add embeddingModel to ChunkerOptions

616761d

feat(chunkers): add model-aware token estimation ratio

133f326

feat(knowledge): pass embeddingModel to all chunkers

2693251

teedonk added 8 commits March 22, 2026 23:20

feat(knowledge): add Ollama chunk size and overlap capping

18e7ac2

feat(knowledge): add Ollama model validation and auto-detect dimension

983efc3

feat(knowledge): update KB detail API for Ollama support

53a1423

feat(knowledge): add provider routing and cross-provider score normal…

0b5d218

…ization

feat(knowledge): add Ollama provider selection UI

606b70b

feat(knowledge): add Ollama params to create KB hook

da36fcd

test(knowledge): update KB detail tests for Ollama support

b9e6ab7

test(knowledge): update search tests for provider routing

b1e92b8

cursor bot reviewed Mar 22, 2026

View reviewed changes

apps/sim/lib/knowledge/embeddings.ts Show resolved Hide resolved

apps/sim/lib/knowledge/documents/service.ts Outdated Show resolved Hide resolved

apps/sim/lib/knowledge/dynamic-tables.ts Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

teedonk added 4 commits March 22, 2026 23:56

fix(knowledge): separate validation from runtime model info to preven…

3698a04

…t silent failures

fix(knowledge): parameterize query vector and accept transaction handle

988158e

fix(knowledge): wrap Ollama delete+insert in transaction with status …

166a7f3

…update

fix(knowledge): clean up orphaned KB on table creation failure

2f30934

cursor bot reviewed Mar 23, 2026

View reviewed changes

apps/sim/app/api/knowledge/search/route.ts Show resolved Hide resolved

apps/sim/app/api/knowledge/route.ts Show resolved Hide resolved

apps/sim/lib/knowledge/dynamic-tables.ts Show resolved Hide resolved

fix(knowledge): replace native select with project Select component

f88e9f9

cursor bot reviewed Mar 23, 2026

View reviewed changes

apps/sim/lib/knowledge/dynamic-tables.ts Show resolved Hide resolved

teedonk added 3 commits March 23, 2026 00:33

fix(knowledge): sort and trim Ollama results to topK

863e497

fix(knowledge): restrict Ollama base URL to localhost and private net…

546061e

…works

fix(knowledge): filter deleted documents from Ollama search and dedup…

00b3c7d

…licate TAG_SLOT_KEYS

cursor bot reviewed Mar 23, 2026

View reviewed changes

fix(knowledge): use OLLAMA_URL env var and allow Docker hostnames in …

075b005

…URL validation

cursor bot reviewed Mar 23, 2026

View reviewed changes

Conversation

teedonk commented Mar 22, 2026

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel bot commented Mar 22, 2026

Uh oh!

cursor bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Mar 22, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Cross-provider normalization skips single-result sets causing unfair ranking

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Returned chunkingConfig omits stored ollamaBaseUrl field

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

OpenAI search embedding ignores KB's configured model

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Stored token count uses wrong ratio for Ollama

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cursor bot commented Mar 22, 2026 •

edited

Loading