Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
b7e377e
v0.5.91: docs i18n, turborepo upgrade
waleedlatif1 Feb 16, 2026
da46a38
v0.5.92: shortlinks, copilot scrolling stickiness, pagination
waleedlatif1 Feb 17, 2026
fdca736
v0.5.93: NextJS config changes, MCP and Blocks whitelisting, copilot …
waleedlatif1 Feb 18, 2026
15ace5e
v0.5.94: vercel integration, folder insertion, migrated tracking redi…
waleedlatif1 Feb 19, 2026
67aa4bb
v0.5.95: gemini 3.1 pro, cloudflare, dataverse, revenuecat, redis, up…
waleedlatif1 Feb 20, 2026
34d92fa
v0.5.96: sim oauth provider, slack ephemeral message tool and blockki…
waleedlatif1 Feb 21, 2026
115f04e
v0.5.97: oidc discovery for copilot mcp
waleedlatif1 Feb 21, 2026
0d86ea0
v0.5.98: change detection improvements, rate limit and code execution…
waleedlatif1 Feb 22, 2026
af59234
v0.5.99: local dev improvements, live workflow logs in terminal
waleedlatif1 Feb 23, 2026
67f8a68
v0.5.100: multiple credentials, 40% speedup, gong, attio, audit log i…
waleedlatif1 Feb 25, 2026
4fd0989
v0.5.101: circular dependency mitigation, confluence enhancements, go…
waleedlatif1 Feb 26, 2026
0d2e6ff
v0.5.102: new integrations, new tools, ci speedups, memory leak instr…
waleedlatif1 Feb 28, 2026
e07e3c3
v0.5.103: memory util instrumentation, API docs, amplitude, google pa…
waleedlatif1 Mar 2, 2026
f1ec5fe
v0.5.104: memory improvements, nested subflows, careers page redirect…
waleedlatif1 Mar 4, 2026
70c36cb
v0.5.105: slack remove reaction, nested subflow locks fix, servicenow…
waleedlatif1 Mar 5, 2026
3ce9475
v0.5.106: condition block and legacy kbs fixes, GPT 5.4
icecrasher321 Mar 6, 2026
6586c5c
v0.5.107: new reddit, slack tools
waleedlatif1 Mar 6, 2026
8c0a2e0
v0.5.108: workflow input params in agent tools, bun upgrade, dropdown…
icecrasher321 Mar 7, 2026
ecd3536
v0.5.109: obsidian and evernote integrations, slack fixes, remove mem…
waleedlatif1 Mar 9, 2026
1c2c2c6
v0.5.110: webhook execution speedups, SSRF patches
waleedlatif1 Mar 11, 2026
36612ae
v0.5.111: non-polling webhook execs off trigger.dev, gmail subject he…
icecrasher321 Mar 12, 2026
e9bdc57
v0.5.112: trace spans improvements, fathom integration, jira fixes, c…
waleedlatif1 Mar 12, 2026
4c12914
v0.5.113: jira, ashby, google ads, grain updates
icecrasher321 Mar 13, 2026
255640f
feat(knowledge): add Ollama embedding types
teedonk Mar 22, 2026
b043bc2
feat(knowledge): add per-KB dynamic pgvector tables
teedonk Mar 22, 2026
61f05a7
feat(knowledge): add Ollama embedding generation with retry and smart…
teedonk Mar 22, 2026
546dd7c
feat(knowledge): store ollamaBaseUrl in KB config
teedonk Mar 22, 2026
616761d
feat(chunkers): add embeddingModel to ChunkerOptions
teedonk Mar 22, 2026
133f326
feat(chunkers): add model-aware token estimation ratio
teedonk Mar 22, 2026
2693251
feat(knowledge): pass embeddingModel to all chunkers
teedonk Mar 22, 2026
18e7ac2
feat(knowledge): add Ollama chunk size and overlap capping
teedonk Mar 22, 2026
983efc3
feat(knowledge): add Ollama model validation and auto-detect dimension
teedonk Mar 22, 2026
53a1423
feat(knowledge): update KB detail API for Ollama support
teedonk Mar 22, 2026
0b5d218
feat(knowledge): add provider routing and cross-provider score normal…
teedonk Mar 22, 2026
606b70b
feat(knowledge): add Ollama provider selection UI
teedonk Mar 22, 2026
da36fcd
feat(knowledge): add Ollama params to create KB hook
teedonk Mar 22, 2026
b9e6ab7
test(knowledge): update KB detail tests for Ollama support
teedonk Mar 22, 2026
b1e92b8
test(knowledge): update search tests for provider routing
teedonk Mar 22, 2026
3698a04
fix(knowledge): separate validation from runtime model info to preven…
teedonk Mar 22, 2026
988158e
fix(knowledge): parameterize query vector and accept transaction handle
teedonk Mar 22, 2026
166a7f3
fix(knowledge): wrap Ollama delete+insert in transaction with status …
teedonk Mar 22, 2026
2f30934
fix(knowledge): clean up orphaned KB on table creation failure
teedonk Mar 22, 2026
f88e9f9
fix(knowledge): replace native select with project Select component
teedonk Mar 23, 2026
863e497
fix(knowledge): sort and trim Ollama results to topK
teedonk Mar 23, 2026
546061e
fix(knowledge): restrict Ollama base URL to localhost and private net…
teedonk Mar 23, 2026
00b3c7d
fix(knowledge): filter deleted documents from Ollama search and dedup…
teedonk Mar 23, 2026
075b005
fix(knowledge): use OLLAMA_URL env var and allow Docker hostnames in …
teedonk Mar 23, 2026
ea59193
fix(knowledge): align dynamic table SQL types with shared schema
teedonk Mar 23, 2026
ee3cc30
fix(knowledge): remove hardcoded OpenAI defaults from updateKnowledge…
teedonk Mar 23, 2026
e6d0a60
fix(knowledge): add enabled field and fix token ratio for Ollama embe…
teedonk Mar 23, 2026
0812f3b
fix(knowledge): remove immutable fields from update schema
teedonk Mar 23, 2026
fd8d2b3
fix(knowledge): strengthen SSRF validation for Ollama base URL
teedonk Mar 23, 2026
5c872c4
fix(knowledge): remove dead code and fix Record type in search route
teedonk Mar 23, 2026
4571299
fix(knowledge): add missing dynamic-tables mock in test
teedonk Mar 23, 2026
322dc4e
fix(knowledge): block IPv6-mapped IPv4 SSRF bypass and fix ::1 hostna…
teedonk Mar 23, 2026
ef84871
fix(knowledge): use KB embedding model for search and fix single-resu…
teedonk Mar 23, 2026
d308fe0
fix(knowledge): preserve ollamaBaseUrl when updating chunkingConfig
teedonk Mar 23, 2026
aa452f4
fix(knowledge): validate Ollama auto-detected dimension against bounds
teedonk Mar 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions apps/sim/app/api/knowledge/[id]/route.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ const { mockGetSession, mockDbChain } = vi.hoisted(() => {
limit: vi.fn().mockReturnThis(),
update: vi.fn().mockReturnThis(),
set: vi.fn().mockReturnThis(),
execute: vi.fn().mockResolvedValue(undefined),
}
return { mockGetSession, mockDbChain }
})
Expand Down Expand Up @@ -98,6 +99,10 @@ vi.mock('@sim/db/schema', () => ({

vi.mock('@/lib/audit/log', () => auditMock)

vi.mock('@/lib/knowledge/dynamic-tables', () => ({
dropKBEmbeddingTable: vi.fn().mockResolvedValue(undefined),
}))

vi.mock('@/lib/knowledge/service', () => ({
getKnowledgeBaseById: vi.fn(),
updateKnowledgeBase: vi.fn(),
Expand Down
5 changes: 3 additions & 2 deletions apps/sim/app/api/knowledge/[id]/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import { AuditAction, AuditResourceType, recordAudit } from '@/lib/audit/log'
import { checkSessionOrInternalAuth } from '@/lib/auth/hybrid'
import { PlatformEvents } from '@/lib/core/telemetry'
import { generateRequestId } from '@/lib/core/utils/request'
import { dropKBEmbeddingTable } from '@/lib/knowledge/dynamic-tables'
import {
deleteKnowledgeBase,
getKnowledgeBaseById,
Expand All @@ -25,8 +26,6 @@ const logger = createLogger('KnowledgeBaseByIdAPI')
const UpdateKnowledgeBaseSchema = z.object({
name: z.string().min(1, 'Name is required').optional(),
description: z.string().optional(),
embeddingModel: z.literal('text-embedding-3-small').optional(),
embeddingDimension: z.literal(1536).optional(),
workspaceId: z.string().nullable().optional(),
chunkingConfig: z
.object({
Expand Down Expand Up @@ -200,6 +199,8 @@ export async function DELETE(
}

await deleteKnowledgeBase(id, requestId)
// Drop the per-KB embedding table if this was an Ollama KB (no-op for OpenAI KBs)
await dropKBEmbeddingTable(id)

try {
PlatformEvents.knowledgeBaseDeleted({
Expand Down
140 changes: 137 additions & 3 deletions apps/sim/app/api/knowledge/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,17 @@ import { AuditAction, AuditResourceType, recordAudit } from '@/lib/audit/log'
import { getSession } from '@/lib/auth'
import { PlatformEvents } from '@/lib/core/telemetry'
import { generateRequestId } from '@/lib/core/utils/request'
import { createKnowledgeBase, getKnowledgeBases } from '@/lib/knowledge/service'
import {
createKBEmbeddingTable,
dropKBEmbeddingTable,
parseEmbeddingModel,
} from '@/lib/knowledge/dynamic-tables'
import { getOllamaBaseUrl, validateOllamaModel } from '@/lib/knowledge/embeddings'
import {
createKnowledgeBase,
deleteKnowledgeBase,
getKnowledgeBases,
} from '@/lib/knowledge/service'

const logger = createLogger('KnowledgeBaseAPI')

Expand All @@ -21,8 +31,66 @@ const CreateKnowledgeBaseSchema = z.object({
name: z.string().min(1, 'Name is required'),
description: z.string().optional(),
workspaceId: z.string().min(1, 'Workspace ID is required'),
embeddingModel: z.literal('text-embedding-3-small').default('text-embedding-3-small'),
embeddingDimension: z.literal(1536).default(1536),
embeddingModel: z
.union([
z.literal('text-embedding-3-small'),
z.literal('text-embedding-3-large'),
z.string().regex(/^ollama\/.+/, 'Ollama models must be prefixed with "ollama/"'),
])
.default('text-embedding-3-small'),
embeddingDimension: z.number().int().min(64).max(8192).default(1536),
ollamaBaseUrl: z
.string()
.url('Ollama base URL must be a valid URL')
.refine(
(url) => {
try {
const parsed = new URL(url)
// Only allow http/https schemes
if (parsed.protocol !== 'http:' && parsed.protocol !== 'https:') {
return false
}
const hostname = parsed.hostname.toLowerCase()
// Block known cloud metadata endpoints
if (hostname === '169.254.169.254' || hostname === 'metadata.google.internal') {
return false
}
// Block IPv6 addresses (except loopback) — prevents IPv6-mapped IPv4 bypass
// URL.hostname keeps brackets for IPv6, e.g. "[::ffff:169.254.169.254]"
if (hostname.startsWith('[') && hostname !== '[::1]') {
return false
}
// Allow localhost, loopback, and private network ranges
if (
hostname === 'localhost' ||
hostname === '[::1]' ||
hostname.startsWith('127.') ||
hostname.startsWith('10.') ||
hostname.startsWith('192.168.')
) {
return true
}
// Allow 172.16.0.0 – 172.31.255.255
if (hostname.startsWith('172.')) {
const second = Number.parseInt(hostname.split('.')[1], 10)
if (second >= 16 && second <= 31) return true
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSRF bypass via hostname prefix string matching

Medium Severity

The ollamaBaseUrl SSRF validation checks hostnames with startsWith('10.'), startsWith('127.'), startsWith('192.168.'), and startsWith('172.') to allow private IPs. However, these checks also match public domain names like 10.evil.com, 127.evil.com, 192.168.evil.com, or 172.16.evil.com. Such domains pass validation but can resolve to any IP, including cloud metadata endpoints (e.g., 169.254.169.254), enabling server-side request forgery against internal services.

Fix in Cursor Fix in Web

// Allow Docker service hostnames (no dots = not a public domain)
// e.g. "ollama", "host.docker.internal"
if (!hostname.includes('.') || hostname.endsWith('.internal')) {
return true
}
return false
} catch {
return false
}
},
{
message:
'Ollama base URL must point to localhost, a private network address, or a Docker service hostname',
}
)
.optional(),
chunkingConfig: z
.object({
/** Maximum chunk size in tokens (1 token ≈ 4 characters) */
Expand Down Expand Up @@ -89,13 +157,79 @@ export async function POST(req: NextRequest) {
try {
const validatedData = CreateKnowledgeBaseSchema.parse(body)

const { provider, modelName } = parseEmbeddingModel(validatedData.embeddingModel)

// For Ollama models, validate the model is available and auto-detect dimension
let effectiveDimension = validatedData.embeddingDimension
if (provider === 'ollama') {
const ollamaBaseUrl = getOllamaBaseUrl(validatedData.ollamaBaseUrl)
try {
const modelInfo = await validateOllamaModel(modelName, ollamaBaseUrl)

// Auto-correct dimension if the model reports a different one
if (modelInfo.embeddingLength && modelInfo.embeddingLength !== effectiveDimension) {
if (modelInfo.embeddingLength < 64 || modelInfo.embeddingLength > 8192) {
return NextResponse.json(
{
error: `Ollama model "${modelName}" reported an unsupported embedding dimension (${modelInfo.embeddingLength}). Supported range: 64–8192.`,
},
{ status: 400 }
)
}
logger.info(
`[${requestId}] Auto-correcting embedding dimension from ${effectiveDimension} ` +
`to ${modelInfo.embeddingLength} (reported by Ollama model ${modelName})`
)
effectiveDimension = modelInfo.embeddingLength
}
} catch {
return NextResponse.json(
{
error:
`Cannot reach Ollama at ${ollamaBaseUrl} or model "${modelName}" is not available. ` +
`Make sure Ollama is running and the model is pulled (ollama pull ${modelName}).`,
},
{ status: 400 }
)
}
}

const createData = {
...validatedData,
embeddingDimension: effectiveDimension,
userId: session.user.id,
}

const newKnowledgeBase = await createKnowledgeBase(createData, requestId)

if (provider === 'ollama') {
try {
await createKBEmbeddingTable(newKnowledgeBase.id, effectiveDimension)
} catch (tableError) {
logger.error(
`[${requestId}] Failed to create embedding table for KB ${newKnowledgeBase.id}`,
tableError
)
// Clean up the orphaned KB row and any partially-created table
try {
await dropKBEmbeddingTable(newKnowledgeBase.id)
await deleteKnowledgeBase(newKnowledgeBase.id, requestId)
logger.info(
`[${requestId}] Cleaned up orphaned KB ${newKnowledgeBase.id} after table creation failure`
)
} catch (cleanupError) {
logger.error(
`[${requestId}] Failed to clean up orphaned KB ${newKnowledgeBase.id}`,
cleanupError
)
}
return NextResponse.json(
{ error: 'Failed to create embedding storage. Please try again.' },
{ status: 500 }
)
}
Comment on lines 203 to +230
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Orphaned KB row when embedding table creation fails

createKnowledgeBase persists the KB row to the database at line 136 before createKBEmbeddingTable is called. If createKBEmbeddingTable throws (e.g. the pgvector extension isn't enabled or a naming collision occurs), the error is re-thrown and the request returns 500 — but the KB record is left behind in the database without a corresponding embedding table. Any subsequent document upload to this KB will fail with a table-not-found error and the user has no way to fix it from the UI.

A safe fix is to delete the orphaned KB row in the catch block before re-throwing:

} catch (tableError) {
  logger.error(...)
  // Clean up orphaned KB row
  await deleteKnowledgeBase(newKnowledgeBase.id, requestId).catch(() => {})
  throw tableError
}

}

try {
PlatformEvents.knowledgeBaseCreated({
knowledgeBaseId: newKnowledgeBase.id,
Expand Down
63 changes: 38 additions & 25 deletions apps/sim/app/api/knowledge/search/route.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ const {
mockGetQueryStrategy,
mockGenerateSearchEmbedding,
mockGetDocumentNamesByIds,
mockParseEmbeddingModel,
mockSearchKBTable,
mockSearchKBTableTagOnly,
} = vi.hoisted(() => ({
mockDbChain: {
select: vi.fn().mockReturnThis(),
Expand All @@ -47,6 +50,9 @@ const {
mockGetQueryStrategy: vi.fn(),
mockGenerateSearchEmbedding: vi.fn(),
mockGetDocumentNamesByIds: vi.fn(),
mockParseEmbeddingModel: vi.fn(),
mockSearchKBTable: vi.fn(),
mockSearchKBTableTagOnly: vi.fn(),
}))

vi.mock('drizzle-orm', () => ({
Expand Down Expand Up @@ -126,6 +132,16 @@ vi.mock('./utils', () => ({
},
}))

vi.mock('@/lib/knowledge/dynamic-tables', () => ({
parseEmbeddingModel: mockParseEmbeddingModel,
searchKBTable: mockSearchKBTable,
searchKBTableTagOnly: mockSearchKBTableTagOnly,
}))

vi.mock('@/lib/knowledge/embeddings', () => ({
generateSearchEmbedding: mockGenerateSearchEmbedding,
}))

import { estimateTokenCount } from '@/lib/tokenization/estimators'
import { POST } from '@/app/api/knowledge/search/route'
import { calculateCost } from '@/providers/utils'
Expand Down Expand Up @@ -163,6 +179,18 @@ describe('Knowledge Search API Route', () => {
}
})

// KB config fetch: db.select().from().where() resolves to default single-KB config
mockDbChain.where.mockResolvedValue([
{ id: 'kb-123', embeddingModel: 'text-embedding-3-small', chunkingConfig: {} },
])

mockParseEmbeddingModel.mockReturnValue({
provider: 'openai',
modelName: 'text-embedding-3-small',
})
mockSearchKBTable.mockResolvedValue([])
mockSearchKBTableTagOnly.mockResolvedValue([])

mockHandleTagOnlySearch.mockClear()
mockHandleVectorOnlySearch.mockClear()
mockHandleTagAndVectorSearch.mockClear()
Expand Down Expand Up @@ -275,6 +303,11 @@ describe('Knowledge Search API Route', () => {
.mockResolvedValueOnce({ hasAccess: true, knowledgeBase: multiKbs[0] })
.mockResolvedValueOnce({ hasAccess: true, knowledgeBase: multiKbs[1] })

mockDbChain.where.mockResolvedValue([
{ id: 'kb-123', embeddingModel: 'text-embedding-3-small', chunkingConfig: {} },
{ id: 'kb-456', embeddingModel: 'text-embedding-3-small', chunkingConfig: {} },
])

mockDbChain.limit.mockResolvedValue([])

mockHandleVectorOnlySearch.mockResolvedValue(mockSearchResults)
Expand Down Expand Up @@ -946,6 +979,11 @@ describe('Knowledge Search API Route', () => {

mockHandleTagOnlySearch.mockResolvedValue(mockTaggedResults)

mockDbChain.where.mockResolvedValue([
{ id: 'kb-123', embeddingModel: 'text-embedding-3-small', chunkingConfig: {} },
{ id: 'kb-456', embeddingModel: 'text-embedding-3-small', chunkingConfig: {} },
])

mockDbChain.limit.mockResolvedValueOnce(mockTagDefinitions)

const req = createMockRequest('POST', multiKbTagData)
Expand Down Expand Up @@ -1003,13 +1041,6 @@ describe('Knowledge Search API Route', () => {
'doc-active': 'Active Document.pdf',
})

const mockTagDefs = {
select: vi.fn().mockReturnThis(),
from: vi.fn().mockReturnThis(),
where: vi.fn().mockResolvedValue([]),
}
mockDbChain.select.mockReturnValueOnce(mockTagDefs)

const req = createMockRequest('POST', {
knowledgeBaseIds: ['kb-123'],
query: 'test query',
Expand Down Expand Up @@ -1072,15 +1103,6 @@ describe('Knowledge Search API Route', () => {
'doc-active-tagged': 'Active Tagged Document.pdf',
})

const mockTagDefs = {
select: vi.fn().mockReturnThis(),
from: vi.fn().mockReturnThis(),
where: vi
.fn()
.mockResolvedValue([{ tagSlot: 'tag1', displayName: 'tag1', fieldType: 'text' }]),
}
mockDbChain.select.mockReturnValueOnce(mockTagDefs)

const req = createMockRequest('POST', {
knowledgeBaseIds: ['kb-123'],
tagFilters: [{ tagName: 'tag1', value: 'api', fieldType: 'text', operator: 'eq' }],
Expand Down Expand Up @@ -1145,15 +1167,6 @@ describe('Knowledge Search API Route', () => {
'doc-active-combined': 'Active Combined Search.pdf',
})

const mockTagDefs = {
select: vi.fn().mockReturnThis(),
from: vi.fn().mockReturnThis(),
where: vi
.fn()
.mockResolvedValue([{ tagSlot: 'tag1', displayName: 'tag1', fieldType: 'text' }]),
}
mockDbChain.select.mockReturnValueOnce(mockTagDefs)

const req = createMockRequest('POST', {
knowledgeBaseIds: ['kb-123'],
query: 'relevant content',
Expand Down
Loading