mirror of
https://github.com/alkimake/paperclip.git
synced 2026-06-15 18:30:39 +09:00
Add full company search page (#5293)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators need to find work, documents, agents, projects, comments, and activity across a company without jumping through separate surfaces. > - The existing Command-K flow was useful for fast navigation but not enough for deeper company-wide discovery. > - Search also needs company-scoped backend contracts, query cost controls, and indexed document matching so it stays safe as company data grows. > - This pull request adds a full company search API and a dedicated board search page that Command-K can hand off to. > - The benefit is a single searchable control-plane surface with richer result context, recents, highlights, and test coverage across server and UI behavior. ## What Changed - Added a company-scoped search endpoint/service with query validation, rate limiting, text matching, fuzzy title matching, and result typing shared through `@paperclipai/shared`. - Added idempotent search migrations for document search indexes and fuzzy matching support. - Added the full `/companies/:companyKey/search` UI, search result row components, highlighted snippets, recent searches, and sidebar/Command-K handoff. - Added Storybook coverage for search surfaces and Vitest coverage for server search behavior, rate limiting, route generation, Command-K behavior, and the search page. - Addressed Greptile findings by renaming the no-match SQL helper, applying search pagination after cross-type merge sorting, and lazy-initializing the default search service so unrelated route-test mocks do not need to know about it. - Merged current `public-gh/master` and renumbered the search migrations behind upstream `0078_white_darwin`: search indexes are now `0079_company_search_document_indexes` and fuzzy matching is `0080_company_search_fuzzystrmatch`. ## Verification - `git fetch public-gh master` - `git diff --check public-gh/master...HEAD` - `git diff --name-only public-gh/master...HEAD | rg '^pnpm-lock\.yaml$' || true` produced no output before opening the PR. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts ui/src/pages/Search.test.tsx ui/src/components/CommandPalette.test.tsx ui/src/lib/company-routes.test.ts` passed: 5 files, 25 tests. - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` passed. - `pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed after Greptile pagination fixes. - `pnpm exec vitest run server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts server/src/__tests__/company-search-service.test.ts && pnpm --filter @paperclipai/server typecheck` passed after the CI mock fix. - After resolving the migration conflict with current `public-gh/master`: `pnpm --filter @paperclipai/db typecheck && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed. - DB migration numbering check passed as part of `@paperclipai/db` typecheck. - UI states are covered by the added Storybook stories in `ui/storybook/stories/search.stories.tsx`. - GitHub reports the PR merge state as `CLEAN` on head `18e54fa8`. - GitHub PR checks are green on head `18e54fa8`: policy, verify, serialized server shards 1/4 through 4/4, e2e, canary dry run, Snyk, and Greptile Review. ## Risks - Search ranking and snippets are new user-facing behavior, so reviewers should check whether result ordering feels right on real company data. - Search touches broad company data, so company scoping and query cost/rate-limit behavior should be reviewed carefully. - The migrations add search indexes/extensions; they are idempotent with `IF NOT EXISTS` for users who may have applied an earlier branch migration number. > ROADMAP.md checked. This PR adds a focused board search surface and does not duplicate an open roadmap item. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub CLI session with medium reasoning effort. Existing branch commits were produced across prior agent sessions; this packaging pass verified, opened the PR, addressed Greptile findings, resolved migration conflicts after upstream PRs landed, and got PR checks green. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
424e81d087
commit
320fd5d23b
31 changed files with 3672 additions and 4 deletions
|
|
@ -0,0 +1,53 @@
|
|||
import express from "express";
|
||||
import request from "supertest";
|
||||
import { describe, expect, it, vi } from "vitest";
|
||||
import { issueRoutes } from "../routes/issues.js";
|
||||
import { createCompanySearchRateLimiter } from "../services/company-search-rate-limit.js";
|
||||
import type { CompanySearchQuery, CompanySearchResponse } from "@paperclipai/shared";
|
||||
|
||||
function createSearchResponse(query: CompanySearchQuery): CompanySearchResponse {
|
||||
return {
|
||||
query: query.q,
|
||||
normalizedQuery: query.q.trim().toLowerCase(),
|
||||
scope: query.scope,
|
||||
limit: query.limit,
|
||||
offset: query.offset,
|
||||
results: [],
|
||||
countsByType: { issue: 0, agent: 0, project: 0 },
|
||||
hasMore: false,
|
||||
};
|
||||
}
|
||||
|
||||
describe("company search route rate limiting", () => {
|
||||
it("rejects repeated same-actor search calls before invoking search", async () => {
|
||||
const search = vi.fn(async (_companyId: string, query: CompanySearchQuery) => createSearchResponse(query));
|
||||
const app = express();
|
||||
app.use((req, _res, next) => {
|
||||
req.actor = {
|
||||
type: "agent",
|
||||
agentId: "agent-1",
|
||||
companyId: "company-1",
|
||||
source: "agent_key",
|
||||
};
|
||||
next();
|
||||
});
|
||||
app.use("/api", issueRoutes({} as never, {} as never, {
|
||||
searchService: { search },
|
||||
searchRateLimiter: createCompanySearchRateLimiter({
|
||||
maxRequests: 1,
|
||||
windowMs: 60_000,
|
||||
now: () => 1_000,
|
||||
}),
|
||||
}));
|
||||
|
||||
await request(app).get("/api/companies/company-1/search?q=wizard").expect(200);
|
||||
const limited = await request(app).get("/api/companies/company-1/search?q=wizard").expect(429);
|
||||
|
||||
expect(search).toHaveBeenCalledTimes(1);
|
||||
expect(limited.body).toMatchObject({
|
||||
error: "Search rate limit exceeded",
|
||||
retryAfterSeconds: 60,
|
||||
});
|
||||
expect(limited.headers["retry-after"]).toBe("60");
|
||||
});
|
||||
});
|
||||
454
server/src/__tests__/company-search-service.test.ts
Normal file
454
server/src/__tests__/company-search-service.test.ts
Normal file
|
|
@ -0,0 +1,454 @@
|
|||
import { randomUUID } from "node:crypto";
|
||||
import { sql } from "drizzle-orm";
|
||||
import { afterAll, afterEach, beforeAll, describe, expect, it } from "vitest";
|
||||
import {
|
||||
agents,
|
||||
companies,
|
||||
createDb,
|
||||
documents,
|
||||
issueComments,
|
||||
issueDocuments,
|
||||
issues,
|
||||
projects,
|
||||
} from "@paperclipai/db";
|
||||
import { companySearchQuerySchema, COMPANY_SEARCH_MAX_QUERY_LENGTH } from "@paperclipai/shared";
|
||||
import {
|
||||
getEmbeddedPostgresTestSupport,
|
||||
startEmbeddedPostgresTestDatabase,
|
||||
} from "./helpers/embedded-postgres.js";
|
||||
import {
|
||||
COMPANY_SEARCH_BRANCH_FETCH_LIMIT,
|
||||
companySearchBranchFetchLimit,
|
||||
companySearchService,
|
||||
} from "../services/company-search.js";
|
||||
|
||||
const embeddedPostgresSupport = await getEmbeddedPostgresTestSupport();
|
||||
const describeEmbeddedPostgres = embeddedPostgresSupport.supported ? describe : describe.skip;
|
||||
|
||||
if (!embeddedPostgresSupport.supported) {
|
||||
console.warn(
|
||||
`Skipping embedded Postgres company search tests on this host: ${embeddedPostgresSupport.reason ?? "unsupported environment"}`,
|
||||
);
|
||||
}
|
||||
|
||||
describe("company search query validation", () => {
|
||||
it("clamps query length, limit, and offset without rejecting the request", () => {
|
||||
const parsed = companySearchQuerySchema.parse({
|
||||
q: "x".repeat(COMPANY_SEARCH_MAX_QUERY_LENGTH + 50),
|
||||
limit: "500",
|
||||
offset: "9000",
|
||||
scope: "not-a-scope",
|
||||
});
|
||||
|
||||
expect(parsed.q).toHaveLength(COMPANY_SEARCH_MAX_QUERY_LENGTH);
|
||||
expect(parsed.limit).toBe(50);
|
||||
expect(parsed.offset).toBe(200);
|
||||
expect(parsed.scope).toBe("all");
|
||||
});
|
||||
|
||||
it("includes offset in the internal per-branch fetch window", () => {
|
||||
const lowOffset = companySearchQuerySchema.parse({ q: "needle", limit: "50", offset: "0" });
|
||||
const highOffset = companySearchQuerySchema.parse({ q: "needle", limit: "50", offset: "9000" });
|
||||
|
||||
expect(companySearchBranchFetchLimit(lowOffset.limit, lowOffset.offset)).toBe(51);
|
||||
expect(companySearchBranchFetchLimit(highOffset.limit, highOffset.offset)).toBe(COMPANY_SEARCH_BRANCH_FETCH_LIMIT);
|
||||
});
|
||||
});
|
||||
|
||||
describeEmbeddedPostgres("companySearchService", () => {
|
||||
let db!: ReturnType<typeof createDb>;
|
||||
let svc!: ReturnType<typeof companySearchService>;
|
||||
let tempDb: Awaited<ReturnType<typeof startEmbeddedPostgresTestDatabase>> | null = null;
|
||||
|
||||
beforeAll(async () => {
|
||||
tempDb = await startEmbeddedPostgresTestDatabase("paperclip-company-search-");
|
||||
db = createDb(tempDb.connectionString);
|
||||
svc = companySearchService(db);
|
||||
await db.execute(sql.raw("CREATE EXTENSION IF NOT EXISTS pg_trgm"));
|
||||
}, 20_000);
|
||||
|
||||
afterEach(async () => {
|
||||
await db.delete(issueDocuments);
|
||||
await db.delete(documents);
|
||||
await db.delete(issueComments);
|
||||
await db.delete(issues);
|
||||
await db.delete(projects);
|
||||
await db.delete(agents);
|
||||
await db.delete(companies);
|
||||
});
|
||||
|
||||
afterAll(async () => {
|
||||
await tempDb?.cleanup();
|
||||
});
|
||||
|
||||
async function createCompany(name = "Paperclip") {
|
||||
const companyId = randomUUID();
|
||||
await db.insert(companies).values({
|
||||
id: companyId,
|
||||
name,
|
||||
issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
|
||||
requireBoardApprovalForNewAgents: false,
|
||||
});
|
||||
return companyId;
|
||||
}
|
||||
|
||||
async function createIssue(companyId: string, values: Partial<typeof issues.$inferInsert> = {}) {
|
||||
const id = values.id ?? randomUUID();
|
||||
await db.insert(issues).values({
|
||||
id,
|
||||
companyId,
|
||||
title: values.title ?? "Search target",
|
||||
description: values.description ?? null,
|
||||
status: values.status ?? "todo",
|
||||
priority: values.priority ?? "medium",
|
||||
identifier: values.identifier ?? null,
|
||||
hiddenAt: values.hiddenAt ?? null,
|
||||
...values,
|
||||
});
|
||||
return id;
|
||||
}
|
||||
|
||||
async function createAgent(companyId: string, values: Partial<typeof agents.$inferInsert> = {}) {
|
||||
const id = values.id ?? randomUUID();
|
||||
await db.insert(agents).values({
|
||||
id,
|
||||
companyId,
|
||||
name: values.name ?? "Search agent",
|
||||
role: values.role ?? "engineer",
|
||||
title: values.title ?? null,
|
||||
capabilities: values.capabilities ?? null,
|
||||
...values,
|
||||
});
|
||||
return id;
|
||||
}
|
||||
|
||||
async function createProject(companyId: string, values: Partial<typeof projects.$inferInsert> = {}) {
|
||||
const id = values.id ?? randomUUID();
|
||||
await db.insert(projects).values({
|
||||
id,
|
||||
companyId,
|
||||
name: values.name ?? "Search project",
|
||||
description: values.description ?? null,
|
||||
...values,
|
||||
});
|
||||
return id;
|
||||
}
|
||||
|
||||
it("ranks exact issue identifiers before weaker title matches", async () => {
|
||||
const companyId = await createCompany();
|
||||
const exactId = await createIssue(companyId, {
|
||||
identifier: "TST-42",
|
||||
title: "Backend endpoint",
|
||||
});
|
||||
await createIssue(companyId, {
|
||||
identifier: "TST-43",
|
||||
title: "TST-42 mentioned in title only",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "TST-42" }));
|
||||
|
||||
expect(result.results[0]?.id).toBe(exactId);
|
||||
expect(result.results[0]?.matchedFields).toContain("identifier");
|
||||
});
|
||||
|
||||
it("matches multiple tokens across the same issue thread and returns comment snippets", async () => {
|
||||
const companyId = await createCompany();
|
||||
const issueId = await createIssue(companyId, {
|
||||
identifier: "TST-7",
|
||||
title: "Checkout semantics",
|
||||
description: "Atomic ownership is enforced here.",
|
||||
});
|
||||
await db.insert(issueComments).values({
|
||||
companyId,
|
||||
issueId,
|
||||
body: "The ranking snippet should explain why this thread matched.",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "checkout snippet" }));
|
||||
const match = result.results.find((item) => item.id === issueId);
|
||||
|
||||
expect(match).toBeTruthy();
|
||||
expect(match?.matchedFields).toEqual(expect.arrayContaining(["title", "comment"]));
|
||||
expect(match?.snippets.some((snippet) => /snippet/i.test(snippet.text))).toBe(true);
|
||||
});
|
||||
|
||||
it("searches issue documents and returns document metadata for snippets", async () => {
|
||||
const companyId = await createCompany();
|
||||
const issueId = await createIssue(companyId, {
|
||||
identifier: "TST-8",
|
||||
title: "Adapter manager",
|
||||
});
|
||||
const documentId = randomUUID();
|
||||
await db.insert(documents).values({
|
||||
id: documentId,
|
||||
companyId,
|
||||
title: "Hermes Parser Plan",
|
||||
latestBody: "The external adapter parser should be discovered from the plugin package.",
|
||||
format: "markdown",
|
||||
});
|
||||
await db.insert(issueDocuments).values({
|
||||
companyId,
|
||||
issueId,
|
||||
documentId,
|
||||
key: "plan",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "Hermes parser", scope: "documents" }));
|
||||
|
||||
expect(result.results).toHaveLength(1);
|
||||
expect(result.results[0]?.id).toBe(issueId);
|
||||
expect(result.results[0]?.matchedFields).toContain("document");
|
||||
expect(result.results[0]?.href).toContain("#document-plan");
|
||||
expect(result.results[0]?.snippet).toMatch(/parser/i);
|
||||
});
|
||||
|
||||
it("excludes hidden issues and other companies' data", async () => {
|
||||
const companyId = await createCompany("Visible Co");
|
||||
const otherCompanyId = await createCompany("Other Co");
|
||||
const visibleId = await createIssue(companyId, {
|
||||
identifier: "VIS-1",
|
||||
title: "Visible needle",
|
||||
});
|
||||
await createIssue(companyId, {
|
||||
identifier: "HID-1",
|
||||
title: "Hidden needle",
|
||||
hiddenAt: new Date(),
|
||||
});
|
||||
await createIssue(otherCompanyId, {
|
||||
identifier: "OTH-1",
|
||||
title: "Other company needle",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "needle" }));
|
||||
|
||||
expect(result.results.map((item) => item.id)).toEqual([visibleId]);
|
||||
});
|
||||
|
||||
it("treats bare SQL wildcard characters as literals instead of match-all queries", async () => {
|
||||
const companyId = await createCompany();
|
||||
const issueId = await createIssue(companyId, {
|
||||
identifier: "TST-20",
|
||||
title: "Plain issue target",
|
||||
description: "Plain issue description",
|
||||
});
|
||||
await db.insert(issueComments).values({
|
||||
companyId,
|
||||
issueId,
|
||||
body: "Plain comment body",
|
||||
});
|
||||
const documentId = randomUUID();
|
||||
await db.insert(documents).values({
|
||||
id: documentId,
|
||||
companyId,
|
||||
title: "Plain document",
|
||||
latestBody: "Plain document body",
|
||||
format: "markdown",
|
||||
});
|
||||
await db.insert(issueDocuments).values({
|
||||
companyId,
|
||||
issueId,
|
||||
documentId,
|
||||
key: "plain",
|
||||
});
|
||||
await createAgent(companyId, {
|
||||
name: "Plain Agent",
|
||||
role: "engineer",
|
||||
capabilities: "Plain agent capabilities",
|
||||
});
|
||||
await createProject(companyId, {
|
||||
name: "Plain Project",
|
||||
description: "Plain project description",
|
||||
});
|
||||
|
||||
for (const q of ["%", "_", "\\"]) {
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q }));
|
||||
expect(result.results, `q=${q}`).toEqual([]);
|
||||
}
|
||||
});
|
||||
|
||||
it("matches percent characters literally across issue, comment, document, agent, and project results", async () => {
|
||||
const companyId = await createCompany();
|
||||
const issueMatchId = await createIssue(companyId, {
|
||||
identifier: "TST-21",
|
||||
title: "Release 100% checklist",
|
||||
});
|
||||
const issueDecoyId = await createIssue(companyId, {
|
||||
identifier: "TST-22",
|
||||
title: "Release 1000 checklist",
|
||||
});
|
||||
const commentMatchId = await createIssue(companyId, {
|
||||
identifier: "TST-23",
|
||||
title: "Comment literal holder",
|
||||
});
|
||||
const commentDecoyId = await createIssue(companyId, {
|
||||
identifier: "TST-24",
|
||||
title: "Comment decoy holder",
|
||||
});
|
||||
await db.insert(issueComments).values([
|
||||
{
|
||||
companyId,
|
||||
issueId: commentMatchId,
|
||||
body: "QA is 100% confident in this result.",
|
||||
},
|
||||
{
|
||||
companyId,
|
||||
issueId: commentDecoyId,
|
||||
body: "QA is 1000 confident in this result.",
|
||||
},
|
||||
]);
|
||||
const documentMatchIssueId = await createIssue(companyId, {
|
||||
identifier: "TST-25",
|
||||
title: "Document literal holder",
|
||||
});
|
||||
const documentDecoyIssueId = await createIssue(companyId, {
|
||||
identifier: "TST-26",
|
||||
title: "Document decoy holder",
|
||||
});
|
||||
const documentMatchId = randomUUID();
|
||||
const documentDecoyId = randomUUID();
|
||||
await db.insert(documents).values([
|
||||
{
|
||||
id: documentMatchId,
|
||||
companyId,
|
||||
title: "Literal rollout",
|
||||
latestBody: "Ship 100% complete adapter support.",
|
||||
format: "markdown",
|
||||
},
|
||||
{
|
||||
id: documentDecoyId,
|
||||
companyId,
|
||||
title: "Decoy rollout",
|
||||
latestBody: "Ship 1000 complete adapter support.",
|
||||
format: "markdown",
|
||||
},
|
||||
]);
|
||||
await db.insert(issueDocuments).values([
|
||||
{
|
||||
companyId,
|
||||
issueId: documentMatchIssueId,
|
||||
documentId: documentMatchId,
|
||||
key: "literal",
|
||||
},
|
||||
{
|
||||
companyId,
|
||||
issueId: documentDecoyIssueId,
|
||||
documentId: documentDecoyId,
|
||||
key: "decoy",
|
||||
},
|
||||
]);
|
||||
const agentMatchId = await createAgent(companyId, {
|
||||
name: "100% Specialist",
|
||||
role: "engineer",
|
||||
});
|
||||
const agentDecoyId = await createAgent(companyId, {
|
||||
name: "1000 Specialist",
|
||||
role: "engineer",
|
||||
});
|
||||
const projectMatchId = await createProject(companyId, {
|
||||
name: "100% Launch Plan",
|
||||
});
|
||||
const projectDecoyId = await createProject(companyId, {
|
||||
name: "1000 Launch Plan",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "100%" }));
|
||||
const ids = result.results.map((row) => row.id);
|
||||
|
||||
expect(ids).toEqual(expect.arrayContaining([
|
||||
issueMatchId,
|
||||
commentMatchId,
|
||||
documentMatchIssueId,
|
||||
agentMatchId,
|
||||
projectMatchId,
|
||||
]));
|
||||
expect(ids).not.toEqual(expect.arrayContaining([
|
||||
issueDecoyId,
|
||||
commentDecoyId,
|
||||
documentDecoyIssueId,
|
||||
agentDecoyId,
|
||||
projectDecoyId,
|
||||
]));
|
||||
});
|
||||
|
||||
it("applies offset after merging cross-type result ranking", async () => {
|
||||
const companyId = await createCompany();
|
||||
const base = new Date("2026-01-01T00:00:00.000Z").getTime();
|
||||
const agentIds = await Promise.all([
|
||||
createAgent(companyId, { name: "Needle agent 1", updatedAt: new Date(base + 6_000) }),
|
||||
createAgent(companyId, { name: "Needle agent 2", updatedAt: new Date(base + 5_000) }),
|
||||
createAgent(companyId, { name: "Needle agent 3", updatedAt: new Date(base + 4_000) }),
|
||||
]);
|
||||
const projectIds = await Promise.all([
|
||||
createProject(companyId, { name: "Needle project 1", updatedAt: new Date(base + 3_000) }),
|
||||
createProject(companyId, { name: "Needle project 2", updatedAt: new Date(base + 2_000) }),
|
||||
createProject(companyId, { name: "Needle project 3", updatedAt: new Date(base + 1_000) }),
|
||||
]);
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "needle", limit: "2", offset: "2" }));
|
||||
|
||||
expect(result.results.map((row) => row.id)).toEqual([agentIds[2], projectIds[0]]);
|
||||
expect(result.countsByType).toEqual({ issue: 0, agent: 3, project: 3 });
|
||||
expect(result.hasMore).toBe(true);
|
||||
});
|
||||
|
||||
it("escapes underscore and backslash characters in issue phrase and token patterns", async () => {
|
||||
const companyId = await createCompany();
|
||||
const literalId = await createIssue(companyId, {
|
||||
identifier: "TST-27",
|
||||
title: "Literal foo_bar path c:\\tmp",
|
||||
});
|
||||
const decoyId = await createIssue(companyId, {
|
||||
identifier: "TST-28",
|
||||
title: "Decoy fooXbar path c:tmp",
|
||||
});
|
||||
|
||||
for (const q of ["foo_bar", "c:\\tmp"]) {
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q, scope: "issues" }));
|
||||
const ids = result.results.map((row) => row.id);
|
||||
expect(ids, `q=${q}`).toContain(literalId);
|
||||
expect(ids, `q=${q}`).not.toContain(decoyId);
|
||||
}
|
||||
});
|
||||
|
||||
it("uses pg_trgm for conservative fuzzy title matches", async () => {
|
||||
const companyId = await createCompany();
|
||||
const issueId = await createIssue(companyId, {
|
||||
identifier: "TST-9",
|
||||
title: "Onboarding wizard polish",
|
||||
});
|
||||
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: "onbordng wizard" }));
|
||||
|
||||
expect(result.results[0]?.id).toBe(issueId);
|
||||
expect(result.results[0]?.matchedFields).toContain("title");
|
||||
});
|
||||
|
||||
it("matches transposition typos against multi-word titles", async () => {
|
||||
const companyId = await createCompany();
|
||||
const searchIssueId = await createIssue(companyId, {
|
||||
identifier: "TST-10",
|
||||
title: "Improve search performance",
|
||||
});
|
||||
const mobileIssueId = await createIssue(companyId, {
|
||||
identifier: "TST-11",
|
||||
title: "Polish mobile navigation",
|
||||
});
|
||||
const otherIssueId = await createIssue(companyId, {
|
||||
identifier: "TST-12",
|
||||
title: "Refactor billing reports",
|
||||
});
|
||||
|
||||
const transpositionCases: Array<{ query: string; expectedId: string; rejected: string }> = [
|
||||
{ query: "serach", expectedId: searchIssueId, rejected: otherIssueId },
|
||||
{ query: "mibile", expectedId: mobileIssueId, rejected: otherIssueId },
|
||||
{ query: "mobail", expectedId: mobileIssueId, rejected: otherIssueId },
|
||||
];
|
||||
|
||||
for (const { query, expectedId, rejected } of transpositionCases) {
|
||||
const result = await svc.search(companyId, companySearchQuerySchema.parse({ q: query }));
|
||||
const ids = result.results.map((row) => row.id);
|
||||
expect(ids, `query=${query}`).toContain(expectedId);
|
||||
expect(ids, `query=${query} should not match unrelated issue`).not.toContain(rejected);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
|
@ -9,6 +9,7 @@ import {
|
|||
addIssueCommentSchema,
|
||||
acceptIssueThreadInteractionSchema,
|
||||
cancelIssueThreadInteractionSchema,
|
||||
companySearchQuerySchema,
|
||||
createIssueAttachmentMetadataSchema,
|
||||
createIssueThreadInteractionSchema,
|
||||
createIssueWorkProductSchema,
|
||||
|
|
@ -32,6 +33,8 @@ import {
|
|||
getClosedIsolatedExecutionWorkspaceMessage,
|
||||
isClosedIsolatedExecutionWorkspace,
|
||||
normalizeIssueIdentifier as normalizeIssueReferenceIdentifier,
|
||||
type CompanySearchQuery,
|
||||
type CompanySearchResponse,
|
||||
type ExecutionWorkspace,
|
||||
type SuccessfulRunHandoffState,
|
||||
} from "@paperclipai/shared";
|
||||
|
|
@ -44,6 +47,7 @@ import {
|
|||
accessService,
|
||||
agentService,
|
||||
companyService,
|
||||
companySearchService,
|
||||
executionWorkspaceService,
|
||||
goalService,
|
||||
heartbeatService,
|
||||
|
|
@ -81,6 +85,10 @@ import { feedbackService } from "../services/feedback.js";
|
|||
import { instanceSettingsService } from "../services/instance-settings.js";
|
||||
import { environmentService } from "../services/environments.js";
|
||||
import { redactSensitiveText } from "../redaction.js";
|
||||
import {
|
||||
createCompanySearchRateLimiter,
|
||||
type CompanySearchRateLimiter,
|
||||
} from "../services/company-search-rate-limit.js";
|
||||
import {
|
||||
applyIssueExecutionPolicyTransition,
|
||||
normalizeIssueExecutionPolicy,
|
||||
|
|
@ -97,6 +105,9 @@ const updateIssueRouteSchema = updateIssueSchema.extend({
|
|||
|
||||
type ParsedExecutionState = NonNullable<ReturnType<typeof parseIssueExecutionState>>;
|
||||
type NormalizedExecutionPolicy = NonNullable<ReturnType<typeof normalizeIssueExecutionPolicy>>;
|
||||
type CompanySearchService = {
|
||||
search(companyId: string, query: CompanySearchQuery): Promise<CompanySearchResponse>;
|
||||
};
|
||||
type ActivityIssueRelationSummary = {
|
||||
id: string;
|
||||
identifier: string | null;
|
||||
|
|
@ -253,6 +264,23 @@ function summarizeIssueRelationForActivity(relation: {
|
|||
};
|
||||
}
|
||||
|
||||
const defaultCompanySearchRateLimiter = createCompanySearchRateLimiter();
|
||||
|
||||
function companySearchRateLimitActor(req: Request, companyId: string) {
|
||||
if (req.actor.type === "agent") {
|
||||
return {
|
||||
companyId,
|
||||
actorType: "agent" as const,
|
||||
actorId: req.actor.agentId ?? req.actor.keyId ?? "unknown-agent",
|
||||
};
|
||||
}
|
||||
return {
|
||||
companyId,
|
||||
actorType: "board" as const,
|
||||
actorId: req.actor.userId ?? req.actor.source ?? "board",
|
||||
};
|
||||
}
|
||||
|
||||
function summarizeIssueReferenceActivityDetails(input:
|
||||
| {
|
||||
addedReferencedIssues: ActivityIssueRelationSummary[];
|
||||
|
|
@ -548,6 +576,8 @@ export function issueRoutes(
|
|||
now?: Date;
|
||||
}): Promise<unknown>;
|
||||
};
|
||||
searchService?: CompanySearchService;
|
||||
searchRateLimiter?: CompanySearchRateLimiter;
|
||||
pluginWorkerManager?: PluginWorkerManager;
|
||||
} = {},
|
||||
) {
|
||||
|
|
@ -559,6 +589,12 @@ export function issueRoutes(
|
|||
});
|
||||
const feedback = feedbackService(db);
|
||||
const companiesSvc = companyService(db);
|
||||
let searchSvc = opts.searchService ?? null;
|
||||
const getSearchService = () => {
|
||||
searchSvc ??= companySearchService(db);
|
||||
return searchSvc;
|
||||
};
|
||||
const searchRateLimiter = opts.searchRateLimiter ?? defaultCompanySearchRateLimiter;
|
||||
const instanceSettings = instanceSettingsService(db);
|
||||
const agentsSvc = agentService(db);
|
||||
const projectsSvc = projectService(db);
|
||||
|
|
@ -1048,6 +1084,25 @@ export function issueRoutes(
|
|||
});
|
||||
});
|
||||
|
||||
router.get("/companies/:companyId/search", async (req, res) => {
|
||||
const companyId = req.params.companyId as string;
|
||||
assertCompanyAccess(req, companyId);
|
||||
const query = companySearchQuerySchema.parse(req.query);
|
||||
const rateLimit = searchRateLimiter.consume(companySearchRateLimitActor(req, companyId));
|
||||
res.setHeader("X-RateLimit-Limit", String(rateLimit.limit));
|
||||
res.setHeader("X-RateLimit-Remaining", String(rateLimit.remaining));
|
||||
if (!rateLimit.allowed) {
|
||||
res.setHeader("Retry-After", String(rateLimit.retryAfterSeconds));
|
||||
res.status(429).json({
|
||||
error: "Search rate limit exceeded",
|
||||
retryAfterSeconds: rateLimit.retryAfterSeconds,
|
||||
});
|
||||
return;
|
||||
}
|
||||
const result = await getSearchService().search(companyId, query);
|
||||
res.json(result);
|
||||
});
|
||||
|
||||
router.get("/companies/:companyId/issues", async (req, res) => {
|
||||
const companyId = req.params.companyId as string;
|
||||
assertCompanyAccess(req, companyId);
|
||||
|
|
|
|||
63
server/src/services/company-search-rate-limit.ts
Normal file
63
server/src/services/company-search-rate-limit.ts
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
export const COMPANY_SEARCH_RATE_LIMIT_WINDOW_MS = 60_000;
|
||||
export const COMPANY_SEARCH_RATE_LIMIT_MAX_REQUESTS = 60;
|
||||
|
||||
export type CompanySearchRateLimitActor = {
|
||||
companyId: string;
|
||||
actorType: "agent" | "board";
|
||||
actorId: string;
|
||||
};
|
||||
|
||||
export type CompanySearchRateLimitResult = {
|
||||
allowed: boolean;
|
||||
limit: number;
|
||||
remaining: number;
|
||||
retryAfterSeconds: number;
|
||||
};
|
||||
|
||||
export type CompanySearchRateLimiter = {
|
||||
consume(actor: CompanySearchRateLimitActor): CompanySearchRateLimitResult;
|
||||
};
|
||||
|
||||
export function createCompanySearchRateLimiter(options: {
|
||||
windowMs?: number;
|
||||
maxRequests?: number;
|
||||
now?: () => number;
|
||||
} = {}): CompanySearchRateLimiter {
|
||||
const windowMs = options.windowMs ?? COMPANY_SEARCH_RATE_LIMIT_WINDOW_MS;
|
||||
const maxRequests = options.maxRequests ?? COMPANY_SEARCH_RATE_LIMIT_MAX_REQUESTS;
|
||||
const now = options.now ?? Date.now;
|
||||
const hitsByKey = new Map<string, number[]>();
|
||||
|
||||
function key(actor: CompanySearchRateLimitActor) {
|
||||
return `${actor.companyId}:${actor.actorType}:${actor.actorId}`;
|
||||
}
|
||||
|
||||
return {
|
||||
consume(actor) {
|
||||
const currentTime = now();
|
||||
const cutoff = currentTime - windowMs;
|
||||
const actorKey = key(actor);
|
||||
const recentHits = (hitsByKey.get(actorKey) ?? []).filter((hit) => hit > cutoff);
|
||||
|
||||
if (recentHits.length >= maxRequests) {
|
||||
const oldestHit = recentHits[0] ?? currentTime;
|
||||
hitsByKey.set(actorKey, recentHits);
|
||||
return {
|
||||
allowed: false,
|
||||
limit: maxRequests,
|
||||
remaining: 0,
|
||||
retryAfterSeconds: Math.max(1, Math.ceil((oldestHit + windowMs - currentTime) / 1000)),
|
||||
};
|
||||
}
|
||||
|
||||
recentHits.push(currentTime);
|
||||
hitsByKey.set(actorKey, recentHits);
|
||||
return {
|
||||
allowed: true,
|
||||
limit: maxRequests,
|
||||
remaining: Math.max(0, maxRequests - recentHits.length),
|
||||
retryAfterSeconds: 0,
|
||||
};
|
||||
},
|
||||
};
|
||||
}
|
||||
696
server/src/services/company-search.ts
Normal file
696
server/src/services/company-search.ts
Normal file
|
|
@ -0,0 +1,696 @@
|
|||
import { and, desc, eq, isNull, sql } from "drizzle-orm";
|
||||
import type { SQL } from "drizzle-orm";
|
||||
import type { Db } from "@paperclipai/db";
|
||||
import { agents, companies, issues, projects } from "@paperclipai/db";
|
||||
import {
|
||||
COMPANY_SEARCH_MAX_LIMIT,
|
||||
COMPANY_SEARCH_MAX_OFFSET,
|
||||
COMPANY_SEARCH_MAX_TOKENS,
|
||||
type CompanySearchIssueSummary,
|
||||
type CompanySearchQuery,
|
||||
type CompanySearchResponse,
|
||||
type CompanySearchResult,
|
||||
type CompanySearchResultType,
|
||||
type CompanySearchScope,
|
||||
type CompanySearchSnippet,
|
||||
} from "@paperclipai/shared";
|
||||
|
||||
const MIN_TOKEN_LENGTH = 2;
|
||||
const MIN_FUZZY_QUERY_LENGTH = 4;
|
||||
const MIN_FUZZY_TOKEN_LENGTH = 4;
|
||||
// Cap fuzzy edits using the shorter of (query token, title word) so common
|
||||
// 4–5 letter English words don't sweep in noise (e.g. "serach" vs "each").
|
||||
const FUZZY_PAIR_LONG_LENGTH = 6;
|
||||
const FUZZY_PAIR_LONG_MAX_EDITS = 2;
|
||||
const FUZZY_PAIR_MEDIUM_LENGTH = 5;
|
||||
const FUZZY_PAIR_MEDIUM_MAX_EDITS = 1;
|
||||
const FUZZY_PAIR_SHORT_MAX_EDITS = 0;
|
||||
const FUZZY_IDENTIFIER_SIMILARITY_THRESHOLD = 0.45;
|
||||
const SNIPPET_MAX_CHARS = 240;
|
||||
export const COMPANY_SEARCH_BRANCH_FETCH_LIMIT = COMPANY_SEARCH_MAX_OFFSET + COMPANY_SEARCH_MAX_LIMIT + 1;
|
||||
|
||||
type IssueSearchRow = {
|
||||
id: string;
|
||||
identifier: string | null;
|
||||
title: string;
|
||||
description: string | null;
|
||||
status: string;
|
||||
priority: string;
|
||||
assigneeAgentId: string | null;
|
||||
assigneeUserId: string | null;
|
||||
projectId: string | null;
|
||||
updatedAt: Date;
|
||||
score: number | string;
|
||||
matchedFields: string[] | null;
|
||||
commentSnippet: string | null;
|
||||
commentId: string | null;
|
||||
documentSnippet: string | null;
|
||||
documentTitle: string | null;
|
||||
documentKey: string | null;
|
||||
};
|
||||
|
||||
type SimpleSearchRow = {
|
||||
id: string;
|
||||
title: string;
|
||||
description: string | null;
|
||||
role?: string | null;
|
||||
updatedAt: Date;
|
||||
};
|
||||
|
||||
function normalizeQuery(query: string) {
|
||||
return query.trim().replace(/\s+/g, " ").toLowerCase();
|
||||
}
|
||||
|
||||
function escapeLikePattern(value: string): string {
|
||||
return value.replace(/[\\%_]/g, "\\$&");
|
||||
}
|
||||
|
||||
function tokenizeQuery(normalizedQuery: string) {
|
||||
const matches = normalizedQuery.match(/"[^"]+"|[^\s]+/g) ?? [];
|
||||
const tokens: string[] = [];
|
||||
for (const match of matches) {
|
||||
const token = match.replace(/^"|"$/g, "").replace(/^[^\p{L}\p{N}%_\\-]+|[^\p{L}\p{N}%_\\-]+$/gu, "");
|
||||
if (token.length < MIN_TOKEN_LENGTH) continue;
|
||||
if (!tokens.includes(token)) tokens.push(token);
|
||||
if (tokens.length >= COMPANY_SEARCH_MAX_TOKENS) break;
|
||||
}
|
||||
return tokens;
|
||||
}
|
||||
|
||||
function fuzzyEligibleTokens(tokens: string[]): string[] {
|
||||
return tokens.filter((token) => token.length >= MIN_FUZZY_TOKEN_LENGTH);
|
||||
}
|
||||
|
||||
function sqlTextArray(values: string[]) {
|
||||
if (values.length === 0) return sql`ARRAY[]::text[]`;
|
||||
return sql`ARRAY[${sql.join(values.map((value) => sql`${value}`), sql`, `)}]::text[]`;
|
||||
}
|
||||
|
||||
function tokenMatchExpression(textExpression: SQL, tokenArray: SQL) {
|
||||
return sql<boolean>`
|
||||
EXISTS (
|
||||
SELECT 1
|
||||
FROM unnest(${tokenArray}) AS search_token(value)
|
||||
WHERE lower(coalesce(${textExpression}, '')) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
)
|
||||
`;
|
||||
}
|
||||
|
||||
function noMatchSql() {
|
||||
return sql<boolean>`false`;
|
||||
}
|
||||
|
||||
function plainText(value: string | null | undefined) {
|
||||
return (value ?? "")
|
||||
.replace(/```[\s\S]*?```/g, " ")
|
||||
.replace(/`([^`]+)`/g, "$1")
|
||||
.replace(/\[([^\]]+)\]\([^)]+\)/g, "$1")
|
||||
.replace(/[#>*_~|]+/g, " ")
|
||||
.replace(/\s+/g, " ")
|
||||
.trim();
|
||||
}
|
||||
|
||||
const MARKDOWN_IMAGE_PATTERN = /!\[[^\]]*\]\(\s*([^)\s]+)(?:\s+"[^"]*")?\s*\)/;
|
||||
|
||||
function extractFirstImageUrl(value: string | null | undefined): string | null {
|
||||
if (!value) return null;
|
||||
const match = MARKDOWN_IMAGE_PATTERN.exec(value);
|
||||
return match ? match[1] : null;
|
||||
}
|
||||
|
||||
function findFirstMatchIndex(value: string, terms: string[]) {
|
||||
const lower = value.toLowerCase();
|
||||
let best = -1;
|
||||
for (const term of terms) {
|
||||
if (term.length === 0) continue;
|
||||
const index = lower.indexOf(term.toLowerCase());
|
||||
if (index >= 0 && (best < 0 || index < best)) best = index;
|
||||
}
|
||||
return best;
|
||||
}
|
||||
|
||||
function highlightRanges(value: string, terms: string[]) {
|
||||
const lower = value.toLowerCase();
|
||||
const ranges: Array<{ start: number; end: number }> = [];
|
||||
for (const term of terms) {
|
||||
const normalized = term.toLowerCase();
|
||||
if (normalized.length === 0) continue;
|
||||
let index = lower.indexOf(normalized);
|
||||
while (index >= 0) {
|
||||
const next = { start: index, end: index + normalized.length };
|
||||
const overlaps = ranges.some((range) => next.start < range.end && next.end > range.start);
|
||||
if (!overlaps) ranges.push(next);
|
||||
index = lower.indexOf(normalized, index + normalized.length);
|
||||
}
|
||||
}
|
||||
return ranges.sort((left, right) => left.start - right.start);
|
||||
}
|
||||
|
||||
function createSnippet(field: string, label: string, source: string | null | undefined, terms: string[]): CompanySearchSnippet | null {
|
||||
const text = plainText(source);
|
||||
if (!text) return null;
|
||||
const firstMatch = findFirstMatchIndex(text, terms);
|
||||
const windowStart = firstMatch < 0 ? 0 : Math.max(0, firstMatch - 80);
|
||||
const windowEnd = Math.min(text.length, windowStart + SNIPPET_MAX_CHARS);
|
||||
const prefix = windowStart > 0 ? "..." : "";
|
||||
const suffix = windowEnd < text.length ? "..." : "";
|
||||
const slice = text.slice(windowStart, windowEnd).trim();
|
||||
const snippetText = `${prefix}${slice}${suffix}`;
|
||||
const offset = prefix.length - windowStart;
|
||||
return {
|
||||
field,
|
||||
label,
|
||||
text: snippetText,
|
||||
highlights: highlightRanges(text, terms)
|
||||
.filter((range) => range.end > windowStart && range.start < windowEnd)
|
||||
.map((range) => ({
|
||||
start: Math.max(0, range.start + offset),
|
||||
end: Math.min(snippetText.length, range.end + offset),
|
||||
})),
|
||||
};
|
||||
}
|
||||
|
||||
function iso(value: Date | string | null | undefined) {
|
||||
if (!value) return null;
|
||||
return value instanceof Date ? value.toISOString() : new Date(value).toISOString();
|
||||
}
|
||||
|
||||
function routePrefix(issuePrefix: string | null | undefined) {
|
||||
return issuePrefix?.trim() || "company";
|
||||
}
|
||||
|
||||
function issueHref(prefix: string, issue: { id: string; identifier: string | null }, suffix = "") {
|
||||
return `/${prefix}/issues/${encodeURIComponent(issue.identifier ?? issue.id)}${suffix}`;
|
||||
}
|
||||
|
||||
function matchTerms(normalizedQuery: string, tokens: string[]) {
|
||||
return [normalizedQuery, ...tokens].filter((term, index, terms) => term.length > 0 && terms.indexOf(term) === index);
|
||||
}
|
||||
|
||||
function makeCounts(results: CompanySearchResult[]) {
|
||||
const counts: Record<CompanySearchResultType, number> = { issue: 0, agent: 0, project: 0 };
|
||||
for (const result of results) counts[result.type] += 1;
|
||||
return counts;
|
||||
}
|
||||
|
||||
function scopeIncludesIssues(scope: CompanySearchScope) {
|
||||
return scope === "all" || scope === "issues" || scope === "comments" || scope === "documents";
|
||||
}
|
||||
|
||||
function scopeIncludesAgents(scope: CompanySearchScope) {
|
||||
return scope === "all" || scope === "agents";
|
||||
}
|
||||
|
||||
function scopeIncludesProjects(scope: CompanySearchScope) {
|
||||
return scope === "all" || scope === "projects";
|
||||
}
|
||||
|
||||
function issueSearchCondition(scope: CompanySearchScope, input: {
|
||||
issueTextMatch: SQL<boolean>;
|
||||
commentMatch: SQL<boolean>;
|
||||
documentMatch: SQL<boolean>;
|
||||
fuzzyMatch: SQL<boolean>;
|
||||
}) {
|
||||
if (scope === "comments") return input.commentMatch;
|
||||
if (scope === "documents") return input.documentMatch;
|
||||
if (scope === "issues") return sql<boolean>`(${input.issueTextMatch} OR ${input.fuzzyMatch})`;
|
||||
return sql<boolean>`(${input.issueTextMatch} OR ${input.commentMatch} OR ${input.documentMatch} OR ${input.fuzzyMatch})`;
|
||||
}
|
||||
|
||||
function selectPrimarySnippets(row: IssueSearchRow, normalizedQuery: string, tokens: string[]) {
|
||||
const terms = matchTerms(normalizedQuery, tokens);
|
||||
const matchedFields = new Set(row.matchedFields ?? []);
|
||||
const candidates: Array<CompanySearchSnippet | null> = [];
|
||||
if (matchedFields.has("identifier")) {
|
||||
candidates.push(createSnippet("identifier", "Identifier", row.identifier, terms));
|
||||
}
|
||||
if (matchedFields.has("title")) {
|
||||
candidates.push(createSnippet("title", "Title", row.title, terms));
|
||||
}
|
||||
if (matchedFields.has("comment")) {
|
||||
candidates.push(createSnippet("comment", "Comment", row.commentSnippet, terms));
|
||||
}
|
||||
if (matchedFields.has("document")) {
|
||||
candidates.push(createSnippet("document", row.documentTitle || "Document", row.documentSnippet, terms));
|
||||
}
|
||||
if (matchedFields.has("description")) {
|
||||
candidates.push(createSnippet("description", "Description", row.description, terms));
|
||||
}
|
||||
return candidates.filter((snippet): snippet is CompanySearchSnippet => Boolean(snippet)).slice(0, 2);
|
||||
}
|
||||
|
||||
function issueResult(row: IssueSearchRow, prefix: string, normalizedQuery: string, tokens: string[]): CompanySearchResult {
|
||||
const snippets = selectPrimarySnippets(row, normalizedQuery, tokens);
|
||||
const sourceLabel = snippets[0]?.label ?? null;
|
||||
const documentSuffix = row.documentKey ? `#document-${encodeURIComponent(row.documentKey)}` : "";
|
||||
const commentSuffix = row.commentId ? `#comment-${encodeURIComponent(row.commentId)}` : "";
|
||||
const suffix = row.commentId ? commentSuffix : documentSuffix;
|
||||
const issue: CompanySearchIssueSummary = {
|
||||
id: row.id,
|
||||
identifier: row.identifier,
|
||||
title: row.title,
|
||||
status: row.status as CompanySearchIssueSummary["status"],
|
||||
priority: row.priority as CompanySearchIssueSummary["priority"],
|
||||
assigneeAgentId: row.assigneeAgentId,
|
||||
assigneeUserId: row.assigneeUserId,
|
||||
projectId: row.projectId,
|
||||
updatedAt: iso(row.updatedAt)!,
|
||||
};
|
||||
const previewImageUrl =
|
||||
extractFirstImageUrl(row.description) ??
|
||||
extractFirstImageUrl(row.commentSnippet) ??
|
||||
extractFirstImageUrl(row.documentSnippet);
|
||||
return {
|
||||
id: row.id,
|
||||
type: "issue",
|
||||
score: Number(row.score),
|
||||
title: row.identifier ? `${row.identifier} ${row.title}` : row.title,
|
||||
href: issueHref(prefix, row, suffix),
|
||||
matchedFields: row.matchedFields ?? [],
|
||||
sourceLabel,
|
||||
snippet: snippets[0]?.text ?? null,
|
||||
snippets,
|
||||
issue,
|
||||
updatedAt: issue.updatedAt,
|
||||
previewImageUrl,
|
||||
};
|
||||
}
|
||||
|
||||
function scoreSimpleRow(row: SimpleSearchRow, normalizedQuery: string, tokens: string[]) {
|
||||
const haystack = [row.title, row.description, row.role].filter(Boolean).join(" ").toLowerCase();
|
||||
let score = haystack.includes(normalizedQuery) ? 90 : 0;
|
||||
for (const token of tokens) {
|
||||
if (haystack.includes(token)) score += 20;
|
||||
}
|
||||
if (row.title.toLowerCase().startsWith(normalizedQuery)) score += 80;
|
||||
return score;
|
||||
}
|
||||
|
||||
function simpleTextCondition(fields: SQL[], containsPattern: string, tokenArray: SQL) {
|
||||
const phraseConditions = fields.map((field) => sql<boolean>`lower(coalesce(${field}, '')) LIKE ${containsPattern} ESCAPE '\\'`);
|
||||
const tokenConditions = fields.map((field) => tokenMatchExpression(field, tokenArray));
|
||||
return sql<boolean>`(${sql.join([...phraseConditions, ...tokenConditions], sql` OR `)})`;
|
||||
}
|
||||
|
||||
export function companySearchBranchFetchLimit(limit: number, offset = 0) {
|
||||
const normalizedLimit = Number.isFinite(limit) ? Math.max(1, Math.floor(limit)) : COMPANY_SEARCH_MAX_LIMIT;
|
||||
const normalizedOffset = Number.isFinite(offset) ? Math.max(0, Math.floor(offset)) : 0;
|
||||
return Math.min(COMPANY_SEARCH_BRANCH_FETCH_LIMIT, normalizedOffset + normalizedLimit + 1);
|
||||
}
|
||||
|
||||
export function companySearchService(db: Db) {
|
||||
return {
|
||||
search: async (companyId: string, query: CompanySearchQuery): Promise<CompanySearchResponse> => {
|
||||
const normalizedQuery = normalizeQuery(query.q);
|
||||
const tokens = tokenizeQuery(normalizedQuery);
|
||||
const scope = query.scope;
|
||||
const limit = query.limit;
|
||||
const offset = query.offset;
|
||||
const emptyCounts: Record<CompanySearchResultType, number> = { issue: 0, agent: 0, project: 0 };
|
||||
if (normalizedQuery.length === 0) {
|
||||
return {
|
||||
query: query.q,
|
||||
normalizedQuery,
|
||||
scope,
|
||||
limit,
|
||||
offset,
|
||||
results: [],
|
||||
countsByType: emptyCounts,
|
||||
hasMore: false,
|
||||
};
|
||||
}
|
||||
|
||||
const company = await db
|
||||
.select({ issuePrefix: companies.issuePrefix })
|
||||
.from(companies)
|
||||
.where(eq(companies.id, companyId))
|
||||
.then((rows) => rows[0] ?? null);
|
||||
const prefix = routePrefix(company?.issuePrefix);
|
||||
const fetchLimit = companySearchBranchFetchLimit(limit, offset);
|
||||
const escapedTokens = tokens.map(escapeLikePattern);
|
||||
const tokenArray = sqlTextArray(escapedTokens);
|
||||
const fuzzyTokens = fuzzyEligibleTokens(tokens);
|
||||
const fuzzyTokenArray = sqlTextArray(fuzzyTokens);
|
||||
const escapedQuery = escapeLikePattern(normalizedQuery);
|
||||
const containsPattern = `%${escapedQuery}%`;
|
||||
const startsWithPattern = `${escapedQuery}%`;
|
||||
const fuzzyEnabled = normalizedQuery.length >= MIN_FUZZY_QUERY_LENGTH && !/[\\%_]/.test(normalizedQuery);
|
||||
const fuzzyTokensEnabled = fuzzyEnabled && fuzzyTokens.length > 0;
|
||||
|
||||
const titlePhraseMatch = sql<boolean>`lower(${issues.title}) LIKE ${containsPattern} ESCAPE '\\'`;
|
||||
const titleStartsWith = sql<boolean>`lower(${issues.title}) LIKE ${startsWithPattern} ESCAPE '\\'`;
|
||||
const identifierPhraseMatch = sql<boolean>`lower(coalesce(${issues.identifier}, '')) LIKE ${containsPattern} ESCAPE '\\'`;
|
||||
const identifierStartsWith = sql<boolean>`lower(coalesce(${issues.identifier}, '')) LIKE ${startsWithPattern} ESCAPE '\\'`;
|
||||
const descriptionPhraseMatch = sql<boolean>`lower(coalesce(${issues.description}, '')) LIKE ${containsPattern} ESCAPE '\\'`;
|
||||
const titleTokenMatch = tokenMatchExpression(sql`${issues.title}`, tokenArray);
|
||||
const identifierTokenMatch = tokenMatchExpression(sql`${issues.identifier}`, tokenArray);
|
||||
const descriptionTokenMatch = tokenMatchExpression(sql`${issues.description}`, tokenArray);
|
||||
const issueTextMatch = sql<boolean>`
|
||||
${titlePhraseMatch}
|
||||
OR ${identifierPhraseMatch}
|
||||
OR ${descriptionPhraseMatch}
|
||||
OR ${titleTokenMatch}
|
||||
OR ${identifierTokenMatch}
|
||||
OR ${descriptionTokenMatch}
|
||||
`;
|
||||
const commentMatch = sql<boolean>`
|
||||
EXISTS (
|
||||
SELECT 1
|
||||
FROM issue_comments search_comments
|
||||
WHERE search_comments.company_id = ${companyId}
|
||||
AND search_comments.issue_id = issues.id
|
||||
AND (
|
||||
lower(search_comments.body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_comments.body`, tokenArray)}
|
||||
)
|
||||
)
|
||||
`;
|
||||
const documentMatch = sql<boolean>`
|
||||
EXISTS (
|
||||
SELECT 1
|
||||
FROM issue_documents search_issue_documents
|
||||
INNER JOIN documents search_documents
|
||||
ON search_documents.id = search_issue_documents.document_id
|
||||
WHERE search_issue_documents.company_id = ${companyId}
|
||||
AND search_documents.company_id = ${companyId}
|
||||
AND search_issue_documents.issue_id = issues.id
|
||||
AND (
|
||||
lower(coalesce(search_documents.title, '')) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR lower(search_documents.latest_body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_documents.title`, tokenArray)}
|
||||
OR ${tokenMatchExpression(sql`search_documents.latest_body`, tokenArray)}
|
||||
)
|
||||
)
|
||||
`;
|
||||
// Each query token (length >= MIN_FUZZY_TOKEN_LENGTH) must have at least
|
||||
// one title word within Levenshtein edit distance. This handles typos
|
||||
// like "serach" -> "search" (transposition) and "mibile" -> "mobile"
|
||||
// (substitution) without the trigram noise that drop-character variants
|
||||
// produced (e.g. "serac" matching "service"). Edit budget is gated on
|
||||
// the SHORTER of the two strings so 4–5 letter English words don't get
|
||||
// swept in by lev=2 collisions.
|
||||
const fuzzyMaxEditsExpr = sql.raw(
|
||||
`CASE
|
||||
WHEN least(length(qt.value), length(title_word.value)) >= ${FUZZY_PAIR_LONG_LENGTH} THEN ${FUZZY_PAIR_LONG_MAX_EDITS}
|
||||
WHEN least(length(qt.value), length(title_word.value)) >= ${FUZZY_PAIR_MEDIUM_LENGTH} THEN ${FUZZY_PAIR_MEDIUM_MAX_EDITS}
|
||||
ELSE ${FUZZY_PAIR_SHORT_MAX_EDITS}
|
||||
END`,
|
||||
);
|
||||
const fuzzyMinTitleWordLengthExpr = sql.raw(`${MIN_FUZZY_TOKEN_LENGTH}`);
|
||||
const fuzzyTokenTitleMatch = fuzzyTokensEnabled
|
||||
? sql<boolean>`
|
||||
coalesce((
|
||||
SELECT bool_and(
|
||||
EXISTS (
|
||||
SELECT 1
|
||||
FROM regexp_split_to_table(lower(${issues.title}), '[^a-z0-9]+') AS title_word(value)
|
||||
WHERE length(title_word.value) >= ${fuzzyMinTitleWordLengthExpr}
|
||||
AND levenshtein_less_equal(qt.value, title_word.value, ${fuzzyMaxEditsExpr}) <= ${fuzzyMaxEditsExpr}
|
||||
)
|
||||
)
|
||||
FROM unnest(${fuzzyTokenArray}) AS qt(value)
|
||||
), false)
|
||||
`
|
||||
: noMatchSql();
|
||||
const fuzzyIdentifierMatch = fuzzyEnabled
|
||||
? sql<boolean>`similarity(lower(coalesce(${issues.identifier}, '')), ${normalizedQuery}) >= ${FUZZY_IDENTIFIER_SIMILARITY_THRESHOLD}`
|
||||
: noMatchSql();
|
||||
const fuzzyMatch = sql<boolean>`(${fuzzyTokenTitleMatch} OR ${fuzzyIdentifierMatch})`;
|
||||
const tokenCoverage = sql<number>`
|
||||
(
|
||||
SELECT count(*)::int
|
||||
FROM unnest(${tokenArray}) AS search_token(value)
|
||||
WHERE lower(${issues.title}) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
OR lower(coalesce(${issues.identifier}, '')) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
OR lower(coalesce(${issues.description}, '')) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
OR EXISTS (
|
||||
SELECT 1
|
||||
FROM issue_comments coverage_comments
|
||||
WHERE coverage_comments.company_id = ${companyId}
|
||||
AND coverage_comments.issue_id = issues.id
|
||||
AND lower(coverage_comments.body) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
)
|
||||
OR EXISTS (
|
||||
SELECT 1
|
||||
FROM issue_documents coverage_issue_documents
|
||||
INNER JOIN documents coverage_documents
|
||||
ON coverage_documents.id = coverage_issue_documents.document_id
|
||||
WHERE coverage_issue_documents.company_id = ${companyId}
|
||||
AND coverage_documents.company_id = ${companyId}
|
||||
AND coverage_issue_documents.issue_id = issues.id
|
||||
AND (
|
||||
lower(coalesce(coverage_documents.title, '')) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
OR lower(coverage_documents.latest_body) LIKE '%' || search_token.value || '%' ESCAPE '\\'
|
||||
)
|
||||
)
|
||||
)
|
||||
`;
|
||||
const tokenCount = tokens.length;
|
||||
const allTokensMatch = tokenCount > 0
|
||||
? sql<boolean>`${tokenCoverage} = ${tokenCount}`
|
||||
: noMatchSql();
|
||||
const score = sql<number>`
|
||||
(
|
||||
CASE WHEN lower(coalesce(${issues.identifier}, '')) = ${normalizedQuery} THEN 1200 ELSE 0 END
|
||||
+ CASE WHEN ${identifierStartsWith} THEN 700 ELSE 0 END
|
||||
+ CASE WHEN lower(${issues.title}) = ${normalizedQuery} THEN 900 ELSE 0 END
|
||||
+ CASE WHEN ${titleStartsWith} THEN 550 ELSE 0 END
|
||||
+ CASE WHEN ${titlePhraseMatch} THEN 350 ELSE 0 END
|
||||
+ CASE WHEN ${identifierPhraseMatch} THEN 320 ELSE 0 END
|
||||
+ CASE WHEN ${commentMatch} THEN 180 ELSE 0 END
|
||||
+ CASE WHEN ${documentMatch} THEN 170 ELSE 0 END
|
||||
+ CASE WHEN ${descriptionPhraseMatch} THEN 120 ELSE 0 END
|
||||
+ CASE WHEN ${allTokensMatch} THEN 260 ELSE 0 END
|
||||
+ (${tokenCoverage} * 70)
|
||||
+ CASE WHEN ${fuzzyMatch} THEN 110 ELSE 0 END
|
||||
+ CASE ${issues.status} WHEN 'done' THEN 0 WHEN 'cancelled' THEN -30 ELSE 20 END
|
||||
)::double precision
|
||||
`;
|
||||
const matchedFields = sql<string[]>`
|
||||
array_remove(ARRAY[
|
||||
CASE WHEN ${identifierPhraseMatch} OR ${identifierTokenMatch} OR ${fuzzyIdentifierMatch} THEN 'identifier' END,
|
||||
CASE WHEN ${titlePhraseMatch} OR ${titleTokenMatch} OR ${fuzzyTokenTitleMatch} THEN 'title' END,
|
||||
CASE WHEN ${descriptionPhraseMatch} OR ${descriptionTokenMatch} THEN 'description' END,
|
||||
CASE WHEN ${commentMatch} THEN 'comment' END,
|
||||
CASE WHEN ${documentMatch} THEN 'document' END
|
||||
], NULL)::text[]
|
||||
`;
|
||||
|
||||
const issueRows = scopeIncludesIssues(scope)
|
||||
? await db
|
||||
.select({
|
||||
id: issues.id,
|
||||
identifier: issues.identifier,
|
||||
title: issues.title,
|
||||
description: issues.description,
|
||||
status: issues.status,
|
||||
priority: issues.priority,
|
||||
assigneeAgentId: issues.assigneeAgentId,
|
||||
assigneeUserId: issues.assigneeUserId,
|
||||
projectId: issues.projectId,
|
||||
updatedAt: issues.updatedAt,
|
||||
score,
|
||||
matchedFields,
|
||||
commentSnippet: sql<string | null>`
|
||||
(
|
||||
SELECT search_comments.body
|
||||
FROM issue_comments search_comments
|
||||
WHERE search_comments.company_id = ${companyId}
|
||||
AND search_comments.issue_id = issues.id
|
||||
AND (
|
||||
lower(search_comments.body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_comments.body`, tokenArray)}
|
||||
)
|
||||
ORDER BY
|
||||
CASE WHEN lower(search_comments.body) LIKE ${containsPattern} ESCAPE '\\' THEN 0 ELSE 1 END,
|
||||
search_comments.updated_at DESC,
|
||||
search_comments.id DESC
|
||||
LIMIT 1
|
||||
)
|
||||
`,
|
||||
commentId: sql<string | null>`
|
||||
(
|
||||
SELECT search_comments.id
|
||||
FROM issue_comments search_comments
|
||||
WHERE search_comments.company_id = ${companyId}
|
||||
AND search_comments.issue_id = issues.id
|
||||
AND (
|
||||
lower(search_comments.body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_comments.body`, tokenArray)}
|
||||
)
|
||||
ORDER BY
|
||||
CASE WHEN lower(search_comments.body) LIKE ${containsPattern} ESCAPE '\\' THEN 0 ELSE 1 END,
|
||||
search_comments.updated_at DESC,
|
||||
search_comments.id DESC
|
||||
LIMIT 1
|
||||
)
|
||||
`,
|
||||
documentSnippet: sql<string | null>`
|
||||
(
|
||||
SELECT search_documents.latest_body
|
||||
FROM issue_documents search_issue_documents
|
||||
INNER JOIN documents search_documents
|
||||
ON search_documents.id = search_issue_documents.document_id
|
||||
WHERE search_issue_documents.company_id = ${companyId}
|
||||
AND search_documents.company_id = ${companyId}
|
||||
AND search_issue_documents.issue_id = issues.id
|
||||
AND (
|
||||
lower(coalesce(search_documents.title, '')) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR lower(search_documents.latest_body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_documents.title`, tokenArray)}
|
||||
OR ${tokenMatchExpression(sql`search_documents.latest_body`, tokenArray)}
|
||||
)
|
||||
ORDER BY
|
||||
CASE
|
||||
WHEN lower(coalesce(search_documents.title, '')) LIKE ${containsPattern} ESCAPE '\\' THEN 0
|
||||
WHEN lower(search_documents.latest_body) LIKE ${containsPattern} ESCAPE '\\' THEN 1
|
||||
ELSE 2
|
||||
END,
|
||||
search_documents.updated_at DESC,
|
||||
search_documents.id DESC
|
||||
LIMIT 1
|
||||
)
|
||||
`,
|
||||
documentTitle: sql<string | null>`
|
||||
(
|
||||
SELECT search_documents.title
|
||||
FROM issue_documents search_issue_documents
|
||||
INNER JOIN documents search_documents
|
||||
ON search_documents.id = search_issue_documents.document_id
|
||||
WHERE search_issue_documents.company_id = ${companyId}
|
||||
AND search_documents.company_id = ${companyId}
|
||||
AND search_issue_documents.issue_id = issues.id
|
||||
AND (
|
||||
lower(coalesce(search_documents.title, '')) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR lower(search_documents.latest_body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_documents.title`, tokenArray)}
|
||||
OR ${tokenMatchExpression(sql`search_documents.latest_body`, tokenArray)}
|
||||
)
|
||||
ORDER BY search_documents.updated_at DESC, search_documents.id DESC
|
||||
LIMIT 1
|
||||
)
|
||||
`,
|
||||
documentKey: sql<string | null>`
|
||||
(
|
||||
SELECT search_issue_documents.key
|
||||
FROM issue_documents search_issue_documents
|
||||
INNER JOIN documents search_documents
|
||||
ON search_documents.id = search_issue_documents.document_id
|
||||
WHERE search_issue_documents.company_id = ${companyId}
|
||||
AND search_documents.company_id = ${companyId}
|
||||
AND search_issue_documents.issue_id = issues.id
|
||||
AND (
|
||||
lower(coalesce(search_documents.title, '')) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR lower(search_documents.latest_body) LIKE ${containsPattern} ESCAPE '\\'
|
||||
OR ${tokenMatchExpression(sql`search_documents.title`, tokenArray)}
|
||||
OR ${tokenMatchExpression(sql`search_documents.latest_body`, tokenArray)}
|
||||
)
|
||||
ORDER BY search_documents.updated_at DESC, search_documents.id DESC
|
||||
LIMIT 1
|
||||
)
|
||||
`,
|
||||
})
|
||||
.from(issues)
|
||||
.where(and(
|
||||
eq(issues.companyId, companyId),
|
||||
isNull(issues.hiddenAt),
|
||||
issueSearchCondition(scope, { issueTextMatch, commentMatch, documentMatch, fuzzyMatch }),
|
||||
))
|
||||
.orderBy(desc(score), desc(issues.updatedAt), desc(issues.id))
|
||||
.limit(fetchLimit)
|
||||
: [];
|
||||
|
||||
const simpleCondition = simpleTextCondition([
|
||||
sql`${agents.name}`,
|
||||
sql`${agents.role}`,
|
||||
sql`${agents.title}`,
|
||||
sql`${agents.capabilities}`,
|
||||
], containsPattern, tokenArray);
|
||||
const agentRows = scopeIncludesAgents(scope)
|
||||
? await db
|
||||
.select({
|
||||
id: agents.id,
|
||||
title: agents.name,
|
||||
description: agents.capabilities,
|
||||
role: agents.role,
|
||||
updatedAt: agents.updatedAt,
|
||||
})
|
||||
.from(agents)
|
||||
.where(and(eq(agents.companyId, companyId), simpleCondition))
|
||||
.orderBy(desc(agents.updatedAt), desc(agents.id))
|
||||
.limit(fetchLimit)
|
||||
: [];
|
||||
|
||||
const projectCondition = simpleTextCondition([
|
||||
sql`${projects.name}`,
|
||||
sql`${projects.description}`,
|
||||
], containsPattern, tokenArray);
|
||||
const projectRows = scopeIncludesProjects(scope)
|
||||
? await db
|
||||
.select({
|
||||
id: projects.id,
|
||||
title: projects.name,
|
||||
description: projects.description,
|
||||
updatedAt: projects.updatedAt,
|
||||
})
|
||||
.from(projects)
|
||||
.where(and(eq(projects.companyId, companyId), isNull(projects.archivedAt), projectCondition))
|
||||
.orderBy(desc(projects.updatedAt), desc(projects.id))
|
||||
.limit(fetchLimit)
|
||||
: [];
|
||||
|
||||
const results: CompanySearchResult[] = [
|
||||
...(issueRows as IssueSearchRow[]).map((row) => issueResult(row, prefix, normalizedQuery, tokens)),
|
||||
...(agentRows as SimpleSearchRow[]).map((row) => {
|
||||
const terms = matchTerms(normalizedQuery, tokens);
|
||||
const snippet = createSnippet("capabilities", "Agent", row.description ?? row.role ?? row.title, terms);
|
||||
return {
|
||||
id: row.id,
|
||||
type: "agent" as const,
|
||||
score: scoreSimpleRow(row, normalizedQuery, tokens),
|
||||
title: row.title,
|
||||
href: `/${prefix}/agents/${encodeURIComponent(row.id)}`,
|
||||
matchedFields: ["agent"],
|
||||
sourceLabel: snippet?.label ?? null,
|
||||
snippet: snippet?.text ?? null,
|
||||
snippets: snippet ? [snippet] : [],
|
||||
updatedAt: iso(row.updatedAt),
|
||||
previewImageUrl: null,
|
||||
};
|
||||
}),
|
||||
...(projectRows as SimpleSearchRow[]).map((row) => {
|
||||
const terms = matchTerms(normalizedQuery, tokens);
|
||||
const snippet = createSnippet("description", "Project", row.description ?? row.title, terms);
|
||||
return {
|
||||
id: row.id,
|
||||
type: "project" as const,
|
||||
score: scoreSimpleRow(row, normalizedQuery, tokens),
|
||||
title: row.title,
|
||||
href: `/${prefix}/projects/${encodeURIComponent(row.id)}`,
|
||||
matchedFields: ["project"],
|
||||
sourceLabel: snippet?.label ?? null,
|
||||
snippet: snippet?.text ?? null,
|
||||
snippets: snippet ? [snippet] : [],
|
||||
updatedAt: iso(row.updatedAt),
|
||||
previewImageUrl: null,
|
||||
};
|
||||
}),
|
||||
].sort((left, right) => {
|
||||
if (right.score !== left.score) return right.score - left.score;
|
||||
return (right.updatedAt ?? "").localeCompare(left.updatedAt ?? "");
|
||||
});
|
||||
|
||||
const paged = results.slice(offset, offset + limit);
|
||||
return {
|
||||
query: query.q,
|
||||
normalizedQuery,
|
||||
scope,
|
||||
limit,
|
||||
offset,
|
||||
results: paged,
|
||||
countsByType: makeCounts(results),
|
||||
hasMore: results.length > offset + limit,
|
||||
};
|
||||
},
|
||||
};
|
||||
}
|
||||
|
|
@ -1,4 +1,5 @@
|
|||
export { companyService } from "./companies.js";
|
||||
export { companySearchService } from "./company-search.js";
|
||||
export { feedbackService } from "./feedback.js";
|
||||
export { companySkillService } from "./company-skills.js";
|
||||
export { agentService, deduplicateAgentName } from "./agents.js";
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue