KSadaf Labs
Taking 2 projects · Q2 2026

Trust infrastructure
for autonomous agents.

We design and ship the boring-but-critical pieces every agent needs before production: policy gates, audit trails, and replayable evals. Lab + studio.

SOC2-ready patterns Replayable evals Least-privilege tooling Human-in-the-loop

How trust is enforced

Agent → Policy → Audit → Eval

Built on the stack you already trust

AWS Lambda Bedrock OpenAI Anthropic Next.js Postgres DynamoDB OpenSearch Vercel TypeScript AWS Lambda Bedrock OpenAI Anthropic Next.js Postgres DynamoDB OpenSearch Vercel TypeScript

/workflows

Patterns we ship over and over

Four reference architectures we've productionized. Click through to watch each draw itself.

/workflows

How agents earn trust

Click through the patterns we ship. Each diagram animates the data flow as it draws.

Every agent step gated by policy, logged, and replayable.

Agent Trust Pipeline

/consulting

What we build

Fixed-scope packages so you know what you're getting. Custom welcome — every project starts with a free scoping call.

All packages →

Agent Trust Audit

Map your live agent, find policy gaps and prompt-injection surfaces, ship a hardened system prompt + audit-log scaffold.

from $5k1–2 weeks

RAG Systems

Search and answer over your docs, tickets, and code with citations — not hallucinations.

from $12k3–4 weeks

Multi-Agent Workflows

Supervisor + specialist agents that automate real workflows with humans in the loop.

from $25k4–8 weeks

AI MVP in 30 days

Idea → working product → first paying user. One sprint, fixed scope, real code.

from $20k4 weeks

/principles

How we work

We're engineers first. The lab and the studio share a bias: ship the boring infrastructure right, then move fast on top of it.

Real shipped work

Role, period, stack — verifiable. No vanity metrics, no fabricated logos.

Boring infra

Lambda, Postgres, S3. The hot framework du jour can wait.

Trust by default

Every agent gets policy + audit + replay before it sees real users.

Honest on scope

If LLMs are wrong for the job, we say so. Often they are.

3

OSS lab tools

PolicyLint · Trace Replay · Eval Harness

99

Lambdas in production

across recent client systems

2

Projects this quarter

selective by design

10y

Production miles

before we touched LLMs

/lab

Currently in the lab

Working drafts. Some are open source, some are in design-partner pilots. Real code, real status — no inflation.

Alpha · OSS

PolicyLint

Static analyzer for agent system prompts. Flags jailbreak surfaces, missing refusals, unbounded tool scope.

Try the demo
Design partner

Trace Replay

Record agent traces in prod, replay them in CI. Catch regressions before users do.

Join the pilot
OSS

Eval Harness

Lightweight golden-prompt eval runner that fits in a single CI step.

View on GitHub

/track-record

Receipts.

What we've actually shipped — with role, period, and stack. No invented numbers. References available on the intro call.

Full track record
  • Multi-service serverless platform

    2024 — present

    Senior engineer

    5 services, 99 Lambdas in production. ARM64 + memory tuning cut p95 cold-start materially. Owned authn/authz.

  • RAG support copilot

    2025

    Architect + builder

    End-to-end RAG over a B2B SaaS knowledge base with cited answers and human handoff.

  • Document intelligence pipeline

    2025

    Architect + builder

    Structured extraction over long-form legal contracts with reviewable audit trail.

/faq

Honest answers, up front

What does Sadaf Labs do?+

Hybrid lab + studio. The studio is selective AI consulting (RAG, multi-agent, AI MVPs). The lab builds open-source trust infrastructure for agents — policy, audit, evals.

How long does an engagement take?+

Trust audits 1–2 weeks. RAG systems 3–4 weeks. Multi-agent builds 4–8 weeks. MVPs scoped at 30 days.

What does it cost?+

Fixed-scope, $5k–$40k for most engagements. Trust audits start at $5k, RAG $12–25k, multi-agent $25k+. Retainer available for fractional AI lead.

Do you take new projects?+

2 per quarter. Free 20-min scoping call open to anyone with an agent in or going to production.

What stack?+

Next.js + TypeScript, AWS serverless (Lambda, DynamoDB, Bedrock), LLMs via Bedrock/OpenAI/Anthropic, RAG via OpenSearch or pgvector.

Are you raising?+

Pre-seed planning. Bootstrapped today via consulting. Investor brief is gated — email hello@sadaf-labs.com for the passcode.

/investors

Building for the agent-trust market

Pre-seed, bootstrapped via consulting today. The lab tools you see above are the product wedge — managed agent trust as a service in 2026. The investor brief is gated.

Got an agent in production? Let's harden it.

Free 20-minute call. We'll map your agent on the whiteboard, find the trust gaps, and decide if we're a fit.

2 slots open this quarter hello@sadaf-labs.com