Trust infrastructure
for autonomous agents.
We design and ship the boring-but-critical pieces every agent needs before production: policy gates, audit trails, and replayable evals. Senior engineers, fixed scope, shipped to production.
Free 20-min scoping call. No sales funnel, no NDA needed for a first conversation.
How trust is enforced
Built on the stack you already trust
/why-us
Why teams pick Sadaf Labs
Six promises we keep on every engagement. The reasons clients sign, and the reasons they come back.
Senior engineers only
You work directly with the people writing the code. No juniors learning on your time, no offshore handoffs.
Fixed scope, fixed price
Engagements are quoted as a fixed deliverable with a defined timeline. No open-ended hourly billing, no scope creep.
Trust built in, not bolted on
Every agent ships with policy gates, audit trails, and replayable evals. Procurement and security teams pass us through.
Production miles, not slideware
99 Lambdas across 5 services in production today. We know the parts that break at 2am because we have fixed them.
Honest on scope
If an LLM is the wrong tool, we say so. If a feature can be a SQL query instead of an agent, we ship the SQL.
Selective on purpose
Two clients per quarter. Every engagement gets a senior owner from kickoff to handover.
/workflows
Patterns we ship over and over
Four reference architectures we've productionized. Click through to watch each draw itself.
/workflows
How agents earn trust
Click through the patterns we ship. Each diagram animates the data flow as it draws.
Every agent step gated by policy, logged, and replayable.
/capabilities
Deep AWS expertise for cloud-native B2B teams
We design, build, and operate production systems on AWS. From agentic orchestration on Bedrock to event-driven microservices, from Figma to Stripe, the stack below is what we use every week.
AWS cloud expertise
Cloud-native B2B applications, designed and shipped on AWS.
- Multi-tenant SaaS, multi-region, single-table DynamoDB
- IaC with AWS SAM and CDK, CI pipelines
- Cost-tuned: ARM Graviton, right-sized memory, on-demand to provisioned migration
Agentic orchestration on AWS Bedrock
Bedrock Agents, Bedrock Knowledge Bases, and the open-source AWS Strands framework.
- Tool-use, action groups, guardrails, session memory
- AWS Strands for multi-agent graphs and human-in-the-loop
- Replayable evals so agents do not silently regress
Frontend engineering
Figma to production, with observability that is honest.
- Design in Figma, build in React.js and Next.js
- Tailwind, server components, edge rendering
- Sentry for errors, performance, and session replay
Backend engineering
Strongly typed services in the languages that earn their place.
- TypeScript and Node.js for API services
- Python for AI, data, and orchestration
- Rust where latency, safety, or cost demand it
Serverless architecture
Event-driven systems that scale to zero and survive Black Friday.
- AWS Lambda, DynamoDB, EventBridge, SQS
- Kinesis Firehose for streaming analytics, S3 as the source of truth
- Step Functions for long-running, observable workflows
Containers and compute
When serverless is the wrong tool, we still know the right one.
- ECS Fargate for stateful or long-lived workloads
- EC2 for GPU inference, custom networking, legacy lift-and-shift
- ALB, VPC, private subnets, secrets, and observability wired in
Payments and user management
Stripe done right, including the parts everyone gets wrong.
- Stripe Checkout, Billing, and metered usage
- Webhook handlers idempotent by design, replay safe
- User and subscription state synced via Stripe callbacks into DynamoDB
Identity and access
Auth that an enterprise procurement team will actually approve.
- Cognito and Microsoft Entra ID, SSO and SAML
- Per-tenant isolation, fine-grained IAM, signed URLs for S3
- Audit trails on every privileged action
Frontend
Figma, React.js, Next.js, Tailwind, Sentry for observability.
Backend
TypeScript, Node.js, Python, Rust. Strongly typed, well tested, observable.
Cloud
AWS Lambda, DynamoDB, EventBridge, SQS, Firehose, S3, ECS, EC2.
/showcase
Architectures from recent builds
A small sample of systems we have shipped on AWS. Each one is in production today.

AWS Bedrock AgentCore
Managed runtime for autonomous agents. Tools, memory, identity, and observability wired in.

AI voice pipeline
Real-time speech-to-action pipeline on AWS, Bedrock, and Strands agents.

B2B lead and ticket platform
Multi-tenant SaaS with role-based dashboards and SLA-aware queues.

Event-driven microservices
EventBridge, SQS, and Lambda with idempotent handlers and replay.

Entra ID and Cognito SSO
Federated identity for enterprise customers, JWT propagation end to end.

Secure S3 upload flow
Signed URLs, virus scan, metadata indexing, and tenant isolation.

Operational analytics
Firehose to S3 to Athena, dashboards your operations team will actually open.
/consulting
What we build
Fixed-scope packages so you know what you're getting. Custom welcome, every project starts with a free scoping call.
Agent Trust Audit
Map your live agent, find policy gaps and prompt-injection surfaces, ship a hardened system prompt with an audit-log scaffold.
RAG Systems
Search and answer over your docs, tickets, and code with citations, not hallucinations.
Multi-Agent Workflows
Supervisor and specialist agents that automate real workflows with humans in the loop.
AI MVP in 30 days
Idea → working product → first paying user. One sprint, fixed scope, real code.
/principles
How we work
We're engineers first. The lab and the studio share a bias: ship the boring infrastructure right, then move fast on top of it.
Real shipped work
Role, period, stack, all verifiable. No vanity metrics, no fabricated logos.
Boring infra
Lambda, Postgres, S3. The hot framework du jour can wait.
Trust by default
Every agent gets policy, audit, and replay before it sees real users.
Honest on scope
If LLMs are wrong for the job, we say so. Often they are.
3
OSS lab tools
PolicyLint, Trace Replay, Eval Harness
99
Lambdas in production
across recent client systems
2
Projects this quarter
selective by design
5
Services in production
on AWS serverless today
/lab
Currently in the lab
Working drafts. Some are open source, some are in design-partner pilots. Real code, real status, no inflation.
PolicyLint
Static analyzer for agent system prompts. Flags jailbreak surfaces, missing refusals, unbounded tool scope.
Try the demoTrace Replay
Record agent traces in prod, replay them in CI. Catch regressions before users do.
Browse demosEval Harness
Lightweight golden-prompt eval runner that fits in a single CI step.
Browse demos/projects
Selected projects
Products and platforms shipped end to end. Each one has the architecture, problem, and outcome.
Intelligent transit and fleet operations
Real-time scheduling, vehicle telemetry, and passenger info for public transport operators.
Read project
Fintech debt recovery platform on AWS
Cloud-native collections with configurable strategy engine and self-serve customer portal.
Read project
AI agents for contact centre operations
Real-time pairing and routing AI for enterprise contact centres at global scale.
Read project
One-click CMS and site builder
No-code multi-tenant CMS that lets non-technical teams launch branded sites in minutes.
Read project
B2B lead and case management
Department-aware workflow with SLAs, audit, and AI-assisted routing across teams.
Read project
/case-studies
Engineering case studies
Focused performance, cost, and architecture wins on production systems. Problem, change, measured outcome.
Auth p95 from 600 ms to under 120 ms
Right-sizing and ARM64 cutover on a Lambda authorizer that gated every request.
Read case study
Kanban board, 28 s to under 2 s
Pagination, streaming, and memoization on a board scanning every case in the tenant.
Read case study
ARM64 Graviton across 99 Lambdas
Fleet wide cutover from x86 to Graviton 3 with no functional regressions.
Read case study
/track-record
Receipts.
What we've actually shipped, with role, period, and stack. No invented numbers. References available on the intro call.
Full track recordB2B lead and case management platform
2024 to present
Senior engineer
Multi-tenant SaaS on AWS serverless. 5 services, 99 Lambdas, single-table DynamoDB, Cognito and Entra SSO. Owned authn and authz.
Intelligent transit and fleet operations
2023 to 2024
Engineer, contributor
Real-time scheduling and vehicle telemetry for public transport operators. Event-driven ingestion and operator dashboards.
Fintech debt recovery on AWS
2022 to 2023
Engineer, contributor
Cloud-native collections platform with configurable strategy engine and self-serve customer portal.
/demos
Try it in your browser
Mini-apps you can use right now. No login, no API keys, runs client-side. Simplified previews of production systems we ship.
PolicyLint
Paste an agent system prompt → flag jailbreak surfaces, missing refusals, unbounded tool scope.
Static analysis
Try it
Doc Q&A
Paste a document, ask questions, get cited answers.
RAG · keyword retrieval
Try it
Meeting Notes Summarizer
Paste a transcript, get a summary and action items.
Extraction · summarization
Try it
/faq
Honest answers, up front
What does Sadaf Labs do?+
Hybrid lab and studio. The studio is selective AI consulting (RAG, multi-agent, AI MVPs). The lab builds open-source trust infrastructure for agents: policy, audit, evals.
How long does an engagement take?+
Trust audits 1 to 2 weeks. RAG systems 3 to 4 weeks. Multi-agent builds 4 to 8 weeks. MVPs scoped at 30 days.
What does it cost?+
Fixed-scope, $5k to $40k for most engagements. Trust audits start at $5k, RAG $12k to $25k, multi-agent $25k and up. Retainer available for fractional AI lead.
Do you take new projects?+
2 per quarter. Free 20-min scoping call open to anyone with an agent in or going to production.
What stack?+
Next.js with TypeScript, AWS serverless (Lambda, DynamoDB, Bedrock), LLMs via Bedrock, OpenAI, or Anthropic, RAG via OpenSearch or pgvector.
Are you raising?+
Pre-seed planning. Bootstrapped today via consulting. Investor brief is gated.
/investors
Building for the agent-trust market
Pre-seed, bootstrapped via consulting today. The lab tools you see above are the product wedge for managed agent trust as a service in 2026. The investor brief is gated.
Got an agent in production? Let's harden it.
Production-grade trust for agents: policy gates, audit trails, and replayable evals.