Hire on Upwork only. No direct contact, please. I work exclusively through Upwork →
SLSadaf Labs
Taking 2 projects · Q2 2026

Trust infrastructure
for autonomous agents.

We design and ship the boring-but-critical pieces every agent needs before production: policy gates, audit trails, and replayable evals. Senior engineers, fixed scope, shipped to production.

Free 20-min scoping call. No sales funnel, no NDA needed for a first conversation.

SOC2-ready patterns Replayable evals Least-privilege tooling Human-in-the-loop Fixed scope, fixed price

How trust is enforced

Agent → Policy → Audit → Eval

Built on the stack you already trust

AWS Lambda Bedrock OpenAI Anthropic Next.js Postgres DynamoDB OpenSearch Vercel TypeScript AWS Lambda Bedrock OpenAI Anthropic Next.js Postgres DynamoDB OpenSearch Vercel TypeScript

/why-us

Why teams pick Sadaf Labs

Six promises we keep on every engagement. The reasons clients sign, and the reasons they come back.

Senior engineers only

You work directly with the people writing the code. No juniors learning on your time, no offshore handoffs.

Fixed scope, fixed price

Engagements are quoted as a fixed deliverable with a defined timeline. No open-ended hourly billing, no scope creep.

Trust built in, not bolted on

Every agent ships with policy gates, audit trails, and replayable evals. Procurement and security teams pass us through.

Production miles, not slideware

99 Lambdas across 5 services in production today. We know the parts that break at 2am because we have fixed them.

Honest on scope

If an LLM is the wrong tool, we say so. If a feature can be a SQL query instead of an agent, we ship the SQL.

Selective on purpose

Two clients per quarter. Every engagement gets a senior owner from kickoff to handover.

/workflows

Patterns we ship over and over

Four reference architectures we've productionized. Click through to watch each draw itself.

/workflows

How agents earn trust

Click through the patterns we ship. Each diagram animates the data flow as it draws.

Every agent step gated by policy, logged, and replayable.

Agent Trust Pipeline

/capabilities

Deep AWS expertise for cloud-native B2B teams

We design, build, and operate production systems on AWS. From agentic orchestration on Bedrock to event-driven microservices, from Figma to Stripe, the stack below is what we use every week.

AWS BedrockAWS StrandsLambdaDynamoDBEventBridgeSQSKinesis FirehoseS3ECSEC2StripeSentry

AWS cloud expertise

Cloud-native B2B applications, designed and shipped on AWS.

  • Multi-tenant SaaS, multi-region, single-table DynamoDB
  • IaC with AWS SAM and CDK, CI pipelines
  • Cost-tuned: ARM Graviton, right-sized memory, on-demand to provisioned migration

Agentic orchestration on AWS Bedrock

Bedrock Agents, Bedrock Knowledge Bases, and the open-source AWS Strands framework.

  • Tool-use, action groups, guardrails, session memory
  • AWS Strands for multi-agent graphs and human-in-the-loop
  • Replayable evals so agents do not silently regress

Frontend engineering

Figma to production, with observability that is honest.

  • Design in Figma, build in React.js and Next.js
  • Tailwind, server components, edge rendering
  • Sentry for errors, performance, and session replay

Backend engineering

Strongly typed services in the languages that earn their place.

  • TypeScript and Node.js for API services
  • Python for AI, data, and orchestration
  • Rust where latency, safety, or cost demand it

Serverless architecture

Event-driven systems that scale to zero and survive Black Friday.

  • AWS Lambda, DynamoDB, EventBridge, SQS
  • Kinesis Firehose for streaming analytics, S3 as the source of truth
  • Step Functions for long-running, observable workflows

Containers and compute

When serverless is the wrong tool, we still know the right one.

  • ECS Fargate for stateful or long-lived workloads
  • EC2 for GPU inference, custom networking, legacy lift-and-shift
  • ALB, VPC, private subnets, secrets, and observability wired in

Payments and user management

Stripe done right, including the parts everyone gets wrong.

  • Stripe Checkout, Billing, and metered usage
  • Webhook handlers idempotent by design, replay safe
  • User and subscription state synced via Stripe callbacks into DynamoDB

Identity and access

Auth that an enterprise procurement team will actually approve.

  • Cognito and Microsoft Entra ID, SSO and SAML
  • Per-tenant isolation, fine-grained IAM, signed URLs for S3
  • Audit trails on every privileged action

Frontend

Figma, React.js, Next.js, Tailwind, Sentry for observability.

Backend

TypeScript, Node.js, Python, Rust. Strongly typed, well tested, observable.

Cloud

AWS Lambda, DynamoDB, EventBridge, SQS, Firehose, S3, ECS, EC2.

/showcase

Architectures from recent builds

A small sample of systems we have shipped on AWS. Each one is in production today.

All work →
AWS Bedrock AgentCore

AWS Bedrock AgentCore

Managed runtime for autonomous agents. Tools, memory, identity, and observability wired in.

AI voice pipeline

AI voice pipeline

Real-time speech-to-action pipeline on AWS, Bedrock, and Strands agents.

B2B lead and ticket platform

B2B lead and ticket platform

Multi-tenant SaaS with role-based dashboards and SLA-aware queues.

Event-driven microservices

Event-driven microservices

EventBridge, SQS, and Lambda with idempotent handlers and replay.

Entra ID and Cognito SSO

Entra ID and Cognito SSO

Federated identity for enterprise customers, JWT propagation end to end.

Secure S3 upload flow

Secure S3 upload flow

Signed URLs, virus scan, metadata indexing, and tenant isolation.

Operational analytics

Operational analytics

Firehose to S3 to Athena, dashboards your operations team will actually open.

/consulting

What we build

Fixed-scope packages so you know what you're getting. Custom welcome, every project starts with a free scoping call.

All packages →

Agent Trust Audit

Map your live agent, find policy gaps and prompt-injection surfaces, ship a hardened system prompt with an audit-log scaffold.

from $5k1 to 2 weeks

RAG Systems

Search and answer over your docs, tickets, and code with citations, not hallucinations.

from $12k3 to 4 weeks

Multi-Agent Workflows

Supervisor and specialist agents that automate real workflows with humans in the loop.

from $25k4 to 8 weeks

AI MVP in 30 days

Idea → working product → first paying user. One sprint, fixed scope, real code.

from $20k4 weeks

/principles

How we work

We're engineers first. The lab and the studio share a bias: ship the boring infrastructure right, then move fast on top of it.

Real shipped work

Role, period, stack, all verifiable. No vanity metrics, no fabricated logos.

Boring infra

Lambda, Postgres, S3. The hot framework du jour can wait.

Trust by default

Every agent gets policy, audit, and replay before it sees real users.

Honest on scope

If LLMs are wrong for the job, we say so. Often they are.

3

OSS lab tools

PolicyLint, Trace Replay, Eval Harness

99

Lambdas in production

across recent client systems

2

Projects this quarter

selective by design

5

Services in production

on AWS serverless today

/lab

Currently in the lab

Working drafts. Some are open source, some are in design-partner pilots. Real code, real status, no inflation.

Alpha · OSS

PolicyLint

Static analyzer for agent system prompts. Flags jailbreak surfaces, missing refusals, unbounded tool scope.

Try the demo
Design partner

Trace Replay

Record agent traces in prod, replay them in CI. Catch regressions before users do.

Browse demos
Internal

Eval Harness

Lightweight golden-prompt eval runner that fits in a single CI step.

Browse demos

/case-studies

Engineering case studies

Focused performance, cost, and architecture wins on production systems. Problem, change, measured outcome.

All case studies →

/track-record

Receipts.

What we've actually shipped, with role, period, and stack. No invented numbers. References available on the intro call.

Full track record
  • B2B lead and case management platform

    2024 to present

    Senior engineer

    Multi-tenant SaaS on AWS serverless. 5 services, 99 Lambdas, single-table DynamoDB, Cognito and Entra SSO. Owned authn and authz.

  • Intelligent transit and fleet operations

    2023 to 2024

    Engineer, contributor

    Real-time scheduling and vehicle telemetry for public transport operators. Event-driven ingestion and operator dashboards.

  • Fintech debt recovery on AWS

    2022 to 2023

    Engineer, contributor

    Cloud-native collections platform with configurable strategy engine and self-serve customer portal.

/faq

Honest answers, up front

What does Sadaf Labs do?+

Hybrid lab and studio. The studio is selective AI consulting (RAG, multi-agent, AI MVPs). The lab builds open-source trust infrastructure for agents: policy, audit, evals.

How long does an engagement take?+

Trust audits 1 to 2 weeks. RAG systems 3 to 4 weeks. Multi-agent builds 4 to 8 weeks. MVPs scoped at 30 days.

What does it cost?+

Fixed-scope, $5k to $40k for most engagements. Trust audits start at $5k, RAG $12k to $25k, multi-agent $25k and up. Retainer available for fractional AI lead.

Do you take new projects?+

2 per quarter. Free 20-min scoping call open to anyone with an agent in or going to production.

What stack?+

Next.js with TypeScript, AWS serverless (Lambda, DynamoDB, Bedrock), LLMs via Bedrock, OpenAI, or Anthropic, RAG via OpenSearch or pgvector.

Are you raising?+

Pre-seed planning. Bootstrapped today via consulting. Investor brief is gated.

/investors

Building for the agent-trust market

Pre-seed, bootstrapped via consulting today. The lab tools you see above are the product wedge for managed agent trust as a service in 2026. The investor brief is gated.

Got an agent in production? Let's harden it.

Production-grade trust for agents: policy gates, audit trails, and replayable evals.

2 slots open this quarter