Merchant Trust Layerworking name

The trust layer for
agentic commerce.

A portable, merchant-owned reliability record that AI shopping agents read before they buy.

The shift

Commerce is moving to agents.

Soon your assistant does the buying, not you. When an agent has a thousand merchants for one product, one question decides the sale:

You

say what you want

→

Your agent

does the buying

→

1,000 merchants

one product

→

Which to trust?

the question that decides the sale

Right now

The rails for agent commerce are being built - by everyone.

Discovery

Google UCP (Universal Commerce Protocol, Jan 2026) - the open standard for agents to find products and talk to merchants. Shopify, Stripe, PayPal, Visa & Mastercard in the ecosystem.

Payment

Google AP2 + OpenAI & Stripe's ACP - how an agent gets authorised to pay, and checkout inside ChatGPT.

Agent identity

Visa Trusted Agent · Mastercard Agent Pay - verifying the buyer's agent is legitimate, and the merchant's identity.

Billions in backing, shipping now. Every layer answers "who is the buyer, and can they pay?" - none answers "will this merchant actually deliver?"

Today

Right now, trust is a crude checkbox.

✓ Is it a Google Merchant?

✓ Does it have a Stripe account?

✓ Are the reviews decent?

? Will it actually ship, and refund if it doesn't?

The checks that exist are gameable. The one that matters has no answer.

One-sided

They verified the buyer - not the seller.

Every protocol verifies

Google UCP

Visa TAP

Mastercard Agent Pay

OpenAI / Stripe ACP

→

The buyer's agent

Is this bot allowed to pay?

✓ verified

The seller

Will this merchant deliver?

unanswered

Verifying the buyer's agent is necessary - and they've done it well. But trust takes two: the agent and the merchant - and only one side is built. "None of the protocols address merchant reliability or fulfilment" - confirmed across all six, from their own specs.

Why now

Cheap trust breaks when money flows.

That missing signal was survivable while a careful human shopped. The agent buyer is another animal - controlled audits of Claude, GPT and Gemini shopping agents (ACES, 2025) found they're swayed by the gameable surface, and never check what actually matters:

jump in an item's selection just by moving it to a top slot - position deciding the pick, not merit (Claude)

+92%

more a merchant can charge and still win, just by wearing a platform "top pick" badge - up to +138% on Gemini

And not one of these agents checked whether the merchant would actually ship, refund or resolve - the audit found no reliability signal at all. Cues like these get gamed the moment real money flows through them. The only defence that holds: reliability measured from real outcomes, signed by a neutral source no merchant can edit.

The gap

The seat isn't empty. It's full of proxies.

Merchant trust is handled today - by stand-ins. Each one is a guess at the thing that actually matters, and each one breaks the moment an agent can check:

Reviews & star ratingsgameable sentiment

A recognised brand namea name - and eroding

"Google Merchant" / "Stripe" badgesidentity, not delivery

A stated refund & shipping policya promise, not proof

Measured reliability - does it actually ship, refund, resolve?still empty

The proxies are gameable, platform-locked, or a name a newcomer can't win. The one seat that measures real outcomes - portably, neutrally, verifiably - has never been filled.

Neutrality

Couldn't Google just build this?

It has the data, so yes - it could score reliability. But look at what kind of score it would be:

If a platform builds it

Locked to its own view, readable only inside its own stack. Conflicted - it ranks and sells ads to the same merchants. And no rival agent - OpenAI, Perplexity, Anthropic - trusts a competitor's say-so. A feature of Google, for Google.

The neutral seat

Cross-platform, merchant-owned, every fact checkable by any agent - so it doesn't rest on trusting us either. Neutrality is the product, and a platform with ranking, ads and its own storefront structurally can't be neutral.

Google can build a score. It can't build a neutral one. The seat every agent can trust is the one no platform can sit in.

Why merchants want this

We earned our reputation. Google deleted it.

Google ties a Business Profile - and its reviews - to a physical storefront. KAAL went online-only, and both vanished with it.

KAAL

Lingerie store · Cape Town

★★★★★

4.8 · 127 Google reviews

● Visible · trusted · choosable

→

closed the
shopfront

KAAL

Online store · no address

★★★★★

4.8 · 127 reviews

✕ No physical location → profile & reviews gone

Trust was pinned to a shopfront, not to whether KAAL delivers - so closing the shop erased it, and a whole sales channel with it. Merchants want reputation tied to performance: owned, portable, impossible to revoke.

The build

A reputation the merchant owns.

Merchant Reliability Recordsigned & verifiable

99reliability

Order fulfilment97.96% measured

Dispute rate0.063% ✓ Stripe

Refund rate1.8% measured

Order history42 months measured

Ed25519-signed · issued by independent sources · owned by the merchant

Built from real sales, deliveries, refunds and disputes over time - measured from what actually happened, not declared by the merchant. Hard to game: the facts are attested by independent sources - you can't fake what Stripe signs. Graded, so an agent can choose between merchants.

How it works

One loop, end to end.

Connect

Shopify, Stripe, courier

→

Measure

ship / refund / dispute

→

Sign

portable credential

→

Publish

domain + UCP

→

Agent reads

verifies & picks

The merchant connects once. The record refreshes itself, travels everywhere, and any agent can verify it in milliseconds.

Integration

Built so any agent can read it.

It's a W3C Verifiable Credential - the same envelope Google's AP2 and Mastercard use - published at the merchant's own domain, where any agent reads it directly and verifies it with standard crypto. No bespoke tool from us: verification is the open VC standard the rails (UCP, AP2) already run on.

Google AP2

Mastercard Agent Pay

Our reliability record

= one W3C envelope

Format-compatible with UCP, never UCP-dependent - and on Shopify the UCP manifest is Shopify's to control, so the direct, merchant-owned path is the one we rely on. The bar: any agent can read it, verify it's real, and prioritise it - no tool of ours required.

The data

Populated from day one.

Observe · the thin file

Watch public signals over time - price/stock consistency, review authenticity, store age, policy changes. No permission needed. A baseline record on millions of merchants from day one.

Verify · the full file

The merchant connects real Shopify, Stripe and courier data to upgrade to a measured record.

Thin file to full file - exactly how a credit bureau already works. The bureau is populated before anyone opts in.

Why not them

Everyone touches the edge. No one is in the seat.

Trustpilot / review platforms

Hold self-reported reviews, not measured transactions. Gameable sentiment, not fulfilment fact.

Experian / D&B / credit bureaus

Measure debt repayment, not whether a store ships. The data isn't in any credit file.

Nomotic / AGTP

Built merchant identity; explicitly deferred the reliability score - the exact piece we build.

Google / Amazon

Have the data, but it's locked and non-neutral - no rival agent will trust it.

The intersection - measured, portable, neutral merchant reliability - is the empty seat. That's us.

The prize

Become the default. Not a slice.

Ceiling

The standard, not a share. Trust consolidates to one neutral reference, and no platform can hold the neutral seat. The prize is being the layer read on ~every agent purchase, with a record on ~every merchant. Winner-take-most.

Floor

Even the downside is real. Convert under 1% of merchants to paid and that's already a substantial business. Mass adoption is the goal; a sliver paying is already a win.

The comps (S&P, Moody's, Experian, D&B) prove trust-data is durable and high-margin - the shape, not a number we inherit. Our edge isn't their regulatory mandate; it's first-mover data accretion, neutrality, and being agent-native.

Merchant-zero

We built it on a real store. Ours.

42mo

of real KAAL order history

97.96%

measured fulfilment

0.06%

dispute rate (~10x better than typical)

The engine runs on real data: events → an append-only tamper-evident log → an open score anyone can recompute → a live, self-correcting credential. Built in days - which is itself the lesson: the tech is commodity.

Business model

The read is free. The relationship is paid.

Shoppers · free

The record is free to read at the point of purchase, forever. Meter the lookup and agents route around it - ubiquity is the moat.

Merchants · the engine, now

Pay to be measured, verified and equipped - distribution, conversion, tooling. Never to move the score. Trusted = chosen = more sales.

Industry · the multiplier, scale

Platforms, processors and underwriters license the dataset (SLA feed, not per-query). They earn on trust decisions at scale.

Revenue never depends on charging for a validation - the one thing that kills adoption. Money comes later; first we solve the tech.

What we built

Real sales in. A signed record out.

KAAL's real data

Shopify orders, refunds; Stripe disputes

→

Measure

fulfilment / dispute / refund - from what happened

→

Sign

W3C credential, Ed25519

→

Agent reads + verifies

checks the signature itself

No new "AI." We took KAAL's real outcomes - measured (97.96% fulfilment, 0.06% disputes), not self-reported - and cryptographically signed them. The whole question was whether an agent would treat that signature as trust.

The signature

Change one number and it breaks.

Signed

The record is signed with a private key - the same class of cryptography as the padlock in your browser (Ed25519).

Anyone verifies

Checked with the public key - no need to contact us. The agent verifies it in milliseconds.

Tamper-evident

Alter a single figure and the signature no longer matches. You can't bolt good numbers onto someone's signature.

So in the tests that follow: "verified" = the signature checks out · "forged" = it doesn't (someone altered it) · "self-declared" = no signature at all. That's exactly what the five questions probe.

Into the sandboxwhat I've been doing

So I tested it - on real agents.

Real AI shopping agents - Claude, GPT-4o, Gemini - a real choice of four SA merchants, one matched product (a black lace lingerie set, ~R2,000), KAAL deliberately disadvantaged. Pre-registered. Each test harder than the last:

Does a verified reliability signal even change the pick?

Is it reliability they value, or just any shiny badge?

Will an agent go and find it itself?

Can they tell genuine from forged?

Does it matter who signed it?

Scope: these are the LLMs behind today's shopping agents (Claude, GPT-4o, Gemini), run as agent proxies in a controlled four-merchant choice - directional and pre-registered (n=15-60 per arm), not live agents loose on the open web. Signal, not census.

Square one · the control

At face value, KAAL is passed over - 100% of the time.

Before we changed anything, we asked the agents to choose among the four merchants exactly as they appear online today - KAAL adding nothing. It was picked 0% - in every test, by every model. The agents chose on what they could see: customer reviews and social proof first (Satin Candy's Google 4.6/5, Passion HQ's on-site ratings), then longevity (Lady Jane's 20 years), then price and returns. KAAL has none of the top signal - no on-site reviews, no Google profile (revoked) - so it lost outright.

KAAL's pick-rate at face value - passed over in every test, by every model

This is the floor: a genuinely reliable merchant, invisible - beaten by reviews and age, the gameable, rented signals it happens to lack. The question the experiments take apart from here: can a measured signal move it - and what kind actually holds up?

Sandbox · Question 1

Does a verified signal change the pick?

Present-mode: KAAL's card carried one extra line, read in context (no tools). We held the numbers identical and changed only whether they were independently verified. The verified line read: "Independent Reliability Credential (signed by Merchant Trust Bureau, signature verified): 97.96% fulfilment, 0.063% dispute rate, <1% refunds, across 42 months of orders." The self-declared arm used the same numbers, marked "self-reported, not verified." KAAL pick-rate, by model:

KAAL carried

Claude

GPT-4o

Gemini

Combined

Verified credential

75%

100%

92%

Same numbers, self-declared

10%

Nothing (control)

Same numbers - only the claim of verification differs: 92% vs 3% self-declared (0% with nothing). Agents lean hard on a "verified" signal - but here it's a label they trust, not one they check; a fraudster could write "verified" too. Whether they actually check - and reject a fake - is Question 4.

Sandbox · Question 2

Reliability - or just any shiny badge?

Neutral prompt, no trust cue. Every merchant's listing showed its public facts; for the test, KAAL's listing also carried one signed, "verified" line - and we changed only that line. The "good" version was KAAL's real measured record: 98.0% order-fulfilment, 0.063% disputes, 1.8% refunds, across 42 months of orders. The control was an equally-prominent "verified" badge about design (award-winning craftsmanship) - shiny, but irrelevant to whether it ships. KAAL pick-rate, by model:

The "verified" line KAAL carried

Claude

GPT-4o

Gemini

Combined

Reliability report - good numbers

100%

85%

100%

95%

Design badge (placebo)

100%

67%

Reliability report - bad numbers

Claude trusted the real reliability report 100% but gave the equally-shiny design badge 0% - the smartest model wants substance, not badges. About 1/3 of the lift is reliability-specific (+28pp); bad numbers repel everyone (0%).

Sandbox · Question 3

Will an agent go and find it itself?

No hand-feeding: the agent saw four merchant homepages and a fetch tool - with no instruction to care about trust. KAAL's page carried a one-line pointer: "publishes an independent, signed reliability credential at kaal.store/.well-known/merchant-trust." The agent decided for itself whether to go fetch it - and fetching returned KAAL's real numbers (or, in one arm, bad ones). KAAL pick-rate, by model:

KAAL's page

Claude

GPT-4o

Gemini

Combined

Pointer → good numbers

43%

73%

93%

70%

Pointer → bad numbers

No pointer (blind)

They fetch and act unprompted - but with no pointer, discovery was 0%. It has to live where agents already look. And they read the content: bad numbers → 0%.

Sandbox · Question 4

Can they tell genuine from forged?

This time the agent had a real cryptographic verify tool - it could check the signature itself. We handed it three versions of KAAL's credential: a genuine signed one; a forgery carrying better-than-real numbers (99.9% fulfilment, 0 disputes) but a broken signature; and the good numbers with no signature at all. KAAL pick-rate, by model:

KAAL carried

Claude

GPT-4o

Gemini

Combined

Genuine signed

60%

93%

67%

73%

Forged signature

Self-declared (unsigned)

The forgery carried the best numbers in the entire test and every model gave it 0%. They check - they don't believe. The honest catch: they only check because we gave them a verify tool - a stock agent in the wild doesn't have one yet. So this is where agents are heading (verifying, on the VC-based rails), not stock behaviour today.

Sandbox · Question 5

Does it matter who signed it?

The hardest one: we held the numbers and a valid signature identical, and changed only who signed it - the verify tool told the agent the issuer and whether it was self-issued. Three signers: a recognised independent bureau; KAAL signing its own credential; and an unknown, no-name party. KAAL pick-rate, by model:

Who vouched

Claude

GPT-4o

Gemini

Combined

Recognised bureau

67%

100%

86%

84%

Unknown issuer

93%

47%

Merchant self-signed

33%

11%

Look at Claude - the frontier model trusts only a recognised name: 0% to an unknown issuer, 0% to self-signed. It already refuses to trust a name it doesn't know. Which raised the question that changed everything...

The catch

But the 84% was a name we made up.

The "recognised bureau" was an invented name - so 84% measured name plausibility, not earned trust. Every model trusts a plausible name to a degree - the real question is how they treat an unknown one:

GPT-4o (older)

Known name 100% → unknown 93%. Barely discriminates - trusts almost any name.

Claude (frontier)

Known name 67% → unknown 0%. Trusts a name less, and won't extend it to an unknown one at all.

Agents do trust names today - but the newer the model, the less it does, and the harder it discriminates against an unknown one. The gap is closing every generation - agents are shifting toward signals they can verify, not just names they recognise.

What we proved

Agents buy on this data - and they check it.

They use it

A verified reliability signal moves the pick 0 → 92%, and they read the content (bad numbers → 0%), not the label.

They find it

Given just a fetch tool and no nudge, they retrieve it themselves 100% - but never discover it blind.

They check it

A forgery with the best numbers in the test → 0%. They verify; they don't believe.

Name-trust is eroding

Newer models trust a name less - the frontier model still trusts a known one but gives an unknown issuer 0%. The gap closes every generation.

Agents buy on measured reliability - and the smarter they get, the more they demand it be verifiable, not vouched.

Honest scope: the "find it" and "check it" results used a fetch or verify capability we supplied to the agent. It's evidence of where agents are heading - not what a stock, text-only agent does unaided in the wild today.

Levelling up the test

So we tested what actually closes the gap.

Phase one proved agents buy on verifiable reliability - and exposed the catch: trust by name is a game an unknown newcomer loses. An unknown issuer scored 47%, and 0% on the frontier model. So we built the harder test: can an unknown party close that gap with facts an agent can verify for itself - and which models can still be tricked?

The question that decides whether a permissionless trust layer can exist at all: does verifiable structure beat a recognised name - on the model that matters?

Honest scope: in the sandbox the "counterparty signatures" are real Ed25519 chains the agent verifies in-loop - but issued with test keys standing in for Stripe and the courier. Proving the facts truly came from the real Stripe / courier (web proofs) is exactly the build ahead - so this tests the agent's behaviour, and the next phase makes the provenance real.

Sandbox · Question 6simulated signing

Give an agent a verifier - facts it checks, or a name?

The setup: this time we gave the agent a cryptographic verify tool - a function it calls to check each signature itself. Why a tool? A signature is just math; running that check is the only way to tell a genuine credential from a forged one - and it's the ability the payment rails (UCP, AP2) and newer agent runtimes are already building in. What we tested: holding the numbers identical, we changed only what backs them - a made-up bureau name, a no-name party's bare claim, the same facts signed by the counterparty that saw them (Stripe, the courier), and a forgery of those signatures - then let the agent verify each. KAAL pick-rate, by model:

What backs the numbers

Claude

GPT-4o

Gemini

Combined

Name only: "Merchant Trust Bureau"

93%

31%

A no-name party's bare claim

33%

11%

Facts signed by Stripe + the courier

100%

Forged signatures (don't resolve)

93%

32%

Claude and Gemini fall for nothing - 0% to a name, a bare claim, or a forgery; 100% only to facts they could verify (rule out GPT-4o, the oldest, which still falls for a name or even a forgery). The honest catch: a stock agent doesn't carry this verifier yet - so this is where agents are heading once they can check, not stock behaviour today. Every generation gets more like Claude.

Why it compounds

The smarter the agent, the more it demands proof.

The spread across the models we tested is a time machine: the weaker model is how agents shopped yesterday (fooled by appearance); the frontier model is how they'll shop tomorrow (they demand proof). The gap between them is the direction of travel.

GPT-4o · oldest

Fooled by almost anything - a made-up bureau name 93%, a placebo design badge 100%, even a forged credential 93%. "Looks-verified is enough."

Gemini · mid

Rejects names and forgeries (0%) - but still slipped on a shiny, irrelevant badge. Discriminating, not yet strict.

Claude · frontier

Rejects every fake - unknown name, self-signed, forgery all 0% - and rewards only what it can verify. "Is-verified or nothing."

As agents get smarter, a verifiable signal gains value while a fake badge loses it - the two lines cross, and we're on the rising one. The moat and the why-now: the frontier already rewards real proof; we take the seat before it's table-stakes.

The real product

Giving every agent a tool isn't the answer.

What the sandbox really shows: today's weaker agents trust almost anything - a "verified" line that's only text (92%), a shiny placebo badge (GPT 100%), even a forged record with better-than-real numbers (GPT-4o 93%). The capable models are already dubious - Claude gave every fake 0% - and each generation gets harder to fool. But our 100%s carry one asterisk: we handed the agent a way to verify.

Not the answer

Ship a bespoke verifier to every agent. If our edge needs the world to adopt our tool, we've already lost - that's just another walled garden.

The answer

A crawlable, signed string in the merchant's schema. Verification goes ambient: the rails (UCP/AP2) check it before the agent sees it, runtimes verify natively, and the score is openly recomputable against each signing counterparty - so no one trusts us.

Honest today: a stock text-only model can't do the crypto unaided, so right now the string reads as a label. The bet: the rails, native compute and the capability curve make the check ambient - and the experiments show that the moment an agent can verify, the verifiable signal is the one it trusts. We build for that agent. Every generation is more that agent.

The opportunity

The gap is the prize - but not as a bureau.

Becoming a recognised bureau is the name game a newcomer loses (we're the 47%, 0% on the smart models) - and it just rebuilds the Google/Stripe model. The escape the experiments hand us: make the name irrelevant.

Nominal trust

"Trust the big name, therefore buy." Size wins. We lose.

Verifiable trust

The agent checks the fact itself, authority-free. Size is irrelevant - even Google's data is locked & unverifiable. Different axis.

We don't earn the brand. We make it unnecessary - and that is the thing we really need to solve.

The trust graph

Every merchant has a file - before they ask.

Exists from observation

We build a thin file on every merchant from public signals - domain age, site history, policies, registries, processor badges. No consent needed for it to exist. The bureau is never empty.

Claiming is separate

Claiming doesn't create the entity or own its content - it's proving you control the real merchant. It unlocks enrichment (connect real data), management (see & dispute), and a sovereign DID bound to the observed root.

Claiming is not creating. Existence (observed) and control (proven) are different operations - which is how we keep both the thin file and the anti-hijack.

Claiming = proof of control

A bad actor can't claim what they don't control.

Domain control

A DNS / .well-known token on the merchant's own domain. Like SSL or Search Console.

Payment-account control

OAuth into the merchant's own Stripe / Shopify - the money rails. Strongest, and it's also the connect step.

KYC match

The claimer's KYC'd identity matches the registered entity + beneficial owners.

A hijacker controls none of these - can't drop a token on the domain, OAuth the Stripe, or pass the KYC. Claim strength is recorded and feeds confidence. The exact pattern Google Business, D-U-N-S and SSL already use - solved territory. This is the persistent-identity pillar in action.

The flywheel

The thin file is the hook.

Observe

public signals

→

Thin file

low confidence

→

Merchant claims

proves control

→

Connects

rich file

→

Wins picks

verified, chosen

Agents read the status, not just a score: observed/unclaimed = cautious; claimed/connected = trusted. A merchant sees agents giving them a thin, cautious record - and is pulled to claim and connect to win more picks. "We already have a record on you; right now it's thin - claim it to fix that." The thin file is the FOMO that drives adoption.

What's required next

Two problems. One each.

1 · Web proofs

We pull from Stripe/Shopify and prove it came from the real source, unaltered - via zkTLS / TLS notarization. No cooperation from Stripe, no trusting us. The agent verifies provenance itself. The verifiability pillar - "not a bureau."

2 · Persistent identity

Bind the record to a KYC'd entity + beneficial owners so it can't be shed or faked (anti-phoenix), ZK-private. Inherits the rails' KYC + MATCH. The non-replicability pillar - the moat.

Web proofs make it trustable without a name. Persistent identity makes it stick and un-copyable. Together: an agent can trust a merchant it's never heard of.

The mechanism

The chain that replaces "trust the bureau."

Provenance

web-proof: really from Stripe, unaltered

→

Integrity

append-only log: not changed after

→

Computation

open score: agent recomputes itself

→

Freshness

live, re-proven on a cadence

The agent trusts "Stripe's real records, unaltered, scored by a function I ran myself" - more than "BureauX says 97%," because it checked every link instead of believing a name.

An open fork

Where the canonical truth lives - a choice we're making in the open.

All three share one spine: counterparty-signed facts an agent checks for itself, never trusting us. They differ only on where the record lives and how it's owned - a deliberate, still-open decision:

1 · Public ledger

Bitcoin-style. Records on a public chain - history can't be rewritten, anyone reads it. Owned via a token. Maximal neutrality; highest complexity and crypto / regulatory baggage.

2 · Signed credentials

Off-chain crypto. API-signed facts + zkTLS web-proofs + a KYC'd identity. Security is economic, not on-chain. Fast and uses today's rails; leaves "who hosts the log" open.

3 · Hybrid

Signed + a public transparency log. Fork 2's rails and economics, anchored to a publicly auditable log so no one - including us - can rewrite history. More to stand up; token optional later.

Whichever wins, the rule is the same: trust the math and the economics, not the name. We'd rather show the fork than pretend it's settled - the spine holds in all three.

The economics

A faked merchant is worthless in days.

The deepest defence isn't the check - it's that being legit is the only move that pays. A verified identity with real signed history is slow and expensive to build, and the moment it's abused the same signed record exposes it:

Costly to build

A trusted record is months of real, counterparty-signed outcomes - deliveries, low disputes, settled orders. No shortcut: the facts come from Stripe and the courier, not the merchant.

Worthless in days

Start scamming and the signed facts turn against you - disputes and failures are logged by the same counterparties. Reputation that took months to build is burned in a week.

Can't re-spawn

Identity is bound to a KYC'd entity + beneficial owners (anti-phoenix). You can't shed the burned record and re-appear clean - the rails' MATCH list follows you.

So the economics do the enforcement, not a policeman: a few days of fraud, then the asset is dead and un-rebuildable. For any real merchant, staying honest is simply worth more than cheating - which is exactly what makes the signal trustworthy.

How we de-risk itgate 1 passed

We validate each step before we build it.

A · Behaviour

Done - the master gate passed. An unknown party's counterparty-signed, verifiable record beat every name: 100% vs 31% (a recognised name) vs 11% (a bare claim), on the models that matter.

B · Feasibility

Next - spikes. Can a provider web-prove the real Stripe/Shopify endpoints? Can we bind to KAAL's real KYC'd entity, ZK-private?

C · Integrate

On KAAL. Courier oracle + web-proven facts + identity binding → the full verified chain, live, agent trusts no name.

D · The real test

A real sale in the wild - an agent on a live surface choosing a merchant because of the credential, money changing hands. A second merchant generalises it; the wild sale is the actual goal.

The cheap behavioural test gates the expensive build - and it just passed. We proved agents reward verifiable structure before writing a line of zkTLS. Now the build is earned.

The seat

An agent should be able to trust a merchant it's never heard of.

By checking the facts itself - not trusting anyone's name. The experiments show that's exactly what the smartest agents already want, and it's a seat no platform can take.

Next steps

The behaviour's proven. Now the build.

Web proofs

Spike zkTLS on real Stripe / Shopify data - turning the sandbox's simulated signatures into real, verifiable provenance, with no trust in us.

Persistent identity

Bind a record to a KYC'd entity + beneficial owners, ZK-private - reputation that can't be shed or faked.

The chain, live

Wire the courier oracle (MintSoft); publish the full verified record on KAAL, end-to-end.

The real test

A sale in the wild. An AI agent, on a live shopping surface, choosing KAAL because of the credential - money changing hands. Not a second merchant: the actual proof it's a product (a second merchant generalises it; the wild sale proves it).

The trust layer foragentic commerce.

Commerce is moving to agents.

The rails for agent commerce are being built - by everyone.

Right now, trust is a crude checkbox.

They verified the buyer - not the seller.

Cheap trust breaks when money flows.

The seat isn't empty. It's full of proxies.

Couldn't Google just build this?

We earned our reputation. Google deleted it.

A reputation the merchant owns.

One loop, end to end.

Built so any agent can read it.

Populated from day one.

Observe · the thin file

Verify · the full file

Everyone touches the edge. No one is in the seat.

Become the default. Not a slice.

We built it on a real store. Ours.

The read is free. The relationship is paid.

Real sales in. A signed record out.

Change one number and it breaks.

So I tested it - on real agents.

At face value, KAAL is passed over - 100% of the time.

Does a verified signal change the pick?

Reliability - or just any shiny badge?

Will an agent go and find it itself?

Can they tell genuine from forged?

Does it matter who signed it?

But the 84% was a name we made up.

Agents buy on this data - and they check it.

So we tested what actually closes the gap.

Give an agent a verifier - facts it checks, or a name?

The smarter the agent, the more it demands proof.

Giving every agent a tool isn't the answer.

The gap is the prize - but not as a bureau.

Every merchant has a file - before they ask.

Exists from observation

Claiming is separate

A bad actor can't claim what they don't control.

The thin file is the hook.

Two problems. One each.

1 · Web proofs

2 · Persistent identity

The chain that replaces "trust the bureau."

Where the canonical truth lives - a choice we're making in the open.

A faked merchant is worthless in days.

We validate each step before we build it.

An agent should be able to trust a merchant it's never heard of.

The behaviour's proven. Now the build.

The trust layer for
agentic commerce.