AI Vendor Due Diligence That Stands Up to Legal

Organizations rushing to adopt AI solutions often overlook critical security and compliance requirements that can expose them to significant legal liability. This article draws on insights from legal and technology experts to identify the most important questions companies must ask AI vendors before signing contracts. Two specific areas demand particular attention: understanding how vendors store and protect source data, and verifying that proper tenant isolation safeguards are in place.

Ask Where They Keep Source Data

The checkpoint that killed deals wasn't a 50-page questionnaire. One question: "Where does your training data live?"

Simple ask. Devastating answers. About 95% of generative AI vendors couldn't answer. They'd point to SOC 2 or ISO 27001—missing that those don't cover AI risks like training data residency or model weight storage. One vendor claimed EU residency. Trained on US data. Disqualified.

What flipped legal and security teams? A model card mapped to SOC 2. Plus a prompt injection red-team report. Not marketing noise. Actual testing. The model card showed which SOC 2 criteria hit which AI risks. The red-team report proved they tested for jailbreaking, injection, extraction. No docs? No deal. Vendors without them didn't fail a checklist. They failed basic security competence.

Procurement shifted from "can you sign our agreement?" to "can you prove you understand what you're building?" The filter was a bloodbath. The shortlist? Bulletproof.

RUTAO XUFounder & COO, TAOAPEX LTD

Require Strict Tenant Isolation Proof

One checkpoint that actually disqualified vendors was strict model isolation and data boundary enforcement for customer prompts and outputs. We required vendors to prove that customer data was not used for model training, fine tuning, or cross-tenant inference under any default or optional configuration. Several contenders failed here once we pushed past marketing claims and asked for architectural detail, not policy language.

The requirement was concrete. Vendors had to document where prompts were processed, how long they were retained, how they were logged, and how isolation was enforced at both the storage and inference layers. If they could not clearly demonstrate zero data reuse with technical controls rather than contractual promises, they were disqualified.

The evidence artifact that convinced legal and security was an SOC 2 report explicitly mapped to model behavior, combined with a documented prompt injection and data exfiltration red team assessment. The strongest vendors could show how their SOC 2 controls applied to model inputs, outputs, and operator access, and back that up with test results showing how the system behaved under malicious prompt scenarios.

What mattered was not having the fanciest model, but having verifiable proof that the model could be safely used in a production environment handling customer and brand-sensitive data. That level of evidence separated serious platforms from experimental ones very quickly.

Ahad ShamsFounder, Heyoz

Set Fast Incident SLAs Via Escalation Path

Clear timelines reduce harm when things go wrong. The contract should require notice of a breach within a set number of hours and should define what details must be shared. Service levels should cover uptime, response time, and fix time with real remedies if targets are missed.

Plans for backup, recovery, and incident tests should be documented and proven. There should be a single point of contact and a path to escalate fast. Set these rules and test them with a joint drill this quarter.

Define Strong Indemnity And Liability Terms

Fair risk terms keep disputes small and costs known. Indemnities should cover IP claims, data loss, and fines that stem from vendor acts. Liability caps should fit the risk and include higher caps for key harms like breach or IP claims.

The vendor should have a duty to defend and to pay for settlements and judgments that fall under the indemnity. Proof of insurance should match the caps and name the customer where allowed. Put these risk terms in place before any data or code is shared.

Confirm Ownership Plus Chain of Title

Legal due diligence should start with clear proof of who owns the code, models, and training data. Contracts should state that all data sources were used under valid licenses and that no rights were breached. Warranties should cover model outputs and say how those outputs can be used, reused, and shared.

The contract should also explain any third party components and open source terms that apply. Rights should survive a merger, sale, or end of the deal so business use can continue. Ask for a chain of title report and written warranties before signing.

Secure Robust Audit with Transparency Rights

Strong audit rights make claims real and testable. The agreement should allow review of security controls, data flows, and model training practices on a set schedule. Vendors should share third party reports, change logs, and subprocessor lists without delay.

There should be a duty to fix findings within a set time and to confirm fixes in writing. Costs and scope of audits should be fair and not block access. Write these audit and transparency duties into the contract now.

Enforce Export Controls Alongside Sanctions Compliance

Compliance with trade laws protects the deal and the team. Vendors should confirm that the product and any models are classified for export and are not sent to banned users or places. The contract should set rules for screening end users and for blocking high risk uses.

Data locations and support teams should be limited to approved regions and staff. Re export and transfer duties should be clear for all parties in the chain. Ask for written export and sanctions controls and test them before rollout.

AI Vendor Due Diligence That Stands Up to Legal