Data is the foundation of all AI models. Weakness here propagates through the entire system. Eight distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| Data quality & lineage | Models trained on stale, incomplete, unrepresentative or undocumented data. Lineage unclear makes root-cause analysis impossible. | High | Data quality SLAs; automated lineage tracking; golden dataset versioning; DQ dashboards by data source | SR 11-7 model documentation |
| Sensitive data exposure | PII, PCI, material non-public information, trade secrets leaking into training sets or GenAI prompts. | Critical | Data classification framework; DLP rules pre-training; prompt scanning & redaction; data masking; PII detection in model inputs | GDPR, PCI-DSS, GLBA |
| Bias in training data | Historical bias (e.g., past lending discrimination) encoded into model behavior; disparate impact on protected groups. | High | Fairness testing (disparate impact analysis); demographic parity & equalized odds checks; representative sampling; stratified evaluation sets; bias documentation in model card | Fair Lending regs, EU AI Act Annex III |
| Data drift & staleness | Training data becomes unrepresentative of real-world distribution. Model accuracy degrades silently. | High | Automated drift detection; retraining triggers; holdout test set monitoring; concept drift detection; version control on retraining data | SR 11-7 monitoring requirements |
| Synthetic / poisoned data | Adversarial actors inject or poison training data to corrupt model behavior; synthetic data with incorrect labels. | Medium | Data provenance tracking; anomaly detection on input distributions; red-team synthetic data; version control & approval gates for training data changes | NIST AI RMF |
| Privacy leakage in training | Model may memorize & regurgitate training examples (especially GenAI); privacy attacks can extract personal data from models. | High | Differential privacy in training; membership inference testing; differential privacy audits for LLMs; data minimisation in prompts | GDPR, CCPA |
| Cross-border data transfer | Training data flows to non-compliant jurisdictions; GDPR Adequacy Decision not met. | Medium | Regional data residency enforced; Standard Contractual Clauses; data transfer impact assessments; contractual restrictions on sub-processors | GDPR Articles 44–50 |
| Unlicensed or copyrighted training data | Training data includes copyrighted content, news articles or proprietary datasets without license or consent. | Medium | Training data audit & licensing review; indemnification agreements with data providers; content filter exclusions; opt-out mechanisms for rightsholders | Copyright law, EU AI Act Annex III |
The model itself — its logic, accuracy, fairness and robustness — is the second critical risk layer. Seven distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| Model error & inaccuracy | Predictions fall outside acceptable error bounds in validation, test or production. Precision, recall or AUC degradation. | High | Rigorous validation with independent test set; challenger model approach; performance SLOs by business segment; monitoring & retraining triggers | SR 11-7 validation |
| Hallucination (GenAI) | LLMs confidently generate false, nonsensical or off-topic outputs. Especially risky for financial advice, documentation, or customer-facing use. | Critical | Grounding to trusted knowledge bases; RAG (Retrieval Augmented Generation) architecture; human-in-the-loop for high-stakes outputs; hallucination detection tooling; guardrails on output format & content | EU AI Act Annex III, FCA guidance |
| Explainability gap | Inability to explain a model's decision to a customer, regulator or the decision-maker. Black-box model in high-stakes use. | High | Explainability tooling (SHAP, LIME, attention visualization); challenger models that are inherently interpretable; model cards documenting global & local explanations; decision rationale logging | GDPR right to explanation, EU AI Act Annex III |
| Model bias & fairness | Disparate impact across protected groups (race, gender, age, etc.). Model systematically disadvantages a customer segment. | Critical | Fairness testing pre-launch & ongoing (disparate impact ratio, equalized odds, calibration); stratified performance evaluation; bias documentation; mitigation strategies (thresholding, fairness constraints in training) | Fair Lending regs, ECOA, FCA guidance, EU AI Act |
| Robustness & adversarial attacks | Model sensitive to small input perturbations or adversarial attacks; prompt injection can manipulate LLM outputs. | Medium | Adversarial testing & red-teaming; input filtering & validation; prompt injection detection; robustness metrics (perturbation tolerance); guardrails on model outputs | NIST AI RMF |
| Model staleness & concept drift | Model performance degrades because the real world has changed; concept drift is not detected or remediated. | High | Automated performance monitoring; drift detection algorithms; retraining triggers & schedules; periodic model challenger evaluations; version control on all model artifacts | SR 11-7 monitoring |
| Overfitting & poor generalization | Model fits training data too closely; poor performance on new, unseen data. Cross-validation gaps. | Medium | Proper train/validation/test splits; cross-validation; regularization tuning; holdout test set evaluation; performance parity across cohorts & time periods | SR 11-7 validation |
Operational risk materializes after deployment. Models that looked good in validation can fail in production if not properly deployed and monitored. Eight distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| Shadow AI & rogue deployments | Business teams using external AI tools (ChatGPT, Copilot, bespoke SaaS) outside IT/risk visibility; unvetted models in production. | High | AI inventory audit with deep packet inspection; enterprise GenAI gateway & allowlist; SaaS application controls; user education & amnesty; regular audits of what tools are actually in use | SR 11-7, SS1/23 |
| Deployment-validation gap | Model performs well in validation but fails in production due to different data, latency, or infrastructure issues. | High | Canary deployment (10% traffic); A/B testing pre-full rollout; shadow mode deployment (parallel to incumbent); production performance monitoring pre/post rollout; rollback procedures | SR 11-7 |
| Inadequate monitoring & alerting | Model degradation, drift, bias or failures not detected; risk materialises silently until customer complaint or regulatory exam. | Critical | Real-time SLO monitoring (latency, accuracy, fairness); automated alerting with escalation; model performance dashboard; data quality monitoring; drift detection; alert response SLAs | SR 11-7, SS1/23 |
| Cost & compute concentration | Runaway inference cost from large-scale GenAI use; heavy reliance on single GPU vendor (e.g., NVIDIA) creating supply chain risk. | Medium | FinOps discipline (tagging, budgets, alerts); multi-model strategy reducing vendor lock-in; cost optimization (model quantization, caching, batch processing); compute capacity planning | Operational resilience |
| Model versioning & rollback failure | Inability to revert to prior model version; unclear which version is in production; audit trail lost. | Medium | Model registry with version control; immutable artifacts; automated version promotion gates; rollback procedures & testing; audit logs of all version changes | SR 11-7 model documentation |
| Latency & performance SLA breach | Model takes too long to score; inference latency causes downstream system timeouts or poor user experience. | Medium | Latency SLOs by use case; performance profiling pre-deployment; auto-scaling & load balancing; caching strategies; model compression for inference speed | Operational resilience |
| Data pipeline failures | Data quality issues, stale data or missing features in the production data pipeline feeding the model. | High | Data pipeline monitoring & alerting; schema validation; feature freshness checks; data quality dashboards; fallback / degraded mode if data unavailable | SR 11-7 monitoring |
| Undocumented or missing model lineage | Cannot trace a production model back to training data, validation report, or deployment approval; audit trail incomplete. | High | Model registry with full lineage tracking; automated model card generation; deployment approval workflow & audit logs; Git version control on model code & training specs | SR 11-7, Audit requirements |
Most financial institutions now depend on external AI providers, open-source models, and cloud infrastructure. Supplier risk is material. Seven distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| Foundation model dependency | Heavy reliance on a single LLM provider (e.g., OpenAI) for GenAI use cases; lock-in risk; single point of failure. | Medium | Multi-model architecture with model abstraction layer; evaluate alternatives (Anthropic, Meta, open-source); contractual commitments from vendor; cost diversification strategy | SS1/23 third-party risk |
| Vendor opacity & auditing gaps | Unable to assess foundation model training data, fine-tuning approach, safety evaluations, or control testing. Vendor documentation is sparse. | High | Standardized vendor questionnaires (AI Act Annex III requirements); on-site vendor audits; contractual audit rights; third-party assessment reports; SLAs on model performance & safety | EU AI Act Annex III, SS1/23 |
| Data residency & jurisdictional risk | Sensitive customer data or training data flows to non-compliant jurisdictions; cloud provider processes data outside agreed regions. | Critical | Regional endpoint enforcement in contracts; data residency SLAs with penalties; technical controls (encryption in transit, VPN, regional gateways); GDPR Adequacy Assessment mapping | GDPR, GLBA, CCPA |
| Open-source license & compliance risk | OSS models with restrictive licenses (e.g., GPL); viral IP obligations; commercial use restrictions; unclear licensing terms. | Medium | License audit of all OSS components; SBOM (Software Bill of Materials) for models; legal review before deployment; usage guidelines & restrictions in contracts; indemnification from partners | Copyright law |
| Vendor service interruption | Provider outage, discontinuation of service, or capacity constraints disrupt business-critical AI systems. | High | Multi-vendor redundancy where feasible; contractual SLAs with financial penalties; backup models & fallback processes; disaster recovery testing; scenario analysis of vendor failure | SS1/23, Operational resilience |
| Vendor training data leakage | Third-party provider uses customer data in training their foundation model; confidential information or customer PII exposed. | Critical | Explicit opt-out clauses in contracts; no data retention beyond contract term; audit rights for training data use; indemnification provisions; DPIA for data sharing | GDPR, confidentiality obligations |
| Model poisoning via supply chain | Provider intentionally or negligently deploys a model with embedded malicious behavior, backdoors or undetected bias. | Medium | Vendor security assessments; red-team evaluation of provided models; behavioral testing before deployment; vendor incident response SLAs; contractual liability provisions | NIST AI RMF, Cyber risk frameworks |
How AI is used matters as much as how well it works. Ethical lapses damage brand, invite regulation, and harm customers. Eight distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| Customer harm & mis-selling | AI recommends unsuitable products, pricing or services; customer not informed AI made the decision; suitability not assessed. | Critical | Suitability checks before AI recommendations; human-in-the-loop for high-stakes advice; explainability to customer; complaint handling & escalation; documented due diligence | FCA ICOBS, FINRA, SEC, GDPR |
| Discrimination & disparate treatment | AI systematically disadvantages protected groups (race, gender, age, disability, religion); disparate impact in lending, pricing or hiring. | Critical | Fairness testing across protected attributes; disparate impact monitoring; adjustment mechanisms (thresholding, fairness-aware training); diversity in training data; audit trails & documentation | Fair Lending regs, ECOA, FCA guidance, Employment law |
| Consent & transparency failures | Customers unaware of AI involvement in decisions affecting them; insufficient disclosure; no right to opt-out. | High | Clear, upfront disclosure that AI is used; explain what AI assessed; offer human alternative; publish AI governance commitments; GDPR-compliant automated decision-making notices | GDPR Art. 22, FCA guidance, GLBA |
| Reputational damage | Public failure of AI system (bias scandal, hallucinated financial advice, security breach) damaging brand and customer trust. | High | Crisis playbook & incident response team; communication templates; media monitoring; executive training; customer remediation program; independent review of failures | Regulatory guidance |
| Workforce displacement & morale | Poorly managed AI automation displaces staff without reskilling; low staff morale; litigation risk; union pushback. | Medium | Staff communication & change management; reskilling programs; transition support; redeployment opportunities; staff representatives on AI governance committees | Labor law, Internal governance |
| Conflicts of interest in AI decisions | AI used to maximize bank profit at expense of customer interest (e.g., predatory pricing, inattention to suitability). | High | Suitability assessments required before AI recommendations; conflicts of interest disclosure; customer interest over bank profit policy; auditable decision logic; human override capability | ICOBS, MiFID, fiduciary duty |
| Unequal access to AI benefits | AI benefits go to affluent customers while AI-driven cost-cutting or denial affects vulnerable populations; widening digital divide. | Medium | Equitable access policy; responsible AI principles; impact assessments for vulnerable populations; affordability & inclusion in AI use case design | Regulatory guidance, Social responsibility |
| Environmental & social impact | Large language models consume enormous energy; training data sourcing involves labor exploitation or environmental harm. | Low–Medium | Carbon footprint assessment of AI models; vendor sustainability questionnaire; consideration of more efficient model alternatives; ESG reporting on AI impact | TCFD, ESG frameworks |
The regulatory landscape for AI is fragmenting. Overlapping regimes create compliance burden. Eight distinct risks:
| Risk | Description | Severity | Control recommendation | Regulatory ref |
|---|---|---|---|---|
| EU AI Act non-compliance | Failure to classify systems, document training data, implement human oversight, or complete conformity assessment for high-risk systems. | Critical | AI inventory with EU AI Act classification; training data governance & documentation; risk management system per Annex III; human oversight audit trails; conformity assessment & technical documentation; regulatory monitoring for changes | EU AI Act Articles 4–37 |
| SR 11-7 & SS1/23 gaps | AI and GenAI models not properly classified under model risk management; validation, monitoring, documentation requirements not met. | High | Extend existing MRM policy to cover all models including ML & GenAI; independent validation gate; ongoing monitoring & revalidation; model inventory & risk tiering; documentation & audit trails | SR 11-7, SS1/23 |
| Intellectual property infringement | Training data includes copyrighted content, code or proprietary databases without license; model outputs reproduce training data; copyright litigation. | Medium | Training data audit & licensing review; copyright clearance for all data sources; indemnification agreements with data vendors & model providers; output filtering; rightholder opt-out mechanisms | Copyright law, DMCA |
| Privacy & GDPR violations | Lawful basis for processing personal data in training unclear; automated decision-making without human review; data subject rights (access, deletion, portability) not honored. | High | DPIAs for all high-risk AI uses; transparent lawful basis documentation; human review for decisions impacting rights; rights fulfillment processes; GDPR training for teams; data minimisation in model inputs | GDPR Articles 1–99 |
| Fair lending & discrimination law violations | Credit models with disparate impact; insurance pricing with age discrimination; hiring AI with gender bias; regulatory enforcement action. | Critical | Disparate impact testing & monitoring; fair lending audits; protected attribute exclusion (or justified use cases); monitoring & remediation of adverse action; customer notification if AI critical to denial | FCRA, ECOA, Fair Credit Reporting Act, GDPR |
| Consumer protection & disclosure gaps | Failure to disclose AI use in financial advice; inadequate explanation of AI decisions; no human escalation path available. | High | Clear disclosure of AI involvement; explainability in plain language; right to human review; complaint handling procedures; training on disclosure requirements | FCA ICOBS, FINRA, SEC |
| Cross-border supervisory divergence | Conflicting expectations across EU, UK, US, SAMA, other regimes; compliance with one regime creates non-compliance with another. | Medium | Legal & regulatory mapping across all jurisdictions where institution operates; apply most-restrictive interpretation; document compliance choices; engage regulators proactively; adjust governance by jurisdiction if needed | Multiple, overlapping frameworks |
| Contractual liability & indemnification gaps | Contracts with AI vendors lack adequate indemnification for IP infringement, data breaches, or model failures; liability allocation unclear. | Medium | Vendor contract review & amendment; ensure indemnification for IP, data protection, model performance; audit rights & SLAs; insurance coverage assessment; liability caps & dispute resolution processes | Contract law, Risk management |
A board-approved risk appetite statement for AI should define tolerance by tier and risk category. Here is a template:
This simplified heat map ranks the 40+ risks by typical severity and likelihood in a FS institution. Position indicates relative priority for governance investment.
Top-right quadrant = highest priority. Bubble size = relative number of controls available. This is illustrative; actual risk positioning varies by institution and use case.
| Risk Category | Top 3 Controls | Secondary Controls |
|---|---|---|
| Data risk | Data quality SLAs, PII classification & DLP, fairness testing | Lineage tracking, drift monitoring, synthetic data testing |
| Model risk | Independent validation, fairness monitoring, hallucination detection | Explainability tooling, adversarial testing, version control |
| Operational risk | AI inventory & classification, real-time monitoring, production SLOs | Canary deployment, automated alerting, data pipeline monitoring |
| Third-party risk | Vendor assessment questionnaire, contractual controls, data residency enforcement | Multi-vendor redundancy, audit rights, SLAs & indemnification |
| Conduct risk | Suitability checks, fairness testing, customer disclosure & transparency | Human-in-the-loop escalation, conflict of interest monitoring, complaint handling |
| Regulatory risk | AI governance policy, regulatory mapping, compliance documentation | Legal contract review, DPIA templates, incident response playbooks |
Get in touch to discuss how this accelerator fits your institution.
Book a Consultation →