Q: Does the DPDP Act 2023 require voice AI data to stay in India?

A: DPDP itself does not impose blanket localisation. Section 16 restricts cross-border transfer to countries the Central Government notifies, with sector regulators free to impose tighter rules. Until the notified list is gazetted, most senior privacy counsel treats cross-border transfer of personal data as a risk to avoid. For payment data, RBI's April 2018 circular is the binding constraint and requires storage in India. For policyholder data, IRDAI's 2023 Information and Cyber Security Guidelines require India residency.

Q: Where does my voice AI call audio physically go when I speak to an Indian bank's bot?

A: It depends entirely on the vendor's architecture. A well-built India-resident deployment routes audio from the telephony PoP to an Indian-region runtime (AWS Mumbai, Hyderabad, or an Indian cloud like Yotta or ESDS), runs STT and LLM in India, stores recordings and transcripts in India buckets with customer-managed keys, and never sends audio or transcripts cross-border. Many vendors do not run this configuration by default. The only way to know is to read the sub-processor list in the SOC 2 and ask for the architecture diagram.

Q: Can a US-headquartered voice AI vendor be DPDP-compliant for an Indian bank?

A: Yes, but only in a fully-India configuration — runtime, STT, LLM, TTS, recording store, transcript store, embeddings and analytics all in Indian cloud regions, with sub-processor disclosure that satisfies the RBI Outsourcing Direction. The default configuration of most US vendors is not this. It is usually available on request at a 30-60% premium with explicit contractual commitment. The standard SOC 2 will not suffice on its own; you will need a residency attestation specific to your deployment.

Q: What is the difference between data residency and data sovereignty for voice AI?

A: Residency is where the bytes physically sit. Sovereignty is which jurisdiction's laws and courts have authority over them and who can compel access. A vendor can host in Mumbai (residency) and still be subject to US legal process via the CLOUD Act if the vendor is a US entity (sovereignty problem). Indian sovereign cloud providers like Yotta, ESDS, Sify and Tata Communications offer both. For sensitive regulated workloads, sovereignty is the harder requirement and the one most vendors gloss over.

Q: How do I audit a voice AI vendor's data flow before signing?

A: Send the fifteen questions in this article as a written RFI. Demand answers within ten business days. Cross-check against the SOC 2 Type II sub-processor list, the public DPA, and the privacy policy. Require an architecture diagram with every sub-processor labelled by region. Insist on right-to-audit extending to sub-processors. Run the data-flow audit in week one of any pilot, not after go-live. If the vendor cannot answer in writing or pushes back on the diagram, that is a hard signal.

Q: Does the RBI 2018 payment data localisation circular apply to voice AI calls about EMIs?

A: Yes, if the call processes or generates payment system data. An EMI reminder confirming a debit, a UPI mandate nudge, a NEFT confirmation, a card-payment-due notification — all touch payment system data. The RBI circular requires complete data, including audio and transcript, stored only in India. Foreign-region STT, LLM or recording archives are non-compliant for these call types.

Q: What is a Significant Data Fiduciary and does it change my voice AI obligations?

A: SDF designation is made by the Central Government under DPDP Section 10 based on volume, sensitivity, and risk. Most large Indian banks, insurers, telcos and hospital chains are likely to be designated. Once designated, the SDF must conduct periodic DPIAs, undergo audit, appoint a designated DPO, and implement enhanced safeguards. For voice AI, that means documented data-flow mapping, DPIA on the bot deployment, contractual sub-processor controls, and incident readiness. Vendors who cannot support these should not be on your shortlist.

Voice AI Data Residency in India 2026: DPDP + RBI

It is 6:14 PM on a Thursday and Anjali Menon, CISO at a Mumbai-headquartered private bank, has the vendor's SOC 2 Type II report open on one monitor and an architecture diagram on the other. The deck looked clean at the steering committee at 11 AM. Voice AI for collections, twelve-week pilot, ₹2.4 crore on the line. The procurement head wants the sign-off back by 7 PM. On page 47 of the SOC 2 report, in the sub-processor table, there is a single line she has been staring at for nine minutes: Speech-to-text inference: AWS us-east-1 (N. Virginia). The customer's WAV file, the moment a borrower says "haan bhai, kal kar deta hoon paisa", leaves Mumbai, lands in Northern Virginia, gets transcribed, and the transcript comes back. The vendor's deck had said "India-hosted infrastructure." The SOC 2 says something else. Her board has a DPDP-compliance attestation due to the audit committee next quarter, the bank is on the RBI's draft list of Significant Data Fiduciaries, and the FAQ she is about to forward back to procurement starts with one sentence: Where does the WAV file land first?

This piece is for Anjali and everyone who shares her seat. Voice AI data residency in India is not a one-line answer in 2026. It is a stack of overlapping regulations — DPDP 2023, the RBI Storage of Payment System Data circular from 2018, the RBI 2023 cloud guidelines, IRDAI's policyholder data rules, MeitY-empanelled cloud, TRAI's framework on telecom metadata — layered on a vendor architecture that almost nobody draws honestly in their first deck. We will walk through what the law actually requires, where audio physically goes in a typical voice AI stack, which vendor patterns survive a board-level audit, and the fifteen questions to put in front of any vendor before you sign. None of this is legal advice. All of it is the conversation you are about to have anyway.

Why this stopped being a checkbox in 2026

For years, data residency was a procurement footnote. You asked the vendor, you got a yes, you moved on. That stopped working in three steps.

DPDP Act 2023 received Presidential assent in August 2023 and the implementing Rules have been notified in stages through 2025-26. DPDP is not GDPR-with-Indian-characteristics — it is a different statute with purpose limitation, narrow deemed consent, a Consent Manager intermediary class, and a Data Protection Board with penalties up to ₹250 crore per instance. The cross-border transfer regime under Section 16 is restrictive in a particular way: the Central Government will notify countries to which transfers are permitted, and any sector regulator can impose a higher standard — RBI can keep payment data home, IRDAI can keep policyholder data home, regardless of the notified list.

The RBI Storage of Payment System Data circular has been in force since April 2018 and was reinforced in the RBI Master Direction on Outsourcing of IT Services in 2023. The 2018 circular is short and absolute: payment system data, end-to-end, must be stored only in India. Voice AI that touches an EMI reminder, a payment confirmation, or a UPI mandate nudge generates payment data. If your vendor's STT runs in Virginia, you have a problem the day RBI's inspector reads your sub-processor list.

The RBI Guidance Note on Operational Risk and Resilience (April 2024) and the cloud computing guidelines spell out exit, data location, audit rights and concentration risk for cloud arrangements. IRDAI's Information and Cyber Security Guidelines, 2023 require policyholder data to reside in India. MeitY maintains an empanelled cloud provider list for regulated workloads. TRAI's framework on telecom metadata binds anyone routing voice through Indian telco infrastructure.

"Where does your data live" is no longer a checkbox. It is a stack of attestations that must be true at the per-byte level — and a CISO who signs off on a voice AI vendor without mapping the data flow is signing a personal liability cheque.

What "voice AI data" actually means — the seven data classes you need to track

Most vendor conversations stop at "we don't send your data overseas." CISO conversations should start with a sharper question: which data? A voice AI workflow produces seven data objects, each with a different regulatory profile.

Data class	What it contains	Sensitivity	DPDP/RBI/IRDAI treatment
Raw audio (WAV/Opus)	Customer voice, ambient sound, PII spoken aloud	Highest — biometric-adjacent	Personal data under DPDP; payment data if call is transactional
STT transcripts	Verbatim text of conversation	High — full PII, account numbers spoken	Personal data; subject to purpose limitation
Intermediate audio chunks	20-200ms slices sent to STT model	High in aggregate	Same class as raw audio
LLM prompt context	System prompt + transcript + customer metadata	High — joined with CRM data	Personal data; sub-processor logs apply
TTS-generated audio	Bot's spoken response	Low for content, medium for the cloned voice itself	Voice clones require explicit consent under DPDP
Recording archives	Full call recordings stored for compliance	High — long retention amplifies risk	Subject to sector retention rules + DPDP storage limitation
Embeddings and vector indices	Numerical representations for RAG/analytics	Medium — "anonymous" until inverted	DPDP-grey; embeddings can be inverted to text, treat as PII
Analytics warehouse exports	Aggregated CSAT, intent labels, KPI rollups	Low if truly aggregated, high if row-level	Depends on aggregation level

The two classes most CISOs miss are intermediate audio chunks and embeddings. Streaming STT sends 20-100ms audio chunks over a WebSocket and gets partial transcripts back in real time. Every chunk is a network hop. If the STT endpoint is in us-east-1, every chunk traverses an undersea cable. Embeddings are subtler — vendors will tell you they are "anonymous numerical representations," but recent research on embedding inversion shows you can reconstruct faithful text from embeddings given the model. Treat them as PII; DPDP's definition is broad enough to capture them.

Mapping the data flow: where the WAV file actually lives at each stage

Here is the journey of a 90-second outbound voice AI call to a customer in Lucknow, end to end. Read it as a checklist of jurisdiction questions.

Stage 1 — Telephony origination. Call originates from your Indian telephony partner (Exotel, Knowlarity, Servetel, Ozonetel, Plivo India, Twilio India, or your own SIP trunk via Tata or Airtel). Number, SIP signalling and audio all start in India because the PSTN gateway is in India. Low risk if your provider is Indian.

Stage 2 — Media routing to the voice AI runtime. RTP or WebRTC media goes from telephony to the runtime. First jurisdictional fork. AWS Mumbai (ap-south-1), Hyderabad (ap-south-2), Yotta, ESDS, Sify or NxtGen keeps media in India. Singapore (ap-southeast-1) or anywhere west, and the audio just crossed a border. Question: what region runs the orchestrator and the WebRTC SFU?

Stage 3 — Speech-to-text inference. Audio streams to the STT model. Deepgram, AssemblyAI, hosted Whisper, Google STT all default to US or EU; most started offering India endpoints in 2025-26 but the vendor must opt in. Self-hosted Whisper-large or NVIDIA Riva on India GPUs keeps audio in India but costs more. Question: which STT, which endpoint URL, which region, customer-managed keys yes or no?

Stage 4 — LLM inference. Transcript plus system prompt plus CRM context go to the LLM. As of mid-2026, Indian-region inference is available for Claude on Bedrock ap-south-1, GPT-4-class on Azure OpenAI India (preview), and Gemini on GCP Mumbai. Most vendors default to whatever is cheapest, usually us-east-1 or eu-west-1. Question: which LLM, which region, what system prompt, what customer context per turn?

Stage 5 — Text-to-speech synthesis. ElevenLabs and Cartesia are US-default; Smallest.ai runs in India. Question: which TTS, which region, where is the voice clone stored?

Stage 6 — Recording archive. RBI Outsourcing Direction requires sales-call recordings — typically 5 years for banks and NBFCs, 3 years for insurance. Question: bucket region, encryption, retention, who has list/read permission?

Stage 7 — Transcript store. Postgres, DynamoDB or vendor-specific store for replay, dispute, audit. Question: which DB, region, what PII tokenisation (name, account number, OTP)?

Stage 8 — Embeddings and vector store. Pinecone, Weaviate, pgvector or Qdrant. Pinecone defaults to AWS us-east-1 unless you pay extra for Mumbai. Question: which store, which region, are conversation transcripts being embedded into a "memory" store?

Stage 9 — Analytics warehouse. Snowflake, BigQuery and Databricks all have Mumbai regions. Question: which warehouse, region, what raw fields are exported versus aggregated?

If you have not had a vendor whiteboard this with regions labelled on every box, you have not had a residency conversation. You have had a marketing conversation.

The regulatory stack, mapped to the data flow

DPDP, RBI, IRDAI and TRAI overlap, conflict in places, and the strictest rule wins.

DPDP Act 2023 — the floor for everyone. Applies to anyone processing personal data of a person in India. Section 8 sets Data Fiduciary obligations (notice, purpose limitation, accuracy, storage limitation, security). Section 9 sets the consent standard — free, specific, informed, unconditional, unambiguous, with clear affirmative action. Voice consent captured during a call counts only if the purpose is specifically stated, not blanket "by continuing this call you agree." Section 10 designates Significant Data Fiduciaries (SDFs); SDFs face DPIA, audit, and designated DPO obligations. Most large Indian banks, insurers, telcos and hospital chains are likely SDFs once notifications complete. Section 16 restricts cross-border transfer to countries the Central Government notifies, with sector regulators free to impose tighter rules. As of May 2026 no notified list has been gazetted, which most privacy counsel reads as a de facto requirement to keep data in India until clarity arrives. The Consent Manager framework under Section 6 creates a new intermediary class — voice AI consent flows must integrate with these where a customer uses one.

RBI Storage of Payment System Data, April 2018. Short and blunt. Complete data relating to payment systems shall be stored only in India. Foreign leg of a cross-border transaction may be stored abroad. A voice AI call confirming a UPI mandate, an EMI debit or a NEFT transaction generates payment system data; audio plus transcript must be in India. Foreign-hosted STT is non-compliant.

RBI Master Direction on Outsourcing of IT Services, April 2023. Requires identified data location, exit clauses, sub-processor disclosure, right-to-audit including sub-processors, and concentration risk management. Voice AI is generally an outsourced IT service for a bank, so the full sub-processor chain — STT, LLM, TTS, cloud, vector store — is in scope. Vendor SOC 2 reports usually stop at the first level; RBI's expectation runs the chain.

IRDAI Information and Cyber Security Guidelines, 2023. Policyholder data in India. Sales calls recorded with disclosed recording, retained policy duration plus statutory cooling-off. A voice AI sales bot that lets the LLM call slip to an EU endpoint violates both disclosure and data-location requirements.

TRAI and telecom data. Unified License conditions require subscriber data, CDRs and metadata in India. The Telecommunications Act 2023 reinforces this. TRAI DLT consent rules continue to apply at the dialler regardless of where AI inference happens.

MeitY empanelment. Cleared list for government workloads; many CISOs treat it as a shortlist for regulated private workloads. AWS, Azure, GCP, Yotta, ESDS, Sify, NxtGen, CtrlS, NTT-Netmagic and Tata Communications are typically on it.

The decision rule: the strictest applicable regulation wins. For a bank running voice AI on EMI collections, RBI 2018 + RBI 2023 outsourcing + DPDP all apply. For an insurer running pre-issuance verification, IRDAI + DPDP + TRAI DLT. For a hospital chain running appointment reminders, DPDP plus health-data treatment under the Rules plus state Clinical Establishments Act provisions.

Vendor architecture patterns — what survives an audit and what does not

Voice AI vendors have converged on three broad architecture patterns. Each behaves differently under audit.

Pattern	Where audio/STT/LLM run	Recording + transcript store	DPDP	RBI 2018	IRDAI	Audit story
Fully-India	AWS Mumbai/Hyderabad or Yotta/ESDS/Sify; self-hosted STT and LLM or India-region managed	India bucket, KMS with customer keys	Defensible	Compliant	Compliant	Strong
Hybrid declared	India runtime, STT in India, LLM cross-border with redaction	India bucket, India keys	Defensible with DPIA	Grey for payment data, depends on what is sent	Risky for policyholder calls	Workable with documentation
Hybrid undeclared	Marketing says India, sub-processors in US/EU	Mixed	Hard to defend	Non-compliant	Non-compliant	Fails inspection
Fully-foreign	US-default STT, US-default LLM, US bucket	US	Non-compliant for SDFs and post-notification	Non-compliant for payment data	Non-compliant	Fails immediately

Four observations from running these comparisons across real procurement cycles.

Fully-India is achievable but costs 30-60% more per minute than the cheapest US-default configuration — self-hosted Whisper or an Indian STT provider, Bedrock ap-south-1 or self-hosted Llama-class on India GPUs, India-region TTS. For a bank running 4 lakh calls a month, the delta is material but defensible. Yotta, ESDS, Sify and Tata Communications offer Indian sovereign cloud with formal MeitY status; the performance gap to AWS Mumbai narrowed in 2025-26.

Hybrid declared is the realistic middle path for non-payment workloads. Audio and transcript stay in India; the LLM call goes cross-border only after redaction of names, account numbers, OTPs and other PII. This needs a deterministic regex-plus-NER layer before the cross-border boundary, not "the LLM is told not to log PII." Defensible under DPDP for non-payment, non-policyholder workloads; if the redaction is sloppy, you leak PII to us-east-1 and find out in the audit.

Hybrid undeclared is the most common and most dangerous pattern. Vendor deck says India-hosted; SOC 2 sub-processor list reveals US endpoints. The standard response is "but we have a DPA in place." A DPA is paperwork, not a data flow change. If the WAV file lands in Virginia, the DPA does not move it back. Anjali's 6:14 PM problem is exactly this pattern.

Fully-foreign is what global voice AI startups ship by default. Often cheaper, almost never deployable inside a regulated Indian enterprise without major architectural change. If a US-headquartered vendor says they can deploy in your VPC in ap-south-1, ask for the per-minute pricing of that configuration before you celebrate — usually 2-3x the marketing price.

For more on how to score these architectures in an RFP, see our voice AI vendor RFP scoring rubric for India 2026.

Fifteen vendor questions to put on paper before you sign

Send these in writing before procurement closes. Do not accept verbal answers. Attach the responses to the contract as a binding schedule.

Where does the customer's WAV file physically land first after leaving our telephony provider? Specify the cloud, region, availability zone, and the service (e.g., AWS ap-south-1, S3, bucket name pattern).
Is the audio encrypted at rest with customer-managed KMS keys (CMK in our AWS account) or with vendor-managed keys? If vendor-managed, what is the key rotation schedule and who has unwrapping access?
Which STT provider performs inference, what is the exact endpoint URL, and in which region does the model run? If multiple providers can serve a call, what is the routing logic and can it spill cross-border under load?
For streaming STT, do intermediate audio chunks cross any geographic boundary between our telephony PoP and the STT endpoint?
Which LLM provider, model version, and region serves the conversation? If multiple, which one is used in fallback and where does that fallback live?
What customer data is included in the LLM prompt on every turn — system prompt, full transcript history, CRM context fields, account numbers, balances? Provide a sample fully-redacted prompt.
What PII redaction runs before any cross-border hop? Show us the regex and NER patterns. Account number, PAN, Aadhaar, OTP, name, address, phone — which are detected, which are masked, which are tokenised reversibly versus irreversibly?
Where are full call recordings stored, in what format, with what retention, and how is access logged? Who in the vendor's team can list and download recordings and how are those actions audited?
Where are transcripts stored separately from recordings, in what schema, and is PII tokenised before storage?
If embeddings or vector indices are created from transcripts or our knowledge base, where do they live and what is the embedding model? Have you tested for embedding inversion against this configuration?
List every sub-processor — STT, LLM, TTS, cloud, telephony, vector store, observability, analytics warehouse, error tracking — with the region in which each processes our data. This is what the RBI Outsourcing Direction requires.
What is the data exit plan? On termination, in what format and by what mechanism is our data returned and how do you certify deletion across all sub-processors?
Provide the full list of countries our data may traverse or rest in under any operational scenario, including disaster recovery and failover.
Confirm DPDP Section 16 alignment. Do you transfer personal data outside India under any circumstance for our account? If yes, to which countries and under what lawful basis?
For payment-related calls (EMI, UPI mandate, NEFT confirmation), can you operate in a configuration where all seven data classes (audio, intermediate chunks, transcripts, prompts, embeddings, recordings, analytics) stay within Indian data centres? What is the per-minute cost premium of that configuration?

If a vendor cannot answer any of these in writing within ten business days, that is the answer.

The CISO's data-flow audit — what we do in week one

When a regulated enterprise signs a pilot with us, week one is not the bot build. It is the data-flow audit. Five two-hour sessions.

Session 1 — scope the call types. Which use cases are in scope (EMI reminders, KYC re-verification, appointment reminders, sales, surveys, collections) and which regulator applies to each. Determines the residency stack.

Session 2 — map the data flow. Whiteboard every box from telephony to analytics with regions labelled. Vendor names every sub-processor and encryption posture. Output: a one-page architecture diagram with jurisdiction on every arrow.

Session 3 — map consent. What is captured at IVR opening, what purpose statement is read, how it is logged, how it integrates with the DPDP Consent Manager pattern. See our DPDP Act compliance checklist for voice AI and our TRAI DLT compliance piece.

Session 4 — retention and deletion. Recording retention by use case, transcript retention, embedding lifecycle, analytics aggregation, data subject rights workflow.

Session 5 — audit and incident. Right-to-audit, sub-processor notification windows, breach notification timeline (DPDP requires intimation to the Data Protection Board and affected persons), deletion certification on exit.

Output: a residency attestation document the CISO can hand to the audit committee. Most vendors will not do this work because it requires honesty about the architecture. The few who will are the ones worth piloting.

What goes wrong in real deployments

Six failure modes from Indian bank, NBFC, insurer, hospital and telco deployments in the last eighteen months.

One — silent failover. Primary STT in Mumbai, failover in Singapore. Under load, calls quietly fail over and data leaves India. Nobody notices until the audit. Fix: contractual prohibition on cross-border failover, configuration flag that fails the call instead of falling over.

Two — observability leak. Datadog, New Relic, Sentry default to US ingestion. Production stack traces sometimes contain transcript snippets — PII left through the logging pipeline. Fix: observability on an India region or self-hosted in your VPC, with scrubbing rules verified.

Three — model improvement clause. Standard SaaS contracts grant a perpetual licence to use customer data for model improvement. Under DPDP purpose limitation, this is out of scope of the consent the customer gave. Fix: explicit carve-out — no use of voice or transcripts for training, fine-tuning or evaluation without separate per-use-case consent.

Four — embedding back-door. Vendor stores past conversations as embeddings for personalisation, sitting in Pinecone us-east-1. Primary store is in India but this back-door is leaking. Fix: embeddings in Mumbai with the same encryption posture, verified.

Five — support engineer access. A vendor engineer in San Francisco gets temporary read access to a recording to debug. Data crossed the border via screen-share. Fix: break-glass access procedure with time-boxed approval, jurisdictional restriction where policy requires, full audit logging.

Six — disaster recovery. Vendor's DR plan fails over to a US region. Under RBI outsourcing direction, DR location must be disclosed and approved. Fix: DR to a second Indian region (Mumbai primary, Hyderabad DR), not cross-border.

Sector overlays — where the rules tighten

DPDP is the baseline. The overlays are where life gets interesting.

Banking and NBFC. RBI 2018 plus 2023 outsourcing direction. Voice AI for BFSI workloads — EMI reminders, KYC re-verification, payment confirmations, collections — must be fully India-resident for payment data. Recording retention typically 5 years. Right-to-audit extends to sub-processors. See our RBI Fair Practices Code piece on AI collection calls.

Insurance. IRDAI Cyber Security Guidelines 2023. Policyholder data in India. Sales calls recorded and retained. Voice cloning of agents requires explicit consent under Protection of Policyholder Interests rules. See our insurance page.

Healthcare. DPDP treats health data as sensitive, plus state Clinical Establishments Acts and the upcoming Digital Health Records framework. Healthcare deployments typically run fully-India.

Telecom. TRAI plus Telecom Act 2023. Subscriber metadata in India. DLT consent at the dialler. Voice AI vendors are effectively VAS riders on the underlying licence.

Government and PSU. MeitY empanelled cloud only. Most vendors are not on the empanelment list.

What changes in the next 12 months

A few things will move between now and mid-2027. Factor them into contract clauses.

The Central Government is expected to gazette the first list of permitted DPDP Section 16 transfer destinations. It will not override RBI or IRDAI sector rules. Expect a narrow list, possibly Quad plus Singapore. Singapore on the list would make Singapore-region inference defensible for non-regulated workloads.

DPDP Significant Data Fiduciary notifications will roll out by sector. Once designated, DPIA and audit obligations bite and vendor architecture transparency requirements stiffen. Get the architecture right before designation, not after.

RBI is expected to issue clearer guidance on AI in financial services, building on FREE-AI framework discussions. Expect named model-risk obligations and a clearer regime around AI sub-processors. IRDAI is likely to update the 2023 guidelines with explicit treatment of generative AI in policyholder interactions, including logging every AI-generated assertion for policy duration.

Indian-region availability of major LLMs will continue improving. By end of 2026, expect GPT-4-class, Claude-3.5-class and Gemini Pro all in production in Mumbai. The US-default cost gap will narrow but not close fully.

Bottom line

Voice AI data residency in India is not a checkbox or a SOC 2 line item. It is seven data classes, nine processing stages, and four overlapping regulatory regimes — and the strictest applicable rule wins. Anjali's 6:14 PM problem is solvable: get the vendor to draw the diagram honestly, send the fifteen questions, demand the residency attestation in writing, and design for fully-India or hybrid-declared depending on whether the workload touches payment or policyholder data. Vendors who can sit through that conversation without flinching are the ones to pilot. The ones who cannot will fail your audit a year from now, at which point the procurement decision will look very different from how it looked at 6 PM on a Thursday.

Why this stopped being a checkbox in 2026

For years, data residency was a procurement footnote. You asked the vendor, you got a yes, you moved on. That stopped working in three steps.

What "voice AI data" actually means — the seven data classes you need to track

Data class	What it contains	Sensitivity	DPDP/RBI/IRDAI treatment
Raw audio (WAV/Opus)	Customer voice, ambient sound, PII spoken aloud	Highest — biometric-adjacent	Personal data under DPDP; payment data if call is transactional
STT transcripts	Verbatim text of conversation	High — full PII, account numbers spoken	Personal data; subject to purpose limitation
Intermediate audio chunks	20-200ms slices sent to STT model	High in aggregate	Same class as raw audio
LLM prompt context	System prompt + transcript + customer metadata	High — joined with CRM data	Personal data; sub-processor logs apply
TTS-generated audio	Bot's spoken response	Low for content, medium for the cloned voice itself	Voice clones require explicit consent under DPDP
Recording archives	Full call recordings stored for compliance	High — long retention amplifies risk	Subject to sector retention rules + DPDP storage limitation
Embeddings and vector indices	Numerical representations for RAG/analytics	Medium — "anonymous" until inverted	DPDP-grey; embeddings can be inverted to text, treat as PII
Analytics warehouse exports	Aggregated CSAT, intent labels, KPI rollups	Low if truly aggregated, high if row-level	Depends on aggregation level

Mapping the data flow: where the WAV file actually lives at each stage

Here is the journey of a 90-second outbound voice AI call to a customer in Lucknow, end to end. Read it as a checklist of jurisdiction questions.

Stage 5 — Text-to-speech synthesis. ElevenLabs and Cartesia are US-default; Smallest.ai runs in India. Question: which TTS, which region, where is the voice clone stored?

Stage 7 — Transcript store. Postgres, DynamoDB or vendor-specific store for replay, dispute, audit. Question: which DB, region, what PII tokenisation (name, account number, OTP)?

Stage 9 — Analytics warehouse. Snowflake, BigQuery and Databricks all have Mumbai regions. Question: which warehouse, region, what raw fields are exported versus aggregated?

If you have not had a vendor whiteboard this with regions labelled on every box, you have not had a residency conversation. You have had a marketing conversation.

The regulatory stack, mapped to the data flow

DPDP, RBI, IRDAI and TRAI overlap, conflict in places, and the strictest rule wins.

Vendor architecture patterns — what survives an audit and what does not

Voice AI vendors have converged on three broad architecture patterns. Each behaves differently under audit.

Pattern	Where audio/STT/LLM run	Recording + transcript store	DPDP	RBI 2018	IRDAI	Audit story
Fully-India	AWS Mumbai/Hyderabad or Yotta/ESDS/Sify; self-hosted STT and LLM or India-region managed	India bucket, KMS with customer keys	Defensible	Compliant	Compliant	Strong
Hybrid declared	India runtime, STT in India, LLM cross-border with redaction	India bucket, India keys	Defensible with DPIA	Grey for payment data, depends on what is sent	Risky for policyholder calls	Workable with documentation
Hybrid undeclared	Marketing says India, sub-processors in US/EU	Mixed	Hard to defend	Non-compliant	Non-compliant	Fails inspection
Fully-foreign	US-default STT, US-default LLM, US bucket	US	Non-compliant for SDFs and post-notification	Non-compliant for payment data	Non-compliant	Fails immediately

Four observations from running these comparisons across real procurement cycles.

For more on how to score these architectures in an RFP, see our voice AI vendor RFP scoring rubric for India 2026.

Fifteen vendor questions to put on paper before you sign

Send these in writing before procurement closes. Do not accept verbal answers. Attach the responses to the contract as a binding schedule.

Where does the customer's WAV file physically land first after leaving our telephony provider? Specify the cloud, region, availability zone, and the service (e.g., AWS ap-south-1, S3, bucket name pattern).
Is the audio encrypted at rest with customer-managed KMS keys (CMK in our AWS account) or with vendor-managed keys? If vendor-managed, what is the key rotation schedule and who has unwrapping access?
Which STT provider performs inference, what is the exact endpoint URL, and in which region does the model run? If multiple providers can serve a call, what is the routing logic and can it spill cross-border under load?
For streaming STT, do intermediate audio chunks cross any geographic boundary between our telephony PoP and the STT endpoint?
Which LLM provider, model version, and region serves the conversation? If multiple, which one is used in fallback and where does that fallback live?
What customer data is included in the LLM prompt on every turn — system prompt, full transcript history, CRM context fields, account numbers, balances? Provide a sample fully-redacted prompt.
What PII redaction runs before any cross-border hop? Show us the regex and NER patterns. Account number, PAN, Aadhaar, OTP, name, address, phone — which are detected, which are masked, which are tokenised reversibly versus irreversibly?
Where are full call recordings stored, in what format, with what retention, and how is access logged? Who in the vendor's team can list and download recordings and how are those actions audited?
Where are transcripts stored separately from recordings, in what schema, and is PII tokenised before storage?
If embeddings or vector indices are created from transcripts or our knowledge base, where do they live and what is the embedding model? Have you tested for embedding inversion against this configuration?
List every sub-processor — STT, LLM, TTS, cloud, telephony, vector store, observability, analytics warehouse, error tracking — with the region in which each processes our data. This is what the RBI Outsourcing Direction requires.
What is the data exit plan? On termination, in what format and by what mechanism is our data returned and how do you certify deletion across all sub-processors?
Provide the full list of countries our data may traverse or rest in under any operational scenario, including disaster recovery and failover.
Confirm DPDP Section 16 alignment. Do you transfer personal data outside India under any circumstance for our account? If yes, to which countries and under what lawful basis?
For payment-related calls (EMI, UPI mandate, NEFT confirmation), can you operate in a configuration where all seven data classes (audio, intermediate chunks, transcripts, prompts, embeddings, recordings, analytics) stay within Indian data centres? What is the per-minute cost premium of that configuration?

If a vendor cannot answer any of these in writing within ten business days, that is the answer.

The CISO's data-flow audit — what we do in week one

When a regulated enterprise signs a pilot with us, week one is not the bot build. It is the data-flow audit. Five two-hour sessions.

Session 4 — retention and deletion. Recording retention by use case, transcript retention, embedding lifecycle, analytics aggregation, data subject rights workflow.

What goes wrong in real deployments

Six failure modes from Indian bank, NBFC, insurer, hospital and telco deployments in the last eighteen months.

Sector overlays — where the rules tighten

DPDP is the baseline. The overlays are where life gets interesting.

Healthcare. DPDP treats health data as sensitive, plus state Clinical Establishments Acts and the upcoming Digital Health Records framework. Healthcare deployments typically run fully-India.

Telecom. TRAI plus Telecom Act 2023. Subscriber metadata in India. DLT consent at the dialler. Voice AI vendors are effectively VAS riders on the underlying licence.

Government and PSU. MeitY empanelled cloud only. Most vendors are not on the empanelment list.

What changes in the next 12 months

A few things will move between now and mid-2027. Factor them into contract clauses.

Voice AI Data Residency and Sovereignty in India 2026: DPDP, RBI, IRDAI and Cross-Border Rules That Decide Where Your Audio Lives

Why this stopped being a checkbox in 2026

What "voice AI data" actually means — the seven data classes you need to track

Mapping the data flow: where the WAV file actually lives at each stage

The regulatory stack, mapped to the data flow

Vendor architecture patterns — what survives an audit and what does not

Fifteen vendor questions to put on paper before you sign

The CISO's data-flow audit — what we do in week one

What goes wrong in real deployments

Sector overlays — where the rules tighten

What changes in the next 12 months

Bottom line

Frequently Asked Questions

Q: Does the DPDP Act 2023 require voice AI data to stay in India?

Q: Where does my voice AI call audio physically go when I speak to an Indian bank's bot?

Q: Can a US-headquartered voice AI vendor be DPDP-compliant for an Indian bank?

Q: What is the difference between data residency and data sovereignty for voice AI?

Q: How do I audit a voice AI vendor's data flow before signing?

Q: Does the RBI 2018 payment data localisation circular apply to voice AI calls about EMIs?

Q: What is a Significant Data Fiduciary and does it change my voice AI obligations?

Caller Digital

Voice AI Data Residency and Sovereignty in India 2026: DPDP, RBI, IRDAI and Cross-Border Rules That Decide Where Your Audio Lives

Why this stopped being a checkbox in 2026

What "voice AI data" actually means — the seven data classes you need to track

Mapping the data flow: where the WAV file actually lives at each stage

The regulatory stack, mapped to the data flow

Vendor architecture patterns — what survives an audit and what does not

Fifteen vendor questions to put on paper before you sign

The CISO's data-flow audit — what we do in week one

What goes wrong in real deployments

Sector overlays — where the rules tighten

What changes in the next 12 months

Bottom line

Frequently Asked Questions

Q: Does the DPDP Act 2023 require voice AI data to stay in India?

Q: Where does my voice AI call audio physically go when I speak to an Indian bank's bot?

Q: Can a US-headquartered voice AI vendor be DPDP-compliant for an Indian bank?

Q: What is the difference between data residency and data sovereignty for voice AI?

Q: How do I audit a voice AI vendor's data flow before signing?

Q: Does the RBI 2018 payment data localisation circular apply to voice AI calls about EMIs?

Q: What is a Significant Data Fiduciary and does it change my voice AI obligations?

Caller Digital

Other Blogs

Voice AI Call QA & Scoring in India 2026: Auditing 100% of Calls Instead of Sampling 2%

Voice AI Clinical Triage and Nurse Helplines in India 2026: Symptom Intake, Out-of-Hours and Tele-Triage at Scale

Voice AI Persona Selection in India: Male vs Female, Accent, Age, Pace — A Vertical Playbook 2026

Voice AI Analytics Dashboards: What an Indian VP of Ops Should Demand from a Vendor in 2026

Voice AI for India's Agritech Sector 2026: Farmer Calls, Mandi Prices and KCC Lending in Regional Languages

Voice AI for Stockbroking, Demat and Equity Investing Platforms in India 2026

Voice AI for Microfinance and Rural Lending in India 2026: JLG Collections, Center Meetings and Field Officer Augmentation

Voice AI for Credit Card Operations in India 2026: Activation, EMI Conversion, Limit Enhancement and Collections

A/B Testing Voice AI Campaigns in India 2026: Scripts, Voices, Call Windows and What Actually Moves Connect Rate