What is the typical voice AI price per minute in India in 2026?

Production-grade voice AI in India is quoted between ₹3 and ₹9 per connected minute, depending on language coverage, voice quality, integration complexity and call volume. The cheapest end (₹2–₹3) is usually stripped-down Hindi TTS with no regional support and no CRM writeback. The middle (₹4–₹6) adds decent Hindi and one or two regional languages. The premium band (₹7–₹9) includes studio-grade multilingual TTS, sub-300ms latency, HIS/CRM integration, human handoff, and DPDP-aligned deployment. Per-minute price alone is a misleading metric — what you should compare is cost per successful outcome, not cost per minute.

Why is cheap voice AI often more expensive in practice?

Because the per-minute rate hides three costs: higher hangup rates on poor-quality Hindi and regional TTS (patients and borrowers drop off in the first 10 seconds), lower first-call resolution that forces re-dials, and weaker CRM integration that leaks data and forces manual cleanup. A ₹3/minute bot with a 40% hangup rate in Patna Hindi has a real cost per completed conversation close to ₹8–₹10. A ₹9/minute bot with a 10% hangup rate and clean CRM writeback typically lands around ₹11–₹13 per completed conversation — and closes the outcome the cheap bot never reaches at all.

What metric should I actually use to compare voice AI vendors?

Cost per retained outcome. For collections, that means cost per RTP (Right Party Contacted and promise-to-pay captured). For appointment booking, cost per confirmed slot. For customer care, cost per first-contact resolution. Run a 2-week paid pilot with every shortlisted vendor, ask for raw call logs, and compute cost-per-outcome yourself. Ignore vendor-published averages — they are almost always cherry-picked from English-only campaigns and do not reflect your language mix.

How much does voice quality really affect outcomes?

More than any other factor, including the LLM prompt. Internal benchmarks across Indian deployments consistently show that upgrading from commodity Hindi TTS to a studio-grade, prosody-tuned voice lifts completion rates by 18–32 percentage points in Tier-2 and Tier-3 markets. The math is brutal: a cheap TTS with 50% completion costs twice as much per completed call as a premium TTS with 85% completion, even if the per-minute price is half. Voice is not a comfort feature. It is the single biggest lever on unit economics.

Are Indian voice AI vendors cheaper than global ones?

On sticker price, yes — by a wide margin. Global platforms like Vapi, Retell or Bland run at roughly ₹12–₹25 per minute once Indian network egress, TTS and LLM costs are loaded in, versus ₹3–₹9 for Indian-built platforms. But the relevant comparison is not sticker price. It is whether the global vendor can actually deliver Hindi and regional Indian languages at the quality buyers need. Most cannot. For an Indian enterprise buyer running meaningful volume in Hindi, Tamil, Telugu or Marathi, a domestic vendor is almost always better value — provided you evaluate on outcome cost, not input cost.

Should I negotiate per-minute pricing or commit to a minimum?

For volumes under 100,000 minutes per month, stay on pay-as-you-go per-minute pricing — you keep the option to switch if quality slips. Above 100,000 minutes, a committed-use discount in the 15–25% range is reasonable, but only after a successful 4–6 week pilot. Never commit to an annual minimum without first running production volume in production conditions — Indian language performance varies enormously by vendor and by region, and the commitment locks you into a vendor whose weaknesses you have not yet discovered.

What hidden costs should I watch for in a voice AI quote?

Six items buyers commonly miss. One, setup or onboarding fees — some vendors charge ₹2–₹5 lakh upfront for voice and persona design. Two, integration fees — HIS, EMR or CRM connectors are often ₹1–₹3 lakh per integration. Three, telephony egress — the SIP or PSTN minutes are sometimes billed separately from the AI minutes. Four, TTS surcharges — premium voices and regional languages may carry a 20–40% premium. Five, storage fees for call recordings beyond a retention window. Six, change management fees for prompt or flow updates after go-live. Ask for an all-in quote with every line item, in writing, before you compare vendors.

What does a sensible pilot look like before committing?

A 2–4 week paid pilot on one use case, one or two languages, and 5,000–20,000 minutes of real production traffic. Set success criteria upfront: target completion rate, target outcome rate, maximum allowable hangup rate in each language, and a maximum cost per outcome. Insist on raw call logs, not aggregated dashboards. Listen to at least 50 random calls in each language with a native speaker. If the vendor refuses any of this, walk away — serious vendors welcome scrutiny because it shortens their sales cycle and raises their win rate.

Voice AI Pricing India 2026: The 7 Contract Clauses Vendors Hide | Caller Digital

Summary: The ₹3 vs ₹9 per-minute debate is a distraction. The real price of voice AI in India is decided by seven clauses buried in the contract: how a "connected minute" is defined, the inbound/outbound differential, retry and answer-machine handling, setup and integration fees, minimum commits and credit expiration, TTS voice and language surcharges, and CRM write-back and seat fees. This post walks through each clause, shows how a ₹3/min headline rate becomes ₹11.40/min in production, and gives you a 10-question RFP block you can paste straight into your next vendor evaluation.

Every NBFC, hospital chain, and ecommerce brand we have talked to in the last year has the same story. The procurement team runs a voice AI bake-off. Three vendors pitch. One quotes ₹3 per minute, one quotes ₹7, one quotes ₹12. The procurement lead picks the ₹3 vendor on cost, gets them through legal, deploys them — and six months later the CFO asks why the monthly invoice is twice what the ₹7 vendor would have been, for worse outcomes.

The instinct is to blame the salesperson. That is wrong. The salesperson told the truth: the rate is ₹3 per minute. What they did not tell you is that "minute" means something different in their contract than in yours, and that six other clauses in the same contract each add 10–40% to the real monthly spend in ways that do not show up until month three.

This post is not another argument that per-minute pricing is broken. That argument has been made, well, by others — and it is correct but not actionable. What buyers actually need is a contract-clause checklist: a specific list of the seven line items that decide what "₹3 per minute" actually means when the monthly invoice arrives. Use it as a procurement filter, an RFP question block, and a post-deployment audit tool. The numbers below are drawn from real deployments we have seen in Indian BFSI, healthcare, and ecommerce during 2024–2026.

The ₹3 vs ₹9 question is the wrong question

Before the checklist, a brief detour. The industry has spent two years arguing about whether per-minute, per-credit, or outcome-based pricing is the "fair" model. The honest answer is that all three models can be fair or abusive depending on who writes the contract. A ₹3/min vendor with transparent clauses can be cheaper than a ₹9/min vendor with predatory ones, and vice versa. Obsessing over the headline rate is how procurement teams lose the negotiation before it starts.

The more useful question is the one a CFO asks at month six: "what is our actual cost per recovered rupee, per booked appointment, per qualified lead?" That number is almost never the headline rate. It is the headline rate multiplied by a stack of clause-specific inflators — each one small, all of them compounding. In the worked example at the end of this post, a contract with a ₹3/min headline rate produces a ₹11.40/min effective rate, while a ₹9/min headline with clean clauses stays at ₹9.20/min. The ₹3 vendor was genuinely three times more expensive per outcome.

The seven clauses below are where that compounding happens. If you read only one section of this post, read the connected-minute definition — it is the single biggest source of invoice shock in Indian voice AI deployments in 2026.

The 7 contract clauses that decide what you actually pay

Clause 1: The connected-minute definition — "connected" does not mean what you think

Every voice AI contract charges for "connected minutes." Most procurement teams assume this means the minutes during which the borrower or customer is actually speaking to the bot. It almost never does.

Read the contract. You will typically find one of four definitions, each more aggressive than the last:

Definition A (rare, transparent): billed time starts when the borrower says their first word and ends when the call is terminated. Silence is not billed beyond a defined threshold.
Definition B (common): billed time starts when the carrier signals answer supervision — meaning the moment the phone is picked up, before anyone has said anything. This typically adds 3–7 seconds per call, which is 5–12% of a 60-second call.
Definition C (common, aggressive): billed time starts when the call is placed, including ring time. This adds 8–15 seconds per call on Indian telecom — you are paying for ringing, even on calls that are never answered.
Definition D (predatory, and we have seen it in real Indian contracts): billed time includes carrier-side voicemail detection, answering-machine navigation, silence up to 30 seconds, and the entire hang-up tail. A 45-second human conversation becomes a 75-second billed minute.

The practical test: ask the vendor to show you, in writing, exactly which Unix timestamp the billing clock starts on and which one it stops on. If the answer is vague, the answer is Definition D. A ₹3/min rate under Definition D is effectively ₹4.50–₹5 under Definition A — a 50–67% inflator before any other clause kicks in.

There is a specific sub-trap here: silence billing. Some contracts charge for the entire call duration even when the bot is silent because the borrower is reading a document, talking to someone else in the room, or thinking. Borrowers in tier-2 and tier-3 India are slower to respond than the US-trained latency assumptions bake in, which means silence billing hits Indian NBFC deployments harder than US ones. Ask: "is silence billed, and if so above what threshold and at what rate?"

Clause 2: The inbound/outbound differential

Most voice AI platforms have very different cost structures for inbound and outbound traffic, and the contract often obscures the difference by quoting a blended rate. But every enterprise deployment we have seen is at least 70% outbound, because that is where the collections, reminders, and outreach workflows live.

The gotcha: the blended rate is calculated on a typical mix the vendor has across all customers, which skews toward 50/50. Your real mix is 80/20 outbound, and outbound is where the per-minute cost is higher — because outbound calls carry the telephony termination fee, the DLT-registered template delivery fee, and often a dialler-per-minute premium that inbound does not.

The practical test: ask for the unblended inbound and outbound rates separately and compute your own weighted average based on your real mix. A ₹3 blended that is actually ₹2 inbound and ₹4 outbound becomes ₹3.60 per real-mix minute — a 20% inflator even before the connected-minute definition hits.

Clause 3: Retries, redials, and the answer-machine penalty

In Indian collections and reminder workflows, first-call answer rates rarely exceed 35–45%. That means for every borrower you actually reach, the dialler placed 2–3 calls. The question is: who pays for the unanswered attempts?

Three common contract patterns:

Pattern 1 (rare, transparent): unanswered attempts are free. You pay only for connected calls.
Pattern 2 (common): the first two unanswered attempts per borrower per day are free, subsequent retries are billed at a reduced rate (typically 25–40% of the connected rate).
Pattern 3 (aggressive): every attempt — answered or not — is billed as a full connected minute once it reaches carrier-side ring supervision. This is almost always the case under the Definition C or D connected-minute rule.

Combined with a typical 35% answer rate, Pattern 3 effectively triples the real cost per connected conversation. A ₹3/min vendor under Pattern 3 is paying for three dials to get to one conversation — bringing the effective rate per conversation to ₹9/min without the bot ever saying a word differently.

And there is a subtler trap: answer-machine navigation. When a call hits a voicemail, some vendors' diallers attempt to navigate the prompts to leave a message. This can add 20–45 seconds per voicemail hit, all billed. Ask: "what does the dialler do when it detects an answering machine, and is that time billed?"

Clause 4: Setup, onboarding, and integration fees

Headline per-minute rates never include the one-time costs, and the one-time costs are where vendors concentrate margin they could not get into the monthly run rate. In Indian enterprise deployments we have seen:

Platform setup fees: ₹1–5 lakh, often framed as a "configuration charge" or "tenant provisioning fee."
Voice model fine-tuning: ₹2–8 lakh, framed as "language customisation" or "vertical training."
CRM integration: ₹3–10 lakh per CRM connector, even for standard connectors to Salesforce, Zoho, Kapture, or LeadSquared that the vendor has built once and resells.
Telephony provisioning and DLT template registration: ₹25,000–1 lakh per template, plus ongoing template-change fees.
Go-live support: 4–12 weeks of "launch engineering," billed at professional-services rates that are often ₹15,000–25,000 per day.

Total first-year one-time cost: typically ₹10–25 lakh for a mid-sized NBFC deployment. Amortised over 12 months of typical volume, that adds ₹0.80–₹2.10 per minute to the effective rate. A ₹3 headline becomes ₹4.50–₹5.10 purely from one-time amortisation.

The practical test: demand a total-cost-of-deployment quote, not just a per-minute quote. Ask for every one-time fee in a single line-itemised document, and ask whether any of them can be waived or amortised into the monthly rate.

Clause 5: Minimum commits, rollover, and credit expiration

The moment a contract includes a minimum monthly commit, the per-minute rate stops being the real rate. What you are actually paying is whichever is higher of (a) your usage × rate, or (b) the commit. In seasonal businesses like Indian ecommerce, where October–December volumes are 3–5× January–March, the commit almost always binds in the low months — meaning you are paying for minutes you never used.

Three flavours to watch:

Hard commit, no rollover: unused minutes vanish at month end. Common in smaller vendors. A ₹3/min rate on a 100,000-minute monthly commit, used at 60% in March, is effectively ₹5/min for that month's real traffic.
Credit expiration: you pre-pay for a pool of minutes (or "credits") that expire after 6–12 months. Any vendor quoting a "₹2/min credit bundle" almost always has this clause, and the real cost depends on how many credits you lose at the expiration date.
Committed growth rate: the contract specifies a volume growth curve you must hit, with penalties if you fall below. This is mostly seen in very large enterprise deals, but it has started appearing in mid-market Indian contracts as vendors try to hedge against flat-line customers.

The practical test: demand quarterly true-ups rather than monthly, demand rollover of unused minutes for at least one quarter, and get the expiration clause deleted or extended to at least 18 months. These three changes alone can reduce the effective rate by 15–25% for a typical mid-market deployment.

Clause 6: TTS voice premium and language surcharges

Hindi voice output is not cheap — and good Hindi voice output is priced like it. Most voice AI platforms offer a baseline English TTS voice at the headline rate, then charge progressively more for regional languages, premium voices, and code-switching capability.

Typical surcharges we have seen in Indian contracts in 2026:

Hindi (baseline voice): ₹0.50–₹1.50 per minute over headline.
Hindi (premium voice with prosody): ₹1.50–₹3.00 over headline.
Tamil, Telugu, Marathi, Bengali, Kannada, Gujarati, Punjabi, Malayalam: ₹1.00–₹2.50 over headline each.
Code-switching (Hindi-English, Tamil-English): ₹1.00–₹2.00 additional over the regional premium.
Custom brand voice: ₹5–15 lakh one-time plus ₹1.00–₹2.50 ongoing per minute.

The aggregation effect is brutal. An NBFC running collection calls in Hindi, Tamil, and Telugu with code-switching — a normal pan-India deployment — can find itself paying a regional-language premium on 85% of its real traffic, which makes the headline English rate almost irrelevant. A ₹3/min headline with a ₹2 regional premium is a ₹5 effective rate on 85% of traffic, or ₹4.70 weighted.

The practical test: ask the vendor what their language mix assumption is when they quote the blended rate, and recompute using your real mix. Any vendor who quotes ₹3/min without asking what languages you need is quoting you a rate they never expect to bill at.

Clause 7: CRM write-back, seat fees, API quotas, and dashboard access

The final category is the "platform fees around the rate" — the things that are not per-minute but show up on the invoice anyway. Watch for:

CRM write-back API calls: some vendors charge per API call for writing call outcomes, transcripts, and borrower-state updates back into your CRM. A 60-second call that produces 8 CRM writes at ₹0.50 per write adds ₹4 to every minute. This is the single most surprising line item on the first invoice because the sales conversation did not mention it.
Dashboard seats: per-user monthly fees for access to the reporting and campaign dashboard. Common quotes: ₹2,000–5,000 per seat per month. An NBFC collections team of 40 users adds ₹80,000–₹2,00,000 per month to the real cost.
API rate limits and burst pricing: some vendors cap your API throughput at a level that forces you to buy "burst capacity" during campaign launches.
Transcript storage and retention: long-term storage of call recordings and transcripts can be billed separately at ₹0.50–₹2.00 per minute per month of retention — which, for a regulatory-required 7-year retention in BFSI, can be 80× the original call cost over the retention period.
Sandbox and staging environments: often billed as additional tenants at a fraction of production pricing, but still meaningful.

The practical test: demand a flat platform fee that bundles dashboards, API calls, transcript storage for the full regulatory retention period, and sandbox access — or explicitly break all of these out line-by-line in the RFP so you can compare like-for-like across vendors.

Worked example: the ₹3 contract that becomes ₹11.40

Let us make this concrete. Vendor A quotes ₹3/min blended. Vendor B quotes ₹9/min blended. Your workload: 300,000 connected minutes per month, 80% outbound, 85% in Hindi with some Tamil and Telugu, 40% call answer rate, with a standard CRM write-back requirement and a 40-seat dashboard team.

Vendor A at a ₹3 headline, with every clause stacked against you:

Clause	Effect	Effective rate
Connected-minute Definition C	+30%	₹3.90
Outbound-heavy mix vs blended	+20%	₹4.68
Retry billing (Pattern 3, 40% answer)	+100% over recorded conversations	₹9.36
Setup + integration amortisation	+₹1.20/min	₹10.56
Hindi + regional language premium	+₹0.80/min weighted	₹11.36
CRM write-back at ₹0.50 × 8 calls/min	~₹0.04/min equivalent	~₹11.40

Effective rate: ₹11.40/minute. Monthly spend on 300,000 connected minutes: ~₹34.2 lakh.

Vendor B at a ₹9 headline with clean clauses:

Clause	Effect	Effective rate
Connected-minute Definition A (transparent)	0%	₹9.00
Unblended outbound rate at ₹9	0%	₹9.00
Retries free up to 3/day	~0%	₹9.00
Setup included in first-year ramp	0%	₹9.00
Hindi included at headline	0%	₹9.00
Bundled platform fee of ₹60k/month	+₹0.20/min on 300k minutes	₹9.20

Effective rate: ₹9.20/minute. Monthly spend: ~₹27.6 lakh.

The ₹3 vendor is ~24% more expensive in rupees per month and, because their answer-to-connected inflator is structural, far more expensive per actually recovered outcome. This is not a hypothetical — it is within 10% of three real deployments we have seen migrated off "cheap" vendors in the last 14 months.

The RFP question block: paste this into your next vendor evaluation

Send this verbatim to every voice AI vendor in your shortlist and demand written answers before the first demo:

Define "connected minute" with the exact Unix-timestamp trigger for start and stop. Is silence billed? Above what threshold?
Are inbound and outbound minutes billed at the same rate? If blended, what mix is the blend based on, and what are the unblended rates?
What is the billing treatment of unanswered dials, voicemail detections, and answer-machine navigation? Show the specific clause.
Itemise every one-time fee — platform setup, model fine-tuning, per-CRM integration, DLT registration, launch engineering days. Include total-cost-of-deployment, not just per-minute.
What is the minimum monthly commit, the rollover policy, and the credit expiration term? Can unused minutes roll over for at least one quarter?
What are the per-language surcharges for Hindi and each regional language, and what is the surcharge for code-switching? Quote against our language mix, not your blend.
Are CRM write-back API calls, dashboard seats, transcript storage, and sandbox access included in the headline rate, or billed separately? Quote each line item.
What is the regulatory retention period for call recordings and transcripts, and what is the storage cost over that period?
Will you accept a total-cost-of-ownership cap in the contract that includes every clause above?
Will you provide a worked example of a real customer's first 90 days of invoicing, with every line item, redacted to anonymise the customer?

Any vendor who refuses to answer questions 1, 4, 7, and 10 in writing is disqualified. Any vendor who accepts a TCO cap in question 9 should move to the top of your shortlist — that is the only clean signal of alignment between their pricing and your real cost.

The negotiation move that beats the headline-rate framing

The single most effective thing you can do in a voice AI RFP in India in 2026 is to refuse to negotiate on per-minute rate. Instead, demand a per-successful-outcome price cap — priced in rupees per booked appointment, per promise-to-pay captured, per qualified lead, depending on your workflow. This does three things at once:

It aligns the vendor's incentive with yours. They make more money only when you do.
It exposes the real cost structure, because any vendor who cannot commit to an outcome price knows their headline rate is hiding inflators.
It makes comparison across vendors trivial. You are no longer comparing ₹3 vs ₹9 — you are comparing ₹150 vs ₹180 per booked appointment, which is a number your CFO can actually use.

The objection you will hear: "we cannot commit to outcome pricing because outcomes depend on your data quality, your list, your brand." That objection is fair. The counter-move is a shared-risk contract — a floor per-minute rate plus an outcome bonus, with a cap on total spend. Most mid-market vendors in India will agree to this if pushed, and the ones who will not are the ones whose economics do not work at the outcomes they are pitching.

Where Caller Digital fits

We built Caller Digital's commercial model to survive this checklist. That means Definition A connected-minute billing with silence thresholds explicitly capped, inbound and outbound rates listed separately in every contract, retry billing at zero cost for unanswered dials up to a defined threshold, Hindi and regional language voices included at the headline rate for most verticals, CRM write-back bundled into a flat platform fee, transcript retention priced per real-world regulatory requirement rather than per-minute-per-month, and a willingness to commit to outcome-linked pricing for deployments where we can validate list quality.

We are not the cheapest headline rate in the Indian voice AI market, and we do not try to be. We are the vendor whose total-cost-of-ownership over 12 months is the lowest for buyers who actually run the clause-by-clause arithmetic. If the arithmetic favours someone else for your specific workload, we will tell you — because an NBFC that deploys us on the wrong use case at the wrong volume is an NBFC that will churn in month eight, which is worse for everyone.

If you are evaluating voice AI and want to pressure-test your shortlist against this checklist, the fastest path is to book a free custom demo. We will walk through each of the seven clauses live, share our standard contract template with every clause highlighted, and give you a TCO calculator you can use against every other vendor in your RFP.

For deeper reading, see our RBI and DPDP compliance checklist, our DPD-bucket playbook for NBFC collections, and our Voice AI vs IVR for Indian Banks: A ₹47 Lakh/Year Decision. For a quick numerical sanity-check on your own workload, plug real numbers into the EMI Collections ROI Calculator.

The bottom line

The ₹3 vs ₹9 argument is a distraction. Every voice AI contract in India has seven clauses that decide what the headline rate actually means in production, and in most cases the cheaper-looking vendor is more expensive by month three. Do not negotiate on per-minute rate alone. Demand Definition A connected-minute billing, unblended inbound/outbound rates, free retries, itemised one-time fees, quarterly true-ups, language-inclusive headlines, bundled platform fees, and — if you can get it — an outcome-linked price cap. The vendors who accept are the ones whose economics actually work. The vendors who refuse are showing you, for free, why their headline rate was too good to be true.

The ₹3 vs ₹9 question is the wrong question

The 7 contract clauses that decide what you actually pay

Clause 1: The connected-minute definition — "connected" does not mean what you think

Read the contract. You will typically find one of four definitions, each more aggressive than the last:

Definition A (rare, transparent): billed time starts when the borrower says their first word and ends when the call is terminated. Silence is not billed beyond a defined threshold.
Definition B (common): billed time starts when the carrier signals answer supervision — meaning the moment the phone is picked up, before anyone has said anything. This typically adds 3–7 seconds per call, which is 5–12% of a 60-second call.
Definition C (common, aggressive): billed time starts when the call is placed, including ring time. This adds 8–15 seconds per call on Indian telecom — you are paying for ringing, even on calls that are never answered.
Definition D (predatory, and we have seen it in real Indian contracts): billed time includes carrier-side voicemail detection, answering-machine navigation, silence up to 30 seconds, and the entire hang-up tail. A 45-second human conversation becomes a 75-second billed minute.

Clause 2: The inbound/outbound differential

Clause 3: Retries, redials, and the answer-machine penalty

Three common contract patterns:

Pattern 1 (rare, transparent): unanswered attempts are free. You pay only for connected calls.
Pattern 2 (common): the first two unanswered attempts per borrower per day are free, subsequent retries are billed at a reduced rate (typically 25–40% of the connected rate).
Pattern 3 (aggressive): every attempt — answered or not — is billed as a full connected minute once it reaches carrier-side ring supervision. This is almost always the case under the Definition C or D connected-minute rule.

Clause 4: Setup, onboarding, and integration fees

Platform setup fees: ₹1–5 lakh, often framed as a "configuration charge" or "tenant provisioning fee."
Voice model fine-tuning: ₹2–8 lakh, framed as "language customisation" or "vertical training."
CRM integration: ₹3–10 lakh per CRM connector, even for standard connectors to Salesforce, Zoho, Kapture, or LeadSquared that the vendor has built once and resells.
Telephony provisioning and DLT template registration: ₹25,000–1 lakh per template, plus ongoing template-change fees.
Go-live support: 4–12 weeks of "launch engineering," billed at professional-services rates that are often ₹15,000–25,000 per day.

Clause 5: Minimum commits, rollover, and credit expiration

Three flavours to watch:

Hard commit, no rollover: unused minutes vanish at month end. Common in smaller vendors. A ₹3/min rate on a 100,000-minute monthly commit, used at 60% in March, is effectively ₹5/min for that month's real traffic.
Credit expiration: you pre-pay for a pool of minutes (or "credits") that expire after 6–12 months. Any vendor quoting a "₹2/min credit bundle" almost always has this clause, and the real cost depends on how many credits you lose at the expiration date.
Committed growth rate: the contract specifies a volume growth curve you must hit, with penalties if you fall below. This is mostly seen in very large enterprise deals, but it has started appearing in mid-market Indian contracts as vendors try to hedge against flat-line customers.

Clause 6: TTS voice premium and language surcharges

Typical surcharges we have seen in Indian contracts in 2026:

Hindi (baseline voice): ₹0.50–₹1.50 per minute over headline.
Hindi (premium voice with prosody): ₹1.50–₹3.00 over headline.
Tamil, Telugu, Marathi, Bengali, Kannada, Gujarati, Punjabi, Malayalam: ₹1.00–₹2.50 over headline each.
Code-switching (Hindi-English, Tamil-English): ₹1.00–₹2.00 additional over the regional premium.
Custom brand voice: ₹5–15 lakh one-time plus ₹1.00–₹2.50 ongoing per minute.

Clause 7: CRM write-back, seat fees, API quotas, and dashboard access

The final category is the "platform fees around the rate" — the things that are not per-minute but show up on the invoice anyway. Watch for:

CRM write-back API calls: some vendors charge per API call for writing call outcomes, transcripts, and borrower-state updates back into your CRM. A 60-second call that produces 8 CRM writes at ₹0.50 per write adds ₹4 to every minute. This is the single most surprising line item on the first invoice because the sales conversation did not mention it.
Dashboard seats: per-user monthly fees for access to the reporting and campaign dashboard. Common quotes: ₹2,000–5,000 per seat per month. An NBFC collections team of 40 users adds ₹80,000–₹2,00,000 per month to the real cost.
API rate limits and burst pricing: some vendors cap your API throughput at a level that forces you to buy "burst capacity" during campaign launches.
Transcript storage and retention: long-term storage of call recordings and transcripts can be billed separately at ₹0.50–₹2.00 per minute per month of retention — which, for a regulatory-required 7-year retention in BFSI, can be 80× the original call cost over the retention period.
Sandbox and staging environments: often billed as additional tenants at a fraction of production pricing, but still meaningful.

Worked example: the ₹3 contract that becomes ₹11.40

Vendor A at a ₹3 headline, with every clause stacked against you:

Clause	Effect	Effective rate
Connected-minute Definition C	+30%	₹3.90
Outbound-heavy mix vs blended	+20%	₹4.68
Retry billing (Pattern 3, 40% answer)	+100% over recorded conversations	₹9.36
Setup + integration amortisation	+₹1.20/min	₹10.56
Hindi + regional language premium	+₹0.80/min weighted	₹11.36
CRM write-back at ₹0.50 × 8 calls/min	~₹0.04/min equivalent	~₹11.40

Effective rate: ₹11.40/minute. Monthly spend on 300,000 connected minutes: ~₹34.2 lakh.

Vendor B at a ₹9 headline with clean clauses:

Clause	Effect	Effective rate
Connected-minute Definition A (transparent)	0%	₹9.00
Unblended outbound rate at ₹9	0%	₹9.00
Retries free up to 3/day	~0%	₹9.00
Setup included in first-year ramp	0%	₹9.00
Hindi included at headline	0%	₹9.00
Bundled platform fee of ₹60k/month	+₹0.20/min on 300k minutes	₹9.20

Effective rate: ₹9.20/minute. Monthly spend: ~₹27.6 lakh.

The RFP question block: paste this into your next vendor evaluation

Send this verbatim to every voice AI vendor in your shortlist and demand written answers before the first demo:

Define "connected minute" with the exact Unix-timestamp trigger for start and stop. Is silence billed? Above what threshold?
Are inbound and outbound minutes billed at the same rate? If blended, what mix is the blend based on, and what are the unblended rates?
What is the billing treatment of unanswered dials, voicemail detections, and answer-machine navigation? Show the specific clause.
Itemise every one-time fee — platform setup, model fine-tuning, per-CRM integration, DLT registration, launch engineering days. Include total-cost-of-deployment, not just per-minute.
What is the minimum monthly commit, the rollover policy, and the credit expiration term? Can unused minutes roll over for at least one quarter?
What are the per-language surcharges for Hindi and each regional language, and what is the surcharge for code-switching? Quote against our language mix, not your blend.
Are CRM write-back API calls, dashboard seats, transcript storage, and sandbox access included in the headline rate, or billed separately? Quote each line item.
What is the regulatory retention period for call recordings and transcripts, and what is the storage cost over that period?
Will you accept a total-cost-of-ownership cap in the contract that includes every clause above?
Will you provide a worked example of a real customer's first 90 days of invoicing, with every line item, redacted to anonymise the customer?

The negotiation move that beats the headline-rate framing

It aligns the vendor's incentive with yours. They make more money only when you do.
It exposes the real cost structure, because any vendor who cannot commit to an outcome price knows their headline rate is hiding inflators.
It makes comparison across vendors trivial. You are no longer comparing ₹3 vs ₹9 — you are comparing ₹150 vs ₹180 per booked appointment, which is a number your CFO can actually use.

Voice AI Pricing in India: The 7 Contract Clauses That Decide Whether ₹3/Minute Is Actually Cheaper Than ₹9/Minute

The ₹3 vs ₹9 question is the wrong question

The 7 contract clauses that decide what you actually pay

Clause 1: The connected-minute definition — "connected" does not mean what you think

Clause 2: The inbound/outbound differential

Clause 3: Retries, redials, and the answer-machine penalty

Clause 4: Setup, onboarding, and integration fees

Clause 5: Minimum commits, rollover, and credit expiration

Clause 6: TTS voice premium and language surcharges

Clause 7: CRM write-back, seat fees, API quotas, and dashboard access

Worked example: the ₹3 contract that becomes ₹11.40

The RFP question block: paste this into your next vendor evaluation

The negotiation move that beats the headline-rate framing

Where Caller Digital fits

The bottom line

Frequently Asked Questions

What is the typical voice AI price per minute in India in 2026?

Why is cheap voice AI often more expensive in practice?

What metric should I actually use to compare voice AI vendors?

How much does voice quality really affect outcomes?

Are Indian voice AI vendors cheaper than global ones?

Should I negotiate per-minute pricing or commit to a minimum?

What hidden costs should I watch for in a voice AI quote?

What does a sensible pilot look like before committing?

Caller Digital

Voice AI Pricing in India: The 7 Contract Clauses That Decide Whether ₹3/Minute Is Actually Cheaper Than ₹9/Minute

The ₹3 vs ₹9 question is the wrong question

The 7 contract clauses that decide what you actually pay

Clause 1: The connected-minute definition — "connected" does not mean what you think

Clause 2: The inbound/outbound differential

Clause 3: Retries, redials, and the answer-machine penalty

Clause 4: Setup, onboarding, and integration fees

Clause 5: Minimum commits, rollover, and credit expiration

Clause 6: TTS voice premium and language surcharges

Clause 7: CRM write-back, seat fees, API quotas, and dashboard access

Worked example: the ₹3 contract that becomes ₹11.40

The RFP question block: paste this into your next vendor evaluation

The negotiation move that beats the headline-rate framing

Where Caller Digital fits

The bottom line

Frequently Asked Questions

What is the typical voice AI price per minute in India in 2026?

Why is cheap voice AI often more expensive in practice?

What metric should I actually use to compare voice AI vendors?

How much does voice quality really affect outcomes?

Are Indian voice AI vendors cheaper than global ones?

Should I negotiate per-minute pricing or commit to a minimum?

What hidden costs should I watch for in a voice AI quote?

What does a sensible pilot look like before committing?

Caller Digital

Other Blogs

TCPA Express Written Consent for AI Calling in the US 2026: The Compliance Playbook That Holds Up in Court

FDCPA-Compliant AI Collection Calls for US Lenders 2026: The Operator's Field Manual

HIPAA-Compliant AI Appointment Reminder Service for US Clinics 2026: The Vendor Selection Guide

Voice AI for Indian Banks & NBFCs 2026: Vendor Selection Framework (Gnani, Verloop, Nurix, Caller Digital Compared)

Best Agentic Voice AI Platforms India 2026: 12-Vendor Buyer's Matrix (Caller Digital, Gnani, Verloop, Nurix, Yellow.ai, Haptik, Skit.ai, Bolna, CoRover, Squadstack, Sarvam, Vapi)

Nurix AI vs Caller Digital 2026: Agentic Voice AI for Indian Enterprises Compared

Verloop.io vs Caller Digital: Outbound Voice AI for Collections, COD and Cart Recovery in India (2026)

Gnani.ai Alternatives India 2026: 7 Voice AI Platforms Compared on Pricing, Latency & Compliance

Voice AI Call QA & Scoring in India 2026: Auditing 100% of Calls Instead of Sampling 2%

Voice AI Clinical Triage and Nurse Helplines in India 2026: Symptom Intake, Out-of-Hours and Tele-Triage at Scale