Explainer

AI transcription and data sovereignty: where does your voice go during inference?

EU hosting, encryption, contractual clauses: these assurances are reassuring, but they say nothing about the moment a language model reads your data in clear text. A breakdown of a blind spot that recent news has brought back into focus.

Understand the Recapro architecture The sectors concerned

The trigger: a public debate on AI in healthcare

In early June 2026, Le Canard enchaîné and then the specialised outlet Next looked into the consultation assistant of a major French e-health player, Doctolib, and the role American language models play in it.

According to Le Canard enchaîné (2 June 2026), American providers would be among the subprocessors used to run this consultation assistant.
Asked by Next (4 June 2026), the company firmly denies that consultation notes are used to train these third parties’ models, and states that they act on its instructions only, within a strict contractual framework that forbids them from exploiting the data on their own account.
The company specifies that patients’ health data is hosted in France and Germany, and encrypted at rest and in transit.

We report these elements as published by the press; the company concerned refutes Le Canard enchaîné’s claims. This explainer targets no specific player: it focuses on the technical principle the episode brings to light, which applies to any AI transcription software.

Source: Next, “Doctolib réfute livrer les infos de ses utilisateurs aux grands acteurs de l’IA”, 4 June 2026.

The dividing line: contractual guarantees or technical guarantees

Whatever the vendor, the security of AI transcription comes down to a point commercial pages rarely address: what happens to your data the moment the model processes it.

01

Inference happens in clear text

A language model cannot summarise or transcribe encrypted data. At processing time, the text is necessarily decrypted and presented in clear to the model. Encryption protects your data at rest and in transit, not during the computation itself.

02

Hosted in Europe is not beyond extraterritorial reach

Locating servers in Europe meets a data-residency requirement. It does not, on its own, neutralise exposure to US law (the CLOUD Act) when a link in the chain is operated by a company subject to it. Residency and jurisdiction are two distinct questions.

03

A contract frames the use, it does not remove the transit

A processing agreement defines what a provider is allowed to do with your data. It does not change the technical fact: if your data passes through a third party’s infrastructure for inference, that transit happens. Protection then rests on contract compliance, not on architecture.

04

The only architectural answer: do not let the data leave

The only way to remove the question entirely is for inference to run on infrastructure you control, with no call to a third-party model. There is then no moment at which the text leaves your perimeter.

Frameworks such as the Data Privacy Framework govern data transfers to the United States. Their legal robustness is, however, repeatedly challenged: relying on contractual compliance alone means depending on a balance that can shift.

Why this concerns every regulated sector

The healthcare example is telling because the data there is among the most sensitive. But the same reasoning applies to any organisation whose meetings touch on confidential information.

Banking & Finance Legal Insurance Public Sector

A credit committee, a lawyer’s consultation, an insurance claim assessment or a municipal deliberation raise the same question as a medical consultation: who, technically, has access to the content while it is being processed?

Recapro’s approach

Recapro is built so the question never arises.

Inference 100% local, in Cloud as in On-Premise

AI inference (transcription and summarisation) is produced by open-weight models running on our own servers. No meeting excerpt is ever sent to a third-party language model (OpenAI, Anthropic, Google), whatever the deployment mode.

Sovereign French Cloud or On-Premise

The Cloud offering is hosted on sovereign French infrastructure (France Nuage), with no link in the processing chain subject to US law. The On-Premise offering deploys the entire processing on your own hardware.

Air-gap mode available

On-Premise, Recapro can run fully offline, with no outbound connection. Outside air-gap mode, only usage and licensing telemetry is sent back; the content of your meetings never leaves the infrastructure.

For the detail of deployment models, see the Cloud vs On-Premise analysis.

Frequently asked questions

01

Is “hosted in Europe” enough to protect data from the CLOUD Act?

Server location meets the data-residency obligation, but on its own it does not neutralise exposure to US law when a provider in the chain is subject to it. Residency and jurisdiction are two distinct questions.

02

Does encryption prevent an AI vendor from accessing the content?

At rest and in transit, yes. But to transcribe or summarise, a model must process the text in clear: encryption does not cover the moment of inference.

03

What is the difference between inference and training?

Training improves a model from data; inference is the use of the model to produce a result. Data can transit for inference without being used for training. The transit, however, does take place.

04

How do you remove the transfer risk entirely?

By running inference on controlled infrastructure, with no call to a third-party model. That is the principle of local processing and On-Premise: the data does not leave your perimeter.

05

Does Recapro send data to an American language model?

No. Inference (transcription and summarisation) is produced by open-weight models running on our own servers, in sovereign French Cloud as in On-Premise. No meeting excerpt is transmitted to a third-party model; only usage and licensing telemetry may be sent back, and the meeting content never leaves.