WIPO Speech-to-text – The Power of Transcription

WIPO Speech-to-Text is an advanced speech recognition service developed by the ATAC team of the World Intellectual Property Organization (WIPO). This cutting-edge technology allows for the conversion of spoken language into written text with high accuracy and speed. WIPO Speech-to-Text easily transcribes speeches, meetings, interviews, and any other spoken content into text format.

Discover WIPO Speech-to-Text in action
(Photo: Jaouad.K/Getty Images)

How it started at WIPO

Key Features


WIPO Speech-to-Text supports a wide range of languages for the transcription of content spoken in different linguistic contexts using dedicated models for:

  • Arabic
  • Chinese
  • English
  • French
  • Russian
  • Spanish

The tool also offers a multilingual model, which recognizes the different languages in case the speakers at a meeting switch between languages

Accuracy and Reliability

Powered by state-of-the-art machine learning algorithms, WIPO Speech-to-Text delivers accurate transcriptions. It recognizes and captures spoken content with high accuracy, even in the presence of various accents, languages, and background noise.

Secure and Confidential

At WIPO, data privacy and security are paramount. WIPO Speech-to-Text follows industry best practices to ensure the confidentiality and integrity of the transcribed content. Data remains protected throughout the transcription process.

Customization and Adaptation

WIPO Speech-to-Text adapts and customize the service to specific needs. It is possible to fine-tune the speech recognition models to improve accuracy for specific speakers, vocabulary, or domain-specific terminology, ensuring the most accurate transcriptions.

Searching & highlighting

The tool uses the latest neural machine learning technology, and is particularly effective in A search feature allow to find specific keywords or phrases within the transcript and keywords are highlighted when pronounced by the speaker.audio from non-native speakers.  


WIPO Speech-to-Text is used at WIPO for most of its meetings, as well as in other intergovernmental organizations such as UNOG, WTO, the Court of Justice, etc.

WIPO welcomes requests to implement WIPO Speech-to-Text in intergovernmental entities that are dealing with large volumes of audio/video files.