WIPO Speech to Text

WIPO Speech to Text is a powerful transcription tool which automatically converts audio and video content into text using artificial intelligence. It was specifically designed for international meetings and conferences. Once trained in a specific subject area, it out-performs other general purpose transcription tools.

The software was originally created and deployed to help transcribe official WIPO meetings and can be customized for other organisations worldwide.

Contact us to find out more
(Photo: Jaouad.K/Getty Images)

In practice: Find out how the Speech to Text  tool is used at WIPO

What are the benefits of WIPO Speech to Text?

Performance

The tool can transcribe one hour of video/audio in five minutes when supported by the appropriate IT infrastructure.

Accuracy

The tool uses the latest neural machine learning technology, and is particularly effective in transcribing audio from non-native speakers.  

Security

WIPO Speech to Text can be installed on-site to guarantee security and confidentiality.

What do I need to use it?

  • A GPU-based server infrastructure if you wish to run the tool independently. If not, cloud-based infrastructure will work.
  • For best results, a large collection of domain-specific data in the relevant language can be used to customize the tool.

Which languages does it work with?

WIPO Speech to Text is currently only available in English. We are working on extending the tool to work with other five official United Nations languages (Arabic, Chinese, French, Russian and Spanish).

How can I get it for my organization?

WIPO Speech to Text is available via standard licensing agreements and WIPO's team of experts can even help you to install and set up the tool.

Send us a mail to find out more.

Resources

Our user guide gives you a quick walkthrough of how to use our WIPO Speech to Text tool.

Who uses it?

WIPO Speech to Text currently powers transcription for various international organizations.