WIPO Speech to Text is a powerful transcription tool which automatically converts audio and video content into text using artificial intelligence. It was specifically designed for international meetings and conferences.
The software was originally created and deployed to help transcribe official WIPO meetings and can be customized for other organizations worldwide.
Our user guide gives you a quick walkthrough of how to use our WIPO Speech to Text tool.
Who uses it?
WIPO Speech to Text tool is used by organizations hosting international meetings to help them to accurately and efficiently transcribe the proceedings.
UNOG’s project FAST has been looking for an automatic speech recognition solution that would be accurate for the UN context with its multilingual voices, diverse accents and specialized lingo, and has found the perfect match in WIPO’s Speech to Text tool.
In October 2019, it started running the customized WIPO Speech to Text for public calendar meetings, servicing its iconic clients, such as OHCHR, ODA and UNCTAD, and received overwhelmingly positive feedback from secretariats and delegates alike.
We are looking forward to further collaborating with WIPO, including for other official working languages.
Sofia Lobanova Zick, FAST Project Manager, United Nations Office at Geneva
CERN's Digital Memory project has digitized its massive back catalogue of audiovisual material. Faced with a lack of accurate text descriptions for much of the content, the project turned to audio-to-text transcription technologies that could provide full-text indexes to the entire collection. Combined with initiatives from CERN's Translation Service and Diversity Office - also keen to provide greater accessibility to historical content – CERN chose to collaborate with WIPO during the summer of 2019 to carry out an audiovisual transcription pilot project. The WIPO Speech-to-Text software, which was designed for conferences involving foreign English speakers with a wide variety of accents, was used to process a sample of historic recordings of High-Energy Physics seminars.
Despite the originality and complexity of the vocabulary, very encouraging results have been achieved, providing a strong incentive to set up a generalized transcription service open to the entire CERN audio-visual community, to process both the old data and the on-going events and content.
Jean-Yves Le Meur, Head of Digital Memory Project, CERN