About Intellectual Property IP Training IP Outreach IP for… IP and... IP in... Patent & Technology Information Trademark Information Industrial Design Information Geographical Indication Information Plant Variety Information (UPOV) IP Laws, Treaties & Judgements IP Resources IP Reports Patent Protection Trademark Protection Industrial Design Protection Geographical Indication Protection Plant Variety Protection (UPOV) IP Dispute Resolution IP Office Business Solutions Paying for IP Services Negotiation & Decision-Making Development Cooperation Innovation Support Public-Private Partnerships The Organization Working with WIPO Accountability Patents Trademarks Industrial Designs Geographical Indications Copyright Trade Secrets WIPO Academy Workshops & Seminars World IP Day WIPO Magazine Raising Awareness Case Studies & Success Stories IP News WIPO Awards Business Universities Indigenous Peoples Judiciaries Genetic Resources, Traditional Knowledge and Traditional Cultural Expressions Economics Gender Equality Global Health Climate Change Competition Policy Sustainable Development Goals Enforcement Frontier Technologies Mobile Applications Sports Tourism PATENTSCOPE Patent Analytics International Patent Classification ARDI – Research for Innovation ASPI – Specialized Patent Information Global Brand Database Madrid Monitor Article 6ter Express Database Nice Classification Vienna Classification Global Design Database International Designs Bulletin Hague Express Database Locarno Classification Lisbon Express Database Global Brand Database for GIs PLUTO Plant Variety Database GENIE Database WIPO-Administered Treaties WIPO Lex - IP Laws, Treaties & Judgments WIPO Standards IP Statistics WIPO Pearl (Terminology) WIPO Publications Country IP Profiles WIPO Knowledge Center WIPO Technology Trends Global Innovation Index World Intellectual Property Report PCT – The International Patent System ePCT Budapest – The International Microorganism Deposit System Madrid – The International Trademark System eMadrid Article 6ter (armorial bearings, flags, state emblems) Hague – The International Design System eHague Lisbon – The International System of Appellations of Origin and Geographical Indications eLisbon UPOV PRISMA Mediation Arbitration Expert Determination Domain Name Disputes Centralized Access to Search and Examination (CASE) Digital Access Service (DAS) WIPO Pay Current Account at WIPO WIPO Assemblies Standing Committees Calendar of Meetings WIPO Official Documents Development Agenda Technical Assistance IP Training Institutions COVID-19 Support National IP Strategies Policy & Legislative Advice Cooperation Hub Technology and Innovation Support Centers (TISC) Technology Transfer Inventor Assistance Program WIPO GREEN WIPO's Pat-INFORMED Accessible Books Consortium WIPO for Creators WIPO ALERT Member States Observers Director General Activities by Unit External Offices Job Vacancies Procurement Results & Budget Financial Reporting Oversight

Innovation Gender Gap: Using the World Gender Name Dictionary

Learn how to find and apply WIPO’s World Gender Name Dictionary. Disambiguate the gender from any dataset by following a few simple steps. 

Estimated reading time: 5 minutes
gender-diversity-845
(Photo: damircudic/Getty images)

The World Gender Name Dictionary (WGND) now offers a 2.0 version with a higher availability of records of countries and territories and names of physical persons. Its documentation files and libraries are available online for users to get started with applying gender name dictionaries to any dataset with names linked to geographical codes.

Recent research has made novel contributions to expand the WGND, which now includes more than 26 million records linking names of physical persons and 195 different countries and territories. The WGND 2.0 updates its predecessor WGND 1.0 and is the result of compilations of more than 50 new different sources of gender data and updates to the original list of sources.

Where to find the WGND 2.0?

The WGND 2.0 is available online in the IES Gender Open Source Project. It has a dedicated Github repository with a documentation describing varying sets of unique observations on gender data linked to country and language codes. Supporting documentations of both the WGND 1.0 and 2.0 versions are also accessible in Harvard Dataverse.

How to use the WGND 2.0?

The first step is to prepare a dataset with names of physical persons and country codes. Once this dataset is ready, the next step is to perform four data cleaning checks as follows:

  • Remove family names or surnames in the name records, so only first or main names are left out in the name variable.
  • Set up the remaining name records in lowercase and remove blank spaces before and after the names’ text.
  • Delete double spaces in between the words making up each final name record.
  • Ensure that codes of countries and territories contained in the dataset are in the alpha-2 digit codes defined in ISO 3166-1. A full list of available ISO 2-digit codes is available online in the ISO Online Browsing Platform or io.

Then, users can access the Gender-it tool in GitHub to apply the WGND 2.0 libraries to the resulting cleaned dataset. Gender-it contains WGND 2.0 libraries that are retrievable by using Stata or Python (Watch the video tutorial below). Libraries in these software options contain detailed instructions and examples for downloading the necessary documentation files, functions and packages for matching the user’s cleaned database with the varieties of gender name dictionaries that are available in WGND 2.0. Stata users can get started by running the tutorial_genderit.do file while Python users can do so by running the introduction to gender-it.ipynb file.

Video tutorial

The video tutorial is divided into four sections guiding users to work around the WGND 2.0 libraries, slide the video to the minute marks to look for the desired section.

  • Introduction         00:00 min
  • STATA         06:48 min
  • Python         24:36 min
  • Trips and tricks 32:40 min
Technical Workshop on Using WIPO’s World Gender Name Dictionary

Other stories you may enjoy

Creating a World Gender Name Dictionary

The World Gender Name Dictionary (WGND) is a tool to assist researchers and policy analysts worldwide in solving the lack of data sources with gender breakdown.

How to measure the Gender Gap in Innovation

There are several ways to get innovation and IP data with gender breakdown. Check which one suits your case better.

What do we know about the Gender Gap in Innovation?

Women innovating, inventing, and creating face constant factors that impede their activities. What can economic research tell us about these and inform gender balance policies?

Related resources

Expanding the World Gender-Name Dictionary: WGND 2.0

This paper revisits the first World Gender Name Dictionary (WGND 1.0), allowing to disambiguate the gender in data naming physical persons (Lax Martínez et al., 2016). We discuss its advantages and limitations and propose an expansion based on updated data and additional sources. By including more than 26 million records linking given names and 195 different countries and territories, the resulting WGND 2.0 substantially increases the international coverage of its processor. As a result, it is particularly designed to be applied to intellectual property unit-record data naming inventors, designers, individual applicants and other creators disclosed in these data.

Guidelines for producing gender analysis from innovation and IP data

Understanding how women and men can access and use the intellectual property (IP) system equally is key to ensuring that their ingenuity and creativity translates into economic, social and cultural development. This short guide summarizes best practice for producing innovation and IP gender indicators.