Using the World Gender Name Dictionary

May 30, 2023

May 30, 2023 ・ 5 minutes reading time

#
Image: Getty Images/damircudic

The World Gender Name Dictionary (WGND) now offers a 2.0 version with a higher availability of records of countries and territories and names of physical persons. Its documentation files and libraries are available online for users to get started with applying gender name dictionaries to any dataset with names linked to geographical codes.

Recent research has made novel contributions to expand the WGND, which now includes more than 26 million records linking names of physical persons and 195 different countries and territories. The WGND 2.0 updates its predecessor WGND 1.0 and is the result of compilations of more than 50 new different sources of gender data and updates to the original list of sources.

Where to find the WGND 2.0?

The WGND 2.0 is available online in the IES Gender Open Source Project. It has a dedicated Github repository with a documentation describing varying sets of unique observations on gender data linked to country and language codes. Supporting documentations of both the WGND 1.0 and 2.0 versions are also accessible in Harvard Dataverse.

How to use the WGND 2.0?

The first step is to prepare a dataset with names of physical persons and country codes. Once this dataset is ready, the next step is to perform four data cleaning checks as follows:

  • Remove family names or surnames in the name records, so only first or main names are left out in the name variable.
  • Set up the remaining name records in lowercase and remove blank spaces before and after the names’ text.
  • Delete double spaces in between the words making up each final name record.
  • Ensure that codes of countries and territories contained in the dataset are in the alpha-2 digit codes defined in ISO 3166-1. A full list of available ISO 2-digit codes is available online in the ISO Online Browsing Platform or io.

Then, users can access the Gender-it tool in GitHub to apply the WGND 2.0 libraries to the resulting cleaned dataset. Gender-it contains WGND 2.0 libraries that are retrievable by using Stata or Python (Watch the video tutorial below). Libraries in these software options contain detailed instructions and examples for downloading the necessary documentation files, functions and packages for matching the user’s cleaned database with the varieties of gender name dictionaries that are available in WGND 2.0. Stata users can get started by running the tutorial_genderit.do file while Python users can do so by running the introduction to gender-it.ipynb file.

Video tutorial

The video tutorial is divided into four sections guiding users to work around the WGND 2.0 libraries, slide the video to the minute marks to look for the desired section.

  • Introduction - 00:00 min
  • STATA - 06:48 min
  • Python - 24:36 min
  • Trips and tricks - 32:40 min
Technical Workshop on Using WIPO’s World Gender Name Dictionary

Related resources

Expanding the World Gender-Name Dictionary: WGND 2.0

Guidelines for producing gender analysis from innovation and IP data

Disclaimer: The short posts and articles included in the Innovation Economics Themes Series typically report on research in progress and are circulated in a timely manner for discussion and comment. The views expressed in them are those of the authors and do not necessarily reflect those of WIPO or its Member States. ​​​​​​​

Related stories

Rethinking Innovation: How Economic Complexity Theory Reveals New Pathways to Growth

Harvard economist Ricardo Hausmann explains why traditional economic models miss the mark on innovation policy – and offers a framework that could transform how countries build competitive advantages in the global knowledge economy.

Unlocking Women’s STEM Potential for Inclusive Innovation

Dr. Delgado began by outlining that only 9% of all patents globally can be attributed to women. Though this share is increasing, the pace is very slow.

Digital Access to Knowledge Helps Reduce the Gender Gap in Scientific Research

A new study by the World Intellectual Property Organization (WIPO) finds that access to scientific literature through the Research4Life initiative is helping to reduce the gender gap in research, especially in low- and middle-income countries. Together with other sister UN agencies, WIPO runs the Research4Life initiative. Institutions that gained access to the program’s resources saw up to a 30% increase in publications with at least one female author and up to a 9% rise in the overall share of credited women authors.

Transforming Knowledge into Technology: Tracking the Progress of Innovation Ecosystems

Efficient innovation ecosystems bring together scientific research and industrial production to create technologies that drive economic growth. But how can we measure how well these interactions work?