WIPO Sequence User Manual

The purpose of this document is to provide users with instructions on how to perform basic operations with the WIPO Sequence desktop application. Typically, the users are a patent applicant, or their representative, seeking to submit a patent application which includes a sequence listing.

This user manual is for version 3.0.0.

1. Summary of functionalities

This table summarizes all the functionalities implemented by the tool in its current version, with links to their relevant section:

Add an invention title and its corresponding language code to a project
Add application information (either current or prior application) to a project
Add feature information to a sequence
Add qualifier information to a feature
Add sequence listing general information data to a project
Change the order in which the sequences will be listed in the generated sequence listing
Create and insert a sequence in another position in the listing
Create a translation qualifier for a selected CDS feature and its associated translated sequence
Delete a sequence
Export all data stored in a project so that they can be later imported into the same or a different instance of the system
Export generated sequence listing in human readable format (.html & .txt)
Generate sequence listing
Import all data stored in a project file
Import data from a FASTA file into an existing project
Import a multi-sequence file into a project
Print data from the project or generated ST.26 sequence listing
Translate a nucleic acid sequence according to a specified genetic code table number
Verify an ST.26 sequence listing file and list the issues as a verification report containing warning and error messages
Verify the data stored in a project and list the issues as a verification report containing warning and error messages
Bulk editing of sequence annotation including qualifier mol_type
Bulk editing and/or adding of a range of sequence”s features
Bulk deletion of a range of sequences
Bulk skip of a range of sequences

2. Tool functionalities

2.1 Project Home View

This section details the different options accessible under Project Home.

A project is the object structure that the tool uses to store data necessary to generate a sequence listing. The tool uses data stored in the project, once these data have been validated as compliant with WIPO Standard ST.26, as the values within the generated sequence listing.

In this View, the list of the created projects is displayed, giving you the option to sort them or to use the search function to filter by project name, applicant file reference, applicant name, invention title, status or creation date.

Note: The tool displays a maximum of 1,000 projects. If a project is not displayed in the Project Home View, you should use the search function to identify the project by its name as it will still be stored locally but just not visible in this View.


Create project

To create a new project, you must begin from the main Project Home View shown below.

Click on the new project button on project home page

1) Click on the “NEW PROJECT” link at the top of the View. As shown, the tool will request a Name (mandatory) and a Description (optional).

Enter a project name and optionally a description

2) When a value is entered in the name field, the “Save” button will be enabled for you to save the new project. The list of projects will now include this new project in the Project Home View.

Import project

This functionality enables import of a previously exported project into the tool. To import a project file, you must begin from the Project Home View.

Click on the “IMPORT PROJECT” link at the top of the View as indicated then follow the steps in the video below:

Video which shows importing of an existing project into a new project



If “Select Range Sequences” remains unchecked, all the sequences will be imported. If you wish to select which sequences to import into the project, check the “Select Range Sequences” checkbox and enter the ID numbers of the desired sequences in the appropriate field. A single sequence can be entered, as well as a list of sequences separated by commas or a range of sequences in the form x-y. By default, the total number of sequences of the imported project will be displayed as a range, i.e., 1–total sequences.

Example: “1, 3, 7, 13-20, 30-50”

If the project is successfully imported, the following blue banner and message will appear at the top of the View.

Note: You need to verify that the imported zip project corresponds to the current database. If the project was exported in a version earlier than 3.0.0, it will not function properly; the imported project must use the same database. This is due to the new database implementation from version 3.0.0 onwards.

Import sequence listing

From the Project Home View, you can import exclusively the sequence information from an ST.26 or ST.25 compliant sequence listing. The file formats for each are *.xml for ST.26 format and *.txt for ST.25 files. For further details on the step involved, watch the video below.

Video showing the import of a sequence listing to create a project

Note: When importing a Sequence Listing, the Features and Qualifiers are case sensitive and should comply with the values provided in Annex I of WIPO ST.26.

It is also important to note that ST.25-compliant sequence listings must be valid; otherwise, the functionality of the WIPO Sequence cannot be guaranteed during import.

The Import Report Table is shown only when an error occurs when importing a file and it displays the following columns:

  • Type of note: “INDIVIDUAL” for a message related to a specific sequence or “GLOBAL” for one or more sequences generally
  • Data element code: from the source file, for ST.25 sequence listings
  • Message text: detailed message with information on the identified issue in question and the changes made to rectify it (if any)
  • Detected sequence: sequence number of the imported sequence related to the message (when the type is “INDIVIDUAL”; otherwise, this field is blank).

Import report generated after successful import

If the file format was ST.25, then the Import Report View will include an Import Report first, as well as the Changed Data report. The Changed data report displays any data that have undergone a transformation or change during the importing process. The following data are presented in a summary table:

  • Origin Tag: data element code for the element type, when importing WIPO ST.25 sequence listings
  • Origin Element Name: corresponding name for the element type
  • Origin Element Value: corresponding value of the original element in the source file
  • Target Element Name: equivalent ST.26 element name where the information is going to be stored in the project
  • Target Element Value: value that will be set for the Target Element Name in the project
  • Transformation: description of the changes/transformations made to the element
  • Sequence ID Number: sequence ID number of the relevant sequence of the transformed element in the project

Example of changed data report

At this point, you can return to the Project Home View or print a report of these changes in PDF format. For instructions on how to download/print the PDF file, refer to the Display sequence listing section.

Conversely, the import process can fail if there are errors in the sequence listing file. In this case, after attempting to import, you will be notified with a red banner indicating an error has occurred during import

Red banner indicating import failed

Note: The tool performs best at the threshold limit of 100 K sequences. When dealing with large sequence listings, you perform the following workaround: split the import process into a series of steps by choosing a specific range of the sequences to import, and then import these sequences into a project range by range. For example, a sequence listing of ~100 K sequences can be split into a series of 10 x 10 K sequences and these can be imported one by one. The first 10 K would be used in the creation of the project.


Validate sequence listing

You can validate an ST.26 sequence listing file by clicking on the “VALIDATE SEQUENCE LISTING” button at the top right of the Projects View.

Validate sequence listing by clicking on the button at the top right-hand side

If the sequence listing passes validation, a banner will appear as shown.

Blue banner indicates sequence listing is valid

If the sequence listing fails the validation, a verification report will be opened in your browser with the validation errors listed in a table as shown in the example. The location of the HTML file will be displayed alongside the XML verification report in case you wish to copy the files to a different location.

When using an IE browser, in order to allow the format to load correctly you must allow an internal script to be run on your computer. Otherwise, the sequences will not be displayed in the standard format and will be less readable.

Please note that for validating a sequence listing, the ST.26 file should comply with the following requirements:

  • Must be encoded in UTF-8 and must contain valid characters according to XML 1.0 specifications

  • Must contain a DOCTYPE line as follows:

<!DOCTYPE  ST26SequenceListing  PUBLIC  "-//WIPO//DTD    Sequence Listing   1.3//EN" "ST26SequenceListing_V1_3.dtd">

The file must comply with DTD file ST26SequenceListing_V1_3.dtd.

Delete project

To delete a project, you must begin from the Project Home.

Click on the trash can icon to delete

Click on the button with the trash can icon on the row within the Project Home table that you wish to delete.

In the pop-up click “Delete” to confirm that you want to delete the selected project.

Want to delete a project



2.2 Project Details View

The Project Details View contains all the information specific to a single patent application or sequence listing. It is broken into two sections: general information and sequence data. At the top there is a table containing the basic information about the project, including the following:

  • Name of the project
  • Date and time of creation of the project
  • Date and time of last updates made to the project
  • Project status (possible values: “new”/“modified”/“generated”/“invalid”/“valid”/“warnings”) - this is not an editable field!
  • Project description – optional
  • Name of the imported file (in the event that the project was imported)
  • Original free text language code for free text
  • Number of sequences (labeled: “Sequences”)
  • A checkbox for invoking the automatic addition of a translation qualifier when a CDS feature is created (a project-level function)
  • Non-English free text language.

Look at the project metadata at the top

There are two levels of menus, the first related to the data in the Project Details View (shown in yellow) and the second a navigation menu to other Views related to the project (shown in blue). You can exit from the project by clicking on “Return to project home”.

For the navigation menu, there are six different Views which are accessible from the Project Details page:

  1. Project Details View (current), shown with the name of the project
  2. Verification Report View, where the verification report can be accessed
  3. Language-Dependent Qualifiers View, where the language dependent free text qualifiers can be accessed and exported/imported
  4. Import Report View, where the import report can be accessed,
  5. Display Sequence Listing View, where human-readable formats of the generated 26 sequence listings can be accessed
  6. Help Menu, which includes references to the user manual and WIPO Sequence and ST.26 Knowledge Base
  7. Preferences View, which is relevant to all projects in this instance of WIPO Sequence.

To print a project, you must enter the Project Details View of the desired project and click on the “Print” button at the top of the View.

Next, you will be shown two checkboxes to clarify what information you want to print from the project: General Information and/or Sequence Information.

Look at the project metadata at the top

If “Print Sequences” is selected, you will have the choice to specify which sequences are to be printed by specifying the range of ID numbers within the “Sequence IDs” field, or to simply print all if this field is left blank.

By default, the total number of sequences of the project will be displayed as a range.

Export project

A project can be exported to a .zip file to back up project data or alternatively import with another desktop computer with WIPO Sequence installed. Simply click on the “Export” button and select a location to save the .zip file. If the export was a success, you will see the following blue banner:

Shows blue banner indicating project has been successfully exported

Import another project into the current one

You can copy information from other projects stored in the tool, into the currently open project. This imported information can be either for the “General Information” Section, “Sequences” Section, or both. Imported General Information will replace the currently existing General Information in the project, while imported Sequences will be appended to the current list of sequences within the project.

Shows the import of another project data into the current project

You must first select the project from which you wish to import information, from the drop-down shown. You can select whether you wish to include parts of the details provided in the General Information Section of the project and also if you wish to import sequences by providing a range of sequence ID numbers to specify which of the sequences are to be imported into the project. By default, the total number of sequences of the project will be displayed as a range.

If the General Information checkbox is checked, a table will appear displaying the entire General Information Section of both projects: the currently selected (destination) project, and the imported project (alternative). You must then select which of the General Information elements are to be replaced by the corresponding imported project’s General Information.

Finally, when you have decided on which General Information elements and sequences are to be imported into the project, you must then click on the blue “Import Project” button. A blue banner appears if the elements have been imported correctly.

Validate project

Before generating the sequence listing as an ST.26-compliant XML file, a project will undergo a validation check beforehand. This step is always conducted prior to generating the sequence listing but can also be performed on its own.

To validate a project, you must click on the “Validate” button at the top of the Project Details View.

Indicates the validate project button

Once the validation has finished, you will be brought to the “Verification Report” View, which displays any verification errors/warnings that may be generated. If the validation was concluded successfully, a blue banner will be shown.

If the validation process finds any errors or warnings, a Verification Report will be generated with a Table detailing the detected verification rules and guidelines that have been broken. Each row identifies whether this is an error, which must be addressed, or a warning, which can be ignored.

Example verification report

Generate sequence listing

The final action that can be performed on a project, and perhaps the most important, is to generate the sequence listing. To generate the sequence listing, you must click on the blue “Generate Sequence Listing” button, at the top of the Project Details View. This will automatically trigger the validation process to be run on the project first.

If the project passes the validation process, a dialog box will open for you to select where to save the generated ST.26-compliant sequence listing (.xml).

Generate sequence listing protocol

If the project fails validation, the Verification Report View will be presented instead, accompanied by a red banner. If the project is valid, a blue banner will be presented.

Example of a generated sequence listing

Display sequence listing

WIPO Sequence allows you to generate a sequence listing in a more human-readable format than XML. After a sequence listing is generated, the XML file may be displayed in HTML format or exported as a text file. Using any internet browser, you may also save the displayed HTML format sequence listing as a PDF file. The export functionality is accessible from the “Display the Sequence Listing” View.

If a sequence listing has not been successfully generated for a given project, then the Display Sequence Listing View will disable the “Display Sequence Listing” and “Export Sequence Listing as .txt file” buttons and you will see an error.

Language-dependent free text qualifiers

The qualifiers which allow a “free text” value in a project are further referenced within the “LANGUAGE-DEPENDENT QUALIFIERS” View of the Project page. Whenever a language-dependent qualifier is added to the current project, the qualifier will also be displayed in this View.

You can modify a translated free text value associated with a qualifier by clicking on the “Qualifier Name” value, which will open an overlay with an edit panel underneath the table.

You will need to provide the source language code and target language code for XLIFF file format export of the free text qualifiers. The translated values will need to be provided by translators before reimporting the XLIFF file.

Please take note that the translated qualifier value, appearing in the column “Non-English Qualifier Value,” corresponds to the selected language specified by the non-English free text language code.

If you click on the “IMPORT FREE-TEXT QUALIFIERS” button, the tool will open the file explorer so you can browse to find and select the “XLIFF” file to import. Multiple validation steps are provided to ensure that the correct mappings between the source and target language values are conducted.

Import free-text qualifiers from an XLIFF file

The selected file must be in XLIFF format and contain the following data items:

  • Project name
  • The target language code
  • The source language code
  • For each XLIFF unit element:
    • The qualifier's unique ID (format: a number preceded by the letter “q”)
    • The qualifier value in the source language tag
    • The qualifier value in the target language tag

Once you have confirmed the selected file for import, the tool will ask you to verify if you want to proceed by confirming a series of verification steps:

  • The system compares the project name from the input file with the name of the selected project.
  • The system will inform you if any qualifiers could not be mapped.
  • The system will inform you of the changes related to the source language and the qualifiers values.
  • The system will inform you of the changes related to the target language and the qualifiers translated values.

After the completion of these steps, the user will see a blue banner at the top: “SUCCESS: THE FREE TEXT QUALIFIER HAS BEEN IMPORTED SUCCESSFULLY” along with an import report displaying in detail the previous and current imported values for the language dependent free text qualifiers. Users can get back to the Free text Qualifier view by clicking the “RETURN TO FREE TEXT QUALIFIERS”.

Exporting free text qualifiers in XLIFF format

If the user clicks on the “EXPORT FREE TEXT QUALIFIERS” button at the top of the View, and then in the dialog box, select the file name and location to save the qualifier text file, all the free text qualifiers of the project will be exported and saved to an XLIFF file format.

The file will include:

  • The project source
  • The project target language
  • The free text qualifiers
  • The translated qualifier free text values
  • The associated qualifier and feature information provided in the table.

This XLIFF format file can be viewed, edited and imported in the tool again after providing the appropriate translation values.

Entering project data

A project is broken into two sections in the same way as the generated sequence listing: general information and sequence data. You can cut and paste data into a project or import data from another project or sequence listing. Generally, clicking on the pencil icon will enable you to edit project fields. Mandatory fields are indicated with an asterisk (“*”).

General Information

The General Information section enables you to enter information related to the patent application itself, which is used to associate the generated sequence listing with this application. The first subsection, Application Identification, is related to the selected project’s patent application status and information. The instructions below take you step by step through the information that must be provided in this section.

Step 1: Application identification

To edit information within the Application Identification subsection, click on the pencil icon highlighted, to the right of the subsection. You must then provide information based on the following steps:

  1. If the application already has an assigned application number, you must select the code of the Intellectual Property Office (IP Office) at which the application was filed. This is the WIPO Standard ST.3 code.
  2. You must select whether or not you have already received an application number or else have provided an applicant file reference, by selecting the appropriate radio button.
  3. If there is no application number, you MUST provide the applicant file reference in this field.
  4. If an application number has already been assigned, you should enter the application number provided.
  5. Select the filing date of the application with the Date Picker if a date has been assigned.
  6. Click the blue “Save” button.

In the example shown, all of the optional values have also been entered:

Screenshot showing the application identification fields

Note: Even if the mandatory values are entered, a warning will always appear in the verification report indicating that “The application identification number is absent. The application number is mandatory if the application number has been assigned.”

Step 2: Priority application/s

Next, if there is a priority application related to the present one, these details are entered in the Priority Application subsection. To add a priority application to the project, you must click on the “Add Earliest Priority Identification” button in the General Information Section.

To set the currently selected priority application as the earliest, you must select “Yes” in the “Selected Earliest Priority Application” drop-down. This will set or modify this as the earliest priority application when the sequence listing is generated. To finish, click on the blue “Add Earliest Priority Application” button.

Screenshot showing the priority application fields

Step 3: Applicant/Inventor

To add data regarding a new applicant or inventor to the project, you must click on the “Add Inventor” or “Add Applicant” button within the General Information Section of the Project Details View. As the steps for performing both these actions are identical, only general instructions will be provided, but this process must be repeated twice if both an applicant and an inventor are to be included within the project, even if the applicant is also the inventor.

When you click on the button, an overlay will open with two radio buttons. If “Existing applicant/inventor” is selected, you can choose from a drop-down box which lists currently saved persons and organizations within the local instance of the desktop tool.

If “New applicant/inventor” is selected, you must fill out the details required when a new person/organization is being created.

Once the details are complete, click on the “Add Applicant” or “Add Inventor” button.

Screenshot showing addition of new applicant or inventor

Note: Only one applicant is required for the sequence listing to be considered valid. As such, one applicant and/or inventor must be marked as primary. This is the applicant/inventor that will appear in the generated sequence listing.

Step 4: Invention title

The final step is the addition of an “Invention Title” within the General Information Section.

To add a new invention title:

  1. Click on the “Add Invention title” button.
  2. In the overlay that appears, you must enter the title of the invention and also indicate the language in which the title is provided.
  3. Click the blue “Add Invention title” button.

Screenshot showing addition of new invention title

According to the WIPO ST.26, it is mandatory for a sequence listing to have the invention title provided in the language of filing. However a project can also optionally include more than one invention title, in additional languages, but only one invention title per language. Each new invention title can be added using the steps above.

Sequence data

The “Sequences” section of the Project Details View is where you provide the technical information related to the sequences themselves. WIPO Sequence provides a number of means to populate sequence data within a project, including manually creating, importing and inserting a sequence. The subsections below provide further details on the steps required to perform these actions.

Create a sequence

The steps to create a sequence in the project are as follows:

  1. Click the “Create new sequence” button. The “Sequence Panel” will come into focus.
  2. Name the sequence by typing the desired name into the “Sequence Name” field. Alternatively, leave the “Sequence Name” filed blank and WIPO Sequence will assign a default name for the new sequence. Default names start with “Seq” and then an iteratively increasing number (“Seq_1”, “Seq_2”, “Seq_3”).

    Note: The purpose of the “Sequence Name” is only to make it easier for the user to distinguish between sequences within the project; this name will not appear in the sequence listing XML file.


  3. Use the “Molecule Type*” drop down box to select one of the three molecule types permitted by WIPO ST.26 (“DNA”, “RNA”, “AA”). If you wish to create a sequence with both “DNA” and “RNA” segments, “DNA” must be selected as the “Molecule Type”.
  4. Type or paste the sequence residues into the “Residues*” field.
  5. Note: If the sequence is a nucleotide sequence (“DNA” or “RNA”) only the symbols listed in WIPO ST.26, Annex I, Table 1 are permitted (a, c, g, t, m, r, w, s, y, k, v, h, d, b, n). Note that the symbol “u” is not permitted. Symbols entered as upper-case letters will be converted to lower-case letters. Any non-letter symbols that are entered will be automatically removed (spaces, numbers, *, -, etc.)

    If the sequence is an amino acid sequence (“AA”), only the symbols listed in WIPO ST.26, Annex I, Table 3 are permitted (A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, O, S, U, T, W, Y, V, B, Z, J, X). Note that only one-letter amino acid symbols are permitted. If your amino acid sequence is represented by three-letter amino acid symbols, (for example, Met-Arg-Leu-Trp-Ile), it must first be converted to one-letter symbols before typing or pasting into the “Residues” field.p>


  6. Type or paste the source organism name in the “Organism name*” field. To select an organism name from the pre-defined list of organism names, simply starting typing the name of the organism and a drop-down list will appear. Select the desired organism name, if present. Otherwise the tool may prompt you to add the organism to the custom organism database stored locally.
  7. If the molecule type is “DNA” or “RNA”, then you must select a specific molecule type from the “Qualifier Molecule Type” drop-down box, which will populate with values appropriate for the molecule type selected in step 3. Note that if the molecule type is “AA”, then the “Qualifier Molecule Type” will automatically populate with the value “protein”.
  8. If the “Mark as an intentionally skipped sequence” box is checked, the sequence will be included in the resulting sequence listing XML file as an “intentionally skipped sequence”; i.e., it will be in the format specified in WIPO ST.26 paragraph 58. Note that the “Mark as an intentionally skipped sequence” box must be checked to save a sequence with fewer than 10 non-“n” nucleotide residues or fewer than 4 non-“X” amino acid residues. When a sequence is intentionally skipped, the “Sequence panel” will remove all constraints on providing values for mandatory elements, and the resulting saved sequence will be ignored when validating the project.
  9. If the molecule type of the sequence is “DNA”, a box labeled “The sequence contains both DNA & RNA fragments” will be visible. If the sequence contains both DNA and RNA fragments, checking this box allows the easy addition of the features required by WIPO ST.26, paragraph 55, for describing DNA/RNA hybrids. Upon checking the “The sequence contains both DNA & RNA fragments” box, the “Sequence panel” will expand to include fields allowing you to describe each DNA and RNA fragment with the feature key “misc_feature” and corresponding location. For each fragment, select “DNA” or “RNA” from the drop down menu in the “Molecule Type” field, then enter a location in the “Location” filed. Optionally, additional text can be included in the “Further text” field. You can create as many of these features as necessary by clicking on the “Add new ‘misc_feature’ feature” button. Note that the entire length of the sequence must be covered by “misc_feature” feature keys such that each and every residue is indicated as either “DNA” or “RNA”.
  10. To finish and save the sequence, click on the grey “Create sequence” button or the blue “Create & Display Sequence” button. If you click on the blue “Create & Display sequence” button, a collapsible sequence display will open after creating the sequence for you to review the values. It will be visible within the Project Details View, beneath the Sequences Section.

The newly created sequence can be found in the last position in the list of sequences, with the next available Sequence ID Number. You may wish to reorder the position of a sequence in the list, by following these steps.

Import sequence

Sequences can also be imported directly from files into a project. The accepted file formats are listed in section 3. When an input file is selected, the desktop tool will automatically detect the format used in the file.

  1. To get started, click on the “Import sequence button”
  2. Click on the "Upload file [.txt, .xml]". When the dialog box opens, select the file containing the sequence data to be imported. The desktop tool will detect the format being used and will perform some validation checks on import. There are five formats that the tool will accept for importing sequences into a project: raw, multi-sequence, FASTA, ST.26 and ST.25.
  3. In the case of selecting a file that is in WIPO ST.25 or ST.26 format, you will first see a “Select Range Sequences” checkbox. When checked, this will open a table listing the Sequence ID Numbers of each sequence in the file and the order in which they will be appended to the list of sequences provided in the project. If you do not wish to import all the sequences to the project, you can provide the desired range of sequence ID numbers. A single sequence can be entered, as well as a list of sequences separated by commas or a range of sequences in the form x-y. For example: “1, 3, 7, 13-20, 30-50”.
  4. In the case of importing a multi-sequence format file, you will see a “Select Range Sequence IDs” checkbox, which, when checked, will display a preview table listing the Sequence ID Numbers of the corresponding sequences in the file as well as the description for each sequence, including sequence name, molecule type and organism name. You must select the range of Sequence ID numbers that you wish to import to the list of sequences within the project. By default, the total number of sequences of the selected sequence listing file will be displayed as a range.
  5. The last two formats that are accepted by the import Sequence process are the raw and FASTA file formats. These formats only define a single sequence per file. When a raw or FASTA file is selected for import, the tool will display the relevant panel. You should proceed by providing the mandatory fields.
  6. After a successful import, the tool will navigate to the “Import Report” View.

Insert sequence

To insert a sequence into a specific position of the list of sequences, you must click on the “Insert Sequence” button. An overlay panel will then appear. In addition to filling out all the information required for creating a sequence, at the top-left of the panel, you must enter the position in which the sequence should appear in the list of sequences. To finish, you can click on either the “Insert sequence” or “Insert & Display Sequence” buttons.

Video showing the insertion of a new sequence into the listing

Reorder sequence

You can reorganize in what order the sequences should appear within the list of sequences provided in a project by using the steps shown in the following video.

Video showing the reordering of a sequence in the listing

Bulk edit

While any of the sequences can be edited one-by-one by clicking on the pencil icon, you can use the bulk edit feature when changes are needed for multiple sequences. While you can go into sequences individually and edit, this would be unfeasible for projects with a large number of sequences. There are a number of fields which can be edited in this way with further details provided below.

  1. Start by clicking on the “Bulk Edit” button.
  2. Choose the “Type of bulk edit” from the drop-down list.
  3. If “Qualifier molecule type” is selected, the system then will prompt you to select the type of nucleic acid sequences to which the bulk edit will apply. The system also gives a warning that the qualifier “mol_type” for sequences where organism = “synthetic sequence” must be “other DNA” or “other RNA”, and that if you change these values an error will be generated on project validation. The system will display a preview of the sequences for bulk edit with the specified characteristics. During editing, the system will inform you that ONLY nucleic acid sequences can have the value of the qualifier “mol_type” edited (because the same value for the amino acid sequences is automatically set by the system to “protein”).
  4. If “Organism” is selected, you must enter the range of sequence IDs to be edited. Then if you have chosen to modify the value of the organism to “synthetic construct”, the system will notify you that the qualifier “mol_type” will be automatically changed to “other DNA” or “other RNA” according to the sequence molecule type (e.g., DNA).
  5. If “Features” is selected, you then need to specify if you want to edit existing features or add new ones. You must enter the “Molecule Type” and the range of sequence IDs to be edited.
    • If you select the type of bulk feature edit as “Edit feature”, for example by modifying the value of a CDS feature location to “complement(join(1..30,61..90))”, you should start by entering the following in the “Select Range of Sequence IDs” text box: the relevant sequence IDs, the “Molecule Type”, and the relevant “Fetaure Key” and its “Feature Location”. To finish, select “Edit sequences”.
    • If you select the type of bulk feature edit as “Add feature”, for example by adding a new “CHAIN” feature with feature location “1..4”, you should start by entering the following in the “Select Range of Sequence IDs” text box: the relevant sequence IDs, the “Molecule Type”, and the relevant “Feature Key” and its “Feature Location”.
  6. If “Bulk skip” is selected, you must then enter, in the “Select Range of Sequence IDs” text box, the range of sequence IDs you wish to skip.
  7. If “Bulk delete” is selected, you must provide, in the “Select Range of Sequence IDs” box, the range of sequences you would like to delete.
  8. For all bulk edit operations, after clicking the “Edit sequences” button, the tool will inform you of the operation’s success by displaying a blue banner. An example is shown below.

Video showing the addition of a CHAIN feature key to sequence 2 and 12

Entering feature data

According to WIPO ST.26, every sequence MUST have at least one “source” feature associated with it. Each source feature must have two mandatory qualifiers: “organism” and “mol_type”.

The Features Table has three columns: the feature key, the location of the feature within the genetic sequence and the qualifiers associated with an individual sequence feature.

The feature location indicates in which segment of the sequence the feature exists. The allowable formats to specify the feature location are provided in WIPO ST.26 and are as follows:

  • Single residue number: x
  • Residue numbers delimiting a sequence span: x..y
  • Residues before the first or beyond the last specified residue number: <x, >x, <x..y, x.>y, <x..>y
  • A site between two adjoining nucleotides: x^y
  • Residue numbers joined by an intrachain cross-link: x..y

Location operators can be used to form complex location descriptions:

  • join (location, location, … location)”: The locations are joined (placed end-to-end) to form one contiguous sequence.
  • order (location, location, … location)”: The elements are found in the specified order, but nothing is implied about whether joining those elements is reasonable.
  • complement (location)”: Indicates that the feature is located on the strand complementary to the sequence span specified by the location descriptor, when read in the 5′ to 3′ direction or in the direction that mimics the 5′ to 3′ direction.

To add a new feature to the sequence, click the “Add feature” button in the Features Section of the selected sequence. Qualifiers can also be added to the feature at this stage; these will be covered in more detail in the next section.

Screenshot showing the add feature button

CDS features

The CDS feature type is used to describe the coding sequence for a protein. A CDS feature may optionally include the amino acid translation of the segment of the sequence to which it belongs. If this is satisfied, the minimum length requirement will appear as a separate sequence within the project. Within the CDS feature of the original sequence, there is a reference to the Sequence ID of the translated amino acid sequence provided in the “protein_id” qualifier.

When creating a “CDS” feature for a sequence, the “translation” qualifier (with default “Genetic Code” value of 1 – “Standard Code”) can be automatically added to the CDS feature with a qualifier value of the translation of the residue chunk of the sequence as indicated by the feature location. An associated “protein id” and separate amino acid sequence may also be generated by checking the checkbox in Basic Information provided at the top of the project details page. However, this qualifier is not mandatory and can be deleted after generation. You can also manually create a “translation” and “protein_id” qualifier which references the associated translated Sequence ID, which you have also created.

Note: From version 2.1.0, the “Automatically add a translation qualifier” checkbox is ticked by default.

Screenshot showing the add feature button

The steps for automatically creating a CDS feature qualifier are as follows:

  1. In the specific sequence display, click the “Add feature” button and select “CDS” as the feature key. If the checkbox “Automatically add a translation qualifier” in Basic Information is checked, it will automatically add a translation qualifier, its value, and a protein_id qualifier and its associated separate amino acid sequence (if appropriate) when a CDS feature is added to a nucleotide sequence.
  2. You also have the option of manually creating a translation qualifier.
  3. When you are finished editing the feature and its related qualifiers, you must click the “Create Feature” button to save it. A resulting CDS feature is then shown associated with the sequence

If the translation qualifier value meets the minimum length requirements, then the tool creates a new sequence for the project with the following attributes:

  • Sequence ID Number = the next available value for Sequence ID Number
  • Length = length of the translated sequence
  • Sequence Name = the value given in the “Sequence Name” field of the “translation” qualifier. If no name was provided, the default sequence name will be provided (“Seq_#”).
  • Molecule Type = “AA”
  • Organism Name = the same value as provided for the original sequence
  • Qualifier Molecule Type = “protein”
  • Sequence Residues = translated values of the original sequence

Note: regarding the creation of the translated sequence, the separate translated sequence is created only if it has at least four specifically defined residues defined, (e.g., “AXTG” counts as three characters). In the case of modifying the “translation” qualifier, if the qualifier value includes fewer than four specifically defined residues, then the associated sequence translation will be removed, as will the “protein_id” qualifier.

Advice on CDS features when including a pseudo or pseudogene qualifier:

Make sure auto-translation is turned off when adding a pseudo or pseudogene qualifier to a CDS feature. If auto-translation is not turned off when a pseudo or pseudogene qualifier is added to a CDS feature, then, when the CDS feature is updated, a translation qualifier will automatically be added. To correct this error, turn off auto-translate for the project, then open the CDS feature and delete the translation and “protein_id” qualifiers, and then update the feature.

If you wish to automatically generate the translation qualifier, the translation table value and sequence name can be set from the Edit Panel of the qualifier. When you create the feature, the tool will perform the translation and then add a “protein_id” qualifier to the feature and a new sequence with the value of the translation.

The translation will be performed again, only if the feature location or one of the qualifiers “transl_table”, “transl_except”, or “codon_start” changes its values, in which case the linked sequence will be updated.

Note: If the translation value is changed, the linked sequence will update its value automatically. However, if the linked nucleotide sequence is modified, the value of the translation qualifier will not change. If the “protein_id” qualifier is modified after creation, then the linked sequence will lose its association to the original sequence.

Advice on use of stop codon:

Typically, stop codons should only be found at the end of a CDS feature, indicating the end point of the encoded amino acid sequence. They should never be found in the middle of a CDS feature unless there is a “transl_except” qualifier that indicates that the stop codon is to be translated into a particular amino acid.

If a stop codon is found in the middle of a CDS feature (highlighted in yellow below), and there is no “transl_except” qualifier indicating that the stop codon is to be translated into a particular amino acid then the tool should stop translation at that point and a red banner will be displayed informing you that no translation will be generated.

An error should be listed in the validation report to alert you that there is a problem with your coding sequence.

Entering qualifier data

Qualifiers are used to provide information about features in addition to that conveyed by the feature key and feature location. There are three types of value formats to accommodate different types of information conveyed by qualifiers:

  1. free text
  2. controlled vocabulary or enumerated values (e.g., a number or date)
  3. sequences.

To view the qualifiers for a feature, you must first select the relevant feature from the Feature Table of the relevant sequence, then click on the pencil icon which will open an overlay.

Existing qualifiers can be edited by clicking on the pencil icon to the right of each row, or you can add a new qualifier to the currently selected feature by clicking the “Add qualifier” button.

Screenshot showing qualifier overlay

When editing or adding a qualifier, you will be presented with the two fields: the “Qualifier name” (to be selected from a drop-down list) and the “Qualifier value”.

The Qualifier Value field will have a different behavior depending on the type of qualifier:

  • Qualifiers with predefined values: the value field is a drop-down field where you can select one of the predefined values for the qualifier.
  • Qualifiers with free text: the value field is a free text field. In addition to the Qualifier Name and the Qualifier Value, which holds the English value only, two additional fields appear to allow you to provide both the language code (e.g., “ru”) and the corresponding language value in the Non-English Qualifier Value. The language code field is assigned the same value as the “Non-English free text language code” filed in the Project Details Information. You can provide a series of non-English values for each selected language either by manual input or by importing the proper associated language from an XLIFF file.
  • Qualifiers with predefined format: the value field is a free text field, but the value entered is validated to ensure it matches the specific rules provided in WIPO ST.26 Annex I, Section 6
  • Qualifiers with no value allowed: the qualifier value field is not editable.

When finished, you must click the blue “Create Qualifier” button to add the newly created qualifier, or “Save”, to save the changes made to the existing qualifier. As a last step, once the qualifier(s) have been added/modified, you must click on the “Update feature” button at the bottom of the overlay to proceed.

2.3 Persons & Organizations View

This View manages all of the persons and organizations saved locally.

Create person/organization

To create a new Person or Organization, you must begin from the Persons & Organizations View. Click on the “CREATE NEW PERSON OR ORGANIZATION” link at the top of the View, as shown:

Video showing the creation of a new organization

In the new View, you must at least fill in the mandatory fields (indicated with an asterisk “*”) corresponding to the details of the new person/organization. For the applicant/inventor, this is the name (if provided in Latin characters) and the language only.

When the name of the person or organization is not in Latin characters, then the Latin character version of the name should be provided in the “Name Latin” field. If this information is not provided, then the project will not validate when the ST.26 sequence listing is validated or generated.

2.4 Custom Organisms View

To create, edit, import, export or delete Custom Organisms, you must begin from the Organisms View.

Create custom organism

Create new custom organism locally

To create a new custom organism, click the “CREATE NEW ORGANISM” link at the top of the View. In the next screen, enter the name of the new Organism and click “Save”. If a description of this custom organism is required, this can be optionally added as shown. To edit the details, click on the name of the organism.

Export custom organism

All the custom organisms and their description that are stored in the tool can be exported and saved to a text file to be modified outside the tool or imported on a later date. To export this list, start by selecting “EXPORT CUSTOM ORGANISMS”, as highlighted below.

Wanto export a custom organism

Next, a dialog box will open allowing you to choose the name of the file and the desired file location.

Enter the details in the dialog

The file that is exported is a text file including both the name and the description of the organism which could be edited and imported into the tool. Download an example.

Import custom organisms

Want to import a custom organism

Firstly, in order to import a list of custom organisms, you must click on the “IMPORT CUSTOM ORGANISMS” link at the top of the View. This will open an Overlay below the custom organisms summary table.

  1. Click on the “Upload file [.txt]” button.
  2. Select the file with the custom organism names from within the dialog box.
  3. Finally, click on the blue “Import Custom Organisms” button.

Note: The file to be imported will be a text file (*.txt) with a list of custom organism names in plain text (UTF-8), each item on a new line.

2.5 System Preferences View

The System Preferences View allows the modification of several configuration parameters of WIPO Sequence. These parameters will apply to every project created or edited by the tool.

Want to update preferences for all projects

In order to modify the system preferences, you should click on the pencil icon shown above to open the Edit Panel.

Update the default values provided

The list of configuration items that can be modified from this View (in order) are:

  • Maximum number of residue symbols to be displayed: This parameter sets the number of residues that will be displayed per row when displaying a sequence. The default is 60.
  • Default location where the ST.26 sequence listing file (.xml) will be generated. There is no need to provide this.
  • Maximum number of sequences to print (leave empty for all): the default is 1,000.
  • Maximum number of residues to print (leave empty for all): the default is 1,200.
  • Original free text language code: If this checkbox is checked, then a warning will be shown during validation if the original free text language code is not provided. By default, this is unchecked.
  • Enable XQV_49: If this checkbox is checked, then a warning will be shown if there is no English value for a language-dependent free text qualifier provided. By default, this is off.
  • Default interface language: This is the language in which the interface will appear when WIPO Sequence is launched. By default, this is English.

Note: The third and fourth items are relevant when printing the project as a PDF. For very large sequence listings, the resulting PDF can have several thousand pages and be impossible to display.

3. File format

WIPO Standard ST.25

For details on the format of WIPO Standard ST.25 files, please refer to the Standard

An example is provided as Annex III to the Standard.

Raw

This format can only describe one sequence. The genetic code is written in its basic form with no additional information. When imported, molecule type, features and name must be added to the sequence through the tool.

Example:

aggatatagatagtatatgatagtatgatatgatgatgtatgtatagtgtagttatga

Multi-sequence

The multi-sequence format can describe one or multiple sequences, along with their name, the type of molecule and the name of the organism. It is one of the allowable formats for import using “PatentIn”. The first line of non-blank text is the header and is comprised of the following components:

<Sequence Name; Sequence Type; Organism Name>

Note: The < and > and ; characters are mandatory elements (spaces are obligatory).

The information of the header will be interpreted by WIPO Sequence as follows:

Header entry Allowable Input Interpreted as
Sequence Name Name of the Sequence (free text) Name of the sequence in the WIPO project file (will not be part of the XML file)
Sequence Type One of the following:
  • DNA
  • RNA
  • AA
mol_type Note that depending on the Organism input, it will be required to further define the mol_type for DNA and RNA (ST.26, paragraphs 75-84)
Organism Name Organism name (free text) Source/organism note: if input is a synthetic construct, the mol_type for DNA and RNA will automatically be identified as “other DNA” or “other RNA” (ST.26, paragraph 84(a)).

Sequence data begins on the line after the header. A new sequence is delineated by a new line in the file, after the end of the sequence information of the previous sequence. There may be one or more empty lines between the end of a sequence and the start of the next header. Amino acid sequences must be in one letter code (Annex I, Section 3, Table 3). Allowable nucleotide symbols are those of Annex I, Section 1 Table 1. Note that “u” in RNA sequences will not be automatically converted to “t” as required by ST.26 paragraphs 14 and 19 and will require manual intervention after import. It is recommended to convert “u” to “t” before performing a multi-sequence import of a file containing RNA sequences.

The following is an example of a set of three sequences defined in multi-sequence format.

Example:

<First Sequence; RNA; Albies alba>
uuuucuuauuguuucuccuacugcuuaucauaaugauugucguaguggcuuccucaucgucucccccaccgccuaccacaacgacugccgcagcggauuacuaauaguaucaccaacagcauaacaaaaagaaugacgaagaggguugcugauggugucgccgacggcguagcagaaggaguggcggagggg
<Second Sequence; DNA; synthetic construct> attgacgtcagtgacgcggtactgacgtcagctgcagtactgacgtaccaaccacgtggtgagctctcgacatgcaactgactcgtcgctattgacgtcagtgacgcggtactgacgtcagctgcagtactgacgtaccaaccacgtggtgagctctcgacatgcaactgactcgtcgctcagt
<Third Sequence; AA; Mus musculus> 
SPPGKPQGPPPQGGNQPQGPPPPPGKPQGPPPQGGNRPQGPPPPGKPQGPPPQGDKSRSPR

FASTA

This format contains residues and description. While importing, you have the option to save the description as a note qualifier.

Example:

AJ011880.1 Artificial oligonucleotide sequence SSR primer (CAC13R)
CTCAACAATCTGAAGCATCG

4. Troubleshooting

WIPO Sequence provides a “Help” functionality, accessible from the top menu.

Common problems and workarounds

The import report or validation report indicates that the project contains multiple qualifier IDs.

Workaround: Delete the feature which includes a duplicate qualifier ID. Recreate the feature at the same location but do not add the qualifier before saving. Next, edit the feature to add the relevant qualifier and update the feature.

The suggested filename with Linux distribution is incorrect.

There is a known issue when using the Linux distribution: an extra “\” appears in the suggested file name by default. In order to resolve this, please manually remove the extra “\” before saving.

Refer to the WIPO Sequence knowledge base for other answers

An error is shown when trying to display the sequence listing in HTML format

If the generated sequence listing in XML format is greater than 100 Mb in size, instead of displaying the sequence listing in HTML format, an error indicates that the sequence listing is to large to be rendered in HTML format.

For any other outstanding questions, contact us.