Administrative Instructions under the Patent Cooperation Treaty 

Annex C, Appendix 2

Nucleotide and Amino Acid Symbols and Feature Table

Table 6:  List of Feature Keys Related to Protein Sequences

key
description
CONFLICT

different papers report differing sequences

VARIANT

authors report that sequence variants exist

VARSPLIC

description of sequence variants produced by alternative splicing

MUTAGEN

site which has been experimentally altered

MOD_RES

post-translational modification of a residue

      ACETYLATION

N-terminal or other

      AMIDATION

generally at the C-terminal of a mature active peptide

      BLOCKED

undetermined N- or C-terminal blocking group

      FORMYLATION

of the N-terminal methionine

      GAMMA-CARBOXYGLUTAMIC
      ACID HYDROXYLATION

of asparagine, aspartic acid, proline or lysine

      METHYLATION

generally of lysine or arginine

      PHOSPHORYLATION

of serine, threonine, tyrosine, aspartic acid or histidine

      PYRROLIDONE CARBOXYLIC
      ACID

N-terminal glutamate which has formed an internal cyclic lactam

      SULFATATION

generally of tyrosine

 LIPID

covalent binding of a lipidic moiety

     MYRISTATE

myristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue

     PALMITATE

palmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue

     FARNESYL

farnesyl group attached through a thioether bond to a cysteine residue

     GERANYL-GERANYL

geranyl-geranyl group attached through a thioether bond to a cysteine residue

     GPI-ANCHOR

glycosyl-phosphatidylinositol (GPI) group linked to the alpha-carboxyl group of the C-terminal residue of the mature form of a protein

      N-ACYL DIGLYCERIDE

N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages

DISULFID

disulfide bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by an intra-chain disulfide bond; if the ‘FROM’ and ‘TO’ endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link

THIOLEST

thiolester bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by the thiolester bond

THIOETH

thioether bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by the thioether bond

CARBOHYD

glycosylation site; the nature of the carbohydrate (if known) is given in the description field

METAL

binding site for a metal ion; the description field indicates the nature of the metal

BINDING

binding site for any chemical group (co-enzyme, prosthetic group, etc.); the chemical nature of the group is given in the description field

SIGNAL

extent of a signal sequence (prepeptide)

TRANSIT

extent of a transit peptide (mitochondrial, chloroplastic, or for a microbody)

PROPEP

extent of a propeptide

CHAIN

extent of a polypeptide chain in the mature protein

PEPTIDE

extent of a released active peptide

DOMAIN

extent of a domain of interest on the sequence; the nature of that domain is given in the description field

CA_BIND

extent of a calcium-binding region

DNA_BIND

extent of a DNA-binding region

NP_BIND

extent of a nucleotide phosphate binding region; the nature of the nucleotide phosphate is indicated in the description field

TRANSMEM

extent of a transmembrane region

ZN_FING

extent of a zinc finger region

SIMILAR

extent of a similarity with another protein sequence; precise information, relative to that sequence is given in the description field

REPEAT

extent of an internal sequence repetition

HELIX

secondary structure: Helices, for example, Alpha‑helix, 3(10) helix, or Pi‑helix

STRAND

secondary structure: Beta‑strand, for example, Hydrogen bonded beta‑strand, or Residue in an isolated beta‑bridge

TURN

secondary structure Turns, for example, H‑bonded turn (3‑turn, 4‑turn or 5‑turn)

ACT_SITE

amino acid(s) involved in the activity of an enzyme

SITE

any other interesting site on the sequence

INIT_MET

the sequence is known to start with an initiator methionine

NON_TER

the residue at an extremity of the sequence is not the terminal residue; if applied to position 1, this signifies that the first position is not the N-terminus of the complete molecule; if applied to the last position, it signifies that this position is not the C-terminus of the complete molecule; there is no description field for this key

NON_CONS

non consecutive residues; indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them

UNSURE

uncertainties in the sequence; used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment

           


 
<<   >>
Table of Contents

 

PCT

Related Links

E-Newsletters

add this