Administrative Instructions under the Patent Cooperation Treaty
Annex C, Appendix 2
Nucleotide and Amino Acid Symbols and Feature Table
Table 6: List of Feature Keys Related to Protein Sequences
|
key
|
description
|
|---|---|
|
CONFLICT
|
different papers report differing sequences |
|
VARIANT |
authors report that sequence variants exist |
|
VARSPLIC |
description of sequence variants produced by alternative splicing |
|
MUTAGEN |
site which has been experimentally altered |
|
MOD_RES |
post-translational modification of a residue |
|
ACETYLATION |
N-terminal or other |
|
AMIDATION |
generally at the C-terminal of a mature active peptide |
|
BLOCKED |
undetermined N- or C-terminal blocking group |
|
FORMYLATION |
of the N-terminal methionine
|
|
GAMMA-CARBOXYGLUTAMIC |
of asparagine, aspartic acid, proline or lysine |
|
METHYLATION |
generally of lysine or arginine |
|
PHOSPHORYLATION |
of serine, threonine, tyrosine, aspartic acid or histidine |
| PYRROLIDONE CARBOXYLIC ACID |
N-terminal glutamate which has formed an internal cyclic lactam |
| SULFATATION |
generally of tyrosine |
|
LIPID |
covalent binding of a lipidic moiety |
|
MYRISTATE |
myristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue |
|
PALMITATE |
palmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue |
|
FARNESYL |
farnesyl group attached through a thioether bond to a cysteine residue |
|
GERANYL-GERANYL |
geranyl-geranyl group attached through a thioether bond to a cysteine residue |
|
GPI-ANCHOR |
glycosyl-phosphatidylinositol (GPI) group linked to the alpha-carboxyl group of the C-terminal residue of the mature form of a protein |
|
N-ACYL DIGLYCERIDE |
N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages |
|
DISULFID |
disulfide bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by an intra-chain disulfide bond; if the ‘FROM’ and ‘TO’ endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link |
|
THIOLEST |
thiolester bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by the thiolester bond |
|
THIOETH |
thioether bond; the ‘FROM’ and ‘TO’ endpoints represent the two residues which are linked by the thioether bond |
|
CARBOHYD |
glycosylation site; the nature of the carbohydrate (if known) is given in the description field |
|
METAL |
binding site for a metal ion; the description field indicates the nature of the metal |
|
BINDING |
binding site for any chemical group (co-enzyme, prosthetic group, etc.); the chemical nature of the group is given in the description field |
|
SIGNAL |
extent of a signal sequence (prepeptide) |
|
TRANSIT |
extent of a transit peptide (mitochondrial, chloroplastic, or for a microbody) |
|
PROPEP |
extent of a propeptide |
|
CHAIN |
extent of a polypeptide chain in the mature protein |
|
PEPTIDE |
extent of a released active peptide |
|
DOMAIN |
extent of a domain of interest on the sequence; the nature of that domain is given in the description field |
|
CA_BIND |
extent of a calcium-binding region |
|
DNA_BIND |
extent of a DNA-binding region |
|
NP_BIND |
extent of a nucleotide phosphate binding region; the nature of the nucleotide phosphate is indicated in the description field |
|
TRANSMEM |
extent of a transmembrane region |
|
ZN_FING |
extent of a zinc finger region |
|
SIMILAR |
extent of a similarity with another protein sequence; precise information, relative to that sequence is given in the description field |
|
REPEAT |
extent of an internal sequence repetition |
|
HELIX |
secondary structure: Helices, for example, Alpha‑helix, 3(10) helix, or Pi‑helix |
|
STRAND |
secondary structure: Beta‑strand, for example, Hydrogen bonded beta‑strand, or Residue in an isolated beta‑bridge |
|
TURN |
secondary structure Turns, for example, H‑bonded turn (3‑turn, 4‑turn or 5‑turn) |
|
ACT_SITE |
amino acid(s) involved in the activity of an enzyme |
|
SITE |
any other interesting site on the sequence |
|
INIT_MET |
the sequence is known to start with an initiator methionine |
|
NON_TER |
the residue at an extremity of the sequence is not the terminal residue; if applied to position 1, this signifies that the first position is not the N-terminus of the complete molecule; if applied to the last position, it signifies that this position is not the C-terminus of the complete molecule; there is no description field for this key |
|
NON_CONS |
non consecutive residues; indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them |
|
UNSURE |
uncertainties in the sequence; used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment |
|
|


