Guidelines for Nomenclature of Cloned Genes or DNA Fragments in Rice

Ray Wu1, Atsushi HIRAI2, John MUNDY3, Rebecca NELSON4and Ray RODRIGUEZ5

1) Biochemistry, Molecular and Cell Biology, Connell University, Ithaca, New York 14853, U.S.A.

2) Graduate Division of Biochemical Regulation, School of Agriculture, Nagoya University, Nagoya, 464-01, Japan

3) Carlsberg Research Lab, Gamle Carlsberg, Vej 10, Dk-2500 VAlby, Copenhagen, Denmark

4) Plant Breeding/Plant Pathology, IRRI, P. O. B3x 933, Manila 1099, Philippines

5) Department of Genetics, UC Davis, CA 95616, U.S.A.


The Committee used previously published information on naming other plant genes (Hart and Galo, 1988; Price, 1989) as a starting point in drafting the following guidelines for naming cloned genes or DNA fragments in rice.

I. (a) A 3-letter acronym is recommended for naming cloned rice genes, because it would provide 17,576 symbols (Price, 1989). For example, an amylase gene may be named Amy because it is already used by geneticists for genes encoding isozymes (Kinoshita, 1986). Moreover, it will be distinct from the naming of bacterial genes, which usually use 3 lower-case letters, such as amy. One advantage is that if one transforms rice with a bacterial gene, one can distinguish it from the plant gene. RFLP markers are to be named using a different system which will be reported by another committee.

(b) If there are two related or homologous amylase genes, they can be named Amy 1 and Amy 2 (Hart and Gale, 1988). Amy 1 represents the first cloned amylase gene of rice. However, there is no implication that the rice Amy 1 is homologous to Amy 1 of other plants. On the other hand, if sufficient information is available to show that the first rice amylase gene to be published is closely related to a wheat gene, such as Amy 2, then the rice gene should be named Amy 2 instead of Amy 1.

(c) For rice genes that were already in the literature, we suggest that future authors follow the old nomenclature to avoid confusion. However, if the new gene does not belong to a known family, then adopt the above recommended system by giving the gene a systematic name using the next available number. For example, if a new α-amylase gene is different from Amy 1, 2 or 3, then the new gene should be named Amy 4.

(d) Different alleles at a single locus can be designated as Amy 1a and Amy 1b, respectively.

(e) To distinguish between genes encoding α-amylase and β-amylase, α-Amy 1 and β-Amy 1, respectively, can be used.

(f) Capital C, preceded by a dash can be used for a cDNA clone. For example, Amy 1a-C refers to a cDNA clone closely related to the genomic clone Amy 1a.

(g) A pseudogene is named with a sign ψ in front of the gene. For example, ψ Amy 1 is a pseudogene closely related to Amy 1.

(h) A gene that regulates the α-Amy 1 can be named α-Amy 1-R. This should be used only if one is certain that the gene is really a regulator gene.

II. To name cloned DNAs of rice using the above-described system, one of the following two criteria should be met.

(a) Extensive nucleotide sequence (and deduced amino acid information if it is a gene) should be provided, and highly significant sequence identity to known genes or sequences should be demonstrated.

(b) A significant level of DNA/DNA hybridization between a known gene or DNA sequence and the DNA in question should be demonstrated. This information should be accompanied by consistent restriction enzyme mapping data.

In the absence of this information, investigators are advised to use clone designations (e.g. pXY 123, λ AB 45) or DNA database accession numbers from EMBL or GenBank. The following abbreviations for the type of vector are recommended: p for plasmid, λfor lambda, c for cosmids, m for M 13, and y for yeast. Later on, when more information becomes available, the name can be changed to the appropriate 3-letter acronym described in section 1.

III. The Committee suggests the names for the following cloned DNA fragments.

Rep for repetitive DNA,
Cen for centromere DNA,
Tel for telomeric DNA,
Ori for origin of replication.

For example, Rep 1-G represents a repetitive DNA, isolate 1, from genomic DNA. If the repetitive DNA comes from the B genome type, it can be abbreviated as Rep(B)1-G.

IV. The abbreviation for several of the common bacterial genes which have been commonly used to transform rice include the following:

gus (for more correctly as uidA) for β-glucuronidase gene,
hpt for hygromycin phosphotransferase gene,
hptII for neomycin phoshotransferase II gene,
bar (or more correctly as pat) for phosphinothricin acetyltransferase gene.

V. For naming a chimeric gene construct, the Committee suggests the use of a capital P for promoter, a capital I for intron, a capital S for signal peptide coding sequence, a capital C for coding sequence and a capital T for 3' terminator regions. Each symbol is enclosed within a parenthesis, and is placed after the name of the component. A slash is placed between components to clearly separate them. For example, CaMV 35S (P)/Act 1 (I)/α-Amy 1a(S)/gus(C)/nos (T) is a chimeric plasmid that contains the promoter CaMV 35S, the intron of Act 1, the signal peptide coding sequence of α-Amy 1a, the coding region of the bacterial gus and the 3' terminator region of the bacterial nos.

VI. The Committee suggests that in naming rice genes, the nomenclature should not include the abbreviation for rice. For example, R has been used to indicate genes, but R can also stand for rye or rose genes. The use of Os for Oryza also is not satisfactory.

VII. The protein product (or phenotype) of a cloned gene should be identical to the basic symbol for the gene, except that each letter in the symbol should be a capital Roman letter. For example, the product of the Adh 1 gene is designated ADH 1 (Hart and Gale 1988).

VIII. The Committee suggests that the recommendations be considered as tentative. We welcome suggestions for improvement or additions from other scientists. A revised guideline will be published every year or two.


In general, the naming system of rice chloroplast and mitochondrial genes should be the same at those of other higher plants. Detailed rules and recommended nomenclature for known genes can be found in the following articles: Plant Mol. Biol. Reported 1, 38-43 (1983); 6, 14-21 (1988); 6, 266-273 (1988); 7, 266-275 (1989).

I. The gene name should include two parts. The first part, a three-letter code in lower-case letters to designate the group to which the gene product belongs. The second part, one or more capitalized letters or numbers to designate specific genes. The gene name is italicized.

II. Groups of genesthat are related in coding function should have the same three-letter code. For groups of gene-encoding polypeptides, the capitalized letters or numbers used to designate specific genes within a group do not necessarily carry any connotation about the molecular weight of the polypeptide or relative gel electrophoretic mobility.


G. H. Hart and M. D. Gale, 1988. Guidelines for nomenclature of biochemical/molecular loci in wheat and related species. Draft copy of July 4, 1988.

T. Kinoshita, 1986. Report on the Committee on Gene Symbolization. RGN 3: 4-7.

C. Price, 1989. Nomenclature for cloned plant genes. Plant Mol. Biol. Reporter 7: 99-103.

