Search for
Japanese | English
The Profiling of Escherichia coli chromosome (PEC) database has been constructed to compile any relevant information that could help to characterize the E. coli genome, especially with respect to discovering the function of each gene. The database is intended to provide an interface comprehensible to most experimental researchers. The E. coli genetic resource committee of Japan supports the construction and maintenance of this database.

The following information is available from PEC.
Information is available from PEC
(1) Basic information about each gene (gene name, direction, length, location, etc.)
(2) The essentiality of each gene for cell growth, classified as essential, non-essential, or unknown based on information from research reports and deletion mutation studies.
The criteria for classification are described below (*1).
References, on which the classification is based, are listed in the format "Medline (PMID)" in the section for each gene
(3) The names of the strains related to each gene, which are stored in the Stock center of the Natl. Inst. Genet, Japan.
(4) Results of similarity searches for each gene product (BLAST, PSI-BLAST, FASTA)
(5) Structural features (domain, motif, etc.) for each gene product
(6) Results of comparative analyses of each gene (homologous in other sequenced bacteria, etc.).
(7) Information from rather long deletion mutations.
(8) Locations of Kohara clones.

PEC provides the following overviews.
Circular The E. coli genome is displayed as a circle, on which the locations of the tRNA genes have been marked. Essential genes, non-essential genes, and unknown genes are painted in different colors, allowing easy visualization of their distribution on the genome.
Linear The E. coli genome is linearly displayed. Each gene is displayed in a linear manner along the genome together with its name, direction, size, location, and class (essential, non-essential, unknown). Also, the regions deleted in the deletion mutant(s) for each gene are shown. This view also shows classical markers, whose exact locations remain unknown, contig information, and the location of the Kohara clones. As the Kohara clones are derived from the strain W3110, their positions were assigned by doing a homology search on the MG1655 genome.
Motif Structural domains and motifs of a gene product are displayed graphically along with those of other genes having the same domains or motifs.

PEC is based on the sequence information of E. coli strain MG1655. Information, such as basic information (gene name, direction, length, location, etc.) about each gene, was retrieved from the other databases listed below (*2) and annotated before incorporation into the PEC database.

We appreciate any comments on how to improve the database.

           yyamazak[at] ( database construction )
  jkato[at] ( research activity )

(*1) Gene classification based on essentiality for cell growth

All of the E. coli genes were classified into three groups, (1) genes essential for cell growth (essential), (2) those dispensable for cell growth (non-essential), and (3) those unknown to be essential or non-essential, mainly using information taken from journal articles and the following criteria.

experimental evidence known
Basically, when a strain having only a null type mutation in a gene (without other suppressor mutations) was able to grow, even if the mutation meant that the strain could grow only at a certain temperature or under certain nutrient conditions, it is classified into the "non-essential" category.
(1) The genes, for which null type mutants (having deletion or Tn (transposon) insertion mutations) have been isolated, are classified into the "non-essential" category.
(2) The genes located within the deleted regions of identified deletion mutations are classified into the "non-essential" category.
(3) The genes that do not fall under (1) or (2) and for which conditional lethal mutants have been isolated, are classified into the "essential" category.
no experimental evidence known
The genes described below are classified as follows without any experimental evidences.
(4) The structural genes of ribosomal proteins are classified into the "essential" category except for those that have been reported to be dispensable.
(5)The argX, cysT, glyT, hisR, leuU, leuW, leuZ, proL, proM, serT, serV, thrU, trpT genes coding for unique tRNAs are classified into the "essential" category.
(6) The hisS and argS genes coding for unique aminoacyl tRNA synthases are classified into the "essential" category.
(7) The genes involved in flagellation, motility, and chemotaxis (flg, flh, fli, mot, che, tap, tar) are classified into the "non-essential" category.
(8) The hem genes, whose mutants can grow when porphyrin is added to the culture media, are classified into the "non-essential" category.
The genes, which do not correspond to those listed in A and B, are classified into the "unknown" category.

(*2) References:

Essential Gene & Minimal Genome, Deletions "Cell Size and Nucleoid Organization of Engineered Escherichia coli Cells with a Reduced Genome."
Hashimoto M, Ichimura T, Mizoguchi H, Tanaka K, Fujimitsu K, Keyamura K, Ote T, Yamakawa T, Yamazaki Y, Mori H, Katayama T, Kato J.
Molecular Biology (2005) 55(1), 137-149.

"Construction of consecutive deletions of the Escherichia coli chromosome"
Jun-ichi Kato and Masayuki Hashimoto
Mol Syst Biol. 2007; 3: 132.
Sequence Data U00096.2 26-FEB-2013
AE000111-AE000510, last updated in 1998
( )
Frederick R. Blattner et al., University of Wisconsin-Madison, USA
Linkage map of E. coli K-12, Edition 10: The traditional map (PubMed)
Mary K. B. Berlyn, Yale University, USA
Microbiol. Mol. Biol. Rev., 62,814-984, 1998
"E. coli database collection" (ECDC) (
Justus-Liebig-University, Germany
"Verified Protein Starts" (
Data compiled from the literature and the appropriate citations are available from EcoGene.
"Completing the E.coli Proteome" (
Dr. Gavin H. Thomas,
Department of Molecular Biology and Biotechnology,
University of Sheffield
"Operon" (
Data sets of the RegulonDB database