Immunoglobulin superfamily proteins in Caenorhabditis elegans

J Mol Biol. 2000 Mar 10;296(5):1367-83. doi: 10.1006/jmbi.1999.3497.

Abstract

The predicted proteins of the genome of Caenorhabditis elegans were analysed by various sequence comparison methods to identify the repertoire of proteins that are members of the immunoglobulin superfamily (IgSF). The IgSF is one of the largest families of protein domain in this genome and likely to be one of the major families in other multicellular eukaryotes too. This is because members of the superfamily are involved in a variety of functions including cell-cell recognition, cell-surface receptors, muscle structure and, in higher organisms, the immune system. Sixty-four proteins with 488 I set IgSF domains were identified largely by using Hidden Markov models. The domain architectures of the protein products of these 64 genes are described. Twenty-one of these had been characterised previously. We show that another 25 are related to proteins of known function. The C. elegans IgSF proteins can be classified into five broad categories: muscle proteins, protein kinases and phosphatases, three categories of proteins involved in the development of the nervous system, leucine-rich repeat containing proteins and proteins without homologues of known function, of which there are 18. The 19 proteins involved in nervous system development that are not kinases or phosphatases are homologues of neuroglian, axonin, NCAM, wrapper, klingon, ICCR and nephrin or belong to the recently identified zig gene family. Out of the set of 64 genes, 22 are on the X chromosome. This study should be seen as an initial description of the IgSF repertoire in C. elegans, because the current gene definitions may contain a number of errors, especially in the case of long sequences, and there may be IgSF genes that have not yet been detected. However, the proteins described here do provide an overview of the bulk of the repertoire of immunoglobulin superfamily members in C. elegans, a framework for refinement and extension of the repertoire as gene and protein definitions improve, and the basis for investigations of their function and for comparisons with the repertoires of other organisms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans / chemistry*
  • Caenorhabditis elegans / enzymology
  • Caenorhabditis elegans / genetics
  • Cell Adhesion Molecules, Neuronal / chemistry
  • Cell Adhesion Molecules, Neuronal / genetics
  • Computational Biology*
  • Genes, Helminth / genetics
  • Helminth Proteins / chemistry*
  • Helminth Proteins / genetics
  • Humans
  • Immunoglobulins / chemistry*
  • Immunoglobulins / genetics
  • Leucine / genetics
  • Leucine / metabolism
  • Markov Chains
  • Multigene Family* / genetics
  • Muscle Proteins / chemistry
  • Muscle Proteins / genetics
  • Nerve Tissue Proteins / chemistry
  • Nerve Tissue Proteins / genetics
  • Physical Chromosome Mapping
  • Protein Structure, Tertiary
  • Protein Tyrosine Phosphatases / chemistry
  • Protein Tyrosine Phosphatases / genetics
  • Protein-Tyrosine Kinases / chemistry
  • Protein-Tyrosine Kinases / genetics
  • Sequence Alignment
  • Sequence Homology*
  • X Chromosome / genetics

Substances

  • Cell Adhesion Molecules, Neuronal
  • Helminth Proteins
  • Immunoglobulins
  • Muscle Proteins
  • Nerve Tissue Proteins
  • Protein-Tyrosine Kinases
  • Protein Tyrosine Phosphatases
  • Leucine