Cloning and sequencing of the cDNA encoding the major albumin of Theobroma cacao

Identification of the protein as a member of the Kunitz protease inhibitor family

The major albumin, a polypeptide of 21 kilodaltons (kDa), from the seeds of cocoa (Theobroma cacao L.), has been identified and partially purified by preparative gel electrophoresis. Some N-terminal sequence was obtained, permitting the construction of an oligonucleotide probe. This probe was used to isolate the corresponding copy DNA (cDNA) clone from a library made from poly(A)+ RNA from immature cocoa beans. The cDNA sequence has a single major open reading frame, that translates to give a 221-amino-acid polypeptide of Mr 24003. The existence of a precursor to the 21-kDa polypeptide of this size was confirmed by immunoprecipitation from total poly(A)+ RNA translation products. The polypeptide has a hydrophobic signal sequence of 26 amino acids before the mature start, and the mature polypeptide would have an Mr of 21223. The protein sequence is homologous with sequences of the Kunitz protease and α-amylase inhibitor family, and the protein probably functions to defend the seed's protein reserves from the digestive enzymes of invading pests. However because the protein comprises 25–30% of the total seed protein it may itself also function as a storage protein. Electron micrographs of immunogold-labelled embryo sections show that the protein is located in membrane-enclosed organelles.

copy DNA


immunoglobulin G


kilobase pairs



Mr :

relative molecular mass


sodium dodecyl sulphate-polyacylamide gel electrophoresis


The authors are very grateful to Dr R. Jennings of the Virology Department, Sheffield University Medical School, for help in raising antibodies, and to Dr G. Cope, of the Biological Sciences Electron Microscopy Unit, Sheffield University, for taking the electron micrographs.

To whom correspondence should be addressed.

Spencer, M.E., Hodge, R. Cloning and sequencing of the cDNA encoding the major albumin of Theobroma cacao . Planta 183, 528–535 (1991).

Key words

  • Albumin cDNA
  • Kunitz protease inhibitor
  • Theobroma