Theory/Principles Jp Solubility as a Function of Protein Structure and Solvent Components Schein, C. H. (1990) Nature Biotechnology 8: 308. (Adapted) 1. Introduction This review deals with ways of stabilizing proteins against aggregation and with methods to determine, predict, and increase solubility. Solvent additives (osmolytes) that stabilize proteins are listed with a description of their effects on proteins and on the solvation properties of water. Special attention is given to areas where solubility limitations pose major problems, as in the preparation of highly concentrated solutions of recombinant proteins for structural determination with NMR and X-ray crystallography, refolding of inclusion body proteins, studies of membrane protein dynamics, and in the formulation of proteins for pharmaceutical use. Structural factors relating to solubility and possibilities for protein engineering are analyzed. It is generally known that proteins must be stored in an appropriate temperature and pH range to retain activity and prevent aggregation. Proteins are often most soluble in solution conditions mimicking their natural environment. Serum proteins are soluble in a pH and salt range where mature insulin, which is stored in acidic granules in the cell, precipitates. Bacterial proteins may prefer buffers containing glutamate or betaine, compounds that accumulate in response to high concentrations of CI" in the medium.2 Caseins and other Ca2+-associated proteins may require small amounts of the ion to maintain their native structure during purification.4'5 The stability of lactase (P-galactosidase) is greatly increased in the presence of milk proteins.6 But for most proteins, experimental determination of the solution properties can help in solvent design. Low solubility in aqueous solvents is often regarded as an indication that a protein is "hydrophobic" as aggregation of integral membrane proteins after transfer to a hydrophilic environment is a well-described phenomenon.7 But all proteins are to some extent hydrophobic, with tightly packed cores that exclude water.8,9 As native, properly folded structures aggregate less than unfolded, denatured ones, there is an intimate relationship between solubility and stability. The free energy of stabilization of proteins in aqueous solution is very low (ca. 12 kcal/mole at 30°C);10 consequently, proteins are on the verge of denaturation.10,11 Protein stability can be increased by solvent additives or by alteration of the protein structure itself. 2. The Properties of Proteins in Solution 2.1 Defining Solubility. The chemisf's definition of solubility, parts purified substance per 100 parts of pure water, is not useful in a biological frame, as proteins in nature are never found in pure water. Blood and eukaryotic cytoplasm contain on the order of 0.15 M salt, with large quantities of trace metals, lipids, and other proteins. The cytoplasm of bacteria is more variable, with salt concentration ranging from 0.3 to 0.6 M.2 The solubilizing effects of small molecules and even other proteins means that protein solubility does not correlate with purity.12 Operationally, solubility is the maximum amount of protein in the presence of specified cosolutes that is not sedimented by 30,000 x g centrifugation for 30 minutes.13 An even stricter criterion, function retained after centrifuging for 1 hours at 105,000 x g, has been suggested for membrane proteins.14 If one has a pure, lyophilized protein or a salt precipitate, one can determine solubility by adding increasing amounts of weighed solid, centrifuging, and measuring the protein content of the supernatant. Dissolved protein should reach a maximum (solubility) and level off. However, in the food industry, solubility is defined by sediment (in ml) 70 I 50 S 30 00 0 0 2 0 4 0 6 0 8 1 12 14 KCI concentration (M) Figure 1 - Solubility of T7 RNA polymerase as a function of salt concentration in 10 mM cacodylate buffer, 1 mM DTT, and 0.02 mM PMSF. The polymerase solution (ca. 1 mg/ml in 0.1 M KCI, 20 mM Tris-HCI, pH 7.9, 5% glycerol) was diluted in 1:10 with the indicated buffers and each sample was individually concentrated in 30-kDa MW cutoff "Centricons" (Amicon). The protein concentrating in the supernatant (measured by the Coomassie blue assay) after concentration is indicated (Protein A, squares). The top curve (Protein B, circles) is from a second measurement using a finer salt gradient and more protein per sample. 219 remaining after centrifuging; the solubility index is thus inverse to the actual solubility.13 The method described in Figure 1 allows definition of the solubility range of a protein in solution. A protein solution is diluted into a buffer series and the samples are centrifuged in microconcentrators. As one can conveniently concentrate about 50-fold, a relatively small amount of protein is sufficient for the estimation. 2.2 Measuring Stability. Methods for determining the thermodynamic stability of proteins use pH and temperature extremes, or high concentrations of denaturants.10 Although useful for discerning changes in the structural stability of mutant proteins that are not clear from activity data, they are not directly correlated with the half-life of proteins in solution. Since aggregation occurs at temperatures well below the TA for proteins; additives that stabilize proteins against aggregation may not necessarily affect the The major problem with using thermodynamic measurements is their failure to account for the kinetic effects that lead to aggregation. Both the enthalpy (AH) and entropy (AS) of hydration vary greatly with temperature, but they cancel to give a relatively small measured free energy (AG) of hydration that seems to vary little with temperature. Most of the temperature-dependent kinetic contribution, which is the more important in explaining hydrophobic effects, dissipates in alterations of the solvent structure around the protein and reversible deformation of the protein structure itself.1016 Accurate discrimination of hydration shells can be done only from crystal structures. Clearly, other methods of determining protein stability are needed. Proteins with shorter half-lives generally have larger subunit molecular weights, lower isoelectric points (pi), higher affinity for hydrophobic surfaces, and greater susceptibility to proteases. Both of the latter characteristics can be used as the basis for determining enzyme stability in less extreme environments as well as the effect of additives on stability. As less stable proteins have a higher tendency to adsorb to surfaces,17 resistance to mechanical shaking may be a useful indicator of solution half-life.18 Trypsin digestion has been used to define the salt stabilization of hyalin.5 2.3 Determining Surface Charge. Isoelectric focusing gives the p/, the pH at which the protein shows no net charge in isoionic conditions. However, due to the binding of salt, one cannot assume that a protein in solution will be negatively charged at pHs above its pi (e.g., acidic caseins bind Ca2+ and appear positively charged at pH 7.4). At pH 7.5 and 50 mM salt, most proteins will bind to DEAE-coupled resins if they are negatively charged and to phospho- and other negatively charged resins if they are positively charged. The charge strength can be estimated from the salt concentration required for elution. Gel methods for following the changes in surface charge during protein folding and aggregation have also been developed.19 Generally, charged proteins can be "salted in" by counterions. Binding of salts to proteins decreases bound water as well as the net charge at the surface. The solubility of lysozyme, a positively charged protein, was shown to vary more with the anion added than the cation; the anion dependence followed the Hofmeister series.20 The solubility of caseins with pi between pH 3 and 5 varies with the cation: sodium, potassium, and ammonium caseinates are all more soluble than those prepared with calcium or aluminum.415 2.4 Determining Hydrophobicity. Binding to resins coupled with hydrophobic groups, like Phenylsepharose (Pharmacia), indicates the presence of hydrophobic residues at the protein surface. Proteins are applied in high salt (0.7-1 M ammonium sulfate), which furthers hydrophobic interactions, and then eluted with a decreasing salt gradient. Most proteins elute between 0.5 and 0.1 M salt; very hydrophobic proteins will not elute into low salt buffer unless the polarity is decreased by adding ethylene glycol. If a protein does not bind to phenylsepharose, it either has a very hydrophilic surface (e.g., RNase A) or it is aggregated. One can determine the hydrophobicity of a purified protein or follow changes in exposure of hydrophobic groups during folding by measuring interaction with a hydrophobic dye or radioactive tracer (e.g., l-anilino-8-naphthalenesulfonate21 or l25I-TID, 3-(trifluoromethyl)-3-(m-[1257]iodophenyl) diaz-irine3). 2.5 Aggregation and Precipitation. Precipitation via any agent can be: 2.5.1 Reversible, as after precipitation with salts or large organic molecules like polyethylene glycol (PEG). Because PEG molecules are excluded from the surface of the protein, a two phase system develops and the protein is concentrated into a smaller volume, where its chances of interacting with another protein molecule to form an aggregate are increased ("excluded volume" model).22. When the precipitant is removed, the water layer around the original molecule can reform and the protein molecules separate into soluble monomers. The protein structure does not significantly change during reversible aggregation. A plot of protein in solution versus the concentration of the precipitant should look the same whether it is made with increasing precipitant (to precipitation) or decreasing precipitant (to solubility). Reversibility is assumed for most mathematical models of salting out12 as well as some recent models of low salt aggregation phenomena.25,26 220 Table 1 - Protein Cosolutes Compounds Mode of Action Amount Used A. Osmotic stabilizers Generally have little direct interactions with proteins; they affect the bulk solution properties of water. i. Polyols and sugars • glycerol, erythritol, arabitol, sorbitol, mannitol, xylitol, mannisidomannitol (Man-Man), glucosylglycerol, glucose, fructose, sucrose, trehalose, isofluoroside These stabilize the lattice structure of water, thus increasing surface tension and viscosity. They stabilize hydration shells and protect against aggregation by increasing the molecular density of the solution without changing the dielectric constant. 10 - 40% ii. Polymers • dextrans, levans, polyethylene glycol Polymers increase the molecular density and solvent viscosity thus lowering protein aggregation in a single phase system. At high polymer concentration, a two phase system develops and the protein aggregates in the phase where its concentration is the highest. 1 - 15% iii. Amino acids and derivatives • glycine, alanine, proline, taurine, betaine, octopine, glutamate, sarcosine,