Application of Proteomics to the Discovery & Verification of Biomarkers for Prion Diseases
Proteomics Approaches for Biomarker Discovery
The biomarker development pipeline is generally composed of three stages: biomarker discovery, verification and clinical validation (Figure 1). The process begins with biomarker discovery where unbiased analyses are carried out to compare the differences between normal and prion-infected samples to identify proteins differentially expressed in prion-infected samples as potential biomarker candidates. In the past decade, advances in MS have made MS-based proteomics a promising tool for protein profiling and biomarker discovery in various types of samples such as cell cultures, tissues and biological fluids. With the latest technology in chromatographic separations and state-of-the-art MS instrumentation, hundreds of proteins can be identified in a single run. However, the biggest challenges for using proteomics for prion disease biomarker discovery are the complexity of the sample and the presence of high-abundance proteins in body fluids such as the cerebrospinal fluid (CSF) and blood. To overcome these difficulties, several separation techniques can be applied to facilitate the protein identification by MS. Based on different separation strategies, three commonly used techniques for biomarker discovery are 2D gel electrophoresis MS (2D-GE MS) and its variation 2D-DIGE MS, SELDI-TOF MS and shotgun proteomic analysis by liquid chromatography (LC)-MS/MS.
2D-GE MS is the most widely used method in studies of proteomic profiling of prion-infected samples where thousands of proteins can be separated on a single gel according to isoelectric point (pI) and molecular weight. Individual protein spots that show differences in abundance between different samples can then be excised from the gel, digested into peptides and analyzed by MALDI-TOF MS or LC-MS/MS for protein identification. Built on 2D-GE, 2D-DIGE adds a quantitative dimension to this technique enabling multiple protein extracts to be separated on the same 2D gel, thus providing a promising approach to comparative analysis of proteomes in complex samples. In 2D-DIGE, up to three different protein extracts; control, infected samples and an internal standard can be labeled with fluorescent dyes, for example, Cy3, Cy5 and Cy2, respectively, prior to 2D electrophoresis. Compared with traditional 2D-GE, 2D-DIGE overcomes the intergel variation problem. Studies using 2D-GE MS or 2D-DIGE MS to profile proteins in CSF have led to the identification of currently accepted biomarker for sCJD 14-3-3 protein and other potential biomarkers for prion diseases (Table 1).
SELDI-TOF MS is a variation of MALDI-TOF MS that selectively captures proteins of interest on a surface modified with chemical functionality prior to co-crystallization with an energy absorbing matrix followed by ionization with a laser for TOF MS analysis. It is another useful method for protein profiling where proteins spotted on a 'protein chip' interact with the chromatographic surface based on absorption, electrostatic interaction and affinity. The proteins of interest selectively bind to the surface, while the others are removed by washing. This method allows rapid concentration and detection of proteins with good sensitivity below a molecular weight of 20 kDa[32,33] and therefore complements gel-based approaches. Although SELDI-TOF MS suffers from low-resolution, low-mass accuracy and lacks tandem MS capabilities, all of which make direct protein identification difficult, the ability of this technique to compare spectra from a large number of samples in a relatively short time with little sample preparation is advantageous for low-resolution TOF instruments without tandem MS capability. SELDI-TOF MS analysis recognizes differentially expressed proteins and provides the foundation for the isolation of proteins of interests for further identification by high-resolution MS and tandem MS analysis.
In contrast to 2D-GE and SELDI-TOF MS where intact proteins are separated, liquid-based shotgun proteomic separation techniques have become increasingly popular in proteomic research because they are reproducible, highly automated and have a greater likelihood of detecting low-abundance proteins. In this case, the protein mixture is digested and the resulting peptides are separated by LC prior to tandem MS detection, which allows for the identification of different proteins by peptide sequencing. However, due to the complexity of the digested protein samples from biological fluids such as CSF and plasma, 1D separation is often insufficient, and thus an orthogonal separation of complex samples is needed prior to LC-MS/MS. More commonly used strategies include MudPIT and gel-based LC.[35,36] Another advantage of liquid-based shotgun proteomics is the suitability for isotope labeling for protein quantification. Examples of quantification methodology include the use of chemical reactions to introduce isotopic tags at specific functional groups on peptides or proteins, such as isotope-coded affinity tags, in which cysteine residues are specifically derivatized with a reagent containing zero or eight deuterium atoms as well as a biotin group for affinity purification of cysteine-containing peptides[37,38] and isobaric tags for relative and absolute quantitation, which labels primary amines at the N-termini and lysine side chains of peptides and enables up to eight samples to be labeled and relatively quantified in a single experiment. Isotope labeling has been demonstrated to be a powerful tool for protein quantitative analysis in CSF and blood samples.[41,42]
All methodologies of proteomics mentioned above have played critical roles in biomarker discovery. With advances and innovations in current techniques, these methods and tools will be even more powerful than before in terms of accuracy, sensitivity or analytical throughput. However, each proteomic approach has its advantages and disadvantages and cannot cover the entire proteome. Therefore, a comprehensive study of prion disease biomarkers will rely on the results obtained from various strategies and different instruments. Table 1 provides a list of potential biomarkers for prion diseases identified by different proteomic technologies.
Searching for Prion Disease-related Biomarkers in the CSF
Being in close anatomical contact with brain interstitial fluid, CSF has been the most studied sample for diagnostic biomarker identification for neurological diseases due to its obvious association with the CNS. Biochemical changes in CSF caused by disease can be reflected by altered expression or post-translational modification (PTM) of certain proteins. Various brain-derived proteins have been shown to be differentially expressed in CJD, of these commonly tested CSF proteins as potential CJD biomarkers include 14-3-3 protein, total tau (T-tau), S100B and neuron-specific enolase.[43–46] To date, only the detection of 14-3-3 protein is included in the diagnostic criteria approved by the WHO for the premortem diagnosis of clinically suspected cases of sCJD. However, the diagnostic accuracy of this test is still up for debate for large-scale screening purposes because of its high false-positive rate where elevated 14-3-3 protein levels were also reported in other conditions associated with acute neuronal damage.[48,49] Although the high false-positive rate caused by conditions such as inflammatory, ischemic and neoplastic disorders could be ruled out by a routine CSF test and brain imaging, the differential diagnosis between CJD and other neurodegenerative diseases can be challenging, and the use of a combination of multiple biomarkers is often necessary. To improve the sensitivity and specificity of CSF 14-3-3 protein analysis, it is desirable to identify other new biomarkers that are specific to sCJD and can be used in conjunction with 14-3-3 protein to support diagnosis. So far, different proteomic approaches have been used for comparative CSF proteome analysis including 2D-GE, 2D-DIGE, SELDI-TOF MS,[50,52] MALDI-TOF MS and MALDI-FTMS. These methods have resulted in significant progress to identify sensitive and specific biomarkers for CJD. Qualtieri et al. summarized the existing proteins proposed as surrogate biomarkers for sCJD diagnosis in their review. These proteins showed significant differential expression in CSF samples from CJD patients, but further investigation of each protein is required to examine the specificity and sensitivity for early diagnosis of sCJD in a larger cohort. One of the proposed biomarkers, heart-type fatty acid binding protein (H-FABP) was initially investigated in a small group of CJD affected patients (n = 8) by 2D-GE MS and immunoassay approach. The amount of H-FABP appeared to be significantly increased in both CSF and plasma of CJD patients in this study, but application of H-FABP as a diagnostic tool was not certain due to the limited number of CJD patients and patients with other causes of rapid progressive dementia included in the investigation. In a recent study, H-FABP level in CSF samples from CJD patients (n = 124) was investigated by Rapicheck® kit assay, which is a qualitative one-step immunochromatography system, and ELISA assay. Results from both assays demonstrated satisfying sensitivity and specificity comparable with 14-3-3 protein and T-tau protein assay, suggesting that the use of H-FABP as diagnostic biomarker in conjunction with 14-3-3 protein and T-tau may improve diagnostic accuracy.
The level of 14-3-3 protein and T-tau is also elevated in several other dementias other than CJD, this is the main reason for the high false-positive rates observed. It is hoped that the search for proteins that are exclusively or directly involved in the pathological process of the disease will result in the identification of CJD specific biomarkers. One such potential protein that might play a pathophysiological role in CJD is extracellular signal regulating kinase ERK1/2; as MAP kinases are activated during pathogenesis in prion diseases. Steinacker et al. measured ERK1/2 level in CSF samples from patients with CJD, other dementias or neurological disorders by an electrochemiluminescence assay and Western blotting, and identified significantly elevated mean levels of total ERK1/2 and phosphorylated ERK1/2 in the CJD patients. The increase of ERK1/2 was also observed in a CJD case that was negative for 14-3-3 protein or had low levels of tau protein, suggesting that ERK1/2 can be used as an alternative CSF biomarker for CJD. In another study, the level of CSF transferrin was examined in the premortem CSF collected between 2006 and 2008 from 99 patients with autopsy-confirmed cases of sCJD (CJD+) and 75 cases of dementia of non-CJD origin (CJD-) by rigorous statistical analysis. It was determined that the level of transferrin is significantly lower in the CSF of CJD+ case compared with that of CJD- patients. The decrease in transferrin level allows the sCJD to be distinguished from dementia of non-CJD origin with an accuracy of 80%, and when combined with T-tau, the diagnostic accuracy increased to 86%. This combination enables more accurate diagnosis than the current method of sCJD diagnosis where levels of 14-3-3 protein and T-tau are detected.
Identifying Biomarkers in Blood or Urine for Easy & Fast Premortem Screening Assays
Blood is a good alternative biological material to search for biomarkers that can be used for routine premortem diagnostic test because its collection is non-invasive, low-cost, safe and simple. Although blood is not in direct contact with brain, the exchange between CSF and blood allows some brain-derived proteins to leak into the blood, and such exchange could be enhanced in cases of neurodegenerative diseases where the blood–brain barrier is damaged. Recently, the evidence that transmission of vCJD in humans can occur through blood transfusion have raised concerns of prion contamination in human blood or blood products.[16,62] Therefore, an effective diagnosis of TSEs during the early phase by a reliable and sensitive test performed using blood samples is desired. Unfortunately, very few proteomic analyses for biomarkers of prion disease from blood samples have been reported partially due to the scarcity of samples from CJD patients and prion-infected animal models. The biggest challenge that impedes biomarker discovery from blood samples, however, is the large dynamic range of proteins that has been reported to exceed 1010 and the high abundances of 12 proteins that comprise approximately 95% of the serum proteome that mask the detection of lower abundance proteins. As a result, the successful detection of low-abundance proteins that are likely to be disease-indicative biomarkers rely heavily on an effective prefractionation or enrichment strategy to remove highly abundant proteins from the serum/plasma prior to proteomic analysis. The most common strategy to reduce dynamic range is immunodepletion in which several major high-abundance proteins in human or mouse plasma are removed by commercially available multiaffinity columns. Alternatively, the complexity of blood samples can be reduced by using affinity separations to enrich a subproteome of interest, for example, phosphoproteome, glycoproteome and low molecular weight subproteome. Our group utilized lectin affinity chromatography to enrich glycoproteins in healthy and prion-infected mouse plasma followed by shotgun proteomic analysis by MudPIT. Using this strategy, we identified 708 proteins in mouse plasma from three time points of disease progression.
SELDI-TOF MS has been proposed as a powerful approach for blood biomarker discovery as this technology focuses on a particular subset of the proteome for each of the capture conditions, and additional prefractionation or enrichment steps help reduce sample dynamic range prior to SELDI-TOF MS, which further improves sensitivity.[66,67] A recent study by Batxelli-Molina et al. analyzed a large number of serum samples from control sheep and animals with early-phase (EP) or late-phase (LP) scrapie to detect biomarkers characteristic of the EP and LP of scrapie by SELDI-TOF MS analysis. To reduce sample complexity, serum samples were prefractionated by anion exchange chromatography according to their charge characteristics that resulted in six fractions being eluted at different pH values. SELDI-TOF MS analysis of the profiles of proteins included in the 2–20 kDa range in all fractions obtained from serum samples of sheep with (EP or LP) scrapie and healthy controls led to the detection of 15 peaks found to significantly differentiate EP or LP animals from control animals. The authors suggested that a combination of four differentially expressed peaks in either EP or LP scrapie could be used as biomarkers for scrapie at different stages. Moreover, the mass of three of the 15 peaks with differential expression in sheep could be detected in hamsters infected with scrapie and used as EP or LP biomarkers of disease in the hamster model. One peak that was detected in both sheep and hamsters was identified as a fragment of the transthyretin monomer. In previous studies, transthyretin has been shown as a potential CSF biomarker with altered levels in sCJD patients.
Urine is another easily accessible body fluid that contains a complex mixture of proteins and peptides that can serve as a good reservoir for biomarker discovery because protein content of urine is relatively low and protein composition appears to be less complex compared with serum. Moreover, several studies have confirmed that PrPSc is excreted into the urine and PrPSc could be detected in the urine of hamsters infected with scrapie.[69–72] Using proteomic approaches, two groups identified prion protein in urine-derived gonadotropins,[73,74] raising the concern about the presence of infectious prion protein in urinary-derived injectable fertility products. Therefore, the development of a sensitive and robust diagnostic test for prion disease in urine is also highly desirable. Simon et al. utilized 2D-DIGE MS to examine the urine of infected cattle over the course of the disease and found the immunoglobulin γ-2 chain C region and clusterin were significantly increased in abundance. Increased abundance of immunoglobulins have also been reported in the urine of scrapie-infected hamsters and sheep.[72,76] The elevated level of clusterin has previously been reported in astrocytes as well as a significant accumulation in CSF and blood plasma. Simon et al. demonstrated that only certain isoforms of clusterin exhibited differential abundance in urine samples collected from control and infected cattle. The specificity of these particular isoforms was further investigated by using clusterin β subunit-specific antibodies to confirm that it was β-subunits of clusterin that exhibited differential abundance in response to BSE infection. Moreover, further examination revealed that the differentially expressed protein isoforms are β-subunits of the glycoprotein clusterin that possessed N-linked glycans (also discussed in next section).
One issue concerning the reliability and accuracy of urine-based biomarkers is whether protein biomarkers identified from a homogenous population where the infected subjects and control subjects are matched based on their breeds, genders and ages can be applied to heterogeneous population, as the urine proteome will be affected by these factors. To address this issue, Plews et al. utilized 2D-DIGE to profile the urine proteome of both a known set and an associated blinded unknown set that contained control and BSE-infected cattle of different breeds, genders and ages. Based on the gel images obtained from the known set, nine selected spot features were used to create a classifier that could be used to distinguish the infected from the uninfected samples. When applied to the 19 samples in an independent unknown data set, such a classifier was able to discriminate between control and infected samples with 74% accuracy. However, five samples that were collected relatively early in the disease course were misclassified, which indicates that the disease status of the animals was influential in affecting differential abundance of individual proteins in urine. To further analyze the influence of breed, age and gender on the protein profiles in urine, a merged sample set was created where all control or infected animals from both known and unknown sets were included. All three factors are demonstrated to significantly affect the urinary proteome. However, despite the presence of confounding factors, a new classifier comprised of proteins best able to discriminate between the samples based on different factors was generated by the regularized discriminant analysis algorithms.
Delving into the Glycoproteome
As previously mentioned, direct profiling of CSF or blood proteome by conventional MS-based approaches to search for prion disease biomarkers is challenging because of the complexity of the proteome and the presence of very high-abundance proteins. An effective strategy is to focus on a subproteome that is more relevant to disease development and progression. PTMs play a key role in modulating the activity and function of most proteins in biological systems. Important PTMs such as phosphorylation, glycosylation and ubiquitination also regulate the functions, cellular targeting and degradation of proteins in the CNS.[80–82] Thus, in addition to the change of protein concentrations, the aberrant PTM patterns of various proteins could be associated with the onset and progression of several neurodegenerative diseases. In recent years, MS-based proteomics has been widely used to both discover novel modifications and study the differential PTMs of proteins in a disease state.[83,84] Among various PTMs, glycosylation represents the most common and complicated forms. Types of protein glycosylation are categorized as N-linked, where glycans are attached to asparagine residues in a consensus sequence N-X-S/T (X can be any amino acid except proline) via an N-acetylglucosamine (N-GlcNAc) residue, or the O-glycosylation, where the glycans are attached to serine or threonine. Glycosylation plays a fundamental role in numerous biological processes, and aberrant alterations in protein glycosylation are associated with neurodegenerative disease states, such as CJD and Alzheimer's disease (AD).[85,86] It was found that PrPSc has a decreased level of bisecting GlcNAc and increased levels of tri- and tetraantennary glycans compared with PrPC, suggesting a decrease in the activity of an enzyme called N-acetylglucosaminyltransferase III. This possible perturbation to the glycosylation machinery of the cells might cause changes in other glycosylation events and lead to glycoprotein profile changes. Liu et al. showed that aberrant glycosylation may modulate the tau protein at a substrate level, stabilizing its phosphorylated isoforms from brains in AD patients. In another study, Reelin, a glycoprotein that is essential for the correct cytosolic organization of the CNS, is upregulated in the brain and CSF in several neurodegenerative diseases, including frontotemporal dementia, progressive supranuclear palsy, Parkinson's disease and AD. As mentioned earlier, Lamoureux et al. found the level of protein β-subunits of clusterin was elevated in urine of BSE-infected cows. Presence of multiple isoforms of clusterin led to the more in-depth glycosylation analysis. Briefly, samples of BSE-infected urine were treated with enzyme PNGase F to release N-linked glycan chains. Western blotting analysis of PNGase F-treated and mock-treated control and infected urine samples revealed that the differentially expressed protein is highly glycosylated clusterin. Given that aberrant glycosylation and glycoproteins play a critical role in neurodegenerative disorders, in-depth glycoproteomic analyses of body fluids from patients would provide great potential for the discovery of new diagnostic markers. The glycoproteomic approaches have been applied to biomarker discovery in the CSF of AD patients. A previous study used 2D-GE to identify several potential glycoprotein biomarkers, including apolipoprotein E, clusterin and α-1-β-glycoprotein in the CSF from AD patients. In a similar study, Sihlbom et al. utilized albumin depletion prior to 2D gel electrophoresis to enhance glycoprotein concentration in the CSF of AD patients and healthy control individuals for image analysis and determined the structure of N-linked glycans of apolipoprotein J by FT-ICR MS. However, the analysis of glycoproteome in tissues or body fluids is challenging due to the low abundance of glycosylated forms of proteins compared with non-glycosylated proteins. Therefore, the analysis of glycoproteins by MS-based proteomics requires effective approaches for enrichment.
The two most common methods to enrich glycoproteins are lectin affinity chromatography and hydrazide chemistry. In the hydrazide chemistry method, N-glycans of glycoproteins are conjugated to a solid support with hydrazide chemistry after periodate-mediated oxidation of the carbohydrate. Peptide moieties of the covalently captured glycopeptides are then released with PNGase F treatment to allow the peptide and glycosylation site to be identified. On the other hand, lectins are a class of proteins isolated from plants, fungi, bacteria and animals that recognize carbohydrates on the surface of proteins. Lectins can specifically capture distinct oligosaccharide epitopes, thus not only allowing the isolation and enrichment of glycoproteins and glycopeptides, but also enabling discrimination of glycan structures among different proteins and different glycoforms of the same protein. Lectins are usually immobilized onto appropriate matrices like agarose or magnetic beads in a number of chromatographic formats, including tubes, columns and microfluidic channels.[92,93] Two lectins, Concanavalin A (ConA) and wheat germ agglutinin (WGA) are commonly used in lectin affinity chromatography due to their broad selectivity. Con A has a high affinity to high-mannose type N-glycans,[94,95] whereas WGA is selective for N-GlcNAc and sialic acids. Because individual lectins can specifically enrich only a subset of glycoproteins, multilectin columns are applied to maximize the coverage of glycoproteomes in biological fluids.[97–99] By utilizing a multilectin column that contained both Con A and WGA to enrich N-linked glycoproteins in mouse plasma, followed by multidimensional liquid chromatography to separate tryptic peptides prior to MS analysis, our group identified a low-abundance protein serum amyloid P-component (SAP) in control and prion-infected mouse plasma collected at 108, 158 and 198 days post inoculation (dpi). Relative quantitative analysis by isotopic formaldehyde labeling revealed that the level of SAP was significantly elevated in infected 108 dpi samples, which was also validated by Western blotting using SAP antibody. Interestingly, the Western blot experiments showed two bands of SAP at 26 and 30 kDa. The intensity of the 26-kDa band remains relatively constant regardless of the physiological status, whereas the level of 30 kDa band was significantly elevated from the infected sample at 108 dpi (Figure 2A). SAP is a 224-residue secreted glycoprotein with a single N-glycosylation site at position 52.[100,101] PNGase F digestion analysis of the sample in which N-glycans were cleaved prior to Western blotting confirmed that the 30 kDa band is the glycosylated form of SAP (Figure 2B) and the elevation of SAP observed in infected samples at 108 dpi in MS analysis can be fully attributed to the increase of glycosylated SAP following the enrichment by lectin affinity chromatography, whereas the level of nonglycosylated isoform remained relatively stable. SAP is previously reported to be associated in vivo with all types of amyloid deposits, and has been found to co-localize with neurofibrillary pathology in various neurodegenerative diseases including AD, CJD, Parkinson's disease and diffuse Lewy body disorders.[102–104] The result from this study suggested that the glycosylated form of SAP could be used as a potential preclinical biomarker for prion diseases and the glycosylation SAP plays an important role in the progression of prion disease.
Verification of Biomarkers by Targeted Proteomics
After potential biomarkers are discovered by means of MS-based proteomic analysis, validations that require a large number of clinical samples and have a much higher threshold for accuracy have to be performed to examine their sensitivity and specificity before they can be developed into clinical assays. However, due to the enormous expense and effort spent in the validation phase and the risk that very few candidates will be qualified, a verification stage is usually needed with the goal of selecting those biomarkers that can potentially pass a final validation from the list of candidates generated in the biomarker discovery phase. At present, although the number of potential biomarkers for prion diseases has been increasing, the verification of these proteins has become the major bottleneck in the biomarker development pipeline, allowing few proposed biomarkers to enter clinical validation. One of the main reasons that hindered the process of the verification phase is the lack of technologies that can verify most of the candidates identified in proteomics analysis across a large population of clinical samples. Currently, antibody-based methods such as Western blotting and ELISA are the most widely used method for biomarker verification. These methods rely on specific antibodies for the proteins of interest and cannot be applied to all biomarker candidates because commercial assays or antibodies are only available for a limited number of proteins, and the development of new assays will be an expensive and time-consuming process. Also, it is important that multiple biomarker candidates be validated in the same assay not only due to time and cost concerns, but also because a panel of biomarkers rather than a single protein will be implemented in diagnostic screening for higher specificity and accuracy. Furthermore, biomarker validation by ELISA or Western blotting is limited by crossreactivity of different antibodies and the difficulty of having a uniform assay condition that is suitable for all antibodies in multiplexed assays.
Recently, quantitative assays based on MRM-MS using a triple quadrupole mass spectrometer in combination with stable isotope-labeled internal standards have been used in biomarker verification as an alternative to ELISA.[105,106] In this approach, the targeted parent ion is selected in the first quadrupole (Q1) and enters the second quadrupole (Q2) where it undergoes collision-induced dissociation. One or more fragment ions are then selected according to the predefined transitions and the ensuing signal provides the spectral counts for quantification. MRM-MS has shown good sensitivity, reproducibility and linear dynamic range for direct quantification of proteins in serum samples. The high throughput and multiplexing capability of this approach for biomarker verification in complex samples was also demonstrated by the quantification of 45 endogenous proteins of moderate to high abundance in plasma without fractionation of enrichment in a single LC-MS run. For the verification of low-abundance biomarkers, a recent approach using stable isotope-labeled standards with capture on anti-peptide antibodies (SISCAPA) was applied to enrich for targeted peptides prior to MRM-MS, thereby significantly improving the sensitivity. In this approach, the peptides of interest are enriched from digested plasma samples that are spiked with known amounts of their stable isotope-labeled internal standard counterparts by utilizing immobilized antibodies generated against specific peptides. The peptides are released from the antibody and then quantified using MRM-MS.
Currently, the verification of protein biomarkers for prion diseases is still conducted using antibody-based platforms, and only a small number of proteins have been selected for verification based on the consideration of the availability and the cost of antibodies, which created a big gap between the discovery and validation phase in the biomarker development pipeline. The application of the MRM-MS assay for the verification of more potential biomarkers identified from large-scale comparative proteomics experiments holds promise to accelerate the development process and generate more candidates for clinical validation. However, an accurate and reproducible MRM-MS assay requires a careful method optimization including sample preparation, LC condition and MS parameters that is specific to each potential biomarker, which presents new challenges in biomarker verification studies.
Expert Rev Proteomics. 2012;9(3):267-280. © 2012 Expert Reviews Ltd.