Towards a Brief Definition of Burnout Syndrome by Subtypes

Development of the "Burnout Clinical Subtypes Questionnaire" (BCSQ-12)

Jesús Montero-Marín; Petros Skapinakis; Ricardo Araya; Margarita Gili; Javier García-Campayo


Health and Quality of Life Outcomes. 2011;9(74) 

In This Article


Design and Study Population

A cross-sectional design was utilized by means of the self-report technique through an online questionnaire completed by selected subjects who had provided informed consent.

The study population was comprised of the entire workforce of the University of Zaragoza in employment in January 2008 (N = 5,493). The sample size was calculated with a 95% confidence interval and a margin of error of 3.5%. The prevalence of burnout was estimated at 18%,[9] giving a result of 427 subjects. As the expected response rate in web-mail surveys is approximately 27%,[10,11] and in order to perform both an exploratory and confirmatory factor analysis on the different groups, 3,200 employees were selected by stratified probability sampling with proportional allocation by occupation (58% teaching and research staff or 'TRS', 33% administration and service personnel or 'ASP' and 9% trainees or 'TRA').

The participants' total final sample (nT = 826) was divided randomly into two equal halves (n1 = 413 and n2 = 413). The size of the resulting sub-samples permitted the established margin of error to be maintained and exceeded the construct validity evaluation criterion, making it possible to perform the analysis on both groups with psychometric adjustment.[12–15] The sample size calculation, subject selection and sample division were performed with Epidat 3.1. software.


An e-mail was sent to the selected subjects explaining the aims of the research. This message contained a link to an online questionnaire and two access passwords that enabled the subjects to complete the questionnaire during the month of February 2008. The first page of the protocol again provided another explanation of the aims of the study, the participants to whom it was addressed, the voluntary nature of participation in it, possible benefits/risks entailed and the confidentiality of information given. All participants received an anonymous report with an explanation of their results. The project was approved by the regional Clinical Research Ethics Committee of Aragon.


Sociodemographic and Occupational Factors Subjects were first asked a set of questions dealing with socio-demographic and occupational characteristics including: age, sex, whether they were in a stable relationship ('yes' vs 'no'), level of education ('secondary or lower', 'university degree', 'doctorate'), occupation type ('TRS', 'ASP', 'TRA'), years of service (' < 4', '4–16', ' > 16'), type of employment contract ('permanent' vs 'part time') and whether they had taken sick leave in the previous year ('yes' vs 'no').

Burnout Clinical Subtype Questionnaire (BCSQ-12) Following on, they were provided with the "Burnout Clinical Subtype Questionnaire" in its brief Spanish version, the BCSQ-12 (Additional file 1, Appendix 1: Spanish language version of BCSQ-12; Appendix 2: English language version of BCSQ-12). This questionnaire consists of 12 items equally distributed between the dimensions of 'overload' (e.g. "I overlook my own needs to fulfil work demands"), 'lack of development' (e.g. "My work doesn't offer me opportunities to develop my abilities") and 'neglect' (e.g. "When things at work don't turn out as well as they should, I stop trying"). Subjects had to indicate their degree of agreement with each of the statements presented according to a Likert-type scale with 7 response options, scored from 1 (totally disagree) to 7 (totally agree). The results were presented as scalar scores. Cronbach's α coefficient showed the internal consistency of these dimensions, with values of α≥0.85 in all cases in the present study.

Maslach Burnout Inventory General Survey (MBI-GS) Subjects were also given the "Maslach Burnout Inventory-General Survey" (MBI-GS)[2] in its validated Spanish language version.[16] This adaptation consists of 15 items grouped into 'three dimensions: 'exhaustion' (e.g. "I feel emotionally drained from my work"), 'cynicism' (e.g. "I've become more callous towards people since I took this job") and 'efficacy' (e.g. "I deal very effectively with the problems of my work"). Responses were arranged (in a Likert = type scale with 7 response options, scored from 0 ('never') to 6 ('always'). Results are presented in scalar scores. All of the questionnaire dimensions acquired an internal consistency of α≥0.78.[16]

Data Analysis

A descriptive analysis of the participants' socio-demographic and occupational characteristics was conducted, using means and standard deviations for age and percentages for the other variables. Contrasts were made depending on the sub-sample to which participants belonged using Student's t-test for age and χ2 for the rest.

An initial contrast was made of the validity of the BCSQ-12 construct by means of an exploratory factor analysis (EFA) over n1. The maximum likelihood (ML) extraction method was used with varimax orthogonal rotation to facilitate interpretation, enabling relatively unrelated dimensions to be obtained. We had previously verified that: the correlations matrix presented a large number of significant values; all variables presented a value of r > 0.30; the absolute values of the anti-image matrix were close to 0; the matrix determining factor was very low; the Kaiser-Meyer-Olkin (KMO) index was > 0.70; Barlett's test of sphericity was statistically significant; and the measures of sampling adequancy (MSA) were above 0.80.[13] The number of components was decided using Kaiser's criterion, which requires eigenvalues > 1,[17] in addition to Cattel's scree test on the sedimentation graph.[18] The belonging factor was determined by means of the factor weight criterion w > 0.5 in only one of the factors[12] and the percentage of variance explained in each variable by means of h2 communality values.

Confirmatory factor analysis (CFA) was performed over n2 in order to ensure the clear distinction between the factors. The covariance matrix was used for data entry as it enables robust analysis to be made of ordinal data when the latent variables present more than one indicator.[19] This analysis was carried out using the ML method. This method assumes a multivariate normality, although it is relatively insensitive to its non-observance.[20,21] Nevertheless, we ensured that Mardia's coefficient for kurtosis was < 70,[22] given that below this limit, the ML method provides consistent parameter estimates.[23] All components of the model were introduced as latent factors, taking the items of the BCSQ-12 as observable variables distributed according to the original proposal.[7] From an analytical perspective, factor saturations (λ) > 0.5,[24–26] the explained variance on each observable variable (R2 ) and the degree of association between latent factors (φ), all of which were standardized, were taken into account. From a general perspective, absolute fit and incremental fit indices were contemplated.

The absolute fit indices used were: chi-square (χ2), chi-square/degrees of freedom (χ2/df), goodness-of-fit index (GFI), adjusted goodness-of-fit index (AGFI), root mean square error of approximation (RMSEA) and standarized root mean square residual (SRMR). χ2 is highly sensitive to sample size,[24] for which use was also made of χ2/df, which indicates a good fit with a value < 5 or, more strictly, < 3.[20,21,24,25] GFI measures explained variance and presents the same limitation as χ2, while AGFI corrects this limitation depending on the degrees of freedom and number of variables. Both are considered acceptable ≥ 0.9.[26–29] RMSEA is a measurement of the error of approximation to the population and is considered acceptable < 0.08,[30] although values of < 0.06[28] and < 0.05[24] have also been proposed. Generally speaking, values < 0.05 are good, while those close to 0.08 are reasonable and values > 0.1 are unacceptable.[31] SRMR is the standardized difference between the observed and the predicted covariance, indicating a good fit for values < 0.08.[21]

The incremental fit indices used were: normed fit index (NFI), non-normed fit index (NNFI), incremental fit index (IFI) and comparative fit index (CFI). NFI measures the proportional reduction in the adjustment function when going from null to the proposed model; it does not take into account the parsimony of the model and is considered acceptable > 0.9.[32,33] NNFI considers the degree of freedom of the proposed model and of the independence model and ≥0.9 is recommended,[26] although > 0.9[33] and ≥0.95[34] have been proposed. IFI also introduces a factor of scale, with values > 0.9 being acceptable.[35] CFI measures improvement in the measurement of non-centrality, also taking into account the parsimony of the model, and indicates good fit ≥0.9,[26] although > 0.9[30] and ≥0.95[34] have also been proposed.

Criterial validity was estimated using ROC curve analysis over nT. The area under this curve was taken as a representation of the discriminatory capacity of the 'overload', 'lack of development' and 'neglect' dimensions (BCSQ-12) to differentiate between 'cases' and 'non-cases' of 'exhaustion', 'cynicism' and 'lack of efficacy' (MBI-GS), respectively. 'Case'/'non-case' status was established in the criterion dimensions taking as the cut-off the 75 percentile of the standard yardstick for the general Spanish population, corresponding to high or very high scores ('exhaustion'≥2.90; 'cynicism'≥2.26 and 'efficacy'≤3.83).[16] The χ2 test was used to contrast the area under the ROC curve against the hypothesis of random behaviour. Cut-off points were chosen for the BCSQ-12 dimensions at scores that optimized the sensitivity-specificity ratio, marking the difference between 'exposed' and 'non-exposed' in each of the conditions.

Accuracy was also calculated by means of negative predictive values, overall misclassification rate, positive likelihood ratio tests (coefficient between sensitivity and 1-specificity) and negative likelihood ratio tests (coefficient between 1-sensitivity and specificity). Likelihood ratio tests between 0.5–2 are regarded as poor; between 2–5 or 0.2–0.5 as good; 5–10 or 0.1–0.2 as very good, and > 10 or < 0.1 as excellent.[36] The size of the effect was estimated by using multivariate logistic regression (LR) models by means of the calculation of adjusted Odds ratios (OR), controlling the variables of age, sex, stable relationship, level of education, occupation type, years of service and duration and type of work contract, described in the preceding section. The statistical significance of the effect was estimated by the Wald test and the goodness of fit of models by means of the Hosmer-Lemeshow (H-L) χ2 test. Confidence intervals at 95% (CI 95%) were calculated in all measures of accuracy and effect.

The distribution of items and factors were described by means of the statistical concepts of mean, standard deviation, median, 25–75 percentiles, minimum-maximum scores, asymmetry and kurtosis. Internal consistency was assessed by means of the item-rest correlation, Cronbach's α and according to changes in α through the elimination of each individual item. Contrasts were made depending on sex and occupation using the Mann-Whitney and Kruskal-Wallis tests, given the non-parametric distribution of the dimensions on these groups.

The level of significance adopted in the tests was p < 0.05, and p < 0.017 for multiple comparisons owing to the Bonferroni correction. Data analysis was carried out using the SPSS-15, AMOS-7 and Epidat 3.1 software packages.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.