Effectiveness of Second-generation Antipsychotics: A Naturalistic, Randomized Comparison of Olanzapine, Quetiapine, Risperidone, and Ziprasidone

Erik Johnsen; Rune A Kroken; Tore Wentzel-Larsen; Hugo A Jørgensen


BMC Psychiatry. 2010;10:26 

In This Article


Study Design

The Bergen Psychosis Project (BPP) is a 24-month, prospective, rater-blind, naturalistic, randomized, head-to-head comparison of the effectiveness of olanzapine, quetiapine, risperidone, and ziprasidone. All patients were recruited from the Division of Psychiatry at Haukeland University Hospital with a catchment population of about 400000. The BPP was approved by the Regional Committee for Medical Research Ethics, and the Norwegian Social Science Data Services. Funding of the project was initiated by the Research Council of Norway, followed by Haukeland University Hospital, Division of Psychiatry. The BPP has not received any financial or other support from the pharmaceutical industry.


The Regional Committee for Medical Research Ethics allowed eligible patients to be included before informed consent was provided, thus entailing a clinically relevant representation in the study. In medical research the provision of informed consent from the participants is fundamental. The disqualification of the most gravely ill patients from participating in trials represents an ethical dilemma; however, as these patients will most likely receive the drugs once they are approved for marketing, despite the lack of evidence from this population. Trial inclusion of patients without informed consent is justifiable on 2 conditions: That no other context exists in which the research question can be answered, and that all patients get clear clinical benefit from whatever treatment they are allocated to.[10] These criteria are fulfilled in some mental conditions from which important studies have been published.[11,12] Patients (age ≥ 18 years) were eligible for the study if they were admitted to the emergency ward for symptoms of psychosis as determined by a score of ≥ 4 on one or more of the items Delusions, Hallucinatory behavior, Grandiosity, Suspiciousness/persecution, or Unusual thought content in the Positive and Negative Syndrome Scale (PANSS),[13] and were candidates for oral antipsychotic drug therapy. Eligible patients met ICD-10[14] diagnostic criteria for schizophrenia, schizoaffective disorder, schizophreniform disorder, brief psychotic episode, delusional disorder, drug-induced psychosis, and major depressive disorder with psychotic features. The diagnoses were determined by experienced clinicians. Patients were excluded from the study if they were unable to use oral antipsychotics, were suffering from manic psychosis, were unable to cooperate reliably during investigations, did not understand spoken Norwegian language, were candidates for electroconvulsive therapy, or were medicated with clozapine on admittance. Patients with drug-induced psychoses were included only when the condition did not resolve within a few days and when antipsychotic drug therapy was indicated.


The evidence thus far shows that to prospectively predict which antipsychotic might be optimal for a given patient with regards to effect and tolerability is not possible, and that antipsychotic therapy currently involves a trial and error approach.[15] A prior history of antipsychotic drug use may provide some information, though. Taking these factors into account the BPP protocol mimicked the normal clinical situation in which oral antipsychotic drug therapy is initiated, with one exception: At admission, a sealed and numbered envelope was opened by the attending psychiatrist and then the patient was offered the first drug in a random sequence of the first-line antipsychotics in Norway - olanzapine, quetiapine, risperidone, or ziprasidone. The randomization was open to the treating psychiatrist or physician and to the patient. Both the treating clinician and/or the patient could discard the SGA listed as number 1 on the list because of medical contraindications for the use of, or prior negative experiences with the drug, however, and the next on the list could be chosen. The same principle was followed if the next drug could not be used. A reason for discarding drugs was sought. In each sequence, the SGA listed as 1 defined the randomization group (RG). The actual SGA chosen, regardless of randomization group, defined the first-choice group (FCG). Further dosing, combination with other drugs, or switching to another antipsychotic drug were then left at the clinician's discretion. Apart from sporadic use, the patients in the project could use only one antipsychotic drug except during the cross-taper period associated with a change of antipsychotic drug. This is in correspondence with leading treatment guidelines which mention combinations of antipsychotics only as a last resort. In cases where concomitant use of more than one antipsychotic drug was found inevitable, the patient was excluded from the project. Any investigation that was beyond normal clinical practice was introduced only after informed consent was obtained.


Study visits were at baseline, at discharge or at 6 weeks from baseline at the latest, and at 3, 6, 12, and 24 months from baseline.

All assessments were performed by one trained investigator. Before inclusion, eligible patients were interviewed by the investigator, using the PANSS, the Calgary Depression Scale for Schizophrenia (CDSS),[16] and the Clinical Drug and Alcohol Use Scales (CDUS/CAUS),[17] and were rated according to the Clinical Global Impression--Severity of Illness scale (CGI-S),[18] and the Global Assessment of Functioning--Split Version, Functions scale (GAF-F).[19] The patients received a physical examination by the admitting physician, and standard blood samples were collected according to the hospital's routine. At discharge from the hospital or at 6 weeks if not discharged, the tests and examinations were repeated by the rater who was unaware of the treatment. Patients were asked also to complete the patient-administered version of the UKU Side Effect Rating Scale (UKU-SERS Pat),[20] and serum level measurements of the antipsychotics were conducted. Thus far, all investigations and tests were part of the hospital's routine for the management of patients suffering from psychosis and became part of the patient's medical record. At this point, the patients were asked for informed consent to be contacted and included in the follow-up project.

At follow-up visits 3, 6, 12, and 24 months after baseline, measures of psychopathology, function, and tolerability, as well as clinical and laboratory assessments were repeated by the rater blind to treatment.

The global outcomes measures were: the time until discontinuation of the initial SGA for any cause, the time until discharge from index hospitalization, and the time until readmittance to the emergency ward for any reason. Symptoms were assessed by the PANSS, the CDSS, the CGI-S, and the GAF-F. Tolerability was measured by the UKU-SERS-Pat, physical examinations, and laboratory tests. The repeated physical examinations included Body Mass Index (BMI), waist and hip circumferences, and blood pressure. Laboratory tests included electrocardiogram (ECG) and blood tests on glucose, lipids, prolactin, and liver functions. The patients were fasting before the drawing of blood, as defined by no intake of food or caloric drink during the preceding 9 hours.

At each visit, all medications were recorded, and the mean antipsychotic drug doses were calculated. Antipsychotic drug doses for antipsychotics other than the SGAs were converted to chlorpromazine equivalent doses.[21] In cases were chlorpromazine equivalent doses could not be found in the literature, this was done by conversion to defined daily doses (DDDs) as developed by the World Health Organization Collaborating Centre for Drug Statistics Methodology.[22] The basic definition of the DDD unit is the assumed average maintenance dose per day for a drug used for its main indication in adults.

Statistical Procedures

The primary analyses were intention-to-treat (ITT) analyses based on the randomization groups (RGs), that is trial participants were analyzed in the group to which they were randomized regardless of which treatment they actually received or how much treatment they received.[23] Secondary analyses were based on first choice groups (FCGs). Baseline data of FCGs were analyzed using SPSS software, version 15 (SPSS, Chicago, IL), and by means of exact χ2 tests for categorical data and one-way ANOVAs for continuous data. For multiple comparisons, Benjamini-Hochberg adjustments were applied. For continuous data that were not approximately normally distributed, a Kruskal-Wallis nonparametric test was used. For baseline comparisons between those lost to follow-up before retesting and those who were retested, independent samples T-tests were used for continuous data and exact χ2 tests for categorical data.

Global outcomes were analyzed using SPSS, version 15, with Kaplan-Meier analyses of survival. Change of symptoms and tolerability outcomes were analyzed in R by means of linear mixed effects (LME) models.[24,25] Fixed effects, i.e. systematic differences between the drugs, were different linear slopes in the four treatment groups, technically a group by time interaction with no baseline group differences. The model calculates overall change per time unit for the variables in the follow-up period that can be visually represented by the slope of a linear curve with time on the x axis and the respective variable on the y axis. The target of the present study was to investigate the over-all change during the follow-up period and the LME model was considered the analysis of choice for this purpose. The model uses all available data and handles different numbers of visits by individual patients, as well as differences in times between visits. Furthermore, the mixed effects model has demonstrated superior statistical power when the missing data is non-ignorable.[26] A linear slope for the follow-up period may represent an over-simplification, however, as it does not capture slope differences at different times. Based on results from other effectiveness studies symptom changes typically follow an initial steep decline followed by a flatter curve.[9,27] LME sensitivity analyses were therefore undertaken separately for the steep and for the flat part of the symptom curves. The choice of period corresponding to the steep and flat part was derived from visual information from plots of the individual symptom curves. The draw-back of dividing the follow-up is loss of statistical power and hence risks of statistical type II errors.

Symptom ratings, laboratory tests and physical examinations were administered on all visits. The UKU-SERS-Pat was administered at visit 2 and following visits. Because differences between treatment groups on UKU-SERS-Pat measures could theoretically be present at visit 2, this was allowed for in the statistical model. For multiple comparisons, Benjamini-Hochberg adjustments were applied. The level of statistical significance was set at α = 0.05.