A High Risk of Sleep Apnea Is Associated With Less Postoperative Cognitive Dysfunction After Intravenous Anesthesia

Results of an Observational Pilot Study

Soeren Wagner; Joerg Quente; Sven Staedtler; Katharina Koch; Tanja Richter-Schmidinger; Johannes Kornhuber; Harald Ihmsen; Juergen Schuettler


BMC Anesthesiol. 2018;18(139) 

In This Article


In this prospective single-center study, two groups (OSAS group and a control group) were compared. The study was performed at the University Hospital of Erlangen, Germany between June 2012 and June 2013 in accordance with the guidelines for Good Clinical Practice and the Declaration of Helsinki. The study was approved by the local Ethics committee (Ethikkommission der Medizinischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany) on 19.04.2012 (reference number: 87_12 B).


After written informed consent, 51 adult patients of both genders with an estimated hospital stay of at least 3 days undergoing surgery were enrolled in this study. Inclusion criteria were an age between 55 and 80 years and an American Society of Anesthesiologists physical status classification of I to III. Patients with a history of brain or head injury, cerebral ischemia, diseases of central nervous system, psychological disorder, alcohol or illegal drug abuse, neuroenhancing or neurocompromising medication, manifest diagnosis of pre-existing cognitive impairment or severe cardiovascular disorder were excluded from the study. Patients were assessed by a detailed screening examination and clinical interview. None of the included patients had been treated for OSAS. Cognitive function was assessed using a neuropsychological test battery on the day before surgery as a baseline measurement. The postoperative testing was performed after surgery on the first or second postoperative day.

Assessment of the Risk for OSAS

The severity and classification of OSAS is most commonly characterized by measuring the apnea/hypopnea index (AHI) resulting from respiratory disturbances while the sleeping patient is monitored in a sleep laboratory. Oxygen desaturation far below the physiological range is the key clinical parameter to assume an apnea episode during sleep. In the present study, however, we used the STOP-BANG test to identify patients with a high risk for OSAS. The STOP-BANG test is a brief, well-known and international accepted questionnaire for sleep disorders.[1] This validated test detects patients with an increased risk for OSAS.[11,12] The STOP BANG questionnaire asks for the incidence of snoring, fatigue and observed stop of breathing, and considers also data on blood pressure, body mass index, age, neck size and gender. Patients meeting the study criteria were asked to participate in the study. The assessors were blinded to the screened and enrolled patients. Patients were screened during premedication visit of the anesthesiology staff. Eligible patients were enrolled after written informed consent to the study. After consent, the patients were screened via the STOP Bang Test. If the resulting composite score (range 0 to 8) was 3 or higher, the patient was assigned to the OSAS group, otherwise the patient was assigned to the control group. The study was performed by two assessors to ensure a testing phase within the study schedule. There were no significant differences between both assessors regarding the results validity detected.

Clinical Protocol

Preoperatively, we registered all medications, which had been prescribed and assured that no neuroenhancing or neurocompromising medication had been started recently or was given routinely. Benzodiazepines for premedication were strictly avoided in all patients, and instead 75 or 150 μg clonidine was given orally if needed. Total intravenous anesthesia with propofol and remifentanil was performed in all patients. Prior to intubation, patients received 0.1 mg fentanyl. Propofol and remifentanil were administrated as target controlled infusion (TCI). Rocuronium was given as neuromuscular blocking drug. Intraoperatively hemodynamic stability was adjusted in order to avoid hypotension and hypertension phases or fluctuations of blood pressure while surgery.

Hemoglobin oxygen saturation was monitored in all patients by pulse oximetry. Prior to in- and extubation, all patients received 100% oxygen for several minutes and were transferred quickly to the recovery room. If necessary, additional oxygen was supplied in the recovery room and patients were weaned of oxygen support prior discharge.

During the postoperative phase, control patients were monitored in the recovery room for at least two hours. Patients in the OSAS group were monitored overnight according to our hospital's clinical security guidelines. Pain was evaluated by the 11-point numerical rating scale (NRS, 0 = no pain, 10 = maximum pain). If the patient's pain score was greater than 4, an infusion with 7.5 or 15 mg piritramide was administrated over 20 min according to total body weight.

Assessment of Cognitive Function

Cognitive function was assessed using a neuropsychological test battery comprising six different cognitive test as a baseline measurement on the day before surgery. The postoperative testing was performed after surgery on the first or second postoperative day. All perioperative tests were carried out in a quiet and separate room during daytime and it was attempted to perform pre- and postoperative tests at matching times of the day. All tests required approximately 60 min per testing phase and patient. To avoid learning effects, tests were presented in two different versions if necessary. The following tests were performed:


The DemTect is a highly sensitive psychometric screening test to identify patients with mild cognitive impairment and patients with dementia in the early stages of the disease.[13] The test consists of five tasks, which survey the functions "verbal memory", "verbal fluency", "cognitive flexibility" and "attention". The transformed total score with a range from 0 to 18 points is independent of age and education. DemTect helps in deciding whether cognitive performance is adequate for age (13–18 pts.) or whether mild cognitive impairment (9–12 pts.) respectively dementia (0–8 pts.) is likely.

Rivermead Behavioural Memory Test (RBMT)

The Rivermead Behavioural Memory test is a highly sensitive test of global memory impairment examining immediate and delayed recall.[14] It is designed to predict daily memory difficulties in people with an acquired and non-progressive brain injury in order to monitor their capability in the course of time. Regarding four parallel forms, learning effects can be avoided during the testing phases. We selected the subtest "story" for our study in which a nearly 55 word story is read out to the subject who has to recall the content immediately and after a 25 min interval. This subtest is known as a suitable evaluation of verbal memory function, logical memory and episodic memory. Maximum score is 84 points and a higher test value means a better performance.

Zahlen-verbindungs-test (ZVT)

The ZVT measures general intelligence performance and analyzes non-verbal cognitive performance speed independent of education but influenced by genetics.[15] The corresponding ability is quantified as "liquid intelligence" or "perceptual or processing speed". The test represents a diagnostic tool that is used clinically in organic brain disease. Tested subjects have to connect 90 ascending numbers, arranged randomly on four different sheets of paper with a pen. The time needed is recorded, averaged and age-adjusted for interpretation. Thus, a higher test value means a worse performance.

Trail-making-test (TMT)

The Trail Making Test A/B attempts to test neurocognitive performance combined with psychomotor ability by using a setting in which the patient has to connect up scaling numbers in the correct order.[16] In the B version the test requires an additional task by switching between alternating numbers and letters. The time needed is recorded, and consequently longer total times reveal greater impairment. In our study we used TMT B, which provides information about visual search speed, scanning, speed of processing and mental flexibility, which are good parameters for executive function.

Digit Span Test

The Digit Span Test used in our study is a subtest of the Wechsler Memory Scale.[17] It consists of random number sequences presented orally to the patient. The single digits have to be repeated in the same order. If repeated correctly, a single digit will be added to the sequence. The examination is repeated twice forwards and twice backwards. Each examination uses six numerical series. Test scoring is based on the total digits recalled correctly. Each correct answer leads to one point. Maximum score is 24 and higher scores indicate better cognitive performance. The Digit Span Test investigates the patient's short term memory capacity (digit span forward) and verbal working memory (digit span backward).

Short Cognitive Performance Test for Assessing Deficits of Memory and Attention (SKT)

The SKT assesses memory and attention deficits within a clinical setting.[18] It consists of nine subtests, which retrieve immediate and delayed verbal memory and attention, measured as the speed of information processing. Test samples are available in five parallel forms to avoid learning effects. Raw scores from each subtest are converted into norm values and a total score, which are age adjusted. Due to its subtest structure, memory and attention can be assessed separately. The scores range from 0 to 9 points for memory, and from 0 to 18 points for attention, respectively. Hence, the total score ranges from 0 to 27 points. Notable is that higher scores indicate more severe cognitive impairment, which is given gradually for each test result and related to the status of impairment.

Color-word-interference-test (FWIT)

The FWIT[19] is assembled from the three following parts: reading written color-words, naming the ink-color of a printed line and naming the ink-color of a written color-word instead of reading the word itself, which names a different color. In each part, processing time and the number of errors are recorded. Thus, a higher test value means a worse performance. The test measures nomination, alertness, and selectivity or rather interference. Recorded data allow the interpretation of executive function. The advantage of this test is that it is not subject to learning-effects upon repetition.

Statistical Analysis

Outliers were identified using the Grubbs test. In this test, the value Xmax which shows the largest deviation from the population mean is identified as outlier and removed from the data set if the test statistic Z = abs(mean-Xmax)/SD is larger than a critical value which depends on the sample size. In our case with 21 and 22 patients in the two groups, the critical values for p < 0.05 were 2.73 and 2.76, respectively. The procedure is iterated until no outliers are further detected. Categorical data were tested for differences between the two groups using the chi-square test. Continuous data were tested for deviations from the normal distribution using the Shapiro-Wilk test. The primary outcome parameter was the change of the cognitive function assessed by the difference between postoperative and preoperative test scores within a subject. The change from baseline values to post-anesthesia values within a group was tested for statistical significance using the paired t-test or the Wilcoxon test, respectively. The change of the cognitive function was further tested for significant differences between the two groups using the unpaired t-test or the Mann-Whitney test, respectively. In order to account for a different distribution of males and females within the two groups, we also analyzed the change of the cognitive function by ANOVA with the factors "gender" and "group". The level of significance was defined as p < 0.01.

The sample size was estimated based on published results for the DemTect.[13] A standard deviation of approximately 3 points was reported for the score. We assumed a difference of at least 4 points between the two groups as clinically relevant. Therefore, we needed at least 19 subjects per group to detect such a difference with an α-error of 0.01 and a power of 0.9. As we expected a 25% dropout rate for the postoperative testing we aimed a study size of at least 25 patients per group.

Categorical data are reported as numbers, continuous data are reported as median and range if not stated else. Statistical analysis was performed using Statistica software (Statistica Version 6, Tulsa/USA).