How Do Expectant Fathers Respond to Infant Cry?

Examining Brain and Behavioral Responses and the Moderating Role of Testosterone

Hannah Khoddam; Diane Goldenberg; Sarah A. Stoycos; Katelyn Taline Horton; Narcis Marshall; Sofia I. Cárdenas; Jonas Kaplan; Darby Saxbe


Soc Cogn Affect Neurosci. 2020;15(4):437-446. 

In This Article



Participants were drawn from the larger longitudinal Hormones and Attachment Across the Transition To Childrearing (HATCH) study. The study follows couples from mid-to-late pregnancy across the first year post-partum. Recruitment occurred via flyers, social media advertising and word of mouth. The current study uses data from a prenatal laboratory visit, conducted in mid-to-late pregnancy, and a separate MRI visit that occurred within 2 weeks of the in-lab visit. Inclusion criteria included that couples were cohabiting, pregnant for the first- time with a singleton fetus and free of use of psychotropic medication. Exclusion criteria included any contraindications for MR scanning, the use of psychotropic medication and left-handedness.

Data for the current study were available for 41 expectant fathers who provided handgrip and self-report data. Of these, 34 fathers also provided neuroimaging data, and 32 of these fathers provided testosterone data. Mean age was 31.7 years old (s.d. = 4.25 years). The sample was highly educated with 80% of participants achieving a college degree or higher, and the population was ethnically diverse (36% White, 7% Black, 26% Hispanic or Latin, 24% Asian or Pacific Islander and 5% others).


Expectant fathers participated in one in-lab visit scheduled mid-to-late pregnancy (average weeks pregnant = 29 weeks, s.d. = 4.7 weeks, range = 18–38 weeks) and an MRI visit an average of 1.05 (s.d. = 1.04, range 0–4 weeks) weeks later. The majority of fathers (31 out of 34 fathers) completed the scan visit within 2 weeks of the in-lab visit. During the prenatal in-lab visit, each father provided three saliva samples for testosterone sampling over 90 minutes and completed the handgrip task after saliva collection, as described below. Additionally, after completing the handgrip task, fathers were asked to listen to the infant cry noise and complete the Emotional Reactions Questionnaire (ERQ) and trait rating task. During the MRI visit, fathers completed the same infant cry task as part of a larger MRI data collection protocol. All procedures were approved by University IRB, and all participants signed informed consent forms prior to participation.

Infant Cry Task. Using the same infant cry and control sounds as a previous study (Riem et al., 2014), the cry task included six 30-second auditory clips of infant crying interspersed with six 30-second clips of frequency-matched white noise counterbalanced across participants leading to 12 trials. The stimuli were presented electronically using the E-Prime 3.0 software (Psychology Software Tools, Pittsburgh, PA) in a block design. The task was 6-minute long and administered in one run.

Handgrip Modulation. Established procedures for handgrip dynamometer data collection (Crouch et al., 2008; Bakermans-Kranenburg et al., 2012; Riem et al., 2012) were followed. Prior to playing the infant cry task, a research assistant demonstrated correct hand placement on the dynamometer and modeled the handgrip task (with their dominant hand). Participants watched a line graph indicating grip-strength on the computer screen. The participant was asked to perform a full-strength squeeze and a half-strength squeeze while watching the line graph. The RA gave verbal feedback on each trial to demonstrate an accurate full-strength and half-strength grip. Once the participant performed this task accurately for three consecutive trials, the participant performed the infant cry task. During data collection, participants were prompted to do a full-strength squeeze, followed 2 seconds later by a half-strength squeeze one time per infant cry and white noise trial. Participants averaged 30 trials (of one full-strength grip and one half-strength grip) during training, with a range of 7–50 trials to master the task.

Testosterone. Saliva samples were collected in CryoSafe collection tubes using passive drool and then stored at −80°C before shipment on dry ice to the Technical University of Dresden (Kirschbaum, PI) to be assayed. Fathers were instructed not to eat, drink anything besides water and chew gum within an hour of before collection. Timing of collection was held constant across participants to minimize variability, and testosterone samples were taken during the first 90 minutes of the prenatal laboratory visit and were not concurrent with the handgrip task described above which occurred after all saliva samples were collected. Testosterone levels were averaged across the three samples.

Neuroimaging Protocol. Imaging was performed on a Siemens 3 Tesla MAGNETOM Prisma scanner using a 20-channel matrix head coil. Functional images were collected using a T2*-weighted echo planar (EPI) sequence (32 transversal slices; TR = 2000 ms; TE = 25 ms; flip angle = 90°) with a voxel resolution of 3 mm × 3 mm × 2.5 mm. Anatomical images were acquired using a magnetization-prepared rapid acquisition gradient echo (MPRAGE) sequence (TR = 2530 ms; TE = 3.13 ms; flip angle 10o; isotropic voxel resolution 1 mm3). Task sounds were transmitted using Siemens V14 sound headphone system.


Emotional Reactions Questionnaire. Following the in-lab cry task, participants completed the ERQ (Milner et al., 1995) to indicate how well each adjective describes their present mood (1, not at all, to 7, extremely well). The negative emotion (bothered, irritated, annoyed and hostile) subscales were averaged to determine negative emotions after listening to infant cry (alpha = 0.95, Milner et al., 1995).

Trait Rating Task. Also following the in-lab infant cry task, each father was asked to rate the infant on nine traits (i.e. hostile, negative, difficult, friendly, cooperative, sweet, content, lively, attached). Following the procedures of previously conducted studies (Crouch et al., 2008; Bakermans-Kranenburg et al., 2012), the trait ratings were made on a 10-point scale (ranging from 1, not at all, to 10, extremely likely). Positive traits were also included to increase validity and decrease bias toward negative traits. Trait ratings were averaged across the three negative traits (hostile, negative and difficult) to obtain a composite negative trait rating.


Hypothesis 1. Neural responses to infant cry were analyzed using FEAT (FMRI Expert Analysis Tool) of FSL (FMRIB's Software Library,; Smith et al., 2004). First, motion correction using MCFLIRT, non-brain removal, spatial smoothing (5 mm FWHM Gaussian kernel) and registration to T1-weighted images using FSL FLIRT were done for pre-processing. Then, functional activation was examined with general linear model analyses. To identify regions involved in the perception of infant crying, contrasts of cry > white noise and white noise > cry were assessed. Contrasts of parameter estimates (COPEs) for cry > white noise and white noise > cry sound tested primary hypotheses regarding response to infant cry vs a frequency-matched white noise. First-level COPEs served as inputs to higher-level group analyses conducted using FLAME to model random-effect components of mixed-effect variance. Images were thresholded with clusters determined by Z > 2.3 and a cluster-corrected significance threshold of P < 0.05 (Worsley et al., 2002) to identify regions that were activated during cry vs white noise across the six blocks for each sound. Father's age and weeks pregnant were mean-centered and included as confound regressors in all models. Models were run with and without covariates and yielded similar results. To visualize results, spherical ROIs (r = 5 mm) centered on activation peaks were used to extract signal change for each condition.

Additionally, given our a priori hypotheses focusing on the amygdala, ROI analyses of the bilateral amygdala were conducted. Parameter estimate values were converted to percentage signal change values via scaling of the PE or COPE values by (100*) the peak-peak height of the regressor (or effective regressor in the case of COPEs) and then by dividing by the mean over time of the filtered functional data. A report was generated using featquery with statistics derived from each image's values within the mask. Percent signal change was extracted from bilateral amygdala using anatomically defined masks created using the Harvard-Oxford subcortical atlas.

Hypothesis 2. Consistent with previous studies (Crouch et al., 2008; Bakermans-Kranenburg et al., 2012; Riem et al., 2012), handgrip modulation was calculated by dividing the half-squeeze intensity by the maximum squeeze intensity per block, and an average ratio of half strength/full strength squeezes was calculated for infant cry and white noise per person.

Hypothesis 3. Demeaned self-report ratings of infant cry and negative emotions were added (separately) as regressors into the general linear model described above. The first-level contrast images of cry > white noise and white noise > cry were submitted to second-level whole-brain analysis to determine differences in activation depending on interpretations of the infant as more negative and self-reported negative emotions after infant cry. Both positive and negative contrast weights were tested for each continuous predictor to determine whether it is related to increased or decreased neural response. Lastly, multivariate regression analyses were used to test the relationship between interpretations of the infant as more negative, and negative emotions during infant cry and signal change in the amygdala ROI.

Hypothesis 4. First, testosterone was added as a regressor to the FSL models testing the relationship between prenatal T and neural activation to infant cry as described above. Next, multivariate regressions were run to determine the relationship between (i) testosterone and negative infant interpretations and (ii) testosterone and self-reported negative emotions while listening to infant cry. All analyses included gestational age of the infant and father's age as covariates.

We adjusted for multiple comparisons using the Holm–Bonferroni method (Holm, 1979), in which the alpha value is adjusted such that the lowest P-value (I = 1) is expected to fall below a/k, where k is the number of analyses) and the higher values to progressively less restrictive thresholds (a/(kI + 1). Therefore, for six planned analyses, we would require at least one model be significant at P = 0.008 (0.05/)6, one model significant at 0.01 (0.05/5), one at 0.013 (0.05/4), one at 0.017 (0.05/3) and one at 0.025 (0.05/2).