1 Introduction

Understanding of pre-instrumental variability in El Niño Southern Oscillation (ENSO) is a key challenge for palaeoclimatology. Instrumental records are too short to fully capture the natural range of ENSO variability and to assess ENSO behaviour in the pre-industrial era (Braganza et al. 2009). Therefore, in order to understand the natural variability of ENSO, proxy records and historical documents that span beyond the instrumental period need to be used (Gergis et al. 2006). A companion paper (Barrett et al. 2016, hereafter referred to as B16) presented new ENSO reconstructions which utilise the wind data found within ships’ logbooks during the period 1815/16–1853/54. The data from ships’ logbooks provides a record with low dating uncertainty and are based on direct, in-situ observations of the weather, thus providing a unique reconstruction from a period when routine observations of the weather were uncommon over land. Four historical reconstructions of the boreal winter, December, January, February (DJF), SOI, were presented in B16 using methods of principal component regression (PCR) and composite-plus-scale (CPS). The logbook-based reconstructions were found to have good agreement with a Jakarta-based SOI, which uses rain-day counts, from 1830 to 1850 (Können et al. 1998), suggesting a common signal from these early observation-based records. The reconstructions from ships’ logbooks will here be compared to previous proxy and documentary-based reconstructions of ENSO, in order to add to the understanding of ENSO variability during the early to mid-nineteenth century.

Proxy data are natural archives of the climate and environment of the past. They are indirect measures of past climatic changes and have been used extensively in the reconstruction of ENSO (Gergis et al. 2006). Here we focus on ENSO reconstructions that cover the historical period and which coincide with the reconstruction from ships’ logbooks. Recent ENSO reconstructions take a multi-proxy approach, using a range of data sources including corals, tree rings, speleothems and ice-cores (McGregor et al. 2010; Emile-Geay et al. 2013a). The multi-proxy approach captures ENSO signals from different regions and different sources. This helps to address the problem of the assumption of stationarity in the relationships between ENSO and the climate signal in the regions in which the proxies are located. This is useful as no two ENSO events, or their resulting teleconnections, are the same so combining the different signals from multiple regions provides the most robust ENSO signal (Braganza et al. 2009; Capotondi et al. 2015). The differences in the spatial distribution of proxy data used in teleconnection regions can result in differences in the ENSO reconstructions. Therefore, it is important to consider the locations of the proxies from which reconstructions are made, and the implications of these differences.

Additional indications of ENSO behaviour in the pre-instrumental era can be obtained from documentary records. Documentary records are the focus of studies within historical climatology and include sources such as weather journals, governmental and military records, and documents from religious institutions. These historical documents are ‘human’ climate proxies and provide data from sources that were often not originally intended for use in climatology (Nash and Adamson 2014). The work of Quinn et al. (1987) played a key role in promoting the use of historical documents for El Niño reconstruction. Quinn et al. (1987) created an El Niño events-based chronology using publications from the west coast of northern South America and the neighbouring parts of the Pacific Ocean, extending back to AD 1525. This record provides a benchmark in historical ENSO reconstruction studies (Gergis and Fowler 2009).

The data from ships’ logbooks are an alternative source of documentary data, with weather observations taken daily on board ships from across the World’s oceans. Although the use of data from ships’ logbooks has increased in the field of historical climatology recently, there has only been one prior use of this data source which directly focused on ENSO reconstruction before that presented in B16. Jones and Salmon (2005) developed a preliminary reconstruction of the SOI using ships’ logbook data, but were hindered by poor data availability. They found weak correlations with previous ENSO reconstructions over the period 1750–1850. The logbook-based SOI reconstructions presented in B16 built upon Jones and Salmon’s (2005) initial reconstruction by using additional data, concentrating on regions with strong ENSO signals and developing their methodologies further. Therefore, comparison of the logbook-based reconstructions with their work, and the proxy studies with which they compared their SOI index, investigates the improvement of these changes to the logbook-based reconstruction methods.

A recent increase in the understanding of ENSO event diversity raises questions over the differences in the climatic response to the range of ENSO flavours (Capotondi et al. 2015). Two main spatial patterns of sea surface temperature (SST) anomalies across the Pacific can classify El Niño events into two categories: Eastern Pacific (EP), also referred to as canonical or classic events, and Central Pacific (CP), also referred to as El Niño Modoki (Ashok et al. 2007), warm pool (Kug et al. 2009), or date-line El Niño (Larkin and Harrison 2005). EP events are defined by the SST pattern typically associated with El Niño, with the strongest positive anomalies in the Eastern equatorial Pacific. CP events have a zonal tripole pattern of SST anomalies, with strongest positive anomalies in the Central Pacific, and negative anomalies in the western and Eastern Pacific (Ashok et al. 2007). Recent decades have seen an increase in the occurrence and strength of CP El Niño events, therefore it is important to understand the implications of these events (Lee and McPhaden 2010). Recent studies have identified some key differences in the teleconnections of EP and CP events, for example in the spatial patterns of temperature and precipitation anomalies over the USA, variables that influence ENSO reconstructions from tree-rings (Infanti and Kirtman 2016). The relationship with the western North Pacific monsoon and ENSO is found to be weaker for EP events than CP events (Weng et al. 2011) and Australian rainfall and Indian Monsoon rainfall is more sensitive to CP events than EP events (Wang and Hendon 2007; Kumar et al. 2006). Precipitation anomalies in South America are strong during EP events but not during CP events (Andreoli et al. 2016). Therefore, there are distinct climate impacts of EP and CP El Niño that leave different signals in the paleoclimate proxies which early ENSO reconstructions would not have been aware of (Wang et al. 2012). The influence of the two types of El Niño on the variables to which proxies are sensitive should be taken into account when using proxy records to identify ENSO behaviour (Karamperidou et al. 2015). Nevertheless, there is still poor consistency amongst previous ENSO reconstructions (Emile-Geay et al. 2013a). Here, the effect of CP and EP events and their teleconnections are explored as a potential source for these inconsistencies.

This paper is organised as follows: Sect. 2 introduces the range of proxy and documentary reconstructions used for comparison of ENSO in this paper. Section 3 describes the methods of comparison used. Section 4 presents the results of the ENSO reconstruction comparisons, with a focus on the continuous ENSO indices from proxy sources, followed by events-based chronologies from both proxy and documentary evidence. Section 5 summarises the key findings and areas of uncertainty, while Sect. 6 concludes this study and indicates areas of further research.

2 Data

2.1 Logbook DJF SOI reconstructions

Four reconstructions of the DJF SOI using wind observations from ships’ logbooks were presented in B16. These extend from 1815/16 to 1853/54, a period dictated by the availability of a sufficient number of digitised logbook observations. The wind force and direction observations were taken from the ships’ logbooks and converted into seasonal zonal wind anomalies at a 7.5° by 8° latitude-longitude gridded resolution. Nine predictor grid boxes from a number of regions were selected due to the existence of statistically significant (p < 0.1) correlation between ERA-Interim zonal wind and SOI during the modern period 1979–2013, and suitable logbook data availability. These grid boxes are located in the Indian Ocean, around the southern tip of Africa and in the Atlantic Ocean. Firstly, PCR was carried out, followed by a simpler method of CPS. Two unadjusted reconstructions were obtained and labelled PCR A and CPS A, with variant A referring to reconstructions which were based on a climatological average calculated over the entire reconstruction period, 1815–1854. These methods were then repeated, but with adjustments made to the climatological means to take into account the shift from dominance of English East India Company (EEIC) observations in the first part of the record, to Dutch observations in the later half. A statistically significant difference in the wind force observations between vessels from these two countries was found in B16, and the adjustment in climatological means attempts to address this bias. Therefore, a separate climatological mean was applied to the earlier period (1815–1833) and the later period (1834–1853), and the two time series combined. The resulting reconstructions are labelled variant B, PCR B and CPS B. B16 suggested that the CPS B method performs best, based on higher RE skill scores from the CPS method and the adjustment in climatology to reduce the bias. However, all four are investigated here, as although we strongly suspect an artificial change in the mean, we cannot be completely certain that this change is not real.

Over the modern era, 1979–2013, ERA-Interim data are used to test the two reconstruction methods. It was found, in B16, that using the full ERA-Interim dataset over this period, strong positive correlation coefficients with the instrumental SOI are obtained for both PCR and CPS. Additionally, in order to obtain a dataset which best represents the data availability of the logbook SOI reconstructions during the modern era, ERA-Interim is sub-sampled at the availability similar to that from the logbook era in the predictor grid boxes. This was carried out in B16 in order to assess the sensitivity of the reconstruction methods to limited data availability. It was found that even with a limited number of observations a reasonable level of reconstruction skill can be obtained using the PCR and CPS methods. Therefore, a sub-sampled PCR and CPS time series which replicates this data availability in the modern period, 1979–2013, is also used in comparison to the instrumental record of ENSO. The sampling replicates the average number of observations per predictor grid box over the logbook period, 1815–1853 (Table S1) for one random iteration and the mean of five iterations as in Küttel et al. (2010). This provides the best possible replication and representation of the logbook-based reconstruction during the modern period and therefore allows the most realistic comparison between analysis of the modern period and that of the logbook period.

2.2 Instrumental SOI and modern ENSO diversity chronologies

A seasonal SOI was calculated from monthly values of the instrumental pressure difference between Tahiti and Darwin that is available from 1876 to present from the Australian Bureau of Meteorology (BoM 2015). The SOI for boreal winter (DJF) season is used as a key indicator of ENSO behaviour. As the SOI does not provide a distinction between EP and CP events, methods which address the difference in SST anomalies in regions of the equatorial Pacific are used to distinguish between EP and CP events (Ashok et al. 2007; Kao and Yu 2009; Yeh et al. 2009). There exists a multitude of methods, indices, datasets and time periods over which classification of EP and CP events has been carried out, with no consensus over which produces the best chronology (Yu et al. 2012; Pascolini-Campbell et al. 2015). Here, we use a simple approach defined by Kug et al. (2009) which provides ones of the longest records of El Niño type, spanning back to 1870. The NINO method uses the NINO3 (5°N–5°S 150°W–120°W) and NINO4 (5°N–5°S 160°E–150°W) SST indices, which are commonly used to identify Eastern Pacific and Central Pacific El Niño events (Kug et al. 2009; Yeh et al. 2009). EP (CP) events are defined by the normalised NINO3 (NINO4) SSTA being greater than one and being greater than the NINO4 (NINO3) (Ham and Kug 2012). The NINO3 and NINO4 indices were obtained from the US National Oceanic and Atmospheric Administration (NOAA 2017a, b), and were calculated from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST1). However, it must be noted that using alternative ENSO diversity chronologies produces different results due to the variations in methods used by the numerous classification schemes, and by the datasets used (Yu et al. 2012). This is a key area in which further consolidation is needed in order to provide a more definite record of the diversity of past ENSO events.

2.3 Multi-proxy ENSO reconstructions

A range of multi-proxy ENSO reconstructions exist from previous studies, which use a number of different sources and methodologies. There is no consensus on which multi-proxy ENSO reconstruction provides the most realistic record of past ENSO behaviour. Proxy data sources which have been used to reconstruct ENSO include tree-rings, ice cores, corals and multi-proxy databases (Gergis and Fowler 2009; McGregor et al. 2010; Emile-Geay et al. 2013a). Methods of proxy-based reconstruction have a range of uncertainties and limitations associated with them, such as dating errors, local environmental influences and non-stationary teleconnections. Therefore, comparison and calibration in multi-proxy analysis is suggested as the best way to gain a comprehensive record of climatic changes from the range of sources (Gergis and Fowler 2009).

Stahle et al. (1998) used tree rings from southern US and Mexico, a region with strongly established teleconnections to ENSO, along with one teak tree from Java to reconstruct a boreal winter (DJF) Southern Oscillation Index extending from 1706 to 1997. This ENSO reconstruction is directly comparable to the logbook-based reconstructions as both produce a time series of DJF SOI, and PCR methods were used in both studies. This was one of the reconstructions used by Jones and Salmon (2005) for comparison with their SOI reconstruction. Mann et al. (2000a), also used by Jones and Salmon (2005) for ENSO comparison, present a reconstruction of Oct-March Niño 3 SST, 1408–1978, compiled from 112 annually resolved proxies. The use of such annually resolved proxies provides good comparability to the annual logbook-based SOI reconstruction. In both cases, use of proxies from tropical and extra-tropical regions ensures multiple ENSO teleconnections are used to inform the reconstruction.

McGregor et al. (2010) used ten commonly used ENSO reconstructions, including Stahle et al. (1998) and Mann et al. (2000a). This combined ENSO reconstruction resulted from the use of 155 proxies, including tree rings, corals and ice cores from across the Pacific and from key teleconnection regions such as the US. The inclusion of the data used by the reconstructions by Stahle et al. (1998) and Mann et al. (2000a) within McGregor et al.’s (2010) proxy index means that an additional degree of similarity will be expected when comparing these three reconstructions. The locations of these three proxy networks can be seen in Supplementary Information, Fig. S1.

Wilson et al. (2010) provide a normalised NINO3.4 reconstructions which uses proxy data from teleconnection (TEL) regions. Annually resolved proxies were used from teleconnection regions within the tropics. Eleven coral proxies and one tropical ice core record were used for their TEL reconstruction which spans from 1540 to 1998 (Wilson et al. 2010). Therefore, a smaller number of proxies are used compared to McGregor et al. (2010), however the focus on tropical teleconnections only suggests that there may be less noise in the ENSO signals from the proxies used in this reconstruction. The proxies used are located mostly in the western Pacific and Maritime continent, with a few in the central Pacific, Fig. S2.

Emile-Geay et al. (2013a) analysed 57 proxies from the centre of action of ENSO and a number of teleconnection regions including the Maritime Continent, Southern Africa and Mexico/Southern USA. A difference in this reconstruction was its focus on low-frequency ENSO variability and the inclusion of proxies with a temporal resolution of 5 years or less and with dating errors of up to ±5 years. In the annually resolved proxies, the dating errors would be expected to be less and may, therefore contain less uncertainty than this low-frequency reconstruction (McGregor et al. 2010). Another criterion for their proxy selection was for them to be equatorward of 35°, which differs from the other multi-proxy reconstructions, which use some records from higher latitudes. They present three different Niño 3.4 SST (DJF) reconstructions, based on the first principal component of three historical SST analyses. Of the initial 57 proxies, reduced networks are used for each of the three reconstructions based on existence of significant correlation between the proxy record and Niño 3.4 from each historical SST (Emile-Geay et al. 2013a). For the purpose of this study, these three reconstructions will be referred to as EG13A, EG13B and EG13C relating to that produced from ERSSTv3 (Smith et al. 2008), HadSST2i (Rayner et al. 2006) and Kaplan SST (Kaplan et al. 1997) respectively. The screening process resulted in 35 (EG13A), 36 (EG13B), and 37 (EG13C) records used in each case, as seen in Fig. S3, with the networks differing by at most four records.

Li et al. (2013) also made use of a large network of proxy data sources by reconstructing a seven century long reconstruction of ENSO from 2222 tropical and sub-tropical tree rings. They reconstructed the mean Niño 3.4 SST anomaly (November, December, January, NDJ). The data is clustered in seven main regions, as seen in Fig. S4: Central Asia [Monsoon Asia Drought Atlas (MADA) PC1], Maritime Continent (MADA PC2), Southwest North America [North American Drought Atlas (NADA) PC1], Pacific Northwest/Texas-Mexico, Northern New Zealand, South American Altiplano and West-central Argentina.

Table 1 summaries the eight multi-proxy reconstructions used in this current paper. Abbreviations for each of the reconstruction are shown in Table 1 and hereafter used to refer to these reconstructions. Datasets from these ENSO reconstructions are available from the US National Oceanic and Atmospheric Administration’s paleoclimatology archive (https://www.ncdc.noaa.gov/paleo-search). The period covered by all of these proxy data sources spans from 1706 to 1977, thus covering the logbook-based reconstruction period (1815–1854), and a period of overlap with the instrumental SOI (1876–1977).

Table 1 Proxy based ENSO reconstructions used in this work

The proxies used within these studies are sensitive to different climatic variables and are located in a range of teleconnection regions. Key variables driving proxy records include precipitation (tree rings, ice cores and corals), air temperature (tree rings and ice cores) and SST (Coral δ18O) (Mann et al. 2000a). The differences in these teleconnections during EP and CP events could be influencing the ENSO reconstructions. Hereid et al. (2013) found corals used in a number of ENSO reconstructions originating from different regions have better or worse skill at reconstructing either El Niño or La Niña. Following on from their proxy ENSO skill assessment, the skill of the various reconstructions at capturing EP and CP events is assessed in Sect. 4.

2.4 Documentary ENSO reconstructions

A number of documentary-based ENSO reconstructions have been published over the past few decades, making use of archived documents that provide indications of weather conditions typical of ENSO events. Quinn et al. (1987) presented the first of these, with an extensive analysis of South American records extending back to AD 1525. This focused on El Niño events only and used key indicators such as heavy rainfall, flooding, the effects of winds and currents on ships, and changes in the productivity of coastal fisheries to infer El Niño events. Cross checks between sources were carried out to help establish the reliability of the source.

However, since its publication, Quinn’s El Niño chronology has been subject to revisions and the reliability of some of the sources used have been placed under scrutiny, with reliability found to reduce prior to 1800 (Thompson 1993). Ortlieb (2000) examined the chronology, questioning some of the sources used and the teleconnections on which they were based. For example, El Niño events that had been identified based on flooding events on the Rimac River, Western Peru, were removed as there is no evidence of an association with floods along this river and El Niño events found within the instrumental record. Events relating to abundant rainfall in southeastern Peru were also removed as they are now thought to be indicative of La Niña events (Nash and Adamson 2014). Overall, 42 of the 86 El Niño events from Quinn’s original chronology were questioned, 25 were suggested to be removed and seven previously unrecognized events were to be added (Ortlieb 2000). This highlights the problems and ambiguities associated with historical reconstructions of ENSO using documentary sources. To improve this, Garcia-Herrera et al. (2008) carried out documentary analysis back to 1550 using only primary documentary sources from a local region of northern Peru with previously established associations between El Niño events and heavy rainfall, with impacts such as large river discharge.

These three documentary-based El Niño reconstructions used for comparison to the logbook-based SOI reconstruction are listed in Table 2. These provide events-based chronologies of El Niño from 1525 onwards. A disadvantage is the focus only on El Niño events, with La Niña events not identified in these studies. Another limitation of these three historical reconstructions is their focus on only one main region, Peru and surrounding regions in South America. Collecting documentary evidence from this region is justified as it is a key region which is impacted by ENSO, however previous studies have suggested a broader spatial coverage helps to fully capture ENSO events and diversity within reconstructions (Braganza et al. 2009). A number of the multi proxy reconstructions have very few records from South America, therefore the core data regions vary between reconstructions.

Table 2 Documentary-based ENSO chronologies used in this work

Our logbook-based reconstructions use data from the Indian Ocean, South of Africa and the Atlantic. Therefore, they are more representative of ENSO signals from different regions than the previous documentary-based reconstructions. Despite these differences, documentary sources often have fewer dating uncertainties associated with them than proxy records, so provide a good source of comparison for the logbook-based reconstructions as their dating errors are also minimal. Finally, Gergis and Fowler (2009), GF09, present a chronology of ENSO events since AD 1525, the same period as the Quinn chronology, which looked at both El Niño and La Niña events. They used a range of complementary proxy data and documentary evidence. This record provides an events-based chronology similar to that available from the documentary based records, and was therefore a useful addition to the comparison of events based chronologies. Events classified as weak events by GF09 were not used here due to conflicting signals which resulted in a given year been classified as both El Niño and La Niña. For all other documentary chronologies, the events, regardless of strength, were used for this comparison study.

A number of previous ENSO-multi-proxy comparison studies provide a comprehensive overview of this topic, including Gergis et al. (2006). This current paper builds upon this work, and provides an up-to-date ENSO comparison using a number of key multi-proxy reconstructions published over the past few years (MCG10, W10, EG13, LI13) as well as some which were included in previous comparisons. These provide the most up-to-date source of comparison for the logbook-based reconstructions presented in B16. This paper also investigates the spatial patterns of the ENSO teleconnections of EP and CP events along with the location of proxy records in order to investigate any potential bias towards the different types of ENSO. This is the first time that a number of recent multi-proxy ENSO reconstructions have been investigated for their EP and CP reconstruction skill.

3 Methods

Comparisons were carried out between the different ENSO reconstructions over a number of periods in which data overlap (Fig. 1). Firstly, the fitting period 1979–2013 was used to assess the event-capture skill of the two reconstruction methods applied to the logbook data, PCR and CPS. This was tested with a full dataset and sub-sampled in order to provide the most realistic comparison to the logbook data. The sub-sampled dataset is a reconstructed SOI calculated using a sub-sample of the ERA-Interim data which replicates the average number of observations per DJF season in the period of the logbook-based reconstruction (Küttel et al. 2010). This is tested using one iteration and the mean of five iterations, as in Küttel et al. (2010). Secondly, a period in which the multi-proxy reconstructions overlap with the instrumental DJF SOI (1876–1977) was assessed. Analysis using longer twentieth century reanalysis for a logbook-type reconstruction over this period is not carried out due to additional sources of uncertainty when using twentieth century reanalyses which span further back in time and are informed from a smaller number of observations (Ferguson and Villarini 2014; Poli et al. 2016; Bett et al. 2017). The documentary chronologies have a more limited period of overlap with the instrumental SOI, with ORT00 and GH08 only providing an ENSO chronology up to 1900. As a result, only the QUINN87 record, which extends up to 1987, was compared to the instrumental SOI and events chronologies. The events-based chronology provided by GF09 was also used for comparison over this period. Finally, the logbook-based reconstruction period, 1815/6–1853/4 was used for comparison between the new logbook-based SOI reconstructions of B16 and the previously published ENSO reconstructions during this time period.

Fig. 1
figure 1

Time periods over which comparisons are made between various ENSO reconstructions and records

The multi-proxy reconstructions used in this comparison are based on a range of ENSO indices. ST98 provide a DJF SOI reconstruction, directly comparable to that obtained from the logbook-based reconstructions, whereas MANN00 reconstructs Niño 3 SST over Oct–March and W10, EG13 and LI13 reconstruct Niño 3.4 SST, Nov–Dec, DJF and NDJ seasons respectively. Finally, MCG10 present a unified ENSO proxy from July–June. Although these all cover the majority of the DJF period analysed by the logbook-based reconstructions, some discrepancies may occur as a result of slightly different seasons used for reconstruction, most likely with W10 and MCG10 which cover the entire year. To compare the various reconstructions Pearson’s correlation coefficients were calculated during the overlap period (1876–1977) and the logbook-based reconstruction period (1815–1854). SST-based reconstructions were inverted to take into account the negative correlation between SOI and SST-based Niño indices during the instrumental record.

The multi-proxy and logbook-based reconstructions are continuous ENSO time series. In contrast, the documentary-based records provide an events-based chronology, listing El Niño and in some cases La Niña events. Therefore, El Niño and La Niña events had to be defined from the ENSO indices to enable comparison with the events-based chronologies. The common way of defining ENSO events from continuous data is through the use of standard deviation thresholds. In their reconstruction of the SOI Stahle et al. (1998) investigate both ±1.0 and ±2.0 standard deviations from the mean, as thresholds for defining ENSO events. When using Niño SST indicators of ENSO, it is typical to define an ENSO event based on a 5-month running mean of SST anomalies, using a threshold of ±0.5 °C, which is equal to roughly 0.5 standard deviations from the modern climatological Niño 3.4 mean (McGregor et al. 2010). However, the one standard deviation threshold is used in a number of previous studies (Mann et al. 2000b; Braganza et al. 2009). Therefore, here we used the one standard deviation threshold as an indicator of ENSO events, with means and standard deviations calculated per series.

Once event-based chronologies were constructed, hit rates were calculated to assess ENSO detection skill. This skill is defined as the percentage of correctly identified events in the target record by the index under comparison. The target record varies depending on the period of analysis. For example, the target record in the fitting period (1979–2013) and historical period (1876–1977) is the instrumental DJF SOI. In the assessment of ENSO flavours the target record is the NINO record 1876–2013 (Kug et al. 2009). Hit rates have been used in previous ENSO studies as an indication of skill when analysing ENSO on an event-by-event basis (Gergis and Fowler 2009; McGregor et al. 2010). A hit rate of over 40% is suggested by Hereid et al. (2013) as one which indicates good skill. Year0 of the ENSO event refers to that of the December of the DJF season, therefore assuming the El Niño impacts preceded the peak in the El Niño event. Year1 analysis compares the DJF SOI with the year of January and Feb, therefore assuming the El Niño impacts followed the peak in the El Niño event. For multi-proxy analysis DJF 1815/1816 is compared to 1816 from the proxy reconstructions. For this period of logbook data, correspondence rates replace hit rates. These indicate how well the multi-proxy and documentary reconstructions correspond with the logbook-based reconstructions. Correspondence rates were used for this period as there is no ‘true’ (i.e. instrumental) ENSO index to compare to the reconstructions.

The spatial pattern of wind anomalies from EP and CP El Niño events, and La Niña events were also analysed to further investigate any potential bias to a specific flavour of ENSO in the logbook-based reconstruction predictor grid boxes. For this, ERA-Interim reanalysis data is used, as in B16. Composite plots are composed of the mean DJF wind anomaly from the years classified as EP and CP El Niño by the NINO method (Kug et al. 2009) The years used in the La Niña composite plot include 1988/89, 1998/99, 2007/08, 2008/09, 2010/11, 2011/12, which are those identified from the instrumental SOI. Using the spatial patterns of ERA-Interim zonal wind anomalies in the predictor grid boxes during modern El Niño events, an experimental attempt to categorise the historical logbook events into EP and CP El Niño was carried out. A simple agreement methodology was carried out in which the spatial wind patterns from modern composites were compared to those in historical individual events. The percentage of the predictor grid boxes in the logbook-defined events with wind anomalies of the same sign as seen the modern EP and CP composites was calculated. For individual events within the period of the logbook-based reconstruction, if a higher percentage of EP (CP) grid boxes have the expected zonal wind anomaly than CP (EP), then the event is labelled as EP (CP). This provides an indication of the frequency of CP and EP events during the reconstruction period.

4 Results

4.1 Fitting period events analysis, 1979–2013

Firstly, the fitting reconstructions from both PCR and CPS methodologies were compared to the instrumental DJF SOI, assessing the event capture ability of these indices. B16 found that the two methods used in the logbook-based reconstructions produced an SOI which was strongly and significantly correlated with the observed DJF SOI, [PCR (r = 0.89) and CPS (r = 0.87)]. The non-detrended DJF SOI values were used to create an events-based chronology, based on the one standard deviation threshold. The PCR (CPS) method identifies 8 (6) El Niño and 5 (4) La Niña events over the period 1979–2013, whilst the instrumental DJF SOI identifies 6 El Niño and 6 La Niña events (Fig. 2). Hit rates, indications of El Niño and La Niña skill, were calculated based on the number of events in the instrumental DJF SOI captured by the reconstruction indices for El Niño and La Niña events (Table 3). Hit rates are higher for El Niño (83%, 67%) than La Niña events (50% for both methods), and PCR hit rates for El Niño are higher than that for CPS. They are all higher than the hit rate of 40% used as a threshold for indicating a relatively high level of skill (Hereid et al. 2013). The better correspondence of the PCR-based reconstruction with the instrumental record was also found in the correlation analysis. Figure 2 shows the ENSO events from the three different data sets and helps to illustrate the hit rates. It also shows that both reconstruction methods create ‘False alarm’ events, ones which appear in the reconstructed SOI but not in the observed record. The false alarm rate indicates the percentage of events predicted by the PCR or CPS that were not events within the instrumental SOI. False alarm rates of 25% for El Niño and 40% for La Niña for PCR and 33% for El Niño and 50% for La Niña for CPS were found.

Fig. 2
figure 2

El Niño (orange) and La Niña (blue) events from the instrumental DJF SOI and the full PCR and CPS fitting reconstructions, 1979–2013

Table 3 Hit rates for El Niño and La Niña events from PCR and CPS fitting full and mean of five sub-sampled iterations compared to the instrumental DJF SOI

The sub-sampled PCR and CPS fitting reconstructions, which were constructed using a sample of ERA-Interim data that replicates the mean data availability over the reconstruction period, provide an indication of the hit rates which would be obtained for a logbook-based reconstruction over this modern period. For El Niño events the hit rates for the sub-sampled PCR are the same as that for the full PCR fitting, (83%), while for the sub-sampled CPS, the El Niño hit rates are higher (83%) than that from the fitting (67%). La Niña hit rates are lower for the sub-sampled than for the full fitting for both methodologies. The performance of the two methodologies for El Niño events is clearly better than their performance for La Niña events. Overall, the sub-sampled fitting produces an ENSO record that has less correspondence with the DJF instrumental SOI, but still captures events well, especially El Niño events, with hit rates not dropping below 67% for either reconstruction method.

Table 4 shows the hit rates for the instrumental DJF SOI, PCR and CPS full and sub-sampled fitting when compared to the NINO method of El Niño event classification (Kug et al. 2009). This method identified three EP (1982/83, 1991/1992 and 1997/98) and three CP El Niño (1994/95, 2002/03 and 2009/10) events since 1979. All three EP events were identified as El Niño in the DJF SOI record, whereas only one of the three CP events were found from the instrumental DJF SOI classification. This immediately suggests that the instrumental DJF SOI classification may identify EP events better than CP events. The PCR reconstruction identified 100% of the EP events, and two of the three (66%) CP events, while the CPS reconstruction identified all three (100%) EP events, but only one of the three (33%) CP events. This suggests that the two methods used in the logbook-based reconstruction also capture EP events more successfully than CP events, with a stronger bias towards EP events in the CPS-based reconstruction compared to PCR-based. These conclusions are also drawn from the sub-sampled reconstructions.

Table 4 Hit rates for EP and CP El Niño from PCR and CPS for the instrumental DJF SOI, PCR and CPS full fitting and the mean of five sub-sampled iterations compared to the NINO method, 1979–2013

To further investigate this potential bias, the spatial wind patterns of individual ENSO events were analysed, specifically within the nine predictor grid boxes, highlighted in Fig. 3 (see supplementary info Table S4 for locations). Composite plots of the DJF mean seasonal zonal wind anomalies for EP and CP events were constructed using ERA-Interim Reanalysis data based on 1979–2013 climatology (Fig. 3).

Fig. 3
figure 3

DJF zonal wind composite anomalies (ms−1) during EP El Niño events (1982/83, 1991/92 and 1997/98) and CP El Niño events (1994/95, 2002/03 and 2009/10), relative to 1979–2013 climatology. Zonal wind data are taken from the ERA-Interim reanalysis. Predictor grid boxes used in the reconstruction are outlined in black, numbered 1–9 from west to east

The mean pattern of zonal wind anomalies during EP and CP events is different over the nine predictor grid boxes, with stronger anomalies seen in a higher number of predictor grid boxes during EP events compared to CP events. Positive wind anomalies are found during EP events in grid boxes 2, 3, 4 (South African grid boxes), 5 and 6 (Indian Ocean) and negative anomalies in grid boxes 1, 8 and 9. In CP events, the only visible signal in the predictor grid boxes is a negative one in grid boxes 3, 4 and 9. It should be noted that the sign of the anomaly in grid boxes 3 and 4 is the opposite in EP compared to CP events. Looking at the individual events (Figs. S5, S6), the wind patterns are more varied due to the inter-event diversity of ENSO. Nevertheless, when considering the composite wind anomaly patterns (Fig. 3) the higher number of predictor grid boxes with stronger wind anomalies during EP compared to CP events, and the increased skill of the two methodologies at capturing EP events over CP events suggests that the logbook-based reconstructions would favour reconstruction of EP over CP events. This has implications for the ability of the historical reconstruction to capture CP events and is considered later when interpreting the historical logbook-based reconstructions.

It is also possible that reconstruction methods could result in a bias towards El Niño rather than La Niña or vice versa, as previously found for individual coral records assessed across the Pacific (Hereid et al. 2013). Figure 4 shows a composite plot of wind anomalies for modern La Niña events. Spatial patterns during these individual events are shown in Fig. S7, and demonstrate differences between individual events, as found in El Niño analysis. Strong, but opposite, zonal wind anomalies are found in four of the same predictor grid boxes for La Niña and EP events, whereas opposite anomalies from the CP grid boxes are found in only one of the La Niña grid boxes. Overall, strong zonal wind patterns are found in five of the predictor grid boxes, whereas in eight predictor grid boxes in EP El Niño events. This, along with the higher hit rate during El Niño compared to La Niña events (Table 3), suggests that El Niño events are more likely to be identified by this reconstruction methodology than La Niña events. Overall, these zonal wind patterns, along with the calculated hit rates, suggest that East Pacific El Niño events are more likely to be captured from reconstructions using these grid boxes when followed by La Niña events. Overall, CP El Niño events are captured least well.

Fig. 4
figure 4

DJF zonal wind composite anomalies (ms−1) during La Niña events (1988/89, 1998/99, 2007/08 2008/09, 2010/11, 2011/12), relative to 1979–2013 climatology. Zonal wind data taken from ERA-Interim reanalysis. Predictor grid boxes used in the reconstruction are outlined in black

4.2 Multi-proxy comparison to instrumental DJF SOI, 1876–1977

The multi-proxy ENSO reconstructions were then compared to the instrumental DJF SOI. Pearson’s correlation coefficients were calculated over the period of overlap 1876–1977. Figure 5 shows a good agreement between the normalised ENSO reconstructions from these records, with reconstructions inverted on the figure to match the sign of the SOI for a more direct comparison.

Fig. 5
figure 5

Eight (ST98, MANN00, MCG10, W10, EG13A, EG13B, EG13C and LI13) reconstructed ENSO indices and the instrumental DJF SOI (black), 1876–1977. Non-SOI indices are inverted

A significant positive correlation is present between the instrumental DJF SOI and ST98. All others have negative correlations, as they are oceanic indices (e.g. Niño SST) which are negatively correlated with the SOI. The strongest correlation (r = −0.82) exists between the instrumental SOI and the Niño 3.4 SST reconstruction from EG13B (Table S5). All correlations are statistically significant at the 99% level, with the lowest correlation of −0.48 (W10). There is good agreement between the multi-proxy reconstructions and the instrumental DJF SOI, which suggest that the use of the DJF SOI as the predictand for the logbook-based reconstructions, as opposed to other ENSO indices, should not impact the correspondence between ENSO reconstructions significantly.

These multi-proxy reconstructions and the DJF SOI were then converted into discrete event chronologies. The mean and standard deviation are calculated separately for each reconstruction, over the period 1876–1977, and ENSO events from each reconstruction defined by the one standard deviation threshold. Between 1876/77 and 1977/78, 18 (17) El Niño (La Niña) events are found from the instrumental DJF SOI record. The El Niño and La Niña events from the instrumental SOI and the proxy reconstructions are shown in Fig. 6.

Fig. 6
figure 6

El Niño (negative) and La Niña (positive) events, based on classification of ±1.0 standard deviation from the mean (1876–1977) from all the multi-proxy reconstructions and the instrumental DJF SOI. Shown for Year1 (year that contains Jan)

There is generally good agreement between the proxy reconstructions, however not all events are found within the entire set of reconstructions. Some years have as little as one reconstruction indicating an ENSO event is present, whereas in other years an event is found within all of the reconstructions. Over this period, the El Niño hit rates are higher than the La Niña hit rates for all but ST98 (Table 5). This is consistent with findings from previous analysis of ENSO multi-proxy reconstructions which stated that proxy reconstructions have better skill at capturing El Niño compared to La Niña (Braganza et al. 2009; McGregor et al. 2010). It is also consistent with findings from Sect. 4.1, which found that the logbook-based reconstructions methods have better event capture skill for El Niño events compared to La Niña. EG13C has some of the highest hit rates for both El Niño and La Niña, suggesting this reconstruction best captures the events indicated by the instrumental DJF SOI. W10 performs worst, with only 35% of La Niña events captured, the only one of the reconstructions that is not above the 40% threshold suggested by Hereid et al. (2013).

Table 5 Hit Rates (Year1) between the instrumental DJF SOI and multi-proxy ENSO events with classification thresholds of ±1.0 standard deviation 1876–1977 mean

4.3 Multi-proxy comparison to instrumental EP and CP El Niño, 1876–1977

The multi-proxy reconstructions were then assessed for their skill in reconstructing ENSO flavours: EP and CP El Niño events. Using the NINO method with HadISST NINO3 and NINO4 indices, 14 EP events and 9 CP events are identified. Table 6 shows the EP and CP skill of the multi-proxy reconstructions when compared to this record All of the proxy reconstructions have better skill at capturing EP events than CP, except EG13A which has a slightly higher CP score compared to EP. With this exception, these results show the same bias towards EP events as was found for the B16 method above. EG13B has the highest EP (92%) and joint highest CP skill score of the multi-proxies.

Table 6 East Pacific and Central Pacific skill of multi-proxy reconstructions compared to target record of events from the NINO method, 1876–1977

A number of these ENSO reconstructions were produced before the appreciation of the different of the types of ENSO events so did not explicitly consider the effect of this on their reconstruction. Therefore, it is possible that the ENSO reconstructions could portray slightly different flavours of past ENSO, and be biased towards reconstructing one type over another (Karamperidou et al. 2015).

4.4 Event chronology comparison to instrumental DJF SOI, 1876–1977

The correspondence between the ENSO events captured by documentary evidence and by the instrumental DJF SOI was also assessed. OTR00 and GH08 only extend up to 1900, therefore only QUINN87 was compared over the period, 1876–1977. The hit rates when comparing to the instrumental DJF SOI are presented in Table 7, along with GF09’s events-based reconstruction which also extends over this entire period. La Niña skill is available only for GF09, as the QUINN87 record only focuses on El Niño events. Comparisons are made for Year0 and Year1 to account for teleconnections that occur prior and post the ENSO-event peak (DJF).

Table 7 Year0 and Year1 (in parentheses) correspondence between instrumental DJF SOI ENSO and QUINN87 and GF09, 1876–1977

There are 18 El Niño events identified from the instrumental DJF SOI between 1876 and 1977. There are 27 El Niño events identified by QUINN87 during this same period, with 11 of these extending for more than one year. The majority of the events identified in the instrumental record are the same as those captured by QUINN87 (Table 7). This is based on comparisons in Year0, whereas a lower hit rate is found when comparing to Year1. This suggests that the El Niño impacts found in these documentary sources are those which precede the DJF season. GF09 has similar skill to QUINN87, with the highest hit rate, 83%, for El Niño in Year0, and with La Niña in Year1, 64%. The documentary records have El Niño hit rates which are higher than any of the multi-proxy reconstructions (Table 5), suggesting that the documentary records perform best over this period.

4.5 Event chronology comparison to EP and CP El Niño events, 1876–1977

In terms of ENSO flavours, hit rates were calculated for QUINN87 and GF09 in line with those presented in Table 7 for the multi-proxy reconstruction. Table 8 shows the EP and CP El Niño hit rates for QUINN87 and GF09 for both Year0 and Year1. Both records have higher skill at capturing EP events compared to CP events. This was also found to be the case for the multi-proxy reconstructions and the logbook-based reconstructions.

Table 8 East Pacific, Central Pacific and La Niña skill of events reconstructions compared to target record of events from the NINO method

4.6 Multi-proxy comparison over logbook reconstruction period, 1815–1854

Comparisons were then made over the logbook-based reconstruction period, 1815–1854. Table S6 and Table S7 show the Pearson’s correlation coefficients between the multi-proxy reconstructions during the two periods and there is reduced agreement between different multi-proxy ENSO reconstructions in 1815–1854 compared to 1876–1977. A number of the correlation coefficients are also no longer significant at the 95% level during this earlier period, which also suggests a reduced agreement between the reconstructions during 1815–1854. Figure 7 shows the four logbook-based reconstructions and the eight multi-proxy ENSO reconstructions. The reduced agreement between the multi-proxy reconstructions during this period can be seen when compared to the agreement between the reconstructions in Fig. 5. Pearson’s correlation coefficients between the four logbook-based reconstructions and these multi-proxy indices, 1815–1854 are shown in Table 9. The best agreement is found with Emile-Geay et al.’s (2013a, b) three Niño 3.4 reconstructions, EG13A/B/C. All three of these reconstructions have significant correlations with both the CPS A and the CPS B reconstructions. For CPS B, correlations significant at the 95% level are also found with ST98 and MANN00. Therefore, five of the eight proxy-based reconstructions are significantly correlated with CPS B, more than with any of the other logbook-based reconstructions. CPS A is significantly correlated with three of the multi-proxy reconstructions, EG13A/B/C. The better agreement with CPS B suggests that the adjustment in the mean carried out due to the shift in data availability from the different logbook countries in B16 was justified. As we hypothesised in B16, CPS B is the most robust reconstruction of our four logbook-based reconstructions.

Fig. 7
figure 7

Historical normalised ENSO indices from eight multi-proxy reconstructions (colour) and the four logbook SOI reconstructions (black) 1815/16 to 1853/54. The missing years in the logbook-based reconstruction are 1829/30 and 1844/45

Table 9 Correlation coefficients between the logbook-based reconstructions and multi-proxy reconstructions 1815/16–1853/54

The continuous proxy reconstructions were then used to identify El Niño and La Niña events over the period 1815 and 1853, using the one standard deviation threshold, shown in Fig. 8. There is good agreement between the logbook-based and the proxy-based reconstructions in 1818/19 and 1819/20, suggesting a persistent La Niña event. However, there are also a number of events that are identified in only the logbook-based reconstructions, suggesting a degree of disagreement with the multi-proxy reconstructions. The percentage of the logbook El Niño and La Niña events found in the numerous multi-proxy reconstructions were analysed using correspondence rates (Table 10). The highest correspondence rate is 60%, between ST98 and CPS B El Niño events. CPS B has an El Niño correspondence of 40% for three reconstructions, MCG10, EG13C and LI13. Apart from this, the only other correspondence rates of 40% or more are three La Niña rates from CPS A compared to MANN00, MCG10 and EG13A. A number of correspondence rates of 0% were found, indicating inconsistency between some of the reconstructions. Overall, CPS B has the highest correspondence rates and the highest number of significant correlations during this period, further confirming that out of the four logbook-based reconstructions this has the best agreement with the multi-proxy reconstructions.

Fig. 8
figure 8

Correspondence (Year1) between logbook ENSO events and multi-proxy ENSO events with classification thresholds of ±1.0 standard deviation 1815–1854 mean. Year1 (that which contains Jan) shown

Table 10 Correspondence rates (Year1) for El Niño and La Niña events 1815–1854 between logbook and multi-proxy reconstructions

An attempt to categorise the El Niño events identified by the logbook-based reconstructions was then carried out. CPS B events are analysed as this logbook-based reconstruction has the best agreement with the other ENSO reconstructions (Table 10). Using the ERA-interim zonal wind composites from modern EP and CP events the zonal wind anomalies within the predictor grid boxes were compared to those seen in the logbook-defined events. In the Era-Interim composites, eight of the grid boxes (1–6 and 8–9) show wind anomalies in EP events, whereas only three (3, 4, 9) display visible zonal wind anomalies in CP events. Figure 9 shows the DJF zonal wind anomalies from logbook data for the 5 years defined as El Niño events in the CPS B reconstruction. Compared to the El Niño zonal wind anomalies in the fitting period, there is a less consistent pattern in the direction and strength of the anomalies in the logbook-derived winds. An attempt to classify these events as EP and CP is very much an experimental one, however a simple agreement methodology was carried out in which the spatial wind patterns from modern composites were compared to those in historical individual events.

Fig. 9
figure 9

DJF Zonal wind anomalies (ms−1) from logbook data for CPS B El Niño years with 1824/25 and 1830/31 anomalies relative to the mean of 1815–1833 and 1833/34, 1838/39 and 1851/52 relative to the mean of 1834–1854

From Table 11 it can be seen that four of the historical events are suggested to be EP events and only one to be a CP event. The event chronology from this reconstruction suggests a lower frequency of CP El Niño events in this period compared to the reported increase in CP El Niño in recent decades (Lee and McPhaden 2010). Further work is needed in order to fully understand the reasons for the different event-capture skill of the logbook-based reconstructions.

Table 11 Classification of CPS B El Niño events into EP and CP

4.7 Documentary chronologies, 1815/16–1853/54

The correspondence between the documentary chronologies and the logbook-based reconstructions was also assessed for the logbook-based reconstruction period, 1815/16–1853/54. Figure 10 shows the timing of these events from the various records. A clear pattern is that there are more El Niño events in the earlier part of the period in the documentary chronologies, whereas El Niño events are more prevalent in the later part of the logbook PCR reconstructions. There is a very limited correspondence between the logbook-based reconstructions and the documentary-based ENSO chronologies. The highest correspondence rate is 20%, found between CPS A and QUINN87 (Year0). The cluster of El Niño events identified in the documentary-based reconstructions at the start of the record is, in fact, opposite from what is seen in both the logbook-based and multi-proxy reconstructions, with these suggesting a persistent La Niña (Fig. 7).

Fig. 10
figure 10

El Niño chronologies from documentary sources, Year0 (black) and logbook-based reconstructions (grey) 1815/16–1853/54. Bars indicate El Niño event

4.8 Combined multi-proxy and documentary, 1815/16–1853/54

Finally, a comparison to the reconstructions of Gergis and Fowler (2009) was carried out. GF09 identify seven moderate to extreme El Niño and seven moderate to extreme La Niña events in the period 1815–1854. Correspondence rates for these events are shown in Table 12. They are much lower than the high hit rates found for GF09 over the more modern period, 1876–1977.

Table 12 Correspondence rates Year0 (Year1 in parenthesis) for El Niño and La Niña events 1815–1854 between logbook and GF09 ENSO reconstruction

For CPS B, the correspondence rate of GF09 for Year0 is 40% for El Niño and 17% for La Niña, and when comparing to Year1 this becomes 0% for El Niño, and 50% for La Niña. Therefore, GF09 correspond better to CPS B El Niño events better in Year0 but to CPS B La Niña events in Year1. This corresponds with findings in Table 7 which suggested La Niña events are better captured by GF09 Year1.

Overall it is clear that there are many disagreements between the numerous ENSO reconstructions covering the period of the logbook-based reconstruction, 1815–1854. The agreement between multi-proxy reconstructions reduces during this period compared to the more modern period analysed, 1876–1977, with a drop in the number of significant correlations and the lower correspondence rates. All correlation coefficients are significant at the 95% level during the latter period, whereas a number of weak correlations in the earlier period drop below the 95% significance threshold. The documentary chronologies do not show an ENSO signal similar to the logbook-based reconstructions over this period. Overall, CPS B has the best agreement with the multiple ENSO reconstructions.

The logbook SOI reconstructions presented by B16 built upon the idea of initial work carried out by Jones and Salmon (2005), who first attempted a DJF SOI reconstruction using the data from ships’ logbooks. They found four major El Niño events during their study period, 1750–1854. Two of these are within the study period of the present paper: 1833 and 1834. From the four new logbook-based reconstructions, only CPS B find 1833 to be El Niño year, and none of them highlight 1834 as an ENSO year. Jones and Salmon (2005) correlated their reconstructed Oct–Mar SOI with two early multi-proxy studies. Low correlation coefficients were obtained, 0.10 with ST98 and 0.08 with MANN00. These were calculated over a longer period, 1750–1850, than the new logbook SOI reconstructions and therefore the strength of the correlation coefficients are not directly comparable. However, considering the new logbook-based reconstructions, correlations significant at the 95% level are found between the CPS B reconstruction and both ST98 (r = 0.32) and MANN00 (r = −0.28). Therefore, there is better agreement of the new logbook reconstructions with the alternative ENSO reconstructions, compared to that found in earlier work from Jones and Salmon (2005). This suggests that the additional data and revised methodology has improved the ability of ENSO reconstruction using the data from ships’ logbooks.

5 Discussion

The key findings from these results are now discussed. Firstly, our discussion is driven by the differences in the skill of the various reconstructions in representing ENSO diversity. Secondly, the strong La Niña signal from the logbook and multi-proxy reconstructions during 1818–1820 is investigated. This event occurs in the years following the 1815 Tambora eruption, which is known to have had widespread effects on global climate. Here we, explore the influence of this non-ENSO signal on the ENSO reconstructions during these few years. Finally, a number of reasons for lack of agreement between datasets are assessed, focusing on the differences between the documentary chronologies compared to the other reconstructions of ENSO.

5.1 Skill of reconstructions in representing ENSO diversity

Firstly, it was found that the logbook-based reconstructions (Table 3) and most of the multi-proxy reconstructions (Table 5) have higher hit rates for El Niño events compared to La Niña events. One reason for differences in the ENSO chronologies overall could be the use of the one standard deviation as a threshold to define ENSO events. The capturing of events is sensitive to this, somewhat arbitrary, but commonly used, threshold (Braganza et al. 2009). Slight differences in reconstruction method and parameters can lead to quite different event classifications, as seen with the differences in events from the logbook-based reconstruction methodologies (Fig. 8). A limitation of this methodology is the non-linearity of ENSO teleconnections, with El Niño teleconnections often stronger than La Niña teleconnections, and thus more likely to provide a stronger El Niño signal in the proxy records compared to the signal from a La Niña event (Hoerling et al. 1997; Batehup et al. 2015). This in part could explain the higher hit rates for El Niño compared to La Niña found in this paper. However, the one standard deviation threshold is most suitable for use here with the logbook-based reconstructions as it is more likely to capture real events than incorrectly interpreting noise, and is used within a number of the multi-proxy reconstructions compared here (Stahle et al. 1998; Mann et al. 2000b).

Another important finding from this paper is the varied ability of the ENSO reconstructions to capture the different flavours of ENSO. The instrumental DJF SOI, multi-proxy reconstructions and documentary chronologies were found to have better skill at capturing EP events than CP events over the common period, 1876–1977. The logbook-based reconstruction methodology and the predictor grid boxes used also appear to capture EP events better than CP events during the fitting period. This would suggest that the historical logbook-based reconstructions may also contain an EP bias. It was found that of the five El Niño events identified by CPS B logbook-based reconstruction, four are suggested to be EP events (Table 11). One reason for this is that in the fitting and calibration of the logbook methodologies, the SOI was used as the target reconstruction. The instrumental DJF SOI has been shown to have an EP bias (Table 4). Thus the choice of calibration target could be an influence on the EP bias within the reconstructions. Another reason for the EP bias within the logbook-based reconstruction could be the spatial distribution of predictor grid boxes. Further work could address the differences in EP and CP teleconnections across the range of regions from which data are obtained to investigate this further. Overall it remains unclear as to the main cause of the EP bias within the logbook-based ENSO reconstructions.

Similarly, there is no clear reason for the higher skill of the multi-proxy reconstructions at capturing EP rather than CP events. One factor that might influence their skill in capturing ENSO diversity are the locations and the weightings of the proxy records used. The multi-proxy reconstructions compared in this study contain large proxy data networks which include ENSO signals from a number of teleconnection regions. Regions with higher weightings which have a consistent teleconnection in both EP and CP events will be most likely to produce an ENSO reconstruction that can capture both types of events. MCG10 noted that their reconstruction might be influenced by an over representation of North American tree rings due to the common use of these tree rings within the multiple reconstructions they used, as well as their higher weightings. This leads to a potential source of bias within their ENSO reconstruction as the temperature and precipitation teleconnections during EP and CP events differ in this highly weighted North American teleconnection region (McGregor et al. 2010; Infanti and Kirtman 2016).

Emile-Geay et al. (2013b) showed that during the period coincident with the logbook-based reconstruction lower weightings were given to the tree-ring networks of North America, with highest weightings given to corals in the western Pacific, eastern Indian Ocean, Red Sea and a South American ice core. Therefore, EG13 reconstructions are more influenced by the Indo-Pacific region, a less remote teleconnection region. LI13 used a tree ring network that was divided into seven regions. The highest weightings were given to those networks from the Maritime Continent (EOF loading = −0.54), South American Altiplano (0.44) and Central Asia (0.45). Therefore, the LI13 reconstruction had high weightings on either side of the Pacific. These multi-proxy reconstructions, despite using different networks, methods and weightings, were all found to capture EP events better than CP events. However, the difference between the EP and CP skill of each reconstruction is varied, with LI13 having the largest difference in skill of EP versus CP events, and EG13 reconstructions having the lowest difference. LI13 addressed ENSO diversity in their study stating that they carry out a reconstruction of canonical (EP) El Niño. This in part explains the reason for the difference between EP and CP skill in this record. However, as for the logbook-based reconstruction, it is suggested that the higher EP skill of the multi-proxy reconstructions could be a result of the index they are calibrated to (Table 1). Therefore, it is clear that reconstructions with records located in different regions and with different weightings have varying skill at capturing ENSO diversity.

Further work is needed in order to fully understand the differences in the proxy signals during EP and CP El Niño events. Schollaen et al. (2015) were the first to use oxygen isotope records from tree rings to detect the different flavour of ENSO. Using proxy records from regions which have uniquely different response during EP and CP events can help to gain this degree of ENSO classification. Only once these proxies have been fully exploited will we be able to gain a full understanding of the behaviour of ENSO and its diversity during the pre-instrumental period (Karamperidou et al. 2015). This would also help inform future ENSO reconstructions using data from ships’ logbooks.

5.2 Influence of non-ENSO factors on reconstruction: Case study 1818–1820 La Niña

Another key finding was that compared to the poor agreement between records over much of the logbook–based reconstruction period, there is a striking degree of agreement during the years 1818–1820. The logbook-based reconstructions indicate La Niña during 1818/19 and 1819/1820, with 1818/19 the second strongest event La Niña event in CPS A and CPS B. The zonal wind patterns from the logbook data are largely consistent with expected La Niña patterns for these 2 years (Fig. S8). Gergis and Fowler (2009) suggest a strong La Niña event in the years 1819 and 1820. The strongest La Niña values of the entire logbook-based reconstruction period are found in 1818/19 for EG13A, EG13B and EG13C. The peak of the La Niña event in the other proxy reconstructions falls in 1819/20 (ST98, MANN00, MCG10 and LI13). Only W10 does not identify this as a La Niña event. The high level of agreement between the multi-proxy and logbook-based reconstructions during these few years is interesting, as such good agreement is not seen for any period during the rest of the logbook-based reconstruction period.

A key feature of global climate during the early nineteenth century was the eruption of Tambora in 1815. The volcanic eruption on the Indonesian island of Sumbawa led to the ‘Year without a summer’ in 1816 (Chenworth 2001) and a reduction in global temperature (Raible et al. 2016). It has been suggested that large volcanic eruptions could be associated with specific ENSO behaviour. Adams et al. (2003) used the ENSO reconstruction of ST98 and MANN00 to investigate the relationship between ENSO and volcanic activity beyond the range of the instrumental record. It was found that the probability of an El Niño event doubled in the winter following a tropical volcanic eruption. Emile-Geay et al. (2008) built upon this previous work and found that large volcanic eruptions appear to ‘load the dice’ in favour of El Niño events. Only the largest eruptions are thought to have a noticeable effect on ENSO behaviour (Emile-Geay et al. 2008). A number of the multi-proxy reconstructions indicate an El Niño event in 1816, supporting this ENSO-volcanic link (Fig. 8). Although the CPS B logbook-based reconstruction does not indicate an El Niño event 1815/16, the SOI index is negative, just not at a magnitude that would flag this year as an El Niño event. Recently, McGregor et al. (2010) concluded that there is also an increased likelihood of La Niña 3 years after a volcanic eruption. This 3 year association would cover the La Niña event of 1818/19. This suggests that it is possible that the eruption of Tambora in 1815 increased the likelihood of this strong La Niña event that is indicated so clearly in the ENSO reconstructions.

Additional volcanic eruptions during the reconstruction period include 1831 and 1835 (Mann et al. 2005). The logbook-based reconstruction CPS B identified an El Niño event 1835/36 followed by a La Niña event 1836/37, However, this is picked up in very few of the multi-proxy reconstructions, suggesting an association between the volcanic eruption of 1835 and ENSO is not strong in this case. A main reason for this could be that the 1835 eruption was not as large as the Tambora eruption, and the association between ENSO and volcanic eruptions was found to be most evident during only the largest eruptions (Emile-Geay et al. 2008). It is also possible that the ENSO signal might be masked by the climatic effects of volcanic eruptions in the predictor region used in the reconstructions, or that an eruption causes an ENSO type response when an ENSO event is not actually present. Destructive interference between these different signals would result in an unclear indication of ENSO variability during, and following, a volcanic eruption. This is a possible factor masking a volcanic-ENSO response, but is not explored further here. Overall, in agreement with previous studies, it appears likely that there is a strong ENSO-volcano association for the Tambora eruption.

5.3 Disagreement with documentary chronologies

Despite the strong La Niña signal from the multi-proxy and logbook-based reconstructions, the documentary ENSO-chronologies suggest an El Niño event during this period, spanning 1818–1819 according to QUINN87, 1817–1819 according to ORT00 and a persistent El Niño from 1816 to 1819 recorded by GH08. These historical records are based on conditions in South America, such as flooding and water abundance in Northern Peru (Garcia-Herrera et al. 2008). The multi-proxy reconstructions, which have a larger spatial coverage, suggest a La Niña event, in agreement with the signal apparent in the logbook-based reconstructions. This conflict between the documentary chronologies and the other ENSO reconstructions is found throughout the period of the logbook-based reconstruction.

For example, QUINN87 and ORT00 identify a ‘Very Strong’ El Niño event occurring in 1828. This was based on documentary evidence of flooding and exceptional rainfall in Northern Peru and Ecuador (Ortlieb 2000). GH08 reported 1827/28 as an El Niño year based upon additional records from Northern Peru including flood damage to bridges and roads (Garcia-Herrera et al. 2008). However, none of the logbook-based reconstruction identified 1827/8 or 1828/9 as an El Niño event, and PCR A even suggested 1827/28 and 1828/29 was a La Niña event. Of the multi-proxy reconstructions, three suggest 1827/28 as an El Niño year, and only one classifies 1828/29 as El Niño and one as La Niña (Fig. 8). Therefore, the signals based on historical documentary evidence, largely of rainfall and flooding from South America, appear to be inconsistent with multi-proxy reconstructions, based on records from multiple teleconnection regions. The latter are more likely to provide a more robust ENSO signal. This was also found when comparing the timing of El Niño events in the documentary chronologies and the logbook-based reconstructions (Fig. 10). Despite the Quinn El Niño chronology being commonly used as a reference for historical events (Gergis and Fowler 2009), it appears to have poor agreement with both the multi-proxy and logbook-based reconstructions. Results suggest that further work incorporating records from multiple regions should be carried out in order to provide a more consistent chronology of past ENSO events.

An additional reason for disagreement between the ENSO reconstructions is the differences in the scale of dating uncertainty between the proxy records and the documentary records. Documentary records, including the ships’ logbooks, are often dated to the day, and yearly summaries are based on a collection of individual weather events. In contrast, proxy records often represent seasonal or annual climate conditions. Dating uncertainty in proxy records depends on the sources used, with tree rings having lower uncertainty than corals (Gergis and Fowler 2009). One of the criteria used for proxy selection by Emile-Geay et al. (2013a), was an absolute dating uncertainty of less than ±5 years. This is poor compared to the daily data from logbooks and historical documents, and indeed the typical duration of ENSO events. Therefore, in terms of dates the documentary and logbook data are more accurate than the multi-proxy reconstructions. This could be one of the key reasons for lack of agreement between different ENSO reconstructions, especially when comparing on an event by event basis. However, this cannot account for the differences between the logbook-based reconstructions and the documentary chronologies.

6 Conclusions

Here, we present a comparison of a number of reconstructions of the El Niño Southern Oscillation. Particular focus is given to the DJF SOI reconstruction carried out using meteorological data from ships’ logbook recently presented in a companion paper, B16. This logbook-based reconstruction built upon initial work by Jones and Salmon (2005), and better agreement was found between the improved logbook-based reconstruction and multi-proxy ENSO reconstructions compared to the agreement of the Jones and Salmon (2005) reconstruction with the multi-proxy reconstructions. Therefore, the use of additional logbook data, the methodological changes and the focus on strong teleconnection regions in B16 logbook-based reconstruction has shown that the use of observations from ships’ logbooks to reconstruct an SOI is a viable approach to ENSO reconstructions. The logbook data therefore can provide an additional indication of the behaviour of ENSO during the pre-instrumental period. Further digitisation of wind observations from ships’ logbooks can increase the temporal extent and robustness of our reconstruction significantly.

We identify a difference in the skill of the reconstructions in capturing the different flavours of ENSO. All but one of the ENSO reconstructions capture EP events better than CP events. Further work should be done to address the differences between EP and CP events and their representation in proxy records. Similar to the need for increased logbook digitisation, there is also large scope for an increased coverage of proxy networks. An increase in the number of corals from the Pacific Ocean would help to improve the reconstruction of the different flavours of ENSO (Wilson et al. 2010). Recent work has looked at using proxies from regions where the EP and CP signals are distinctly different, in order to distinguish between ENSO flavours (Schollaen et al. 2015). This demands more attention, in order to fully capture ENSO behaviour using proxy records. Additionally, there is a need for an increased consensus in methods of classifying EP and CP events over the twentieth century in order to provide a more consistent understanding of ENSO diversity. Currently, the multiple methods and datasets used for creating EP and CP event chronologies results in differences between records, which hinders complete comparisons of the different types of ENSO.

The impact of the 1815 Tambora volcanic eruption on ENSO behaviour was also investigated, within the background of previous research. McGregor et al. (2010) suggested an increased likelihood of El Niño in the year of a volcanic event. This is support by the El Niño event suggested in 1815/16 by a number of the multi-proxy reconstructions. The likelihood of La Niña event doubled 3 years after a large volcanic eruption (McGregor et al. 2010). A strong La Niña was suggested in the logbook-based reconstruction 3 years after the Tambora eruption and was also found in the multi-proxy reconstructions. The link between volcanic eruptions and ENSO is still contested and further work is needed to explore this fully (McGregor et al. 2010). However, here we present evidence that this suggested link was found for the Tambora eruption, which was one of the strongest eruption of the last millennium (Emile-Geay et al. 2008).

Overall, it is clear that there is still disagreement between reconstructions of ENSO in the early to mid-nineteenth century. The logbook-based reconstruction adds an additional, well-dated indication of ENSO behaviour during this period, being based on direct weather observations from the pre-instrumental period. It also benefits from not being subject to some of the range of uncertainties and limitations that often surround proxy records. It is suggested that further work is still needed in digitisation of additional logbook data and collection of additional multi-proxy records in order to produce an ENSO history based on records with a stronger agreement than currently attainable. The data from ships’ logbooks has the potential to be a key data source for this future work, and contains the possibility of distinguishing the different flavours of El Niño.