Supplementary Information, McIntyre and McKitrick (2003)

Proxy Data

For the emulation of MBH98, the data set pcproxy used is here. This data set was originally obtained from Mann's ftp site here according to directions from Scott Rutherford (at Mann's direction), but this file was deleted after the publication of MM03. This file was only used for our emulation of MBH98. 

Recollated Tree Ring Data

We noticed problems with the principal component series in pcproxy.txt, most notably that the principal component series in this file were not correctly calculated. Accordingly, we carried out a complete re-collation of the tree ring proxy series used in MBH98 for 5 of 6 regions, using site identification information at the Nature SI and archived series at WDCP

In each of the 5 regions below, the "sitelist" is the list of WDCP identifications used in the collations and the "dataset" is the freshly collated tree ring information used in principal component calculations. For these calculations, the portions of the series prior to 1400 were not included in the collation. 

NOAMER (North America region): dataset sitelist  The Nature SI provided a list of ITRDB identifications. In nearly all cases, the SI identifications directly matched a WDCP site. A few aliases were noticed in the MBH98 listing and the following WDCP identifications substituted in our collation: ak001 for AK001C; wy005 for WY005B; the site AR045 could not be identified at WDCP.  

SOAMER (South America): sitelist dataset. Again, the ITRDB identifications are obtained from the Nature SI. The following aliases were identified: chil002 for CHIL002S; chil007 for CHIL007S;  

AUSTRAL (Australia-New Zealand) sitelist dataset. The ITRDB identifications were obtained from the Nature SI and no aliases were noticed. 

STAHLE/OK (Texas-Oklahoma) sitelist dataset. In this case, the Nature SI listed the names for 16 sites, but did not provide ITRDB identifications. The names for the 16 sites were looked up at the WDCP database and the identifications shown in the sitelist assigned and then used for re-collation.

STAHLE/SWM (Southwestern US- Mexico) sitelist dataset. The same procedure was followed as for the Stahle/OK series.  Subsequent to MM03, we noticed that a couple of sites had more than one version archived at WDCP and that the start dates of the version here did not match the start date of the version listed in the Nature SI. Subsequent work has used an updated version of this dataset.

VAGANOV: There was no site information for the Vaganov region. In this instance, we were unable to re-collate and used the 3 Vaganov series (#81-83) as supplied by Mann.

Updated Data Versions

For non-PC series, we noticed that archived versions at WDCP differed from MBH98 versions. MBH98 do not provide data citations (see AGU for a data citation policy) and the basis for data citation applied here is noted in the comments.

# Series Comment
1 Burdekin River, Australia coral fluorescence

Graph  Script  Data URL MBH98 data for  series 1 Burdekin River, Australia coral fluorescence has correlation of 0.42 with WDCP series. Lough (pers. comm. Oct. 2003) confirms validity of WDCP series over earlier data. Plot shows visual coherence, but considerable shifting. 

2 Great Barrier Reef coral calcification

Graph  Script  Data URL-Havannah Island MBH98 data for  series 2 (Great Barrier Reef) is incorrectly labelled coral thickness, while the data should refer to coral calcification (Lough, pers. comm., Oct. 2003). The MBH98 data appears to be an average of the following 5 corals for the period 1615-1982: Abraham Reef, Britomart Reef, Havannah Island, Lodestone Reef and Sanctuary Reef (correlation - 0.99). A 5-reef average is used by Jones et al. (1998) and appears to be the same data. The MBH data seems to be Z-transformed, although the basis of the Z-transform is not clear.

3 Urvina Bay, Galapagos coral δO18  

Graph  Script  Data URL  MBH98 data for  series 3 Urvina Bay, Galapagos coral δO18 has correlation of -0.9992951 with WDCP series - which is reversed in sign during transformation. MBH overwrite actual data in 1962-64 and fill for 1907-1909 and 1953-61 as noted above.  The missing data results from a splice between two corals, which are spliced by adjusting the readings of the second coral.. 

6 Vanuatu coral δO18

Graph  Script  Data URL MBH98 data for  series 6 Vanuatu coral δO18 has correlation of 0.93 with WDCP series. MBH have one plugged year in 1980. 

7 New Caledonia  δO18 

Graph  Script  Data URL MBH98 data for series 7, New Caledonia  δO18 has correlation of only 0.618 with WDCP data. 

8 Secas, Panama  δO18

Graph  Script  Data URL MBH98 data for series 8, Secas, Panama  δO18 has correlation of 0.983 with WDCP annualized series (annual data calculated from WDCP 10 per year data). 

9 Secas, Panama  δC13

Graph  Script  Data URL MBH98 data for series 9, Secas, Panama  δC13 has correlation of 0.991 with WDCP annualized series (annual data calculated from WDCP 10 per year data). 

10 Central England historical

Graph  Script  Data URL The use of summer data in series 10 (Central England) is shown first through correlation of >0.99 with JJA series and only 0.62 with annual data and secondly through direct inspection of the 3 series taken together. The truncation of data from 1659 to 1730 can be seen by examination of the two series together. 

11 Central Europe historical

Graph  Script  Data URL  The  use of summer data in series 11 (Central Europe) was established by extremely high correlation with JJA values (as with Central England) and lower correlations with annual data. The truncation of 1525-1550 values can be seen by comparing series. 

12-15 Quelccaya These series exactly match WDCP data.
16 Dunde Lonnie Thompson has not archived this series. MBH98 version used.
17 S. Greenland ice melt No archived version located. MBH98 version used
18 Svalbard ice melt MBH98 version reconciles to annualized interpolation of WDCP version at C1500 location and is used.
19 Penny No archived version located and MBH98 version is used
20 Crete, Greenland δO18 This is a stacked version of several series; the underlying series are at WDCP, but not the stacked version. The MBH98 version was used.
    Series 21-31 are instrumental temperature series attributed to Jones and Bradley (1992). Identifications considered in  JB92 Table 13.1.  JB92 series for Central England, Berlin, Sverdlovsk and Toronto (all digitally available at WDCP) are compared to MBH series 21-31 and no correlations are found to permit identification.
21 42.5N, 92.5W 

MBH98 data for series 21 has correlation of 0.889 with JB92 Minnesota (adjacent grid box) annual data. JB92 version used. There are many differences in the plotted series. Graph  Script  Data URL

22 47.5N, 2.5E No candidate at JB92 and MBH98 version used 
23 47.5N, 7.5E

MBH98 data for series 23  has correlation of 0.81 with JB92 Geneva annual data, which has identical start date (1753) and location. There are many differences in the plotted series including a notable downspike in the MBH data in early 19th century not present in JB92 data. JB92 version used.. Graph  Script  Data URL

24 47.5N 12.5E  No counterpart in JB92 data and MBH98 version used.
25 7.5N 17.5E No counterpart in JB92 data and MBH98 version used.
26 52.5N 17.5E  No counterpart in JB92 data and MBH98 version used.
27 57.5N, 17.5E  

MBH98 data for series 27 has correlation of >0.99 with JB92 Stockholm annual data, which has identical start date (1756) and location, confirming identification. The MBH series is linearly transformed from the JB92 series. JB92 version used. Graph  Script  Data URL

28 57.5N, 37.5E

MBH98 data for series 28 has correlation of 0.96 with JB92 Leningrad annual data, which has identical start date (1752) and location, confirming this identification. The MBH series is transformed from the JB92 series. JB92 version used.  Graph  Script  Data URL

29 62.5N, 7.5E No counterpart in JB92 data and MBH98 version used.
30 62.5N, 12.5E  

MBH98 data for series 30 has correlation of 0.998 with JB92 Trondheim annual data, which has identical start date (1761) and location, confirming identification. The MBH series is transformed from the JB92 series. Graph  Script  Data URL

31 62.5N 42.5E No counterpart in JB92 data and MBH98 version used.
    Series 32-42 are instrumental precipitation series attributed to Jones and Bradley (1992). There are many problems in these series. JB92 series
32 12.5N 82.5E  No counterpart in JB92 data and MBH98 version used.
33 17.5N 72.5E  No counterpart in JB92 data and MBH98 version used.
34 37.5N 77.5W No counterpart in JB92 data and MBH98 version used.
35 42.5N 2.5E 

MBH98 data for series 35 has correlation of 0.95 with JB92 Marseilles (43.3N, 5.4E) annual data, which has identical start date (1749), confirming identification. The MBH98 gridcell is one grid-box to the west of the correct location. Both the JB92 series at WDCP and MBH series are transformed, but transformations are different. The JB92 version is used. Graph  Script  Data URL

36 42.5N 7.5E  No counterpart in JB92 data and MBH98 version used.
37 42.5N 72.5W

MBH98 data for series 37, precipitation 42.5N, 72.5W has correlation of 0.92 with JB92 Paris annual data, which has identical start date (1770), confirming identification. The MBH98 gridcell location is wildly incorrect. The JB92 version is used. Both the JB92 series at WDCP and MBH series are transformed, but transformations are different. Graph  Script  Data URL

38 47.5N 2.5E This is the gridcell for Paris, but is not Paris data. No counterpart in JB92 data and MBH98 version used.
39 47.5N 12.5E  No counterpart in JB92 data and MBH98 version used.
40 52.5N 12.5E No counterpart in JB92 data and MBH98 version used.
41 52.5N 2.5W No counterpart in JB92 data and MBH98 version used.
42 57.5N 7.5W No counterpart in JB92 data and MBH98 version used.
43 Tasmania temperature reconstruction 

MBH98 data for series 43, Tasmania T-reconstruction has correlation of 0.82 with updated WDCP series. Plot shows visual coherence, but considerable shifting. WDCP series used.  Graph  Script  Data URL

44 Java tree ring No archived data located (Jacoby) and MBH98 version used.
45 New Zealand temperature reconstruction Exact match in C1500 location at WDCP.
46 "Central Patagonia" temperature reconstruction An exact match was found in the C1500 location at WDCP in the Northern Patagonia series. MBH98 has incorrectly transposed the locations of these two series.
47 "Northern Patagonia" temperature reconstruction

An exact match was found in the C1500 location at WDCP in the Central Patagonia series. MBH98 has incorrectly transposed the locations of these two series.

48 Upper Kolyma Some tree ring series from this area are archived, but no identification of this series (Schweingruber, pers. comm.) could be made and MBH98 version used. Postscript: This series matches Figure 4 of Earle, Brubaker et al., 1994. Arc Alp Res 26: 60-65.
49 Western US temperature reconstruction (MXD) An exact match was found at WDCP by splicing two columns in Briffa's western US dataset. The citation to Fritts is incorrect.
50 Western US temperature reconstruction (RW) All values in this series after 1961 were set equal to the corresponding values of series 49. Fritts archived data at WDCP, but the corresponding version could not be identified. The MBH98 version was used (without fills).
    The WDCP identifications of MBH98 series 51-61 (Jacoby northern treeline series) are not shown in MBH98. These identifications are straightforward as shown here
51 Four Twelve AK

MBH98 data for  series 51 (Four Twelve AK) has correlation of 0.86 with WDCP. Comparison of end values shows that WDCP continues to 1990, as compared to MBH end in 1976 (with plugs to 1980). Plotting shows that MBH98 has pervasive and increasing over-statement in 20th century values and peaks in the 1920s.  Graph  Script  Data URL

52 Fort Chimo PQ

MBH98 data for  series 52 (Fort Chimo PQ) has correlation of 0.93 with WDCP. Comparison of end values shows that WDCP continues to 1990, as compared to MBH end in 1976 (with plugs to 1980). Plotting shows that MBH98 has pervasive and increasing over-statement in 20th century values. Series peaks in 1960s. Graph  Script  Data URL

53 Gaspe PQ The first 4 years of this series have been plugged in MBH98; otherwise the series matches the WDCP version.
54 Arrigetch AK

MBH98 data for  series 54 (Arrigetch AK) has correlation of 0.96 with WDCP. Comparison of end values shows that WDCP continues to 1990, as compared to MBH end in 1976 (with plugs to 1980). Plot shows series peak in early 1980s with downturn to series end in 1990. Graph  Script  Data URL

55 Sheenjek River AK

MBH98 data for  series 55 (Sheenjek River AK) has correlation of 0.70 with WDCP. Comparison of end values shows that both WDCP and unplugged MBH98 end in 1979. Comparison of start values (and plot) shows WDCP starts much earlier. Considerable overstatement of values in MBH98 in the 1940s and in the 18th century. Graph  Script  Data URL

56 Twisted Tree, Heartrot Hill NWT

MBH98 data for  series 56 (Twisted Tree, Heartrot Hill (TTHH), Canada has correlation of 0.699 with WDCP. Comparison of end values shows that WDCP continues to 1990, while unplugged MBH98 ends in 1976. Comparison of start values (and plot) shows that MBH98 starts earlier then MBH98. WDCP values peak in the 1960s and reduce sharply thereafter. Increasing MBH overstatement in the 20th century. Graph  Script  Data URL

57 Mackenzie Mts NWT Exact match at WDCP
58 Coppermine River NWT

MBH98 data for  series 58 (Coppermine River, Canada has correlation of 0.99 with WDCP. MBH plug three years (1978-1980), but otherwise coverage period is the same. Values nearly identical at beginning but pervasive changes later in the series.  Graph  Script  Data URL

59 Hornby Cabin NWT Exact match at WDCP
60 Churchill MB Exact match at WDCP
61 Castle Peninsula PQ Exact match at WDCP
62 NC precipitation reconstruction No archived series located and MBH98 version used
63 SC precipitation reconstruction No archived series located and MBH98 version used
64 GA precipitation reconstruction No archived series located and MBH98 version used
65 Tarvagatny Pass, Mongolia

MBH98 data for series 65, Tarvagatny Pass, Mongolia has correlation of 0.94 with updated WDCP series. WDCP version used. MBH data shows increasing over-estimate in 20th century.  Graph  Script  Data URL

66 Yakutia temperature reconstruction No archived series located (Hughes) and MBH98 version used
67 Fennoscandia temperature reconstruction No archived series located  at WDCP (Briffa) at time of MM03 and MBH98 version used. Subsequently, a version was located at Briffa's website, which coincided with MBH98 version.
68 Polar Urals temperature reconstruction No archived series located  at WDCP (Briffa) at time of MM03 and MBH98 version used. Subsequently, a version was located at Briffa's website, which coincided with MBH98 version.
69-71 Stahle/OK PCs These were calculated ab initio as discussed below. The fresh calculations are in columns 69-71 of the Corrected Dataset.
72-80 Stahle/SWM PCs These were calculated ab initio as discussed below. The fresh calculations are in columns 72-80 of the Corrected Dataset.
81-83 Vaganov PCs No site locations were available and MBH98 versions used.
84-92 NOAMER PCs These were calculated ab initio as discussed below. The fresh calculations are in columns  84-92 of the Corrected Dataset.
93-95 SOAMER PCs These were calculated ab initio as discussed below. The fresh calculations are in columns 93-95 of the Corrected Dataset.
96-99 AUSTRAL PCs These were calculated ab initio as discussed below. The fresh calculations are in columns 96-99 of the Corrected Dataset.
100 chin04 This is alias for chin004. First two years were truncated in MBH98 version without annotation. WDCP version used.
101 chin04x This is alias for chin004x, but is an exact match to WDCP
102 fran009 Exact match at WDCP
103 fran010 Exact match at WDCP
104 fran011 Exact match at WDCP
105 indi008x

There is no series with this identification at WDCP. MBH98 data for series 105 INDI008X has correlation of 0.83 with  WDCP indi002x and this identification is applied (and WDCP version used.)   Graph  Script  Data URL

106 mexi001 Exact match at WDCP
107 MOR003 This is an incorrect identification, but the data is an exact match to morc011 (Ifrane)
108 MOR007 This is an incorrect identification, but the data is an exact match to morc001 (Tounfite)
109 MOR008 This is an incorrect identification, but the data is an exact match to morc014 (Col du Zad)
110 spai011 Exact match at WDCP
111 spai012 Exact match at WDCP
112 SWED002B

This is presumably alias for swed002. MBH98 data for series 112, SWED002B is WDCP swed002.  Correlation is 0.977.  Graph  Script  Data URL

   

Series which were successfully located in digital form in the MBH form are noted here; comments on digitally unavailable series are here

Retained Principal Component Series

For each region, we used the number of principal components listed in the weights file, which coincided with information at the Nature SI: NOAMER -9; SOAMER - 3; AUSTRAL - 4; OK- 3; SWM -9; VAGANOV -3. For each region, as stated in MM03, we used a conventional principal components calculation over the period for which all sites have values. Conventional principal component calculations, which MBH98 claim to use, require that there be no missing data.  In the calculations of MM03, we calculated principal components for each site over the maximum available period. The script is here. The relationship between the periods in which MBH principal components were applied (according to the pcproxy.txt file) and the period during which all selected sites in the region are available as shown here. (Postscript: subsequent to MM03, Mann et al. reported that they dealt with missing data in tree ring calculations by "stepwise" calculation of principal components in which the rosters were periodically changed. The difference in procedure resulted in the NOAMER PC1 and the STAHLE/SWM PC1 not being available to the AD1400 roster. This difference in procedure has been characterized by Mann et al. in very invidious terms as "selective censoring", but was actually a reasonable attempt to interpret MBH98 procedures from the public record, since Mann had refused to clarify procedures upon request. We have subsequently examined the principal component calculations of MBH98 in considerable detail and have carried out a reconstruction with the inclusion of the NOAMER PC1 and STAHLE/SWM PC1 in the AD1400 roster, showing the same results as MM03. This has been submitted to a senior journal and is under review as of June 2004.) 

Corrected Dataset

The corrected dataset incorporating updated versions of series as annotated above and freshly calculated principal component series as used in MM03 is here.

Calculations

To say that the description within MBH98 of their methodology is terse is an understatement. The algorithms described here were successful in replicating the MBH98 reconstruction in the 20th century to a high degree of precision; the correlations declined in earlier centuries, but still captured most MBH98 features. I (SM) notified Prof. Mann of this in a private email and requested clarification, perhaps in the form of a less terse description of the MBH98 methodology. Prof. Mann stated that he was unable to deal with this request because of numerous other time commitments. The algorithms described here are accordingly based upon a careful consideration of the public data and considerable experimentation.  (Postscript June 2004 - this represents the position as at October 2003 and is provided here to represent our approach as at MM03.)

(a) Temperature Principal Components

MBH98 purports to establish relationships between the proxies and 16 temperature principal components calculated from the Climate Research Unit (CRU) instrumental temperature database, using a subset of 1,082 out of 2,592 cells and the 79-year period from 1902-1980 as a calibration period.  MBH98 reported that they carried out a principal component analysis of the above temperature data, using "conventional" principal component analysis.  There is considerable missing data. Since "conventional" methods fail in the presence of missing data, the precise methodology used by MBH98 remains undisclosed. 

MBH have archived the gridcell locations, temperature principal components, EOFs and eigenvalues used in MBH98 at two locations: ftp://eclogite.geo.umass.edu/pub/mann/MANNETAL98 also ftp://ftp.ngdc.noaa.gov/paleo/paleocean/by_contributor/mann1998. These were downloaded and re-collated.  A script for downloading, collating and saving these and other MBH98 datasets is here.

MBH did not archive the gridcell standard deviations used to normalize the gridcell temperature series prior to principal component calculations. This information is required to calculate the NH average temperature after reconstruction of the temperature principal components using proxies. The current CRU dataset differs from the version used in MBH98. Because the now-obsolete version was unavailable, the gridcell standard deviations were estimated from the current CRU dataset. We do not expect the calculations to be especially sensitive to this approximation, or, alternatively, if they are, then this would raise questions about the robustness of the procedure. The CRU dataset was downloaded from CRU in July 2003 and collated into an R-table of dimension 1769 ( months from 1856 Jan to 2003 May) x 72 (longitude groups W to E) x 36 (latitude groups N to S).  The Jones marker of -9999 was changed to NA. The dataset was truncated to the 1082 MBH cells, the location of which was obtained from the MBH98 FTP site. Following MBH98, the dataset was then truncated to 1902-95 and then a Z-transformation (subtracting mean and dividing by standard deviation) was carried out cellwise. The scaling factor for each cell was saved as "mann.scale.tab".  Four cells were found to have no observations. These locations were saved as a tracking vector (of length 1082) is saved as "nil.tab". For each cell, the cosine of the latitude is used as an area-weighting factor. The scaled dataset obtained above is then multiplied by this area-weighting factor to obtain a scaled latitude-weighted temperature dataset, which is saved for further use.  This dataset is not used further in this paper, but was used in calculations regarding the MBH98 calculations of temperature principal components, which will be discussed in a later paper. The script for the above is here

(c) Reconstruction

The reconstruction of the NH temperature average is done through 4 functions.  

RPC  for a dataset of proxies proxy calculates the reconstructed TPCs.  The periodization defined in MBH98 is parameterized in the R-object period.m and the selections of eigenvectors in each period is parameterized in the R list select of length12. The proxies are Z-transformed basis 1902-1980. For all periods prior to 1970, the roster of proxies used in the period is defined as the set of proxies available in the first year of the period.  After 1970, the roster is the year-by-year available proxies.  For each period, using the roster of proxies and selection of eigenvectors, a matrix of coefficients G is calculated using Gcalc.  Then for each year in the period, the proxies are regressed against the selection of coefficients (using the proxy weighting factors as sent by  Prof. Mann's associate) thereby yielding the "reconstructed" TPCs for the selected eigenvectors.

Gcalc   (see after function RPC )  calculates a matrix G of regression coefficients given a roster of proxies (as a logical vector of length 112) and a selection of eigenvectors (as a logical vector of length 16) for the relationship between the proxies and TPCs for the calibration period 1902-1980. (All calculations are done after Z-transforming both proxies and TPCs to period 1902-1980, but this is not done in this function.)  This calculation is formally equivalent to MBH98's "least-squares solution to the overdetermined matrix equation.

 The script for generating Figures 7 and 8 is here. The MBH98 NH reconstruction (nhmann) is loaded, as are the MBH98 TPCs (pc) , eigenvectors (eof) , eigenvalues (lambda), weights (weights1), proxies (proxy), together with the above functions.  Figure 7a is column 2 of nhmann; Figure 7b is column 1 of rpc;  by applying the function RPC to proxy, a matrix MBH of reconstructed TPCs is obtained; Figure 7c is column 1 of MBH;  Then the corrected and updated proxy table proxy4 is loaded; a matrix new of reconstructed TPCs is obtained by applying the function RPC to the new table proxy. Figure 7d is column 1 of new. Figure 7 is then plotted.

The NH average using the reconstructed TPCs is now calculated. The reconstructed TPCs are expanded to 1082 cells using the eigenvectors eof, eigenvalues lambda, and the selection of eigenvectors for each period in list select. The MBH98 periodization is included as data within the function.  The location of the Mann cells is loaded within the function as is the vector nil of cells with no data.  This dataset is re-scaled using the scaling factors calculated in the original Z-transform of the CRU temperature data. The average is then taken of the dataset truncated to NH cells (using information on location of Mann cells). Areal weighting was already allowed for in the transformation described above. This gives Figure 8b; Figure 8a is the same as Figure 7a above.

Oct. 2003.

Feb. 4, 2005 - 2 hyperlinks edited.