The structural approach follows the original potential yield setting

The result of this exercise is the first successful recovery of the nonlinear yield response to winter chill in commercial pistachio production. I apply my findings to climate predictions in the current growing areas to show the potential impact of climate change on California pistachios in the next 20 years, and predict that a significant decline can be expected. The potential yield is the yield that would have been attained with zero damage temperatures. Note, this is not the maximum yield, but the average yield that would be attained with zero damage stemming from our input of interest. This does allow this average yield to be lower than an “ideal” yield, as the crop might experience sub-optimal levels of other inputs even when the damage from pests or temperature is zero. In fact, the setup assumes that damages from temperature are orthogonal to damages from other factors . The log form, which also approximates the effect of temperature as percent change in yield, allows us to separate the temperature effect from background potential yield and noise, and estimate it by OLS or similar methods. What happens if we only have the aggregated yields on the county level? Barrows, Sexton, and Zilberman deal with a similar question when estimating the yield effect GE varieties in cotton, corn, and soybeans. They have a panel of country level yields, but these are not partitioned to GE and non-GE yields. However, they do have the shares of GE and non-GE planted acres in each country. treating the total yields as weighted averages of the GE and non-GE yield, they can estimate the yield effects of GE traits using OLS. The problem here is similar. I only have county level yields for pistachio,french flower bucket but I do have the shares of each county experiencing various CP levels. The total effect of temperature on the county level is the average of its effects on the individual orchards, weighed by their acreage share.

When PY is viewed as the zero loss yield, the function f becomes a net-of-loss function, ranging between 0 and 1 . In practice, however, the potential yield is unknown, and researchers will often use a fixed-effects setup to model the heterogeneity in potential yields between countries or regions. For example, Barrows, Sexton, and Zilberman motivate their model with an “expected efficacy” function that is bounded in [0, 1] but later set up and run a linear fixed-effects model . This econometric practice is very common in many settings, but since the potential yield is no longer modeled as an expected maximum, the function f can no longer be interpreted as a net-of-loss damage function which is bounded in [0, 1]. I suggest the following setup, which follows the original motivation more closely. The agronomic literature, introduced more thoroughly in a subsequent section of this chapter, looks at the yield response in pistachios as a satiated process. When the weather conditions are too warm, there would be virtually no yield. When the weather conditions are cold enough, yield would be normal . However, colder conditions will not have further yield effects . I therefore take the potential yield to be the average yield, considering only years when chill is deemed as sufficient by the existing literature in the entire county. This would be equivalent to Barrows, Sexton, and Zilberman having a measure of pest infestation on non-GE areas, and using the country averages yields when the infestation level is relatively low as potential yield. For this, I take a CP level of 65, which has not been shown to reduce yields in previous publications . Of my 165 observations, 101 serve to calculate the potential yield, and 64 are not. The rate seems high, but it assures at least two observations are used per county-decade to calculate the potential yield. Now, the ratio of yield to potential yield in the panel should theoretically be bounded between 0 and 1 , with deviation from these bounds attributed fully to the disturbance term.

Assuming that this increase is smooth and monotonic, the logistic probability function is a good candidate to model the process. California pistachios are a high value crop, with grower revenues of $1.8 billion in 2016. The most common variety is “Kerman” , and almost all the California acreage is planted in five adjacent counties in the southern part of the San Joaquin valley. In recent years, rising winter daytime temperatures and decreasing fog incidence have lowered winter CP counts. Climatologists have concluded that winter chill counts will continue to dwindle , putting pistachios in danger at their current locations. To better predict the trajectory for this crop and make informed investment and policy decisions, the yield response function to chill must first be assessed. This task has proven quite challenging. The effects of chill thresholds on bloom can be explored in controlled environments, but for various reasons these relationships are not necessarily reflected in commercial yield data. For example, Pope et al. report that the threshold level of CP for successful bud breaking in California pistachios was experimentally assessed at 69, but could not identify a negative response of commercial yields to chill portions of the same level or even lower. They use a similar yield panel of California counties, but only have one “representative” CP measure per county-year. Using Bayesian methodologies, they fail to find a threshold CP level for pistachios, and reach the conclusion that “Without more data points at the low amounts of chill, it is difficult to estimate the minimum-chill accumulation necessary for average yield.” The statistical problem of low variation in treatment at the growing area, encountered by Pope et al., is very common in published articles on pistachios. Simply put, pistachios are not planted in areas with adverse climate. Too few “bad” years are therefore available for researchers to work with when trying to estimate commercial yield responses. An ideal experiment would randomize a chill treatment over entire orchards, but that is not possible. Researchers resort either to small scale experimental settings, with limitations as mentioned above, or to yield panels, which usually are small in size , length , or both. Zhang and Taylor investigate the effect of chill portions on bloom and yields in two pistachio growing areas in Australia, growing the “Sirora” variety. Using data from “selected orchards” over five years, they note that on two years where where chill was below 59 portions in one of the locations, bloom was uneven. Yields were observed, and while no statistical inference was made on them, the authors noted that “factors other than biennial bearing influence yield”. Elloumi et al. Investigate responses to chill in Tunisia, where the “Mateur” variety is grown. They find highly non-linear effects of chill on yields, but this stems from one observation with a very low chill count. Standard errors are not provided, and the threshold and behavior around it are not really identified.

Kallsen uses a panel of California orchards, with various temperature measures and other control variables to find a model which best fits the data. Unfortunately, only 3 orchards are included in this study,bucket flower and the statistical approach mixes a prediction exercise with the estimation goal, potentially sacrificing the latter for the former. Besides the potential over-fitting using this technique, the dependent variables in the model are not chill portions but temperature hour counts with very few degree levels considered, and no confidence interval is presented. Finally, Benmoussa et al. use data collected at an experimental orchard in Tunisia with several pistachio varieties. They reach an estimate for the critical chill for bloom, and find a positive correlation between chill and tree yields, with zero yield following winters with very low chill counts. However, they also have many observation with zero or near-zero yields above their estimated threshold, and the external validity of findings from an experimental plot to commercial orchards is not obvious.Pistachio growing areas are identified using USDA satellite data with pixel size of roughly 30 meters. About 30% of pixels identified as pistachios are singular. As pistachios don’t grow in the wild in California, these are probably missidentified pixels. Aggregating to 1km pixels, I keep those pixels with at least 20 acres of pistachios in them. Looking at the yearly satellite data between 2008-2017, I keep those 1km pixels with at least six positive pistachio identifications. These 2,165 pixels are the grid on which I do temperature interpolations and calculations. Observed temperatures for 1984-2017 come from the California Irrigation Management Information System , a network of weather stations located in many counties in California, operated by the California Department of Water Resources. A total of 27 stations are located within 50km of my pistachio pixels. Missing values at these stations are imputed as the temperature at the closest available station plus the average difference between the stations at the week-hour window. Future chill is calculated at the same interpolation points, with data from a CCSM4 model CEDA . These predictions use an RCP8.5 scenario. This scenario assumes a global mean surface temperature increase of 2o C between 2046-2065 . The data are available with predictions starting in 2006, and include daily maximum and minimum on a 0.94 degree latitude by 1.25 degree longitude grid. Hourly temperature are calculated from the predicted daily extremes, using the latitude and date . I then calibrate these future predictions with quantile calibration procedure , using a week-hour window. Past observed and future predicted hourly temperatures in the dormancy season are interpolated at each of the 2,165 pixels, and chill portions are calculated from these temperatures.

Erez and Fishman produced an Excel spreadsheet for chill calculations, which I obtain from the University of California division of Agriculture and Natural Resources, together with instructions for growers . For speed, I code them in an R function . The data above are used for estimation and later for prediction of future chill effects. For the estimation part, I have a yield panel with 165 county-year observations. For each year in the panel, I calculate the share of county pixels that had each CP level. For example: in 2016, Fresno county had 0.4% of its pistachio pixels experiencing 61 CP, 1.8% experiencing 62 CP, 12% experiencing 63 CP, and so on. The support of CP through the panel is [36, 86]. Past county yields are from crop reports published by the California Department of Food and Agriculture. Figure 3.1 presents chill counts and their estimated effects in percent yield change for two time periods: 2000-2018 and 2020-2040. The top left panel shows the chill counts in the 1/4 warmest years between 2000 and 2018 . The top right panel shows the chill counts in the 1/4 warmest years in climate predictions between 2020 and 2040. Chill at the pistachio growing areas is likely to drop substantially within the lifespan of existing trees.Results from the polynomial regression are presented in Table 3.2 . The first coefficient is for an intercept term, and it is a zero with very wide error margins. This makes sense, as centering around the means also gets rid of intercepts. The second coefficient is positive, as we would expect, and statistically significant. The third coefficient is negative, as we would also expect since the returns from chill should decrease at some point, but not statistically significant even at the 10% level. However, as dropping it would eliminate the decreasing returns feature, I keep it at the cost of having a wide confidence area. With the estimated coefficients, I build the polynomial curve that represents the effect of temperatures on yields. It is presented in Figure 3.2 with a bold dashed line. The 90% confidence area boundaries are the dotted lines bounding it above and below. Note that the upper bound of the confidence area does not curve down like the lower one. This is the manifestation of the third coefficient’s P-value being greater than 0.1. In both cases, the confidence area was calculated by bootstrapping. The data was resampled and estimated 500 times, producing 500 curves with the resulting parameters. At each CP level, I take the 5th and 95th percentiles of bootstrapped curve values as the bounds for the confidence area. This approach also deals with the potential spatial correlation in error terms.