Four replicates composed of 12 fruits from three separate trees, which were from the same fruits used for physiological measurements, were frozez. Frozen tissues were ground into a fine powder with the Retsch Mixer Mill MM 400 . RNA extractions were performed on samplings from 865-2564 GDD for hull tissues, 865-2139 GDD for shell tissues, and 1106-2564 GDD for kernel tissues. One gram of ground tissue was used for RNA extraction as described in . RNA concentrations were quantified with Nanodrop One Spectrophotometer and Qubit 3 . RNA integrity was then assessed on an agarose gel.cDNA libraries were prepared with Illumina TruSeq RNA Sample Preparation Kit v.2 from the extracted RNA. The quality of the barcoded cDNA libraries was assessed with the High Sensitivity DNA Analysis Kit in the Agilent 2100 Bioanalyzer and then sequenced on the Illumina HiSeq 4000 platform by the DNA Technologies Core at UC Davis Genome Center. Raw reads were trimmed for quality and adapter sequences using Trimmomatic v0.39 with the following parameters: maximum seed mismatches = 2, palindrome clip threshold = 30, simple clip threshold = 10, minimum leading quality = 3, minimum trailing quality = 3, window size = 4, required quality = 15, and minimum length = 36. Trimmed reads were then mapped using Bowtie2 to the pistachio transcriptome assembly. Count matrices were made from the Bowtie2 results using sam2counts.py v0.913. A summary of all read mapping results can be found in Supplementray Table 1.
To annotate transcripts with GO terms and Pfam, the best blast hit from the NR database was imported to OmicsBox software . Likewise, plastic garden container the predicted protein sequences were searched against Pfam, ProDom, ProSiteProfiles by InterProScan in OmicsBox to retrieve conserved domains/motifs and corresponding GO terms. The Kyoto Encyclopedia of Genes and Genome orthology and pathways were identified using the Automatic Annotation Server with bi-directional best hit and BLAST as search program against closely related organisms, mallow, rose, mustard, and papaya. The carbohydrate active enzyme was annotated using trinotate software hmmscan function by searching the proteome against the Carbohydrate Active Enzymes database with default settings . The plant Transcription factor Protein Kinase Identifier and Classifier online tool was used to identify transcription factor families with default settings . Enrichment analysis for all functional annotations was performed using a Fisher test. The P-values obtained from the Fisher test were adjusted with the Benjamini and Hochberg method .Count matrices were normalized with the variance stabilizing transformation function from the DESeq2 package in R. The normalized reads were then used for weighted gene co-expression networking analysis by WGCNA package in R . The WGCNA was conducted for each tissue separately. Transcripts were checked by the goodSamplesGenes function, and transcripts that had no expression across all time points were removed. The dendrogram plots were made to identify and remove sample outliers. The network construction was done step by step. The soft thresholding power was selected to obtain the approximate scale-free topology, and adjacency was calculated with power equals selected soft power and type is signed type. TOM was calculated with signed type. The initial module eigengenes were detected by default hierarchical clustering function with default parameters other than the appropriate minModuleSize selected foreach tissue, then final module eigengenes were obtained by merging close modules with cutHeight = 0.25.
The module eigengenes values were then used to identify modules that are significantly associated with measured physiological traits by calculating correlations using Pearson’s product-moment correlation coefficient. The intramodular connectivity was assessed using the edge file output based on TOM. Each transcript connectivity was calculated by how many toNode each fromNode connects, and transcripts that had the top 5% highest number of connections in each module were identified as having high connectivity.The annotation of protein-coding regions was performed with primary contig assembly using transcript evidence from RNA- and Iso-seq data plus ab initio predictions trained on RNA- and Iso-seq data. In the transcriptome assembled with Iso-seq data from all five tissue types , a total of 16,366 transcripts were identified compared to the final gene models and the number of transcripts with tissue-specific expression was variable between different tissues, with the highest number in developing buds , followed by leaves , dormant buds , fruits , and flowers . The initial ab initio gene prediction using Augustus with the training set evidenced by Iso-seq data was followed by EVM with weighted evidence. After the final updates with the addition of different isoforms and UTR regions using PASApipeline, in total, 38,315 protein-coding genes were annotated,of which 37,521 were lifted over onto the 15 chromosomes . Genic regions constituted 19.3% of the genome with, on average, 65.94 genes per Mb, gene size of 2,890 bp, exon size of 285 bp, and intron size of 436 bp . The final gene models on the 15 chromosomes contained 1,596 of BUSCOs out of 1,614 genes in the Embryophyta-db10 database . We performed a comparative analysis of orthology-constrained synteny between the pistachio, mango , and sweet orange genomes, all belonging to the Spinadales order. Overall, extensive structural variation among the three genomes was detected .
More collinearity from macrosynteny was found between pistachio and mango chromosomes although one region of pistachio chromosomeswas aligned to two different regions of mango chromosomes, which supports the lineage specific whole genome duplication event in mango. The new genome annotations and the insights to the relationship between pistachio and its close relatives further motivated analysis of the pistachio fruit developmental program.Pistachio fruit tissues and the seed develop asynchronously. To redefine the developmental stages occurring in pistachio, we sampled fruit clusters weekly for 24 weeks from one week after fruit set to two weeks after commercial harvest would have occurred in the 2019 season . Because heat accumulation regulates enzymatic activities that influence many areas of metabolism including fruit growth , we use accumulated heat expressed as growing degree days and calculated as described in the Section 3.3.6, in addition to calendar time in our analysis to allow for more transferable results. Approximately 48 clusters were collected per time point from 12 trees using a randomized block design. Across sampling, we focus on three main tissues–the mesocarp , endocarp , and embryo . To capture environmental variability and provide a comprehensive timeline of fruit development we assessed a large sample size of individual fruit data and validated our results across two additional independent field seasons in distinct orchards across multiple geographical locations in the California Central Valley. By assessing physiological characteristics of fruit and kernel growth we redefined the stages of fruit development in relation to the previous proposed stages of growth and division of hull/shell, shell hardening, plastic pot and kernel growth. From late April through May, the shell and hull tissues grew in a logarithmic growth pattern plateauing at 500 GDD . During this time , the hull and shell tissues were fused together and became increasingly green, losing the initial red pigmentation. Stage II was redefined as a period before embryo development, where the fruit does not experience any expansion but continues to accumulate dry weight. The initiation of embryo growth began after 900 GDD , marking the beginning of Stage III. Kernels accumulated fats during Stage III while the shell simultaneously became increasingly woody and firm . While the kernel had reached its maximum size at the end of Stage III, it continued to accumulate dry weight up until the end of the season. During Stage IV, the hull underwent final changes consistent with fruit ripening. Hull softening occurred concurrently with changes in red and yellow coloration. . Shells began to split during Stage IV with most fruits splitting by 2600 GDD. We fitted linear and linear-mixed polynomial models for each physiological trait as a function of accumulated heat to describe their behavior across the season using tree as a random variable. We decided to test if the behavior of these traits was consistent across multiple seasons and could be anticipated across different environmental conditions. To do this, we integrated multiple years datasets and fitted models and found that fruit size and weight could be accurately modeled across years and separate locations . Although color and texture changes followed similar patterns across locations/years, they were more subject to environmental variation.To investigate the molecular events occurring during pistachio ripening, we performed transcriptomic analysis of the hull, shell, and kernel tissues across up to 15 weeks of development using the same samples used for the physiology experiment. Samplings correspond to the end of Stage II, 886 GDD, through Stage IV, 2564 GDD. RNA was extracted and sequenced from at least three replicates per time point and tissue. The resulting reads were mapped to the predicted transcriptome producing an average of 76.41% mapping percentage across all samples. On average, 25,941 genes total were expressed across each tissue. We performed a principal component analysis with the normalized reads of all samples .
The PCA revealed that the main driver of separation was tissue type with the kernel samples being very distinct from the hull and shell samples . An additional PCA was performed for maternal tissues only, which showed that the shell and hull tissues were separated by tissue type , increasing in similarity with age . We performed a weighted gene co-expression analysis for each tissue type to identify gene modules associated with specific pistachio developmental stages. The WGCNA yielded 19 modules for hull samples, 17 modules for shell samples, and 22 modules for kernel samples . In each set of modules, we focused on patterns of expression corresponding to the stages of development identified in the assessment of the fruit physiology for each tissue . We then performed trait-module correlations to identify module trends significantly correlated to the patterns observed in hull color and texture, shell texture, and kernel firmness, weight and fat content. Highly correlated modules were further investigated to identify regulatory factors or molecular pathways related to each trait.We hypothesized that plant hormones may regulate the initiation and progression of developmental changes in pistachio fruits. We analyzed the expression of various hormone biosynthesis and signaling genes . We focused on hormones known to be related to fruit ripening because of the relevance of Stage IV for the gain of fruit quality attributes. ABA biosynthesis was highest during the beginning of Stage IV in the hull and shell as ripening began, and was also significantly enriched in the H-IV-2 module composed of genes with elevated expression during Stage IV. Among biosynthesis genes, the rate-limiting step NCED had the highest expression and drives the pattern observed in Figure 3.4A. Similar to ABA, auxin and cytokinin related genes increased in expression at the beginning of ripening. The kernel showed the highest levels of GA compared to the hull and shell, peaking at the beginning of ripening. ABA signaling was also higher in the kernel during Stage IV and was also significantly enriched in the kernel modules associated with Stage IV . We found a significant enrichment of ethylene biosynthesis genes in the H-III-3 module . While many of these genes were expressed in this module, the total normalized reads of biosynthesis genes remained constant during Stage III and Stage IV, meaning that ethylene was not likely involved in fruit ripening. Instead, the highest total normalized reads for ethylene biosynthesis in the hull and shell occurred at the beginning of Stage II. JA biosynthesis and signaling genes were also very highly expressed at the same time as ethylene in the hull, corresponding to a significant enrichment of JA signaling genes in the H-II-1 . The signal transduction network for JA biosynthesis including homologs for JAZ TFs , MYC2 TF , and COI-1 were also coexpressed in the H-II-1 module along with many of the JA biosynthesis genes.Because hull integrity is important for maintaining the quality of the shell and kernel, we were interested in changes in the chemical composition of the hull during Stage III and IV that may influence ripening and hull breakdown. We measured volatile organic compounds weekly from 1106 GDD to 2564 GDD from the same samples collected and analyzed for physiological data and gene expression analysis. At the transition between Stage III and IV, we observed a rise in total VOC production . Terpenes, the main volatile fraction components in pistachio hulls, were quantified acrossStage III and IV and all metabolites exhibited the same pattern peaking at the transition between stages .