The distal region of Chr 3C was associated with six volatiles including five terpenes


It is possible that retention of gene copies within metabolic networks increases flux providing selective advantage, or maintains gene balance, according to the gene dosage balance hypothesis. Network analysis pointed out the leading role of plant cell wall metabolism in determining quality attributes. In particular, xyloglucan endotransglycosylase/hydrolases emerged as central hubs in the network, being correlated both with other members of the gene family and with sensorial attributes relate to tomato texture and taste. Texture is one of the critical components of tomato fruit quality perception. The activation of genes related to cell wall polysaccharide synthesis affects the structure and properties of cell wall and hence the texture and taste attributes. XTH enzymes are involved in the remodeling of plant cell wall hemicelluloses, disassembling of the cellulose–xyloglucan matrix, process that contributes to fruit softening or contributing in the maintenance of cell wall integrity . To date, genetically engineered tomatoes with altered expression of xyloglucan endotransglucosylase/hydrolase showed that it affects texture. The role of individual cell wall–modifying enzymes in fruit softening and the composition of polymers in the fruit cell wall differ between fruit species and within cultivars of the same species. Important XTH genes, physically located in a cluster on chromosome 3, display similar expression patterns in all three genotypes and tend to conserve some specific interactions with the other members of the family. On the other hand, few XTHs display specific links only in one environment, square pots for planting suggesting that a cell wall gene remodeling is involved in the adaptation.

In tomato the XTH family was highly expanded, suggesting that xyloglucan-modifying enzymes may play a more important role in fruit quality than previously suspected. Network analysis evidenced a multifaceted role of these enzymes: first, they are hubs able to tune network relationships; second, they are involved in regulating different sensory attributes, mainly textural such as flouriness, hardness, turgidity, juiciness and skin resistance, but also related to fruit taste and appearance. In SM, texture emerged as a highly dynamic sensorial parameter in terms of the number of links between the two environments, including also to the taste attribute saltiness. Also in RSV the textural attributes as well as the taste attribute sweetness were highly dynamic traits, showing a larger number of changing interactions between the two environments. The differential magnitudes in variability of network connectivity in changing environment reflect differences in cultivar response to environment deriving from the conservation and divergence of gene regulation in response to different environments. In addition, the ACSs confirm to be master regulators of ethylene biosynthesis and fruit quality as well as the ERF transcription factors, downstream components of ethylene signaling that regulate the expression of ethylene-responsive genes, which in turn regulate quality related traits, such as color, firmness, aroma, and taste. Finally, the combined analysis of rnaseq and metabolome data showed a good correspondence between transcript levels and metabolite abundances. The main pathways related fruit quality showed a coherent pattern between changed metabolites and changed transcripts. Both primary and secondary metabolism changes between Acerra and Sarno resulted from differential gene expression between environments. For example, the down regulation in one environment of genes involved in metabolite degradation was consistent with the accumulation of the correspondent metabolite in that environment.

Fruit flavor is an elusive trait, influenced by many factors including genetics, environments and cultural practices . Breeders increasingly are focused on meeting the needs of consumers, but genetic improvement of flavor is challenging as a consequence of the chemical and genetic complexities of the flavor phenotype . These challenges are accentuated in heterozygous, polyploid species. For example, fewer significant single nucleotide polymorphisms were detected in genome-wide association study of tetraploid blueberry when diploid models were applied ; in octoploid strawberry, structural variation underlying a locus affecting volatile production was difficult to resolve using a single reference genome . Recent advances have been made via chemical–sensory studies to identified specific volatiles associated with consumer preference . Although important volatile compounds in fruit crops are being identified, too little is known about the metabolomic and genetic diversity within species and breeding populations. Some volatiles have been lost during domestication and breeding as a combined result of negative selection and linkage drag in tomato and watermelon . Likewise, gain and loss of terpene compounds during strawberry domestication and its genetic causes have been investigated . Recent advances in sequencing technology and analytical approaches have opened new opportunities to understand the chemistry and genetics of fruit flavor. Genome-wide association studies have revealed loci for flavor in a variety of fruit crops . Meanwhile, genomes-wide expression quantitative trait loci studies have the capability to bridge the gaps between GWAS signals and their underlying causative genes. Integration of GWAS and eQTL studies has led to discovery of a master metabolite regulator in tomato and a flesh-color-determining gene in melon . Long-read sequencing now allows assembly of genomes with high contiguity, and when coupled with parental short-read data , the two haplotypes of a heterozygous individual can be fully resolved. Phased assemblies have improved variant discovery, especially for large structural variants . The extent, diversity and impact of SVs increasingly are being studied in horticultural crops and have been shown to alter fruit flavor, fruit shape and sex determination . Great opportunity exists to coherently integrate these multi-omics resources for the discovery of flavor genes. Garden strawberry is an allo-octoploid species with highly palatable non-climacteric fruit .

It increasingly has been utilized as a model for Rosaceae fruit crops genomics and flavor research as a result of its short generation time, wide cultivation and high value. Through exploration of spatiotemporal changes in gene expression and homolog search, several flavor genes have been cloned and validated, including an alcohol dehydrogenase and several alcohol acyltransferases for esters, a nerolidol synthase 1 for terpenes and a quinone oxidoreductase for furaneol. Recently, QTL studies and transcriptome data analyses for strawberry volatiles using biparental crosses have detected QTL and causative genes for mesifurane and gamma-decalactone . Nevertheless, low mapping resolution and a lack of subgenome-specific markers have hampered further characterization of causal genes underlying other QTL. This problem recently was addressed by the development of 50K Fana SNP array using probe DNA sequences physically anchored to the octoploid ‘Camarosa’ genome . High heterozygosity combined with an allopolyploid genome presents difficulties for resolving causative genes and their haplotypes. To further the goal of discovering causative genes affecting flavor in strawberry, association studies with larger sample sizes and additional genetic resources such as eQTL and additional genomes are required. Furthermore, these resources must span the breadth of natural variation in breeding germplasm. Here we present multi-omics resources consisting of an eQTL study representing the genetic diversity of strawberry breeding programs in the US, phased genome assemblies of a highly- flavored University of Florida breeding selection, a structural variation map in octoploid strawberry and a volatile GWAS of 305 individuals. These are combined to leverage the extensive metabolomic, square pots plastic genomic and regulatory complexity in strawberry for the discovery of natural variation in genes affecting flavor. Ultimately, the functional alleles identified will be selected in breeding to achieve superior flavor.The eQTL population consisted of 196 genotypes including 133 newly sequenced accessions . The University of Florida genotypes were grown at GCREC and collected in the spring of 2020 and 2021. The University of California-Davis collection of diverse selections from multiple breeding programs were grown at either Santa Maria CA or Oxnard CA, for day-neutral and short-day accessions, respectively, and collected in the spring of 2021. Four UC genotypes were collected at both sites to ensure sequencing and SNP quality. Total RNA was extracted from a bulk of three fully ripe fruits using a Spectrum™ Plant Total RNA Kit , after flash freezing in liquid nitrogen. Illumina 150-bp pair-end sequencing was performed on the Illumina NovoSeq platform by Novogene Co. . On average, 6.9 Gb of sequence data were obtained for each sample. Raw RNA-Seq data of 63 samples from previous published studies were retrieved from the NCBI SRA database . In order to quantify gene expression, short reads were trimmed for adapter sequences and low-quality reads with TRIMMOMATIC v.0.39 and aligned against the reference genome using STAR v.2.7.6a in the two-pass mode .

Only unique aligned reads were scored by HTSEQ v.0.11.2 in the union mode with the ‘–nonunique none’ flag supplied with the latest Fragaria_ananassa_v1.0.a2 annotation . All count files were compiled in R and normalized with the DESEQ package . To generate the marker dataset for eQTL mapping, SNPs and InDels were called using the mpileup and call commands. Markers were further hard-filtered using BCFTOOLS with the following steps: individual calls with lower than sequencing depth of three were set to missing using + setGT plugin; marker sites with quality < 30, missing rate > 0.3, heterozygous call rate > 0.98, minor allele frequency < 0.05, or number of alternative alleles > 1 were purged; the filtered markers were imported and analyzed in R, and only markers showing more than three matched calls in four duplicated sample pairs were retained. A total of 491 896 markers passed the three stages of filtering. The missing calls were imputed, and all calls were phased using BEAGLE v.5.2 using the default settings . The eQTL mapping was performed for 62 181 fruit expressed genes using the filtered markers. Linear mixed models implemented in GEMMA were used for association analysis . The relationship matrix was computed in GEMMA and supplied to explain relationship within populations, and the top five principal components with a total of 25.0% variance explained were imported as covariates to reduce effects from population stratification to signify the genetic variance underlying the target traits. The Bonferroni corrected 5% significance threshold was used, determined the by number of LD-pruned markers . The approach to define an eQTL was similar to that used in previous studies . Briefly, we first clustered all significant markers with distance < 100 kb and purged clusters with fewer than three markers. The lead marker with lowest P-value was used to identify the eQTL, and boundaries of eQTL were defined as the furthest flanking significant markers. Clusters in LD were merged and boundaries were updated. In order to investigate the genetic control of fruit volatiles, we performed volatile phenotyping and SNP array genotyping with 49 330 markers on a panel of 305 accessions from the UF strawberry breeding program, with 59 individuals overlapped with the eQTL panel . A total of 97 volatiles including esters, terpenes, aldehydes, alcohols, acids, ketones and lactones were quantified . Based on relationships among volatiles, we identified at least five clusters belonging to the same chemical class or biosynthetic pathway, including clusters of eight aldehydes, three ethyl esters, three hexanoic acid derivatives, seven medium-chain esters and three terpenes . Generally high narrow-sense heritability was observed across volatiles , ranging from 0.212 to 0.916, with a mean of 0.660. The highest value of h2 was found for mesifurane and the lowest for octanoic acid, ethyl ester . Genome-wide association study identified 62 signals for 35 volatiles . The lead SNP effects varied from 0.27 to 2.44 , with the largest effect for methyl anthranilate . Two hotspots which contained multiple signals of volatiles belonging to the same class or pathway were found for mediumchain esters and for terpenes , which also were detected to in previous studies and reflected in chemical relationships . Our GWAS results confirmed previous homoeologous group assignments for these volatile QTL and clarified their subgenome and physical positions. The SNP AX-166515537 was the lead SNP for three esters, and a 14 Mb region on Chr 6A shared signals for six medium-chain esters. An LD analysis revealed three linkage blocks . This 3.1-Mb region did not display clear LD block separation . Two significant markers for medium-chain ester hotspot and methyl thiolacetate were tested for their predictability of flavor characteristics . Some abundant volatiles including: 2-hexenal, -; butanoic acid, 2-methyl-; and pentanal were associated with multiple DNA variants , suggesting polygenic inheritance. Pentanal was associated with threeloci, together explaining 30.7% of phenotypic variation in a GLM model. Significantly higher pentanal content was observed in genotypes with three doses of the alternative allele at two loci .In this study we leveraged eQTL, GWAS and haplotype-resolved genome assemblies of a heterozygous octoploid to identify allelic variation in flavor genes and their regulatory elements.