The statistical technique chosen for the design of the models was multiple linear regression

The values for the 10 minute run were then downloaded from the GitHub into an Excel file. The mean value for each sensor was calculated for that sample and recorded in an Excel file with the corresponding Myron Ultrameter value for the same sample. From the remaining ~900mL of sample, another 100 mL was set aside and refrigerated for additional laboratory analytical testing.The subsequent phase of the experimental process was the testing of each of the 34 tap water samples for EPA primary and secondary drinking water standards. The purpose of this was to attempt to develop models between the sensor values recorded in phase one and correlate them to other important drinking water values not measured directly by the sensors but were hypothesized to have some correlation due to physical or chemical phenomena. Turbidity and dissolved organic carbon were chosen because of their potential to be correlated with dissolved oxygen since these parameters are often indicators of microbial activity, which consumes oxygen in water. Various transitions metals were also chosen because metals would presumably contribute to the overall electrical conductivity of a sample, hydroponic nft gully plus ion specific sensors are extremely cost prohibitive so there are economic benefits to attempting to model their concentrations in water.

Free chlorine was chosen because it relates highly to the pH and ORP sensor values as discussed in the related work section. Lastly, color was chosen because it is one of the main visual indicators of poor water quality, and therefore a major consumer concern, although it was not expected that any of the sensor values could be used to predict color as they do not feature any optical measurement features. Table 2 lists the parameters tested, the standard method performed for each, and the instrument utilized.The results obtained from the additional laboratory testing of the samples were matched the sensor values obtained in phase one and added to the database. This method involves using several explanatory variables to predict the outcome of a response variable. This technique can be used to determine how strong the relationship is between two or more independent variables and one dependent variable. The technique allows one to obtain a predicted value for specific variables. To perform multiple regression several assumptions must be met.

Four principal assumptions were tested to suit the suitability of the data for multivariate regression. 1) there must be a linear relationship between the outcome variable and the independent variable 2) the residuals are normally distributed 3) there is no multicollinearity, that is the independent variables are not highly correlated with each other, and 4) homoscedasticity must be satisfied meaning the variance of error terms are similar across the values of the independent variables.This result is expected for ORP sensors. YSI reports that the most common problem with ORP measurements for environmental water samples is that readings from various instruments for the same water sample can differ by a significant margin even with the same sensor type and electronics, yet the sensors show identical or similar readings in ORP standards. This is explained by the fact that the tap water sampled is expected to be clean by most standards. Therefore, it would likely contain few redox active species present and those that are present have low concentrations, which was confirmed by the results from the Free Chlorine testing of the samples as well. However, in standard solutions the two probes read the expected values, this is due to the concentration of redox-active species being much higher.

Therefore, it is suggested that historic data be used to help determine the validity of probes for environmental water samples and inconsistent data between probes does not always indicate a malfunctioning probe. As for the discrepancies between the pH values, it is believed that the issue may be due to sensor drift. This is presumed to be the issue due to the similar behavior patterns of the probe with the Atlas pH probe values becoming slightly shifted above the Myron Ultrameter values through the month of October 2021. This indicates more frequent calibration than recommended by the manufacturers may be necessary for long-term sensor deployment. To resolve the discrepancy between the pH probe performance it would be suggested to collect another set of tap water samples in the future with a recalibration frequency of every 2-4 weeks and determine if the new data allows for acceptance of the null hypothesis. Despite the difference in performance between the Atlas Scientific sensor probes and the Myron Ultrameter, the average values for the measured parameters are still in expected ranges for tap water.The zinc correlations had the best results for both the correlation coefficient and the adjusted coefficient of determination indicating a strong linear relationship for the model. The adjusted coefficient also indicates that 73% of the variation of the zinc values around the mean are explained by the sensor inputs. Of the metal ions, it is observed that the models for lead performed the worst, this is likely due to the low lead concentration in the water leading to negative values obtained for some samples from the ICP-OES, indicating that many of the samples had lead values below the detection limit, which may cause erroneous results. Multiple regression coefficients can also be reduced due to collinearity between the input variables. This was tested using the “correl” function in excel to create a correlation matrix demonstrated in Table 6. Some correlation was observed between pH and EC and DO and ORP, however this should not affect the predictive capabilities of the models, and the correlations are low enough to still satisfy the assumption of independence between explanatory variables to meet the model requirements.Lastly, the metal dosing was conducted. The last 8 minutes of each run represents the time frame when the Metal Solutions were introduced. The “isoutlier” function in MATLAB was used to calculate upper and lower thresholds for each sensor. The program uses a True/Falselogic array to determine if a value is outside either range and marks values that are determined as outliers with and “x.” The sensors which recognized outliers are plotted in Figures 16-26. The pH results with low values found for outlier detections can be attributed to the acidity of the metal solutions used for dosing which ranged in pH’s from approximately 3.5-6. Based on the point in time at which outliers were detected it appears that the sensors detected the lowest concentrations for copper followed by zinc, then lead, and finally iron. EC appears to have some false alarms for the zinc and iron contaminations, as values are detected before the metal contamination was introduced into the flowing water. Otherwise, the EC sensors appear to have responded well to the introduction of zinc and sodium chloride as many outliers are detected after the additions of those contaminants. Lastly, dutch buckets for sale many outliers were detected for DO upon the addition of sodium chloride however they would be expected to decrease because increasing the salinity reduces oxygen solubility indicating bad performance of the DO sensor for anomaly detection.The design and prototype proposed present a wireless water quality monitoring system for point of use applications using commercially available sensor probes. The system architecture is built on a Raspberry Pi microprocessor and includes software that allows for remote access of sensor data for users and operators. The experimental results determine the ability of the system to predict other primary and secondary drinking water standards with satisfactory RMSE values for the dataset that was tested, as well as recognize outliers caused by contamination events. The project could be expanded upon by integrating the predictive models into automated data processing scripts and generally making the system more autonomous.

The proposed algorithms can be improved upon with more robust data analysis and machine learning techniques due to the complexity of a data set with five explanatory variables. Future work includes retrofitting the sensor prototype into mobile water filtration units deployed in areas with known water quality issues and developing an app to accompany the system which alerts users of water quality issue when they are detected. Overall, the viability of the system is promising for combining many of the advantages of wireless sensor networks for remote monitoring of drinking water.Dissecting complex environments into biotic and abiotic components for individual study may lead to surprising and novel discoveries about how organisms respond phenotypically and evolutionarily to their habitats. This is because the net effect of a given environment is the accumulation of effects due to all relevant components of the ecosystem. Selection on floral characteristics, for instance, is mediated by many agents including pollinators, herbivores, temperature, and water availability . Thus, the complexity of natural habitats can preclude identification of the precise stimulus for phenotypic change or natural selection in the field . By isolating simpler components of complex habitats, we can test whether each affects the phenotype expressed by a given genotype, the adaptive value of that phenotype, or both. These two effects of environment—phenotypic plasticity and differential selection —are crucial to understanding trait evolution and fitness both in a historical context and in the context of a changing planet. Despite their importance, most ecological drivers of trait expression and natural selection are unknown . Because environmental effects are best understood as responses of one trait to a specific stimulus , disentangling the precise ecological interactions that cause plasticity and differential selection in nature is an important goal . For plants, soil is a key component of the complex natural habitat. Soils contain intricate patterns of chemical, physical, and microbial variation that are linked on continental and centimeter scales . Feedbacks between above-ground plant communities, below-ground microbial communities, and nutrient availability are common . At the level of populations and individual plants, soil microbes can affect plant growth , resistance to infection and above ground herbivory . Additionally, microbes can mediate adaptation to novel environments . Subsets of the soil microbiome interact with plants by colonizing aerial plant tissues or the rhizosphere or root . Thus, soil chemistry, plant biology, and microbial ecology are intricately linked and soils are comprised of many potential biotic and abiotic selective agents.In this study, we investigated the role of soil as a driver of plasticity and as an agent of selection on flowering time, an important ecological trait for plants and their communities . Flowering phenology has a strong genetic component but also responds to stimuli including temperature , water availability , pathogen infection , and herbivory . While soil chemistry is known to affect flowering time , soil microbial communities have rarely been acknowledged as possible drivers of reproductive timing in plants. Furthermore, previous studies that explored the relationship between soil microbiome, flowering, and selection used domesticated plants, artificial microbial communities, and/or biota from heavily disturbed soils . Although these experiments provide evidence that soil microbes change plant reproductive timing and selection pressures, they allow us to draw only limited conclusions about the evolutionary importance of this process. Here, we asked whether flowering phenology is plastic in response to different soil microbiomes, and soil microbial communities alter the intensity of selection on flowering phenology. We further asked whether relative abundance of specific members of the microbiome predict the observed effects of microbial treatments on flowering time. To test these hypotheses, we grew gnotobiotic seedlings of the non-mycorrhizal wild mustard Boechera stricta Al-Shehbaz in sterilized potting soil inoculated with microbial communities extracted from soils of four undisturbed natural habitats. To enable comparison of biotic and abiotic soil variables, we also grew seedlings in field soils collected from the same habitats, which were sterilized to eliminate their natural microbiomes but retained their chemical and physical differences. For each individual plant, we recorded the day of first flowering, height, number of leaves, and number of fruits. We quantified selection on phenology as the linear relationship between flowering time and fruit production. Rather than manipulating microbial communities, we used presumably intact communities extracted from soils collected from undisturbed field sites near wild B. stricta populations. Our experiment included 48 natural, inbred accessions that represent the breadth of genetic diversity harbored by B. stricta in the study region . Furthermore, we focused on a phenotype that is under selection in this species in a nearby field site .