Fields were subdivided into plots if several crops or intercropping patterns were found in one field


The rationale behind selecting the case studies was to cover the relevant agroecological , agronomic and commercial contexts in which organic agriculture is implemented in SSA. Kenya and Ghana were selected as focal countries for the following reasons: i) in both countries, there is a substantial share of area under organic agriculture, ii) the existence of organic crop production for export, and iii) the existence of local scientific partners with whom prior experiences in collaboration existed and who could implement the study. In both countries, relevant organic farming initiatives were mapped, visited and evaluated according to the following criteria: a) a sufficient number of individual smallholder farms, which complied with the farm selection criteria , b) the willingness of the organic initiative operators to cooperate with the research team, and c) coverage of a wide range of agroecological, agronomic and commercial contexts. Out of these 13 organic initiatives, we selected five, referred to as case studies hereafter. The selected case studies , and one non-certified , and three in Kenya, one of which was certified and two non-certified ) are described in Section 3.1 and Fig. 2. As a first step, we characterized the population of organic farms in each case study area, according to the socio-demographic and agronomic data collected by the organic interventions and defined criteria for selection: farms had a) to be located not more than 50 km away from each other, b) to be exposed to the intervention, which aimed at the adoption of organic agriculture, at least three years prior to the start of the data collection period in July 2014, and c) to meet or exceed the minimum farm size . In a second step, we stratified the organic farms according to village, and randomly selected organic farms in each stratum. In a third step, we randomly selected similar conventional farms in each stratum. These conventional farms needed to meet the same size criteria as the organic farms. In total, the local research partners randomly selected 300 farms for each of the three case studies in Kenya and 400 farms for each of the two case studies in Ghana and sensitized the farmers,microgreen fodder system in a series of workshops, on the study’s objectives and the intensive data collection involved.

In each case study, the research team assigned a data collection team with a site manager and a group of 10–20 enumerators. The enumerators were trained and monitored extensively in order to ensure homogenous, comparable and high-quality data. Data was collected for five cropping seasons from August 2014 to March 2017. Season 1 was used as a pre-test for training the enumerators and for tailoring the questionnaire contents and procedures to the research necessities. Farmers who were literate entered all relevant information into farmers’ field books designed by the researchers. Less literate farmers were supported in keeping regular records of their farming activities by literate family members or farmers’ secretaries . In Ghana, about 200 visits of an average of one hour each were paid to each farmer by the enumerators or farmers’ secretaries. In Kenya, the enumerators transferred the information from farmers’ field books fortnightly to the electronic questionnaire for 2–3 h per visit over two years . The questionnaire was an electronic Excel file with an automatic upload function to a database in which all data was stored. The questionnaire contained 20 sheets comprising all the relevant information about each farm concerning inputs, outputs and processes. For each farm, fields were identified and marked on a sketch map. We measured the size of all fields on a farm using handheld GPS devices.For each plot, all crops were documented. For each crop on each field, we documented inputs and outputs as well as which agronomic activities were performed. Finally, all inputs and outputs were documented in physical and monetary units . Physical quantities of yields were determined by using standardized measuring containers and calibrating the farmers’ own containers accordingly. This allowed the farmers to measure the quantities used and harvested with their own containers, yet allowed a standardization of these containers for comparability.

Achieving high data quality standards with survey data from smallholder farmers is challenging. Therefore, we implemented a complex iterative data verification and correction process alongside the data collection to ensure complete, valid and consistently high-quality data . To minimize possible response errors, participating farmers and field secretaries were trained in record keeping prior to and during the data collection. Several other measures were applied to reduce respondent/farmer fatigue: interviews were kept brief , but were performed on a regular basis. To maintain farmers’ motivation over the entire course of the project, all the participants received small yearly tokens of appreciation, which did not influence their farming practices. To reduce data entry errors, ongoing training of enumerators, together with support and supervision, were established in all five case study sites, including seasonal workshops, video tutorials and peer review sessions among the enumerators as well as regular checks of enumerators’ performances. Carefully designed questionnaires, including instant validity checks were used to ease the data entry and reduce mistakes. Through this, enumerators were enabled to directly identify and correct errors. After data collection, the completed questionnaires were uploaded to the central Microsoft Access database and passed through multiple procedures for data quality, checking to detect syntactical, semantic and coverage anomalies within each questionnaire. Automated database queries were set up, resulting in enumerator specific data quality reports, each encompassing greater than 50 validity checks. To also ensure the consistency between different questionnaires and identify enumerator biases, agronomic parameters such as yields, inputs and labour hours were calculated and compared between and within case studies as well as between enumerators. We further established processes for identifying outliers for monetary parameters as well as for physical inputs and outputs . Outliers were identified by calculating lower and upper fences . Monetary outliers were replaced with the case study median. For the data entries that caused outliers in physical inputs and outputs, we followed a multiple imputation approach, applying the multivariate imputation by chained equations method for replacing outliers through the R-package “mice” .

Predictive mean matching was used as imputation model and 22 variables were included as predictors for output, labour and input quantity outliers. Five imputed datasets were generated through this method and analysed for differences through a MANOVA test. In a sensitivity analysis, the results proved to be stable and, consequently, the initial imputed data set was used for further analysis. The datasets and source code generated during and/or analysed during the current study are available from the corresponding author on request. We used an entropy balancing approach , to correct for potential selection bias in each case study with regards to participation in the OA interventions. The exact adjustment of covariate moments make it an appealing alternative to standard matching or reweighting methods when estimating causal effects from observational studies . Farm specific weights were generated in STATA using a large range of covariates covering the characteristics farms and farmers . Unobservable characteristics, such as motivation or risk aversion, were assumed to be implicitly captured through family labour, gender, experience and other covariates. Based on the entropy weights, key performance indicators reflecting immediate and intermediary outcomes were used to compare farms in the intervention groups with farms in the control groups at the crop and farm levels. More immediate outcome variables include compliance with minimum requirements for OA and the uptake of AOM practices. More intermediate outcomes include changes in yield and economic performance in terms of gross margins. As the weights were only assigned to untreated units in the control group through the data preprocessing, the entropy balancing produced estimates of the average treatment effect on the treated . The effects were estimated using a probit regression for the binary compliance outcome. A generalized linear model with a binomial family for the error distribution and a logit link for the dependent variable was used to estimate the effect for the AOM scores, as recommended for dependent variables scaled as proportions . For the gross margin estimations, standard ordinary least square regression was employed. For farm-level estimations of economic performance effects, the robustness of the method was tested by comparing the results generated by entropy balancing with the results produced through propensity score matching. This confirmed the findings. For the impact analysis at crop level, we concentrated on four crops/ crop categories in each case study,barley fodder system which were most commonly grown by farmers. Crops were aggregated to crop categories according to Table S9.

The organic to non-organic yield ratio, input cost ratio, labour cost ratio, and gross margin ratio were calculated using a bootstrap procedure to estimate a single confidence interval on the ratio in medians. The systems were deemed significantly different from each other, if the 95% confidence interval of the ratio did not overlap one another. The analysis was implemented using the R-package boot and figures were produced using the R-package ggplot2. Gross margins are not displayed for those crops with different mathematical operator signs. The sensitivity analysis, with an assumed general price premium of 20% as a conservative estimate, was based on data from a meta-study . We, however, deducted estimated cost for maintaining a functioning internal control system and covering cost for external certification, as we did not want to overestimate potential profitability of the smallholder systems. The productivity effects of AOM were assessed across all five case studies using a production function framework. In all cases, a CobbDouglas specification was used. Due to the large number of zero values for mineral fertiliser and synthetic pesticide. inputs, dummy variables associated with the incidence of zero observations were included in the analysis . In this article, we use the term “pesticide” as an umbrella term for fungicides, insecticides, herbicides and other plant protection substances, unless specified differently. The endogeneity of AOM was tested in all five cases and the significant correlation of error terms required the use of a regression model in order to treat this covariate as endogenous. Program participation and experience with organic management were employed as instruments in the first stage equation. We applied the evaluation framework to a broad set of case studies covering different agroecological and market contexts, as well as various types of interventions that had introduced OA to African smallholder farmers. Three of the selected case studies were in Kenya and two in Ghana . In three of these case studies the interventions aimed at implementing non-certified OA and in the other two, certified OA was introduced .

The implicit assumption behind the non-certified organic interventions is that they lead to a healthier environment and that farmers benefit from applying agroecological principles and technologies, avoiding the use of synthetic pesticides and mineral fertiliser. The certified organic interventions combine the capacity development efforts with a formal certification for securing price premiums for further improving farmers’ livelihoods. Each case study consisted of 280–398 smallholder farmers over the period April 2015 to May 2017. A proportion of these farmers had been exposed to interventions that introduced them to organic farming practices at least three years prior to data collection , while the remaining farmers were selected from similar socio-ecological contexts . Average farm sizes were 2–3 ha in Ghana and around or below 1 ha in Kenya. The most labourintensive case study was KE-NC1, which was dominated by vegetable production ) and KE-C ). In the other case studies labour hours were between 55 and 147 h/ha*a. In GH-NC family labour was dominating compared to other case studies. All five organic interventions significantly reduced the number of farmers using conventional inputs, including synthetic pesticides and/or mineral fertilisers, compared to the control groups . However, the share of farmers not using any conventional inputs differed substantially between case studies. In KE-C, 92% of the farmers who were exposed to the interventions did not use any conventional inputs, while in the control groups the share of farmers not using conventional inputs was low. The farmers in GH-C had low compliance rates with organic standards as the business partners of the cooperative failed to sell their produce on the organic market and thus did not receive the expected organic premium price. This contributed to a large number of farmers in the intervention group continuing to use mineral fertilisers and/or synthetic pesticides.