Appendix E. Evaluation Methods
There are four main types of evaluation that may be included in State SNAP-Ed Plans. The following table from Addressing the Challenges of Conducting Effective Supplemental Nutrition Assistance Program Education (SNAP-Ed) Evaluations: A Step-by-Step Guide (Cates et al., 2014) provides definitions and SNAP-Ed relevant example questions:
|Type of Evaluation||Definition||Example Questions|
|Formative research||Application of qualitative and quantitative methods to gather data useful for the development and implementation of intervention programs||Do elementary-age school children served by our Implementing Agency eat the recommended daily servings of fruit and vegetables?|
|Process (implementation) study||Measurement and tracking of activities associated with the implementation and fidelity of an intervention program||How many SNAP participants and low-income eligibles are enrolled in the intervention? How many attended each of the six classes offered?|
|Outcome assessment||Examination of the extent to which an intervention program achieves its stated goals; does not establish cause and effect conclusions||Did the Healthy Kid program meet stated goals of increasing use of fat-free or 1% milk by 25% among participating families?|
|Impact evaluation||Measurement of the net change in outcomes for a particular group of people that can be attributed to a specific program||Did children in the Color Your Plate program increase the number and types of vegetables eaten by at least 0.25 cups per day compared with children who did not participate in the program?|
SNAP-Ed providers can use all four types of evaluation to measure indicators in the SNAP-ED Evaluation Framework. Baseline measurement of relevant individual-level indicators could inform the current state of target behaviors and serve to answer formative assessment questions. Additionally, at the environmental settings level, identifying a need for improved access or creating appeal in SNAP-Ed eligible sites and organizations and identifying and measuring the strength of key partnerships are additional examples of relevant indicators for formative assessment.
The framework contains fewer process indicators as most process data for SNAP-Ed are collected through reporting systems that inform the Education and Administrative Reporting System (EARS). Some process indicators included in the framework are reach and resources put towards policy, systems, and environmental (PSE) change activities and sustainability.
Outcome assessments examine the extent to which an intervention achieves its stated goals. The example in the table above depicts outcomes at an Individual level, and the medium- and long-term individual-level indicators of the framework can be measured pre- and post-intervention to assess whether your intervention achieved its behavioral change goals. The framework also specifies outcomes at the Environmental Settings and Sectors of Influence levels. At these levels, for example, we may assess the food and/or physical activity environments in sites, organizations, communities, or other jurisdictions at baseline and follow-up to determine the extent to which those environments changed as a result of PSE changes that were adopted and implemented.
Impact evaluation measures changes in outcomes attributable to the program and includes the use of control or comparison groups. The step-by-step guide by Cates and colleagues (2014) mentioned above focuses on impact evaluation and discusses important considerations for quality impact evaluation studies.
Considerations in SNAP-Ed Evaluation
Both process and outcome evaluation of SNAP-Ed programming is essential to ensure the fidelity and assess the effectiveness of program delivery, respectively. Some SNAP-Ed programs may have internal evaluators on staff who work hand-in-hand with program staff; others may contract with external evaluators from a separate agency; and some may use both. You can find experts with experience in community-based evaluation programs like SNAP-Ed in your state or a neighboring state who can help you evaluate your SNAP-Ed interventions. Many community outreach evaluation services at colleges and universities, including Land-grant institutions, are already evaluating SNAP-Ed services. CDC Prevention Research Centers, state and local health departments, and public health institutes can also assist with community-based evaluation for SNAP-Ed programs.
SNAP-Ed evaluations should focus on specific current interventions or initiatives in your approved SNAP-Ed Plan. Evaluation of projects or initiatives beyond the scope of SNAP-Ed interventions or the low-income population, or projects that intend to generate new knowledge or theory in the field of obesity prevention, are considered research and, therefore, will not be approved for funding. For example, requests to fund the creation or validation of an evaluation tool that is not specific to the SNAP-Ed intervention would not be approved. States interested in broad research may wish to seek alternate sources of funding. SNAP-Ed will pay for the data collection from a low-income control group (no intervention) or comparison groups (different intervention) when such data are necessary and justified to conduct an impact evaluation of the SNAP-Ed intervention. Whenever a state carries out a SNAP-Ed evaluation activity that costs more than $400,000 in total—whether spent in one year or multiple years—FNS strongly recommends that an impact evaluation be conducted. However, SNAP-Ed funds cannot be used to pay for the portion of data collection or surveillance of populations whose incomes exceed 185 percent of the federal poverty level or the general population.
As with most population-based public health programs, SNAP-Ed programs or interventions generally are not designed to establish cause-and-effect relationships between programming or exposure and outcomes as in a randomized control group design due to ethics (programming cannot be withheld from SNAP-Ed participants to serve as a control group) and the multi-level complexity of SNAP-Ed programming. However, SNAP-Ed programming may be evaluated to assess correlations or associations among variables (e.g., programming and outcomes) at one, or preferably more, points in time and at multiple programming levels throughout the funding period.
SNAP-Ed Plans may range from 1- to 3-year plans. One-year SNAP-Ed plans and funding cycles pose a challenge for states who are limited in the length of their program delivery, and evaluation designs, particularly in the evaluation of PSE interventions and social marketing campaigns. Within the Environmental Settings, Sectors of Influence, and Social Norms and Values levels of the framework, medium- and long-term outcomes and population results cannot be adequately measured within a 1- or even 3-year period, depending upon the indicator(s) and/or measure(s).
As the multi-level, multi-sectoral field of SNAP-Ed programming expands, so does the need for the reliable and validated environmental scanning and evaluation tools, measures, and secondary data sources that are referenced throughout this interpretive guide.
Effective evaluation will help to build the evidence base and identify effective and promising or emerging obesity prevention strategies and interventions. Knowledge on effective obesity prevention strategies and interventions is evolving. Examples of success can be found across the nation in states, cities, towns, tribes, and communities. But there is still much to learn, and programs are challenged to stay up-to-date, to be culturally relevant, and to help to establish evidence-based practices that are needed to meet the evidence-based requirement of the Healthy, Hunger-Free Kids Act.
As part of your SNAP-Ed evaluation plan, it is helpful to develop a sampling plan, a description of who will participate in the evaluation. Some evaluations may include all participants in an activity or all sites, while others may assess a sample or subset of participants or sites. Sampling in evaluation is useful for programs that do not have the time, funds, or staff to include all participants or sites in their evaluations. If your program does not have internal evaluation expertise, partnering with evaluators or researchers to plan evaluation methods, including sampling plans, will benefit your SNAP-Ed outcomes.
At the Individual level of the framework, when measuring goals, intentions, behavior change, and sustained behaviors, evaluators may wish to collect pre-, post-, and in some cases, follow-up data on a subset or sample of participants. Ideally, this sample will represent the entire group that participated in the SNAP-Ed intervention. Members of the sample are representative of the total population from which they are sampled, when using methods, such as:
- Simple random sampling: All individuals in the population have an equal opportunity of being selected (e.g., a lottery system or draw numbers/names from a hat).
- Systematic sampling: Individuals selected according to a random starting point and a fixed, periodic interval (e.g., from random start point on a list every 8th individual is selected).
- Stratified sampling: The population is divided into groups (e.g., by grade level), then individuals from each group are randomly selected.
- Cluster sampling: Clusters or groups are randomly selected (e.g., classrooms in a school); all individuals within the selected clusters are included in the study (e.g., all students within the classrooms that were randomly selected).
Often in SNAP-Ed evaluation, random sampling is not feasible or practical. For example, if you wanted to administer a survey to people who attended a one-time community nutrition event and you did not have a roster of all attendees, you would not be able to select a random sample of attendees. The conclusions from the sample may not generalize to all participants. Specific sampling methods for non-random samples include
- Convenience sampling: Individuals who are readily available are selected from the population you are studying.
- Purposive sampling: The evaluator selects a sample he or she believes represents the population.
- Snowball sampling: A small group of participants who have the desired characteristics is selected; those participants in turn recommend others with similar characteristics to participate.
- Quota sampling: A sample is selected based on pre-specified characteristics in the same proportion as the population (e.g., if 60% of the population is female then 60% of sample will be female; recruitment stops when 60% is reached).
- Self-selection sampling: Participants volunteer to participate.
At the environmental or organizational settings level, programs may choose to measure PSE changes in a sample of sites in which they are working to affect change. Based on the stage at which programs are engaged in PSE changes, the sampling will vary. The Needs and Readiness Flow Chart that follows outlines a process to establish readiness to implement PSE interventions.
Needs and Readiness Assessment Flow Chart
As indicated on the flow chart, the 5 stages in the readiness process: 1) needs assessment, 2) coalition building, 3) assessing coalition readiness, 4) gaining support from other groups and community members, and 5) assessing site, organization, or community readiness build on one another and require information from different sources, thus the sampling will vary. Below each stage’s sampling recommendation is provided.
- Needs assessment: For a defined issue, the sampling will be 10 issue-specific sites in the community, or within the relevant domains of environmental settings (i.e., eat, learn, live, play, shop, and work. If the community or domain has fewer than 10 sites then all the issue-specific sites will be assessed.
- For example, a group wants to address access to parks. Google Maps or other maps could be used to identify public parks in the community. Community A has 20 parks and community B has 5 parks. For community A, 10 parks will be chosen using one of the sampling methods listed in the individual level. For community B, all 5 parks will be assessed.
- Coalition building: A minimum of 3 partnerships, community groups and/or members actively engaged in the issue. Using the access to parks issue above, the three groups and/or members could be: (1) a representative from the department of parks and recreation; (2) a member of a neighborhood board; and (3) a mother who organizes a playgroup at the park.
- Assessing coalition readiness to collaborate: One representative from each group in the coalition. For example if a park coalition consists of 5 groups, one member of each of the 5 groups would complete the assessment.
- Gaining support from other groups and community members: Five key informants active in the issue. For example, the park coalition could get input from the following: local business leader, after-school program director, president of a local running club, principal of a nearby school, and a representative from the local police department.
- Assessing community readiness: At least 5 key informants who are familiar with the community and the issue.
The sites selected should represent the SNAP-Ed sites where one or more PSE changes are being made (MT5).
When sampling sites that are already actively addressing the issue, you may want to include the sites or organizations that represent the most advanced sites in terms of their progression through the stages of change at the organizational level (e.g., Weiner, 2009). In this case, the sample of sites would not necessarily be representative of all sites in which you are working, but it would represent those sites that have made significant progress in adopting and implementing PSE changes.
Sectors of Influence
As stated in the introduction to the Sectors of Influence chapter, most indicators will be measured using existing or secondary data sources, and you will likely include all relevant jurisdictions in your evaluation. For example, indicator MT8a1 is the total number of farmers markets that accept SNAP benefits per 10,000 SNAP recipients. To calculate the rate, if a state has 178 farmers markets that accept SNAP, and an average monthly SNAP caseload of 950,000 participants, the adjustment would be equal to 178/(950,000/10,000), yielding 1.87 farmers markets for every 10,000 SNAP participants. This is calculated based on state data, not a sample of state data.
Most of the data for the indicators at the population level will come from existing data sources at the state level, such as the Behavioral Risk Factor Surveillance System and the Youth Risk Behavior Surveillance System. If states choose to collect data to represent population-level changes, sampling methods described for individual-level sampling would apply. For example, in California, the Champions for Healthy Change study is a 4-year longitudinal survey of mothers, teens, and children from randomly selected SNAP households in 17 local health departments. Telephone interviews include collecting comprehensive information on dietary behaviors during the previous day using the Automated Self-Administered Recall System (ASA24), types and levels of physical activities, and self-reported height and weight to calculated BMI (https://www.cdph.ca.gov/Programs/CCDPHP/DCDIC/NEOPB/Pages/ChampionsforChangeProgram.aspx).
Determining the Size of Your Sample
Consider partnering with an evaluator or researcher to assist in your evaluation design and selection of an appropriate sample. An experienced evaluator or researcher will be able to assist with a power analysis, which determines how likely it is to achieve a significant program effect given a specific sample size. Power represents the chance that you will find an effect of your intervention if it’s there. Typically, researchers aim for a power of .80, meaning you have an 80 percent chance of finding a statistically significant difference between two groups when there is one. A similar analysis can be conducted to determine the necessary sample size for different types of studies—one-time surveys, comparisons of pre/post means, comparisons of two proportions, and more sophisticated statistical tests.
Several factors influence sample size, depending on the type of analysis. For example, if you want to determine how many people you should include in a survey in order to report the percentage of the population who intends to eat more fruit, you will need to know the population size, the margin of error, and the desired confidence level to calculate the sample size (see, for example, https://www.surveymonkey.com/mp/sample-size-calculator/). If you want to compare averages, such as the average cups of fruit per day, before and after an intervention, you will need to know the Type I error rate (typically .05), the Type II error rate (typically .20), the effect size, and the standard deviation for the change (see, for example, http://www.sample-size.net/sample-size-study-paired-t-test/). There are other online calculators available for many study designs. Since the body of evidence is building for PSE interventions, associated or anticipated effect sizes may not always be known to accurately calculate a sample size needed to detect significant changes in outcomes.
The following figure depicts the factors that influence sample size.
There are other considerations such as time and resources that may influence your sampling decisions. For example, in Maryland, the Text2BHealthy program delivers nutrition and physical activity tips via text messages to parents of elementary children who are receiving classroom-based nutrition education. The program staff is also working with grocery stores and parks near the participating schools. To evaluate behavior change, pre- and post-intervention surveys were conducted with parents in a sample of participating schools. The number of schools selected, 50, was based on available funds to provide one prize for a drawing at each school. Schools that had at least two intervention elements were included in the initial list of eligible schools; schools were randomly selected for participation. All parents at the randomly selected schools were invited to participate in the survey.