| Overview: Evaluation
Program evaluation is an essential part of the design and implementation
of any intervention. Evaluation uses research methodologies to address
fundamental questions of program development, including what should
be attempted, what was done, to whom, how, and what effect the intervention
had, if any.1 Evaluation draws on research methodologies used in
social and behavioral sciences to answer these questions, but it
is not equivalent to scientific research. There is usually more
collaboration and cooperation with program administrators and designers.
A well-conceived evaluation is an iterative process that provides
program administrators and designers with important information
critical to program development and implementation. To do this,
evaluators and program staff must work “up front” to
design an evaluation with clear goals and a well-developed plan
that fits seamlessly into the overall program design and implementation.
This “up front” work, in turn, enables the evaluation
to provide an assessment of whether the program achieved its goals.
Evaluation is commonly associated with educational and behavioral
intervention programs, such as AIDS prevention, smoking prevention/cessation,
and sexual abstinence programs, but is applicable for any type of
program development.1 The purpose of the MCHB Bright Futures Resource
Center for Curricula was to develop a curricula to enhance residency
training in the areas of Behavioral Pediatrics and Adolescent Medicine.
Evaluation of the program was an integral part of its design. This
chapter provides an overview of three major types of evaluation
(formative, process, and outcome) and describes how our Center has
evaluated the program to date.
Formative Evaluation
Formative evaluation is akin to needs assessment and is performed
early in the evaluation process. The purpose of formative evaluation
is to understand the need for the intervention and begin making
decisions about how to implement or improve the intervention. Thus,
information gleaned through the formative evaluations is critical
to designing a targeted, effective intervention. Our Center undertook
a major formative evaluation effort. In order to understand residency
training needs in Behavioral Pediatrics and Adolescent Medicine,
all pediatric residency training programs in the U.S. were surveyed.2
The survey instrument consisted of three parts, one for the residency
training director, one for the Adolescent Medicine director, and
one for the Behavioral Pediatrics director. The surveys assessed
current training practices and areas of the curriculum that directors
felt needed enhancement. The surveys revealed that, though reading
lists were numerous, there was a lack of case-based materials to
teach residents about behavioral pediatrics and adolescent medicine.
The need for case-based materials identified through the above
survey formed the basis for the curriculum that we developed. Critical
areas in behavioral pediatrics and adolescent medicine were identified
by program staff and targeted for case development. Educational
and teaching goals were identified for each of these areas and then
cases were written and adjunctive materials developed to create
a complete curriculum package to meet these goals.
Case development, itself, constituted the second formative evaluation
effort of this program. Cases were written, and then reviewed extensively
by experts in the field and revised. They were then pilot tested
on a small number of pediatric residents at a teaching conference.
Each pilot test was accompanied by a survey of both the facilitator
and the learners. These surveys assessed how well both the facilitator
and the learners felt the case and ensuing discussion met the identified
educational and teaching goals. There was also opportunity for open-ended
feedback. In addition, program staff attended many of these sessions
to directly observe problems with the cases and how they were used
and received. All this information was then brought back to program
administrators and case writers to inform the process of case revision.
Cases and the evaluation forms were revised multiple times. This
provides a clear example of how important it is for evaluation and
program staff to collaborate closely and how critical evaluation
is in the interactive process of intervention development. It also
illustrates how evaluation research is different from both outcome
evaluations and scientific research. Since cases and the evaluation
instruments were undergoing multiple revisions, the learners and
facilitators of the teaching sessions based on these early cases
were not equivalent. They were receiving different “interventions”
(cases). Thus, though the goals and the evaluation forms used in
these sessions were essentially identical, they were not comparable
and cannot be used in final assessment of outcomes.
Process Evaluation
The second type of evaluation is process evaluation. Process evaluation
is used, hopefully periodically, after a program is implemented.
For the Bright Futures Resource Center, this began after cases were
finalized. Process evaluation assists program staff with determining
whether program goals are being adequately met by answering the
questions of what was done, how, and to whom. Like formative evaluation
efforts, the results of process evaluation can be used to guide
changes in the program that would improve the ability to meet stated
programmatic goals. A number of the methods are similar to those
used in formative evaluation, including surveys, direct observation,
and open-ended interviews. The first two of these are performed
during the intervention. The latter is performed outside the intervention.
In addition, administrative records provide another important source
of information for process evaluation. Surveys are used to obtain
information on program participants and determine characteristics
of those who received the intervention. The learner and facilitator
evaluation forms developed by this project used in the formative
evaluation stage were also used as instruments to obtain process
evaluation information on those receiving the newly derived curriculum.
Direct observation was also used. During process evaluation, it
is important that those performing the direct observation be unobtrusive
and not disrupt the intervention. Observers need to be well-trained
in how to systematically and uniformly make observations and record
the encounter. To minimize inter-observer variability, our Center
used one program staff person trained in adolescent medicine to
observe all adolescent medicine case-based teaching sessions and
another person trained in behavioral pediatrics to observe all behavioral
cases.
Lastly, monitoring standardized administrative records provides
an important source of data for process evaluation. These records
provide information on who received the intervention and what, specifically,
was provided. The development of database templates for these standardized
records is another critical piece of program design. The administrative
records consisted of computerized databases. These databases maintain
information on all programs requesting and receiving any or all
parts of the curriculum. This provides, on a programmatic level,
information about which programs are being reached and what cases
are requested.
Outcome Evaluation
The last type of evaluation is the one that often receives the
most attention. Outcome evaluation attempts to determine the meaning
or effect of an intervention. Like process evaluation, it should
be used periodically to assess if and how well program goals are
being achieved. The ideal design for program evaluation would be
one in which the same individuals could be compared to themselves
both with and without the intervention. Obviously, this is not possible,
so evaluators have turned to experimental design to try and infer
what the effect of the intervention is in the population of interest.
There are basically three types of experimental design used in outcome
evaluation—non-experimental, quasi-experimental, and randomized.1
Non-Experimental Designs: Non-experimental designs are the most
commonly utilized outcome evaluation technique. These designs do
not employ a comparison group. Instead, individuals receiving the
intervention are compared with themselves before and after the intervention
in terms of variables the intervention is designed to influence.
Changes in any of these variables are then ascribed to the intervention.
The Center assessed level of comfort with and skill on completing
a Denver II developmental screen from a case focusing on use of
this screen. The evaluation revealed that use of the case significantly
increased residents’ knowledge in interpreting the Denver
II. 4, 5 Thus, this technique provided useful information as to whether
programmatic goals of these cases were being met. One problem with
non-experimental designs is that the inference that change was due
to the intervention is subject to confounding. Therefore, it is
not possible to emphatically state that changes are, in fact, due
to the intervention itself.
Quasi-experimental designs: Quasi-experimental design is more rigorous
than non-experimental design because it uses a separate, non-randomized
comparison group. However, because the comparison group is not randomized,
this design is also subject to bias and confounding. Factors associated
with the outcome may not be equally distributed among the case and
control groups. The use of matched controls decreases bias and confounding.
However, matching requires extensive knowledge of the literature
to identify those confounding variables that the cohorts should
be matched on. In addition, once matching variables are identified,
it is often extremely difficult to actually match cases and controls.
Besides difficulties of sample size and recruitment, relevant factors
are often difficult to measure or, for that matter, unknown.
Randomized Experimental Designs: Randomized experiments are considered
the sine qua none of outcome evaluation precisely because randomization
reduces bias between intervention and control groups. However, randomized
designs are the most difficult and most expensive to perform, and
can be ethically challenging. Randomized designs require the development
of only one cohort. That cohort is then randomly divided into those
who receive the intervention and those who do not. Thus, they require
that those targeted to receive the intervention and those requesting
it be willing to not obtain the intervention until after the outcome
evaluation is completed. For the Bright Futures Center, this would
mean withholding the curriculum from half of the programs who request
it until after the outcome evaluation is completed at their site.
Because this was not felt to be acceptable for the Center, one of
whose major goals was to make readily available curricular resources
to those in need, randomized designs were not employed as part of
the evaluation. However, randomized trials of these cases as an
educational intervention may be possible in a future project and
we would encourage all faculty who use this curriculum to think
creatively about ways to conduct such testing.
Elizabeth Goodman, M.D.
References:
1. Coyle S, Boruch R, Turner C. Evaluating AIDS Prevention Programs.
Washington, DC: National Academy Press; 1991.
2. Emans SJ, Bravender T, Knight J, Frazer C, Luoni M, Berkowitz
C, Armstrong E, Goodman E. Adolescent medicine training in residency
programs: Are we doing a good job? Pediatrics. 1998;102:588-595.
3. Frazer C, Emans SJ, Goodman E, Luoni M, Bravender T, Knight J.
Teaching residents about development and behavior: Meeting the Challenge.
Archives of Pediatric and Adolescent Medicine 1999; 153:1190-1194.
4. Knight J, Frazer C, Goodman E, Blaschke G, Bravender T, Luoni
M, Hall M, Emans SJ. Case-based teaching by pediatric residents
(abstract). Ambulatory Pediatric Association; San Francisco; 1999.
5. Knight JR, Frazer CH, Goodman E, Blaschke GS, Bravender TD, Emans
SJ. Development of a Bright Futures curriculum for pediatric residents.
Ambulatory Pediatrics (In Press).
|