MDDE 617 - Program Evaluation in Distance Education



My instructor was: Mary Kennedy

Overview

In Program Evaluation in Distance Education MDDE 617 you will be introduced to the field of program evaluation, which spans a number of social science disciplines including education and training, nursing and medical education, social work, military training, and business. The course is designed to increase the student’s knowledge and understanding of major forms, approaches, and models of program evaluation and applying that knowledge in the production of a program evaluation proposal.

Resources



===My Assignments

=


Grade: 25/25


Geez... I was thinking that telework had something to do with telecommunication, like installation of computer networks, LAN, or internet. It's always good to Google terminology before you read a paper!

  • "Telecommuting, e-commuting, e-work, telework, working from home (WFH), or working at home (WAH) is a work arrangement in which employees enjoy flexibility in working location and hours. In other words, the daily commute to a central place of work is replaced by telecommunication links. Many work from home, while others, occasionally also referred to as nomad workers or web commuters utilize mobile telecommunications technology to work from coffee shops or myriad other locations. Telework is a broader term, referring to substituting telecommunications for any form of work-related travel, thereby eliminating the distance restrictions of telecommuting. All telecommuters are teleworkers but not all teleworkers are telecommuters. A frequently repeated motto is that "work is something you do, not something you travel to". A successful telecommuting program requires a management style which is based on results and not on close scrutiny of individual employees. This is referred to as management by objectives as opposed to management by observation." (Wikipedia)

  • "The term telework is the term favored in Europe and other countries while telecommute is used more often in the U.S and Canada." (webopedia)

Mike and I used http://ietherpad.com/ to work collaboratively. There is no sign-up with EtherPad and it is very easy to use. This collaborative assignment that was true collaboration. Normally, I would have done my part and Mike would have done his and hopefully when we put our parts together it would be readable--most collaborative assignments that I have worked on have been less than stellar. This assignment was definitely greater than the sum of its parts.


Grade: 25/25


Grade: 34/35 (Did this assignment while on holiday in Bali.)


Unit 1 Overview of History and Theory of Program Evaluation


1. Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines (pp. 3-52). White Plains, NY: Longman.
2. Gredler, M. E. (1996). Program evaluation (pp. 3-19). Columbus, Ohio: Merrill.
3. Migotsky, C., et al. (1997). Probative, dialectic, and moral reasoning in program evaluation.Qualitative Inquiry, 3(4), 453-467.
4. Levin-Rozalis, M. (2003). Evaluation and research: Differences and similarities. Canadian Journal of Program Evaluation, 18(2), 1-31


Fitzpatrick, Sanders & Worthen, 2004, p. 3-52.

1. What are the 10 characters of a profession?
2. Is evaluation a profession, according to these 10 characteristics? (p.7)
3. List three examples of evaluation uses.
4. What is the basic difference between formative and summative evaluation? (p. 16/7)
  • formative: if the evaluation's primary purpose is to provide information for program improvement. Often, such evaluations provide information to judge the merit or worth of a part of a program.
  • summative: if the evaluation is concerned with providing information to serve decisions or assist in making judgments about a program's overall worth or merit in relation to important criteria. Scriven defines it as, " evaluation done for, or by, any observers or decision makers who need valuative conclusion for any other reasons beside development."
  • Robert Stake: "When the cook tastes the soup, that's formative evaluation; when the guest tastes it, that's summative evaluation."
5. What is the difference between program evaluation and evaluation research, according to Fitzpatrick, Sanders, and Worthen? (p.17)
6. Who are the audiences for formative evaluation? For summative evaluation? (p.18)
  • formative: the audience is generally the people delivering the program or those close to it, (examples given: those responsible fore developing the new schedule, delivering the training program, or managing the mentoring program). Because formative evaluations are designed to improve programs, it is critical that the primary audience be people who are in a position to make changes in the program and its day to day operations.
  • summative: the audience include potential consumers (students, teachers, employees, managers, or health officials in agencies that could adopt the program), funding sources (taxpayers or a funding agency), and supervisors and other officials, as well as program personnel. The audiences for summative evaluations are often policymakers or administrators, but can, in fact, be any audience either the ability to make a "go or no go" decision.
7. What is the difference between internal and external evaluation? (p.23)
8. Which era (in terms of decades) saw the most growth and development of program evaluation as a field of study and practice? (p.)
9. Could you make a time line of the key events in the history of program evaluation? What would be the highlights? (p.)
10. Is evaluation a profession or a discipline? Or both? (p.42 last paragraph)
Gredler, 1996.
1. What is the basic difference in assessment and evaluation, according to Gredler? (p.5 2nd last paragraph)
2. How does Gredler define evaluation - in one sentence? (p.)
3. What global event launched evaluation into national prominence in the USA? (p.)
4. Why is the production factor of early evaluation studies a poor fit with social reality? (p.9 end)
5. What are alternative assessments? (p.)
6. How does program evaluation relate to and differ from educational research, accountability, and accreditation movements? (p.)

Migotsky, C., et al. (1997).

1. What do these authors object to, in Scriven's position on program evaluation practice?
  • They object to Scriven's persistent drive for objective, analytical tactics i the conduct of evaluations, in particular the final synthesis procedure. Scriven encourages the move away from a model of personal, intuitive judgment to one of analytic objective heuristics. (pp. 454/463)
2. What do the authors mean by the term probative?
  • probative: tending to prove or seek the truth, the objective, analytical, and rationale approach to program evaluation.
  • Scriven notes that probative inferences should be believed until discredited.(p.457)
  • Scriven argues that we need to (1) compile a comprehensive list of possible merits (using conceptual analysis and empirical needs assessment, (2) look at performance on each of the criteria, (3) in some cases a more complex inference is required - rework things, (4) ten criteria are weighted by relative importance.(p.457)
3. What do these authors mean by the term dialectic?
  • dialectic: A method of argument or exposition that systematically weighs contradictory facts or ideas with a view to the resolution of their real or apparent contradictions.
  • need to concentrate resources into providing better review mechanisms to improve the value resolution process: focus on opportunities and expectation for self-reflection, opportunities for professional critique (includes casual critiques and just talking it over). Sharpening judgmental powers of program evaluators and increase external critiquing of the judgments. (p. 460/1)
4. What are the four elements of Scriven's evaluation logic?
------Scriven talks about logical sequence of concepts that defines how people try to connect data to value judgments that the evaluand is good or bad, better or worse, ------passing or failing, or the like. Scriven outlined the four steps:
  • selecting criteria of merit, those things the evaluand must do to be judged good
  • setting standards of performance on those criteria, comparative or absolute levels that must be exceeded to warrant the appelation "good"
  • gathering data pertaining to the evaluand's performance on the criteria relative to the standards
  • integrating the results into a final value judgment.
5. Is Scriven promoting an absolute stance or a relative stance, in his determination of evaluation standards? Which stance are these authors promoting?
  • Scriven claims we can rank simple cases and even reach some conclusions of "absolute" merit. The authors are promoting a more relative stance on merit.
6. What argument does Schwandt make about moral reasoning and its dichotomy with Scriven's probative, value-controlled thinking?
  • Schwandt describes the process of interpreting ethical and moral dilemmas as technical problems as a denial of the "messiness" of social inquiry and thereby a limitation to the expertise of the professional evaluator and that such knowledge disengaged from the subjectivity of morality would, therefore, have little direct applicability in the competing values that make up human interaction and social life. (p. 458)
7. Is judgment probative? Can it be?
  • Judgment is an intrinsic process. Evaluation is a perception-judgment unity. There is no perception independent of conceptualization. There is no recognition without meaning; there is no meaning without valuing. Perception is the process through which sensations are interpreted, using knowledge and understandings of the world, so that they become meaningful experiences. Formal evaluation can emphasize either the suppressing of these personal judgments or the refining of them. Judging goodness requires cultivating perceptual awareness of those particularities. (p. 460)
8. Do these authors wish to expand the evaluator's ability and use of judgment, or to control it?
  • Definitely expand the evaluator's ability and use of judgment.
  • experienced evaluators gains understanding by studying the evaluand from different point of view and from different frames of reference (sounds like social constructivism). In creating valid representation and value judgments, evaluators expose their interpretations to rival conceptual perspectives using different conceptual frames, modifying, and refining notions of merit. This triangulation process assures multiple viewpoints and provoking new interpretations. This sometimes can cause conflicting representations and create tension, but this can encourage deeper/fuller understanding of the evaluand. (p. 461)
9. What is meant by triangulation, and why is it valuable to program evaluators?
  • a process for assuring multiple viewpoints rather than reducing to a single viewpoint. (p. 461)
  • see #8
  • (Reminds me of social negotiation in social constructivism.)
10. Merit does not equal productivity, these authors claim. What does it equal?
  • Scriven defines merit in terms of performance. (p.462)
  • Alternatively, the authors define merit not only in terms of outcomes, but summative evaluation also portrays how well the program was constituted and operated.
  • Programs that contribute to the improvement in social well-being are meritorious just by existing.
  • Following state-of-the-art practices to some extent is meritorious.
  • For different purposes/populations/contexts/times programs have different values, not as Scriven notes, in productivity alone. (p.462)

Levin-Rozalis, M. (2003).

Evaluation_vs_Research_Table.gif
Table is from Levin-Rozalis' article (p. 5)

Table is from Levin-Rozalis' article (p. 5)

1. Which domain signals the main difference between research and evaluation?
  • In the domain of application, lies the main difference between research and evaluation. (p. 5)
2. How does research look at knowledge?
  • research is meant to formulate general knowledge as general laws. The understanding of the knowledge generated is to create an abstract generalized law that can be applied to as many events as possible. (p. 5/6)
  • research is intended to enlarge a body of knowledge. (p. 6)
3. How does evaluation look at knowledge?
  • evaluation is intended to gather knowledge and understanding of a concrete activity/project and to give this understanding back to the project as feedback. (p. 6)
  • evaluation attempts to investigate the mutual influences between a maximum number of variables at a given time and place. (p. 6)
  • knowledge is only a means to an end and its value is first and foremost in the feedback for the evaluated project. (p. 6)
4. What is theory-driven evaluation?
  • Some evaluators have found solutions for this problem by borrowing methods from other disciplines, others, by relating to theory. (p. 4)
  • Evaluators facilitate program stakeholders to clarify their program theory or model. (program theory: stakeholders’ implicit and explicit assumptions on what actions are required to solve the problem and how the problem will respond to the actions.) The program theory is then used as a framework to guide the design of evaluation design, the selection of research methods, and the collection of data. link to a PowerPoint or (p. 9)
5. How is deductive logic in opposition to the logic used in evaluation?
  • because the evaluation examines the object in order to reveal the variables and the elements that play a role and the connections between them. It does not use the object (evaluand) to validate variables and suppositions stemming from an existing theory. (p.9)
  • using deductive logic one would never know the real causes, because the theory leads to the mechanisms, contexts, and outcomes that are part of its frame of reference. (In reference to contraception use and counseling.) It is not important when examining a theory, but it is crucial when we want to know what it is that works in a project.
6. Evaluation is not theory — dependent; it is field dependent. Do you agree with this statement made by the author? (yes)
  • evaluators rely on theoretical knowledge, but do not attempt to validate theories.
  • attempting to draw hypotheses from some specific theoretical framework will limit the scope of the evaluation and prevent the evaluator from making hypotheses that do not arise from this theoretical framework.
  • theoretical frameworks dictate the concepts we use and their expected relationships with each other.
  • In an evaluation, a theory that is suitable to the project can be chosen at a later stage, when the evaluator is drawing conclusions and explaining the findings. (p. 11)
7. What are the principles of abduction?
  • situated logic - look locally and then forward.
  • The principles of abduction are based on the notion that there are no a priori hypotheses, no presuppositions, and no advance theorizing. Each event is scrutinized and its importance examined. Hypotheses are then formed about the event: Is it connected to other events and, if so, how? Perhaps it is an isolated event and, if so, what is its meaning? The explanations we form for these new events are “hypotheses on probation,” and a cyclical process of checking and rechecking against our observations takes place, widening and modifying the explanation through this process. Link to another paper by Levin-Rozalis or (p. 12)
8. How do research methods and evaluation methods differ?
  • The main aim of research is to increase our understanding and knowledge concerning humanity in general, and thus it must find variables that can be generalized and formulated into rules. In order to do this, the researcher must sieve all the variables connected to a specific event that are not subject to generalization or do not belong to the system of suggested explanations.
  • In contrast, the activity of evaluation tries to use all the phenomena that have been discovered in the examination in order to present a coherent system of explanation for the project being examined, thus diminishing the ability to generalize the findings. (p. 14)
  • Research stems from a theory and its hypotheses and empirical generalizations. It is the fruit of pedantic attention to planning a strategy for operationalizing the hypotheses - formulating the variables in observational terms. An observational term has to be anchored in reality; a suitable situation for observing must be found or artificially created before the appropriate scientific observations are carried out to test the hypothesis. To do this, the researcher must choose the variables that seem most suitable to the concepts of the hypothesis: the most suitable subjects, the most felicitous field and the most appropriate set-up.
  • Evaluators cannot choose the participants, set-ups, or variables of the project. The field is given, the participants are given, and, at least in part, the variables are not known in advance. There is a general definition of the evaluation questions, but they are not defined in terms of a hypothesis and the variables are not operational. The instruments of evaluation (interviews, discussions, observations, questionnaires, videos, analysis of protocols, dialogue analysis, or any other tool) are planned and chosen according to the population involved, the activities to be checked, the question to be evaluated, the time and money available, and the contract signed between the operators of the project or the initiators and the evaluation team.
  • Evaluators do not examine isolated variables; they examine events, which include most of the possible variables together with their interconnections and contexts, as well as factors that are not variables (the type of neighbourhood or the character of the school, for example). In their analysis of the events, evaluators define categories and variables, based on previous knowledge and according to either the aims of the project or the finding. For example, if a variable like opacity in definition of roles occurs repeatedly in different aspects and locales of the project, it then becomes possible to see if its occurrence and influence are identical in each case, what its origins are, and so on.
  • As in anthropological or qualitative research, the evaluator him/herself is an instrument. In addition to their academic and professional qualifications, it is important for evaluators to have the ability to make contact and communicate, for example, to win trust and to present their findings in a way that will make them not only understandable but also acceptable for application. Despite the importance of the evaluator’s personality in this process, evaluation must be systematic, structured, and professional, and not “art” or intuition-based. (p. 15/16)
9. What is the function of internal and external validity in evaluation?
  • A threat to validity would be any factor that influences the results of the evaluation.
  • Internal validity is the degree the treatment or intervention effects change in the dependent variable. The greater the ability a evaluator can attribute the effect to the cause, rather than to extraneous factors the higher the degree of confidence that the treatment or intervention caused the effect.
  • Inferences are said to possess internal validity if a causal relation between two variables is properly demonstrated. A causal inference may be based on a relation when three criteria are satisfied:
    1. the "cause" precedes the "effect" in time (temporal precedence),
    2. the "cause" and the "effect" are related (covariation), and
    3. there are no plausible alternative explanations for the observed covariation (nonspuriousness).
    In scientific experimental settings, researchers often manipulate a variable (the independent variable) to see what effect it has on a second variable (the dependent variable). For example, a researcher might, for different experimental groups, manipulate the dosage of a particular drug between groups to see what effect it has on health. In this example, the researcher wants to make a causal inference, namely, that different doses of the drug may be held responsible for observed changes or differences. When the researcher may confidently attribute the observed changes or differences in the dependent variable to the independent variable, and when he can rule out other explanations (or rival hypotheses), then his causal inference is said to be internally valid. Wikipedia
  • Inferences about cause-effect relationships based on a specific scientific study are said to possess external validity if they may be generalized from the unique and idiosyncratic settings, procedures and participants to other populations and conditions. Causal inferences said to possess high degrees of external validity can reasonably be expected to apply (a) to the target population of the study (i.e. from which the sample was drawn) (also referred to as population validity), and (b) to the universe of other populations (e.g. across time and space).
  • The most common loss of external validity comes from the fact that experiments using human participants often employ small samples obtained from a single geographic location or with idiosyncratic features (e.g. volunteers). Because of this, one can not be sure that the conclusions drawn about cause-effect-relationships do actually apply to people in other geographic locations or without these features. Wikipedia
  • But evaluators only stress internal validity, as they interested in the here and now of a specific project and not the generalizability of the evaluation. (p.17)
  • However, evaluators are interested in if the effect of the project can be generalized or what the scope of the projects' application may be. (Generalize their conclusions, but not their findings.) (p. 18)
  • since most of the concepts of evaluation are indirect/abstract/theoretical constructs, one can never be completely confident that what they are measuring is indeed the thing that they want to measure. But this is okay because the operational definition and the question used in the evaluation stem from the field and not from a theory. (p. 17)
10. Is relevance an acceptable replacement for external validity, in terms of evaluation rigor?
  • definitely, for the evaluation to be effective, the feedback and value embodied in it must be relevant. (p. 20)
11. Is causality an essential outcome of program evaluation?
  • A significant element in feedback from evaluation is causality. When we know the cause then we can try to control the effect. (p.23)
12. The author sees research and evaluation as separate and distinct disciplines, despite their sharing concepts, data collection methods and instruments. Does her article provide solid argument for this stance? Has it convinced you of the separateness of these two disciplines?
  • yes and yes.


===Unit 2 Program Evaluation Models and Approaches

=

Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines (pages 53-167. White Plains, NY: Longman .
Patton, M. Q. (2001). Evaluation, knowledge management, best practices, and high quality lessons learned. American Journal of Evaluation, 22(3), 329-336.
Morris, D. B. (2002). The inclusion of stakeholders in evaluation: Benefits and drawbacks. Canadian Journal of Program Evaluation, 17(2), 49-58.


Phase 1 Models
Phase 2 Models
Objective-oriented
Management-oriented
Consumer-oriented
Expertise-oriented
Participant-oriented
Early attempts to describe and provide guidelines for doing evaluation - they didn't break away from the scientific research/objectivist/quantitative measurement framework
These models broke new ground and borrowed form other research paradigms in delineating guidelines for doing evaluation. They opened possibilities for use of an anthropological humanist research/subjectivist/qualitative measurement framework.

Fitzpatrick, Sanders & Worthen, 2004, p. 57-167



Stake holders: individuals/groups who have a direct interest in and may be affected by the program being evaluated or the evaluation's results. They hold a stake in the future direction of that program and deserve to play a role in determining that a direction by
  1. identifying concerns and issues to be addressed in evaluating the program.
  2. selecting the criteria that will be used in judging its value.
Evaluators need to identify stakeholders for an evaluation and involve them early, actively, and continuously. (p 54)

Program: activities that are provided on a continuing basis; typically what is evaluated.

Epistemologies (philosophies of knowing)
  • Objectivism requires evaluation to be "scientifically objective" - that is use data-collection and analysis techniques that yield result reproducible and verifiable by others using the same techniques. The evaluation procedure is 'externalized' - existing outside of the evaluator. (social science base of empiricism; replicate) (p. 60) Logical Positivism is a 20th century philosophical movement that holds characteristically that all meaningful statements are either analytic or conclusively verifiable or at least confirmable by observation and experiment and that metaphysical theories are therefore strictly meaningless. (p. 61) merriam-webster
  • Subjectivism is based on the experience of the evaluator rather than scientific method. Knowledge is understood as being largely tacit rather than explicit. The validity/accuracy of an evaluation depends on the evaluator's experience/background/qualifications/keenness of perceptions. The evaluation procedure is "internalized" - existing within the evaluator in ways that are not explicitly understood or reproducible by others. (experientially-based; tacit knowledge; phenomenology) (p. 60-61)

Principles for assigning value (parallel objectivism/subjectivism)
  • Utilitarian Evaluation: focus on group gains (avg scores); greatest good for the greatest number (p. 62)
  • Intuitionist-Pluralist Evaluation: value of impact of program on each individual; all who are affected by program are judges (stakeholders) (p. 62)
  • Are these principles mutually exclusive? No. The purist view that looks noble in print yields to practical pressures demanding that he evaluator use appropriate methods based on alternative epistemologies within the same evaluations (impractical). Choose the methods right for THAT evaluation and understand the assumptions/limitations of the different approaches.

Quantitative: numerical - statistics
Qualitative: non-numerical - narratives and verbal descriptions
- Evaluation is a transdiscipline - crosses many disciplines. "Law of the instrument" fallacy - with hammer/nails, everything appears to need hammering (p. 64)
- Identify what is useful in each evaluation approach, use it wisely & avoid being distracted by approaches designed to deal w/ different needs (p. 66)

Practical Considerations
  1. Evaluators disagree whether/not intent of evaluation is to render a value judgment. Some are concerned only with the usefulness of the evaluation to decision makers and believe that they, not the evaluator, should render the value judgment. Others believe the evaluator's report to the decision maker is complete only if it contains a value judgment. (decision-makers or evaluator render judgment?) (p. 66)
  2. Evaluators differ in views of evaluation’s political role. Who has the authority? Responsibility? These will dictate evaluation style.
  3. Evaluators are influence by their prior experience.
  4. Evaluators differ in who they think should conduct the evaluation and nature of expertise needed to do so?
  5. Evaluators differ in their perception of whether it is desirable to have a wide variety of evaluation approaches.

Evaluation_Approaches_on_the_Dimension.jpg
Figure 3.1 from Fitzpatrick (2004), but Modified

Figure 3.1 from Fitzpatrick (2004), but Modified





Objectives-orientated
Management-orientated
Consumer-orientated
Expertise-orientated
Participant-orientated
1.Some proponents
Tyler, Provus, Metfessel & Michael, Hammond, Popham, Taba, Bloom, Talmage
Stufflebeam, Alkin, Provus, Wholey
Scriven, Komoski
Eisner, Accreditation Groups
Stake, Patton, guba and Lincoln, Rippey, MacDonald, Parlett and Hamilton, Cousins and Earl
2.Purpose of evaluation
Determining the extent to which objectives are achieved
Providing useful info to aid in making decisions
Providing info about products to aid decisions about purchases or adoptions
Providing professional judgments of quality
Understanding & portraying the complexities of programmatic activity, responding to an audience’s requirements for info
3.Distinguishing characteristics
Specifying measurable objectives; using objective data; looking for discrepancies between objectives and performance
Serving rational decision making; evaluating at all stages of program development
Using criterion checklists to analyze products; product testing; informing consumers
Basing judgments on individual knowledge and experience; use of consensus standards, team/site visitations
Reflecting multiple realities; use of inductive reasoning and discovery; firsthand experience on site; involvement of intended users; training intended users
4.Past uses
Program development; monitoring participant outcomes, needs assessment
Program development; institutional management systems; program planning; accountability
Consumer reports; product development; selection of products for dissemination
Self-study; blue-ribbon panels; accreditation; examination by committee; criticism
Examination of innovations or change about which little is known; ethnographies of operating programs
5.Contributions to the conceptualization of an evaluation
Pre-pose measurement of performance; clarification of goals; use of objective measurements that are technically sound
Identify & evaluate needs & objectives; consider alternative program designs & evaluate them; watch the implementation of a program; look for bugs & explain outcomes; see if needs have been reduced or eliminated; metaevaluation; guidelines for institutionalizing evaluation
Lists of criteria for evaluating educational products & activities; archival references for completed reviews; formative-summative roles of evaluation; bias control
Legitimation of subjective criticism; self-study with outside verification; standards
Emergent evaluation designs; use of inductive reasoning; recognition of multiple realities; importance of studying context; criteria for judging the rigor of naturalistic inquiry
6.Criteria for judging evaluations
Measurability of objectives; measurement reliability validity
Utility; feasibility; propriety; technical soundness
Freedom from bias; technical soundness; defensible criteria used to draw conclusions and make recommendations; evidence of need and effectiveness required
Use of recognized standards; qualifications of experts
Credibility; fit; auditability; confirmability
7.Benefits
Ease of use; simplicity; focus on outcomes; high acceptability; forces objectives to be set
Comprehensiveness; sensitivity to information need of those in a leadership position; systematic approach to evaluation; use of evaluation throughout the process of program development; well operationalized with detailed guidelines for implementation; use of a wide variety of info
Emphasis on consumer info needs; influence on product developers; utility; availability of checklists
Broad coverage; efficiency (ease of implementation, timing); capitalizes on human judgment
Focus on description & judgment concern with context; openness to evolve evaluation plan; pluralistic use of inductive reasoning; use of a variety of info; emphasis on understanding
8.Limitations
Oversimplification of evaluation and problems outcomes-only orientation; reductionististic; linear; overemphasis on outcomes
= tunnel vision
Emphasis on organizational efficiency and production model; assumption of orderliness and predictability in decision making; can be expensive to administer and maintain; narrow focus on the concerns of leaders
Cost and lack of sponsorship; may suppress creativity or innovation; not open to debate or cross-examination
Replicability; vulnerability to personal bias; scarcity of supporting documentation to support conclusions; open to conflict of interest; superficial look at context; overuse of intuition; reliance on qualifications of the “experts”
Nondirective; tendency to be attracted by the bizarre or atypical; potentially high labor-intensity and cost; hypothesis generating; potential for failure to reach closure
(Fitzpatrick, pp160-162)

Application Exercise, (Fitzpatrick, 2004 p. 166)


2. Below is a list of evaluation purposes. Which approach would you choose to use in each of these examples? Why? What would be the advantages and disadvantages of this approach in each setting?
a. Determining whether to continue a welfare-to-work program designed to get full-time, long-term employment for welfare recipients
  • management-oriented evaluation: the approach is meant to serve decision makers. Its rationale is that evaluative information is an essential part of good decision making and that the evaluator can be most effective by serving administrators, managers, policy makers, boards, practitioners, and others who need good evaluative information. (p. 88) Also, since the evaluation is summative and most likely aimed at the cost/benefits of the program
b. Describing the implementation of a distance- learning education program for college students
  • Consumer-orientated: Providing info to the consumers (the college students) about products to aid decisions about purchases or adoptions (from the table above).
  • Or participant-orientated, as it relies heavily on descriptive information and considers views and needs of all the stakeholder groups.
c. Making recommendations for the improvement of a conflict-resolution program for middle-school students
  • Participant-oriented: evaluators work to portray the multiple needs, values, and perspectives of program stakeholders to be able to make judgments about the value or worth of the program being evaluated. (p. 149)
d. Determining whether reading levels of first graders, in Ms. Jones' class, at the end of the year are appropriate.
  • Objective-orientated: we are looking for discrepancies between the established reading level objectives and the students performance, any discrepancies will be measured. The established objectives are straightforward, it should be easy to compare end of year assessments with the expected level.



Good quote, "Cronbach (as cited in Fitzpatrick, 2004) notes that one important role of the evaluator is to illuminate, not dictate, the decision. Helping clients to understand the compexity of issues, not to give simple answers to narrow questions, it a role of evaluation."



1. Which evaluation model emulates the systems approach, or systems thinking, most closely?
  • hmm.. not done reading yet, but objective-orientated evaluation talks about inputs and outputs.. I will keep reading. (p.76)
  • I was wrong - developers of the management-oriented evaluation approach have relied on a systems approach to evaluation in which decisions are made about inputs, processes, and outputs much like the logic models and program theory. By highlighting different levels of decisions and decision makers, this approach clarifies who will use the evaluation results, how they will use them, and what aspect(s) of the system to make decisions about. (p. 88)
  • Also, Provus' Discrepancy Evaluation Model, described as an objectives-oriented evaluation model, is systems-oriented, focusing on input, process, and output at each of five stages of evaluation: program definition, program installation, program process, program products, and cost-benefit analysis. (p. 93)
2. In rejecting objectives-oriented evaluation, did evaluators believe that it was fundamentally flawed? If so, how?
*
3. Which evaluation model relates to the accreditation movement?
  • Expertise-orientated
4. Which evaluation model has a built-in sort of meta-evaluation?
  • Management-oriented approach (see Table 9.1, p. 160). Kirkpatrick recognizes that one of the contributions to the conceptualization of an evaluation is metaevaluation. "This link to metaevaluation was made because models like CIPP and UCLA are actually 4 to 5 evaluations in one – so people could think of them as internal evaluation of evaluations, with a stretch (M. Kennedy, personal communication with , May 25, 2009).

5. Is goal-free evaluation really goal free? (Scriven developed this evaluation method) (p. 84)
  • Goal-Free Evaluation: (1) goals should not be taken as given, they should be evaluated, (2) goals are generally little more then rhetoric ans seldom reveal the real objectives of the project or changes in the intent, (3) many important program outcomes are not included in the list of original program goals or objectives.
  • Scriven believes that the most important function of goal-free evaluation is to reduce bias and increase objectivity (sounds like Scriven!!).
  • Goal-free evaluation is intentionally opposite of Objective oriented evaluation.
  • Goal-directed evaluation and goal-free evaluation can work well together, asit is important to know how other judge the program, not only on the basis of how well it does what it is supposed to do but also on the basis of what it does in all areas, on all its outcomes, intended or not. (p. 85)
6. Why would an evaluator avoid knowing the program goals and objectives? (p. 84)
  • In objectives-orientated evaluations, an evaluator is told the goals of the program and is therefore immediately limited in perceptions - the goals act like blinders, causing one to miss important outcomes not directly related to those goals. (hmmm...)
7. We still write program and course goals and objectives. Are they of any use to evaluators?
  • Damn yes! Getting tired...
8. What does CIPP represent, in the CIPP Model? (p. 89)
  • Context Evaluation - planning decisions: Determining what needs are to be addressed by a program and what programs already exist helps in defining objectives for the program.
  • Input Evaluation - structuring decisions: Determining what resources are available, what alternatives strategies for the program should be considered, and what plan seems to have the best potential for meeting needs facilitates design of program procedures.
  • Process Evaluation - implementing decisions: How well is the plan being implemented? What barriers threaten its success? What revisions are needed? Once these questions are answered, procedures can be monitored, controlled, and refined.
  • Product Evaluation - recycling decisions. What results were obtained? How well were needs reduced? What should be done wit the program after it has run its course? These questions are important in judging program attainments

9. How are administrative decisions made in your organization? Do they always follow a logical approach?
  • I don't know and I would hope so, but I am sure they are they are not always.
10. Do adversary evaluation always have to present two opposing views?
*
11. List three fundamental characteristics of participant-oriented evaluation. (p. 133/4)
  • More holistic approach which admits to the complexity of humans. Instead of simplifying the issues we should, attempt to understand ourselves and human services in context of their complexity.
  • Value pluralism (a theory assuming more than one principle or basic substance as the ground of reality) is recognized, accommodated, and protected.
  • Depend on inductive reasoning. Understanding comes from grassroots observation and discovery.
  • Multiplicity of data. Understanding comes from the assimilation of data from a number of sources. Subjective/objective, qualitative/quantitative representations of the evaluand.
  • Do not follow a standard plan. The evaluation process evolves as participants gain experience. Often the important outcome of the evaluation is a rich understanding of one specific entity with all of its idiosyncratic contextual influences, process variations, and life histories.
  • Record multiple rather than single realities. No one perspective is accepted as the truth, all perspectives are accepted as correct, and a central task of the evaluator is to capture these realities and portray them without sacrificing the program's complexity.
12. With all these evaluation models to choose from, what influences an evaluator's decision?
  • each evaluation must be judged by its usefulness, not its label. (p. 139)
  • first make certain the proposed strategy and tactics fit the terrain and will attain the desired out comes of the campaign (p. 154)
  • evaluation practitioners should use these approaches as heuristic tools, selecting from a variety of evaluation approaches one appropriate for the situation rather than distort the interests and needs of the evaluation's audience(s) to make them fit a preferred approach. (p. 154)
  • How will one know which approach is best for a given situation? There is almost no research to guide one's choice. (p. 156)
  • eclectic in program evaluation, choosing and combining concepts from the evaluation approaches to fit particular situations, using pieces of various evaluation approaches as they seem appropriate. (p. 1.64)
  • Much of evaluation's potential lies in the scope of strategies it can employ and in the possibility of selectively combining those approaches. Narrow, rigid adherence to single approaches must give way to more mature, sophisticated evaluations that welcome diversity. (p. 165)
  • The way in which evaluators determine which approach(es) to employ in a given situation is not based on scientific inquiry or empirical testing; rather, it is
    based on philosophical, methodological, and client preferences. Often, evaluators will not adhere to one specific approach, but instead will opt for a combination of
    several approaches in a more eclectic approach to evaluation. (p. 165/66)

Patton, 2001

1. Why does Patton think that the term `best practices' is a bad idea? (p. 330/331)
  • The widespread and indiscriminate use of the terms "lessons learned" and "best practices" has devalued them both conceptually and pragmatically because they lack any common meaning, standard, or definition.
  • The assumption with "best practices" is that there must be a single best way t o do something is highly suspect, as the world values diversity, recognizes that many paths exist for reaching some destination; some may be more difficult and costly, but those are criteria that take us beyond just getting there and reveals the importance of asking, "best" from whose perspective using what criteria? (p. 331)
  • From a systems point of view, a major problem with "best practices" is the way that they are offered without attention to context - a lot of "best practices" rhetoric presumes context-free adoption. (p. 331)
  • "Best practices" that guide practice can be helpful, but ones that are highly prescriptive and specific represents bad practice of best practices.
  • Calling something "best" is typically more a political assertion than an empirical conclusion.
2. What does Patton mean by pragmatic utilitarian generalization, and how is it achieved? (p. 334)
  • Theory is an abstraction from direct experience, and thus Patton is asserting that high-quality principles should be generalized from existing evaluations such that they can be transferable and applied to new situations. (Not sure if this is clear - need more tea.)
3. Is this statement by Patton fair criticism, do you think? "Seldom do such statements [best practices and lessons learner] identify for whom the practice is best, under what conditions it is best, or what values or assumptions undergird its best-ness." (p. 330)
  • I think it is fair, as evaluations are generating a lot of knowledge and there may be a way to harvest common principles and generic patterns of program effectiveness, within certain contexts.

Morris, 2002, p. 49-58

Morris notes that stakeholder involvement involvement empowers the stakeholders, increases the utilization of results, and increases the validity of the evaluation.

1. How does Morris define stakeholder?
  • Morris notes that the term stakeholder has had different definitions for different researchers over the years and researchers are now moving toward explicitly defining how the term "stakeholder" is used. Morris defines stakeholder very broadly, ranging from token to active participation by recipients, distributors, or financial supporters of program services. (p. 49/50)
2. Is the inclusion of stakeholders in evaluation activity related to a quantitative research perspective or a qualitative research perspective?
  • The origins of participant involvement in evaluation can be found in the social constructivist perspective employed in qualitative research. (P. 50)
3. What is the basic premise of social constructivism?
  • A learning theory that emphasizes that learning is an active social process in which individuals make meanings through the interactions with each other and with the environment they live in. Knowledge is thus a product of humans and is socially and culturally constructed. (p. 50/51)
  • Thus all stakeholders perspectives related to an evaluation are valued and should be actively sought from stakeholders to gain a complete picture of program rationale, impact, and alternatives. Using the principles of social constructivism should lead to am ore valid evaluation that empowers and increases the likelihood of the utilization of results as stakeholder investment is greater that otherwise be the case. (p. 51/52)
4. How is stakeholder inclusion in evaluation activity related to the concept of empowerment?
  • Empowerment is the increased feeling or sense of power stemming from a given action, in this case participation. (p. 52)
  • Feeling listened to, being taken seriously, and making an important contribution are characteristic of empowerment. Studies show that participation leads to empowerment.
5. Does the inclusion of stakeholders increase or decrease conflict, overall?
  • Conflict should be anticipated in participation evaluations. If not anticipated, conflict can easily lead to feelings of disempowerment. (p. 54)
  • If stakeholder views are not considered to be of equal value, the participant evaluation process may actually increase the conflict and power differential among the groups, thereby polarizing participants and impeding any productive discussion. (p. 54)
6. List three benefits of including stakeholders (p. 50)
  1. Empowering stakeholders by encouraging active participation
  2. Increase the utilization of the program evaluation results, and
  3. The validity of the results
7. List three drawbacks to including stakeholders. (p. 50)
  1. additional time,
  2. personnel, and
  3. expenses required.
Stakeholder participation can lead to feelings of lack of power by participating stakeholders, disregarding of results by decision makers, and questionable validity. Thus, stakeholder participation in evaluations can influence the same outcomes both positively and negatively.



=== Unit 3 Program Evaluation Forms and Purposes

=

Owen, J. M., & Rogers, P. J. (1999). Program evaluation; Forms and approaches (pp. 1-62; 170-307). Thousand Oaks, CA: Sage.
Chen, H. T. (1996). A comprehensive topology for program evaluation. Evaluation Practice, 17(2), 121-130.

Owen, 1999, p. 1-85

Evaluation:
  • negotiating an evaluation plan;
  • collecting and analyzing evidence to produce findings;
  • dissemination the findings to identified audiences for use in (a) describing or understanding an evaluand or (b) making judgments and/or decisions related to that evaluand. (p. 4)
1. What are the four knowledge products of evaluation? (p. 4)
  • evidence: the data which has been collected during the evaluation. This could be regarded as information.
  • conclusions: the synthesis of data and information. These are the interpretations or meanings made through analysis. Conclusions result from analytical processes involving data display, data reduction and verification
  • judgments: in which values are placed on the conclusions. Criteria are applied to the conclusions stating that the program is 'good' or 'bad' or that the results are 'positive', 'in the direction desired' or 'below expectations'.
  • recommendations: these are suggested courses of action, advice to policy-makers, program managers or providers about what to do in the light of the evidence and conclusions.

2. What is the meaning of the phrase logic of evaluation? (p. 14)
  • Owen (2007) describes the logic of evaluation and defines how people try to connect data to value judgments that the evaluand is good or bad, better or worse, passing or failing, or the like. Without this logic any review or evaluation will be found wanting. Evaluators must be familiar with the following basic issues:
  1. Establishing criteria: What is the underlying basis for the criteria to judge the worth of what is being evaluated? What dimensions must the evaluand to well? It is important to be explicit about the criteria being used to determine worth. (breakfast should be nutritious)
  2. Constructing standards: What evidence and standards/criteria need to be used to make a judgment of worth? How well should the evaluand perform? (characteristics: fiber, fat, sugar, sodium; good = derived from nutrient profiles of raw cereals)
  3. Measuring performance: What standards were applied and how will conclusions be made and presented? How well did the evaluand perform? (ex, highly recommended <= 4% fat)
  4. Synthesizing and integrating evidence into a judgment of worth: Decision-making on the basis of the evaluation. What is the worth of the evaluand? (categorized the cereal and highly recommended some, but ultimate choice left it up to the consumer.)
  • "Logic of evaluation" is important to evaluators which are concerned with determining impact - evaluations of this nature generally are summative (report on what the program has achieved an should be undertaken on programs that are settled or stabilized). (p. 13)
  • The application of the logic of evaluation to real settings involves the evaluator, client or some other stakeholder holding a view about the worth of a give program based on a defensible empirical inquiry. (p. 18)
  • Worth = extrinsic value of an evaluand within a given context. (Cereal: The worth of the cereals might be regarded within decision making about the most appropriate foods for people who are undertaking an extended nutritional regime.) (p. 14)
  • Merit = intrinsic value. (p. 14)
  • The generality of the logic is its strength: it gives us a base from which we can delve into different approaches to evaluation practice. (p. 14)
  • Refer to this lesson plan: Evaluating Chocolate Chip Cookies Using Evaluation Logic (It's a word document.)

3. According to Owen, who makes evaluation judgments?
  • Clients and evaluators should determine, in advance of any evidence gathering, who is to make judgment about the worth of the program. In some instances, the client of an evaluation is more than happy to let the evaluator do so. But in other cases, the client prefers to take on this responsibility. The fact is that, in studies which use the logic of evaluation, some judgment of worth must be made by someone. (p. 20)

4. List Owen's five evaluation forms. (a major part for each definition is from this website)
  • Proactive: Guides the early planning so that it incorporates the views of stakeholders and the accumulated knowledge from previous work in the field. Focuses on the actual need for a program. The main use of these data is to help planners determine what type of program would meet the identified social need or problem. This type of evaluation is carried out before a program is developed.
  • Clarificative: Quantifies both the program’s process and objectives - make program assumptions explicit. Concerned to clarify the underlying rationale of a program. Program developers use this information to think through, and make explicit, the logic that supports the program, including assumptions about how its components link together to produce the desired outcomes. While design evaluation would usually occur before the implementation of a program, it may also be carried out while a program is operating if it is not clear how it was intended that the program was to be delivered. As such, it has a formative evaluation orientation.
  • Interactive: Think of this as evaluation design to enable the program to make “mid-course corrections”. Examines program implementation including the extent to which a program is being delivered in the way it was intended to be delivered. The information gained is used to determine how the implementation of the program could be improved and so has a strong formative evaluation emphasis. Consequently, this form of evaluation is conducted as the program is being delivered within its various settings. The information is of particular use to those implementing the program.
  • Monitoring: focuses on program outcomes and delivery for management decision-making and accountability purposes. This data are used primarily to account for the expenditure of program funds, including the extent to which key accountabilities have been met by program managers. This type of evaluation 'is appropriate when a program is well established and ongoing' (Owen, 1993: 24). It frequently involves keeping track of how the program is progressing. Real time feedback to managers is an important feature of this type of evaluation. (already identified targets & implementation taking place, managers need an indication of the performance of components of the program)
  • Impact: establishes the effects of a programme once it has been implemented and settled for a period of time. This may involve determining the degree to which program objectives have been met or documenting both intended and unintended outcomes. The main use of these data is to justify whether the program should continue to be implemented or implemented in other settings and, if so, whether any modifications are required. Thus, it has a strong summative evaluation emphasis. Because of this, impact evaluation is usually conducted after some logical end point in the program has been reached, such as where a neighbourhood watch program has been fully operational for a year.
5. Define each of the five evaluation forms. (p. 53/54)

Proactive
Clarificative
Interactive
Monitoring
Impact
Orientation
Synthesis
Clarification
Improvement
Justification/
finetuning
Justification/
accountability
Typical Issues
-Is there a need for the program?
-What do we know about this problem will address?
- What is recognized as best practice* in this area?
-Have there been other attempts to find solutions to this problem?
-What does the relevant research or conventional wisdom tell us about this problem?
-What do we know about the problem that the program will address?
-What could we find out from external sources to rejuvenate an existing policy or program?
-What are the intended outcomes & how is the program designed to achieve them?
-What is the underlying rationale for this program?
-What program elements need to be modified in order to maximize the intended outcomes?
-Is the program plausible?
-Which aspects of this program are amenable to a subsequent monitoring or impact assessment?
-What is this program trying to achieve?
-How is this service going?
-Is the delivery working?
-Is the delivery consistent with the program plan?
-How could delivery be changed to make it more effective?
-How could this organization be changed so as to make it more effective?
-Is the program reaching the target population?
-Is implementation meeting program benchmarks?
-How is implementation going between sites?
-How is implementation now, compared with a month ago?
-Are our costs rising or falling?
-How can we finetune the program to make it more efficient?
- How can we finetune the program to make it more effective?
-Is there a program site which needs attention to ensure more effective delivery?
-
-Has the program been implemented as planned?
-Have the stated goals of the program been achieved?
-Have the needs of those served by the program been achieved?
-What are the unintended outcomes?
-Does the implementation strategy lead to intended outcomes?
-How do differences in implementation affect program outcomes?
-Has the program been cost-effective?
Key Approaches
-Needs assessment
-Research review
-Review of best practice [benchmarking]
-Evaluablity assessment
-Logic/theory development
-accreditation
-Responsive
-Action research
-Quality review
-Developmental
-Empowerment
-Component analysis
-Devolved performance assessment
-Systems analysis
Objectives based
-Process-outcome studies
-Needs based
-Goal free
-Performance audit
State of Program
None
Development
Development
Settled
Settled
Major Focus
Program content
All elements
Delivery
Delivery/ outcomes
Delivery/ outcomes
Timing (F2F Program delivery
Before
During
During
During
After
Assembly of evidence
Review of documents & databases, site visits & other interactive methods. Focus groups, nominal groups & Delphi technique useful for needs assessments.
Generally relies on combination of document analysis, interview and observation. Findings include program plan & implications for organizations. Can lead to improved morale.
Relies on intensive onsite studies, including observation. Degree of data structure depends on approach. May involve providers & program participants.
Systems approach requires availability of Management Information Systems [MIS], the use of indicators & the meaningful use of performance information.
Traditionally required use of preordinate research designs, where possible the use of treatment & control groups, & the use of tests & other quantitative data. Studies of implementation generally require observational data. Determining all the outcomes requires use of more exploratory methods & the use of qualitative evidence.
*Remember Patton's comments on best practices: Patton, M. Q. (2001). Evaluation, knowledge management, best practices, and high quality lessons learned. American Journal of Evaluation, 22(3), 329-336.*

  • Program Logic: describe the nature of social and educational programming. The causal mechanisms which are understood to link program activities with intended outcomes.(p. 43)
  • Evaluation Theory: a body of knowledge that conceptualizes, aid in understanding, and predicts action in the areal of evaluation inquiry. (p. 43)
  • Program Logic Development: involves the development of program logic using a range of analytical methods including documentation, and interviews with program staff and other stakeholders with a view to constructing a map of what the program is intended to do. (from clarificative evaluation, p 43/44)

Page 50 Scenarios
  • Scenario A: Clarificative (Logic development... clarify internal structure and function... causal mechanisms which link program activities with intended outcomes)
  • Scenario B: Interactive (Needs-based evaluation... judging the worth of a program... does the program meet the needs of the participants OR
  • Scenario B: Impact (it is an established program adopted from elsewhere... how do the students perform as a result of the program (outcome measures)
  • Scenario C: Proactive (Research review... use pure/applied research to impact on social and educational planning... takes place before a program is designed)
  • Scenario D: Impact (Performance audit... analysis of program efficiency and effectiveness... judgment on continuance or cancellation)
  • Scenario E: Monitoring (System analysis... setting up procedures by which the central management institutes common evaluation procedures to by used uniformly across an organization???)
  • Scenario F: Interactive (Empowerment evaluation..involves helping program providers to develop and evaluation their own programs... action research)
  • Scenario G: Impact (Devolved performance assessment... organization sets up evaluation procedures by which component can report regularly on their progress.)

6. What is PEC and its implication for the five evaluation forms? (p. 55/56)
  • PEC = Program Evaluation Continuum
  • highlights the need for evaluation to contribute to decision-making at every key point in program design linked to: (a) pre-program, (b) during implementation, and (c) post-completion.
  • Pre-program Stage: Proactive and Clarificative evaluations for program identification and for program design and appraisal. Seek to identify worthwhile investments which address needs or tap developmental opportunities. Normally involved some form of situational or trend analysis, problem identification and comparison with a desired state... (p. 55)
  • Implementation Stage: Monitoring evaluation is used to check that the program is on target in terms of its stated objectives. Seeks to provide up-to-date info on actual implementation progress as compared with targets, and suggestive corrective action. Also, Impact evaluation is involved, to assess the likehood that the stated objectives can be attained, with at view to identify changes.(p. 56)
  • Post Completion Stage: Terminal evaluation focus on immediate outcomes, seen as an end-of-projet status statement anc could be assembled by project management, focuses on resource use and the actual outcomes to this point in time, seeks to provide an end-of-implementation summary to interested parties, including the funding agency. Impact evaluation follow after a sufficient time to allow the full effedts of the program to be noted, provides info about he 'final' outcomes of the progrma, both expected and unexpected, likely to be done by an outside evaluation consultant who reports to the funding agency. (p. 56)

7. Which type of question (who, what, when, where, why) does Owen's evaluation forms address?


8. What is a post hoc evaluation?
  • post hoc = Latin for after this.
  • Many times evaluators have a limited range of evidence on which to reach conclusions. Post hoc evaluations are when researchers rely on data collection for a program that has already has been completed or has already started. (p. 65)

9. What are the ten elements of an evaluation plan, as suggested by Owen? (p. 72/73)
  1. Specify the evaluand: What is the focus of the evaluation?
  2. Orientation or purpose(s) of the evaluation: Why is the evaluation being done?
  3. Clients/primary audiences: Who will receive and use the information?
  4. Evaluation resources: what human and material resources are available?
  5. Evaluation focus(es): Which element(s) of the program will need to be investigated? -- program context, program design, program implementation, program outcomes or combination?
  6. Key evaluation issues/questions: Assembly of evidence/data management - What are the key questions and how can we collect and analyze data to answer them? (For each question, outline the data management techniques to be used.) Key questions - To what extent does... ? Is there... ? In what way does... ? Data management: What are the most appropriate methods of data collection and data reduction? Collection - Is sampling important? Is anything known about this from other sources? How will that data be collected? Analysis and Interpretation - How will the data be analyzed to addresses the key evaluation question?
  7. Dissemination: What strategies for reporting will be used? When will reporting take place? What kinds of information will be included (findings, conclusions, judgments, recommendations)?
  8. Codes of behaviour: What ethical issues need to be addressed?
  9. Budget and timeline: Give the resources, what will be achieved at key time points during the evaluation?
  10. Other considerations which emerge in the course of the negotiation.

Owen, 1999, p. 170-307

  1. What is proactive evaluation concerned with?
  2. List the key approaches to proactive evaluation.
  3. What is clarificative evaluation concerned with?
  4. List the key elements of clarificative evaluation.
  5. What is interactive evaluation concerned with?
  6. List the key elements of interactive evaluation.
  7. What is monitoring evaluation concerned with?
  8. List the key elements of monitoring evaluation.
  9. What is impact evaluation concerned with?
  10. List the key elements of impact evaluation.

Chen, 1996, p. 121-130

1. Why does Chen criticize the dichotomy of formative and summative evaluation?
  • Chen believes that program evaluation has outgrown the formative-summative distinction.
  • The formative-summative dichotomy does not cover many relevant, important kinds of evaluations; as a result there are discrepancies between the as defined by Scriven and the actual usage of these concepts.
2. Chen proposes a four-cell typology of evaluation, in lieu of the formative/summative dichotomy. Is the functions/stages typology more inclusive of all types of evaluation? (p. 3)
  • It is more inclusive, as Scriven's formative-summative dichotomy the cook only tasted the soup to only improve the soup, not to make a valuative conclusion (to see if it is good enough to serve to the guests).
  • When a cook tastes the soup, sometimes the soup is not good enough for the guests and/or making it thought to be beyond capabilities; therefore, it is not just formative or summative. Scriven's formative-summative distinction is narrow. (p. 122)
  • Scriven considers when the guests taste the soup to be summative, guest provide a conclusive opinion of the soup; however, guests' opinions could be used to improve the soup in the future (= formative).
  • Scriven's formative-summative dichotomy and Stake's soup tasting analogy lead to problems in classifying relevant evaluation activies, indicate a need for a broader conceptual framework that can provide a more complete classification of evaluation activities. (p. 123)

Basic_Types_of_Evaluation_-_Chen_1996.png
Basic_Types_of_Evaluation_-_Chen_1996.png



2_Conceptual_Framework_for_Basic_Evaluation_Types_Chen.jpg
2_Conceptual_Framework_for_Basic_Evaluation_Types_Chen.jpg

I made a little summary of Chen's typology. (p. 123-125)

3. How does Chen's typology relate to Owen's five evaluation forms - or does it?
I mashed up Chen Typology and Owen's evaluation forms. They complement each other nicely, as Owen lacks the dimension of function and Chen only has two program stages (process and outcome). I like that I communicate with others easily with Owen's 5 evaluation forms - it's more specific. But I like Chen's argument that any stage the function of the evaluation can be assessment or improvement.

Mix_Chen_and_Owen_smaller.jpg
Mix_Chen_and_Owen_smaller.jpg



4. Chen expands his four-cell typology to include an additional six cells for mixed purpose evaluations. What are the three circumstances that indicate the use of mixed-purpose evaluations.
Basic_and_Mixed_Types_of_Evaluation.jpg
Basic_and_Mixed_Types_of_Evaluation.jpg

(Figure 2 from p. 128)
Examples of Mixed Types of Evaluation,
  • Comprehensive Process Evaluation: used to strengthen the implementation process and to judge the merits of a program
  • Comprehensive Outcome Evaluation: elaborates causal mechanisms underlying a program so that it examinesnot only whether the program has an impact, but why, and which mechanisms influence program success or failure.
Sequential Integration - links different types of evaluations in sequential order
  • Comprehensive Assessment Evaluation: first evaluates the merits of a program implementation and then evaluates program effectiveness.Info from implementation and outcome assessment provides a comprehensive merit judgment of the overall program.
Examples of circumstances where mixed evaluation is used, (p. 129)
  1. When evaluation is not a one-shot study, but used as a mechanism for continuous feedback to program stakeholders. As the program progresses and changes, the need for evaluation is different from time to time. Under such conditions, mixed types of evaluation allow the evaluator flexibility in meeting the changing needs.
  2. When evaluation has to respond to multiple stakeholder groups with different needs. Some stakeholders, such as clients and funding agencies, may require information on merit assessment; others, such as program implementers, may need information on program improvement. Under this condition, mixed-type evaluations provide information to meet the diversity of needs.
  3. When stakeholders require comprehensive information about the program. Chen (1994) argued that, due to the complexity of the program implementation processes, the effectiveness of the labor intensity programs cannot be appropriately assessed without answering both of the following two questions: "Does the program achieve its goals?" and "How does the program achieve its goals?" Mixed-type evaluations provide such comprehensive information.



=== Unit 4 Program Evaluation Methods and Applications

=

Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines (pp. 260-300). White Plains, NY: Longman.
Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines(pp. 301-374). White Plains, NY: Longman.
Janesick, V. J. (1998). Stretching exercises for qualitative researchers (pp. 13-43). Thousand Oaks, CA: Sage.
Poulin, M. E., Harris, P. W., & Jones, P. R. (2000). The significance of definitions of success in program evaluation. Evaluation Review, 24(5), 516-536.

Fitzpatrick, Sanders & Worthen, 2004, p. 260-300

Stufflebeam's (1973a) resultant structure for developing evaluation designs includes these six activities/functions: (p. 260)
  1. Focusing the evaluation
  2. Collecting information
  3. Organizing information
  4. Analyzing information
  5. Reporting information
  6. Administering the evaluation

(Questions 1 to 4 are not from the assigned readings, not sure about question 5.)
1. The authors delineate criteria and standards hierarchically, with criteria being more general than standards. Is this representative of the way most evaluators use these terms?
2. What is the stakeholders' function in delineating evaluation questions?
3. Is the process of determining standards and criteria dependent on type of evaluation model selected?
4. In evaluation planning, is the client involved?
5. Are research designs useful in evaluation contexts?

6. What do the authors suggest as a classification scheme for data collection methods? (p. 268)
  • I Data collected directly from individuals identified as sources of information
    • -A. Self-reports (1) Paper-and-pencil methods (e.g., structured questionnaires, unstructured surveys, checklists, inventories, rating scales) (2) Interviews (structured or unstructured, personal or telephone) (3) Focus groups (4) Personal records kept at evaluator's request (e.g., diaries, logs)
    • B. Personal products (1) Tests: a. Supplied answer (essay, completion, short response, problem solving) b. Selected answer (multiple-choice, true-false, matching, ranking) (2) Performances (simulations, role-playing, debates, pilot competency testing) (3) Samples of work (portfolios, work products of employees)
  • II. Data collected by an independent observer
    • -----A. Narrative accounts
    • -----B. Observation forms (observation schedules, rating scales, checklists)
  • III. Data collected by a technological device
    • -----A. Audiotape
    • -----B. Videotape
    • -----C. Time-lapse photographs
    • -----D. Other devices (1) Physical devices (blood pressure, air quality, blood-alcohol content, traffic frequency or speed) (2) Graphic recordings of performance skills (3) Computer collation of participant responses
  • IV. Data collected with unobtrusive measures
  • V. Data collected from existing information resources or repositories
    • -----A. Review of public documents (federal, state, or local department reports, databases, or publications)
    • -----B. Review of organizational documents or files (files of client records, notes or products of employees or program deliverers, manuals, reports, audits, publications, minutes of meetings)
    • -----C. Review of personal files (correspondence or E-mail
7. What are the elements of an evaluation management plan, as described by the authors? (p. 277)
-----A management plan is needed to structure and control resources, including time, money, and people.
-----A good management plan must specify for each evaluation question the following:
    1. the tasks to be performed and the time lines for each task,
    2. the personnel and other resources required to complete the task, and
    3. the cost.

Sample headings for a management plan worksheet,
Evaluation Question
Tasks
Estimated Task
Beginning & Ending Dates
Personnel Involved
& Estimated Costs
Other Resources
Needed & Costs
Total Task Cost

Fitzpatrick, Sanders & Worthen, 2004, p. 301-374
Data Collection Designs
  • Case studies may be used to accumulate evidence for causal relationships and are invaluable for exploring issues in depth, providing "thick descriptions" of programs in implementation, different outcomes, contextual issues, and needs and perspectives of various stakeholders.

  • Experimental designs, may be used when the stakes are high, a research design to investigate cause and effect relationships between interventions and outcomes. Experimental designs are case controlled studies which use random sampling practices to place subjects in control groups and experimental groups and then compare the differences in outcomes. Experimental designs, if feasible, are preferable to quasi-experimental designs in that they can counter more threats to the internal validity of the study. (p. 311)
    • Posttest-only designs: (1) decide what comparison are desired and meaningful, (2) assure that the 'students' in the two (or more) comparison groups are similar. (3) collect information at the end after the program ends (posttest) to determine whether difference occurred. The name of the design, posttest-only, does not dictate the measure to be used. Post-treatment measures can be surveys, interviews, observations, tests, or any other measure or measures deemed appropriate for getting a full picture of the outcomes. (p 311/312)
    • Pre-post designs: when a pretreatment measure can supply useful information. For example, it the groups are small, there may be concern about their equivalence. A pretest can help confirm their equivalence, though only on the measures collected. If there is concern that many participants may drop out of the program, and, thus, scores on the posttest may not represent equivalent groups, pretest scores can be used to examine differences in the two
      groups as a result of dropouts. Many use pretests as benchmarks to report the change that has occurred in those participating in the program from before the program to its conclusion. These reports are often appealing to stakeholders; however, pre-post comparisons can be misleading because the change from pre to post can he due to the program and/or other factors in the participants' lives (e.g., natural changes that occur with the passage of time, other learning, and intervening events). Instead, the post measure of the comparison group is generally the more appropriate comparison because it better represents what the treatment group would have been like-at that point in time-if they had not received the new curriculum or program. In other words, the comparison group experienced the same other factors and passage of time that contributed to change from the pre to the post measure in the group receiving the new program; they simply did not experience the program. So, the difference between the comparison group and the treatment group more
      clearly reflects the effect of the program than does a comparison of the change
      from the pre to post measures.

  • Quasi-experimental designs can be useful when random assignment is difficult or inappropriate. A research design that does not involve random allocation of subjects between control and experimental groups, but approximates that design in other respects, e.g., using a comparison group that is matched on all apparently relevant characteristics. The term, due to Campbell and Cook, is misleading because it suggests that these designs are less adequate for establishing causal connections, when in fact they are often just as adequate, and often have other huge advantages (e.g., reduced cost, shorter timelines, less ethical problems). The key error here is to think that 'adequacy' involves some standard that goes beyond the scientific and legal requirement of 'beyond reasonable doubt' which can readily be obtained by other designs. (link)
    • Interrupted time-series designs (p. 316) to establish a trend for change/cause
      • Random assignment is inappropriate or impractical (e.g., program is a policy or law that applies to everyone)
      • Existing data are available that have consistently measured the construct of interest
      • Quite a few data collection points exist to establish a trend prior to the new program or policy
      • Few, it any, other factors are anticipated to occur concurrently that could also change the construct
      • The program or policy should have a relatively quick impact
    • Nonequivalent comparison group design is similar to pre-post design, but there is no random assignment to groups. Try to find an existing group very similar to the one that will receive the new program. The pretest is a more important component of this design than it is in the experimental designs because it helps us examine similarities between the groups. Of course, the goal is to establish the equivalence of the groups, if only on the pre-measure. (p 316)
    • Regression-discontinuity design is used when eligibility for the program to be studied is determined by a person's "scoring" above or below a certain point on the eligibility criterion (e.g., high blood pressure or cholesterol levels). The design then compares outcomes for people in the program with outcomes for people who were not eligible for the program, using regression methods. A 'discontinuity" in the line, or a difference in the regression line, for the two groups suggests a program effect. This design can be useful when programs are limited to those most in need or most qualified, such as a program for highly gifted students, and eligibility is determined by a clearly defined cut point.(p. 316/7)

  • Descriptive Designs: Case studies are often used for descriptive purposes when the desire is to examine an issue from many different perspectives. Unlike the qualitative case study design, these designs do not provide in-depth descriptions. They are fairly simple designs but are used frequently to answer rather straightforward questions.
    • Cross-sectional designs provide useful quantitative information on large numbers of individuals and groups to show a 'snapshot in time." Cross-sectional studies (also known as Cross-sectional analysis) form a class of research methods that involve observation of some subset of a population of items all at the same time, in which, groups can be compared at different ages with respect of independent variables, such as IQ and memory. The fundamental difference between cross-sectional and longitudinal studies is that cross-sectional studies take place at a single point in time and that a longitudinal study involves a series of measurements taken over a period of time. Cross-sectional research takes a 'slice' of its target group and bases its overall finding on the views or behaviours of those targeted, assuming them to be typical of the whole group. (link)
    • Time-series design is intended to demonstrate trends or changes over time. Unlike the interrupted time-series design, the purpose of the design is not
      to examine the impact of an intervention, but simply to explore and describe changes in the construct of interest. The results of a time-series design can be very useful at the beginning stage of a case study if the evaluator explores with stakeholders their interpretations of the ups and downs exhibited in the results. Their perspectives may point the way to the next steps in data collection. (p. 318)

  • Mixed Methods Designs: There is no one design that is best for all settings. A good design is one that matches the evaluation questions developed during the planning phase, the context of the evaluation, and the information needs and values of stakeholders. When using mixed methods, the evaluator should consider her purpose or purposes in using those mixed methods and select the design or approach most appropriate for achieving that purpose. (p. 319)
    • Component Design: These designs are common, and do expand our knowledge of the program, but are not necessarily the best use of mixed methods designs because the contrasting methods focus on such different issues. (p. 320) [I am not sure if they include triangulation in this, as it measures the same thing using tools that have different biases.]
      • Triangulation refers to the convergence or corroboration of data gathering and interpretation about the same phenomenon. The exact approach or form of data gathering and/or interpretation can vary. For example, researchers sometimes state they are using data triangulation, investigator triangulation, theoretical triangulation, or methodological triangulation. Data triangulation refers to the convergence or corroboration of data about the same phenomenon. Investigator triangulation refers to the collaboration of two or more investigators to gather and interpret the data. Theoretical triangulation refers to the use of more than one theoretical framework to guide the conceptualization of the study and the interpretation of the data. And, methodological triangulation refers to the use of more than one method to gather the data. The terms methodological triangulation and triangulation are often used by different researchers as being synonymous with the broader designation of mixed or multiple methods. The use of these terms can be confusing. (link)
      • Complementarity reaches beyond triangulation by focusing not only on overlapping or converging data, but also on the different facets of phenomenon, providing a greater range of insights and perspectives.(link) It is designed to gain a fuller understanding or picture of the construct of interest. Methods with different biases may still be selected, but not with the hope of results converging and increasing validity. Instead, the hope is for somewhat different results that, when combined across methods, will provide a fuller picture of the abstract constructs we tend to examine in evaluation. (p. 306)
      • Expansion is the overall widening of the scope, breadth, or range of a study. (link) and provides a fuller picture of the program, but not of any individual construct. (p. 306)
    • Integrated Designs mix methods and paradigms at many different stages. Caracelli and Greene see these designs as more desirable, writing that they "have the potential to produce significantly more insightful, even dialectically transformed, understandings of the phenomenon under investigation." (p. 320)
      • ||
        Iterative_Design_Process.jpg
        Iterative Design
        ||
        || Iterative Design ||
      • Iterative/spiral design (link)
        • The evaluator uses different methodologies, from different paradigms, in sequence with the results of each informing the next stage of data collection and interpretation. Thus, interviews might be conducted to begin tapping the construct and to provide information for constructing a survey or some other paper-and-pencil measure such as a test. This measure could be given to a broader sample of people to further explore perspectives gained from the initial interviews and study the degree to which such views are held across many different people and subgroups.
          These results might then be used to conduct focus groups or more intensive interviews with subsets of the population tested or surveyed. This stage of data collection can probe for greater understanding of the survey results. (p. 320)
        • [From gaming] Iterative design, graphic above, is a design methodology based on a cyclic process of designing, testing, analyzing, and refining a work in progress. In iterative design, interaction with the designed system is used as a form of research for informing and evolving a project, as successive versions, or iterations of a design are implemented. Test; analyze; refine. And repeat. Because the experience of a viewer/user/player/etc cannot ever be completely predicted, in an Iterative process design decisions are based on the experience of the prototype in progress. The prototype is tested, revisions are made, and the project is tested once more. In this way, the project develops through an ongoing dialogue between the designers, the design, and the testing audience. (link)
      • Embedded - Different methods are embedded within each other (p. 320)
      • Holistic - Use program theory or concept mapping as structure for integrating mixed methods throughout (p. 320)
      • Transformative - Mix methods, values, stakeholders; use participatory, empowerment, action-oriented (p. 320)


The qualitative researcher concentrates on the instance, trying to pull it apart and put it back together again more meaningfully--analysis and synthesis in direct interpretation. The quantitative researcher seeks a collection of instances, expecting that, from the aggregate, issue--relevant meanings will emerge. (p. 360)

1. Define quantitative methods.
  • Quantitative research uses methods adopted from the physical sciences that are designed to ensure objectivity, generalizability and reliability. These techniques cover the ways research participants are selected randomly from the study population in an unbiased manner, the standardized questionnaire or intervention they receive and the statistical methods used to test predetermined hypotheses regarding the relationships between specific variables. The researcher is considered external to the actual research, and results are expected to be replicable no matter who conducts the research.
  • The strengths of the quantitative paradigm are that its methods produce quantifiable, reliable data that are usually generalizable to some larger population. Quantitative measures are often most appropriate for conducting needs assessments or for evaluations comparing outcomes with baseline data. This paradigm breaks down when the phenomenon under study is difficult to measure or quantify. The greatest weakness of the quantitative approach is that it decontextualizes human behavior in a way that removes the event from its real world setting and ignores the effects of variables that have not been included in the model. (link)

2. Define qualitative methods.
  • Qualitative research methodologies are designed to provide the researcher with the perspective of target audience members through immersion in a culture or situation and direct interaction with the people under study. Qualitative methods used in social marketing include observations, in-depth interviews and focus groups. These methods are designed to help researchers understand the meanings people assign to social phenomena and to elucidate the mental processes underlying behaviors. Hypotheses are generated during data collection and analysis, and measurement tends to be subjective. In the qualitative paradigm, the researcher becomes the instrument of data collection, and results may vary greatly depending upon who conducts the research.
  • The advantage of using qualitative methods is that they generate rich, detailed data that leave the participants' perspectives intact and provide a context for health behavior. The focus upon processes and "reasons why" differs from that of quantitative research, which addresses correlations between variables. A disadvantage is that data collection and analysis may be labor intensive and time-consuming. In addition, these methods are not yet totally accepted by the mainstream public health community and qualitative researchers may find their results challenged as invalid by those outside the field of social marketing. (link)

3. What is critical multiplism?
  • Critical implies that, as in positivism, the need for rigor, precision, logical reasoning and attention to evidence is required, but unlike positivism, this is not confined to what can be physically observed. Multiplism refers to the fact that research can generally be approached from several perspectives. Multiple perspectives can be used to define research goals, to choose research questions, methods, and analyzes, and to interpret results (Cook 1985). (link)
  • The method of critical multiplism involves testing competing models in order to determine the most inclusive and parsimonious [frugal or economical] explanation.
  • Critical Multiplism is a framework that constitutes the combination of two prescription for research planning: critical thinking and multiple representation. The critical part refers to the fact that in planning a research effort, decisions about how to carry out the research should be made critically; the opposite would presumably be mindlessly. For instance, one must decide what the outcome measure will be for an intervention study. One might simply elect to use whatever outcome measure is common in the field and let is go at that. That could be fairly mindless decision making if one knew that the common outcome measure had deficiencies or if it were unlikely to capture exactly the outcomes intended from the intervention. If one were interested in the effect of treatment of seizure disorder on reading ability of children, one would use as research subjects of children of one gender or the other, or else children of both genders. Critical thinking would lead to an active, rather than a passive, decision about what to do, along with the articulated reasons for doing is that way. Less critical approaches would be on the "convenient," or 'seemed like a good idea at the time' variety.
  • The multiplism component has to do with the fact that only if facets of research are allowed to vary or are planned to vary will we have good grounds for generalizing findings. Multiplism is precisely the thinking behind recent federal legislation that mandates that research supported by National Institute of Health (NIH) will be expected, unless otherwise justified, to include both male and female subjects from minority groups as well as from Whites. Too much important biomedical research was being done on White males and was regarded as dubiously applicable to females and minorities. That thinking, multiplism, is also behind frequent complaints that so much of psychological research is done on college sophomores.(link: search for Critical Multiplism)

4. How is critical multiplism related to triangulation? (p. 224 of Philosophical and theoretical perspectives for advanced nursing practice)
  • Critical multiplism has been called a method of 'elaborated triangulation' and, as such, is purported to add nothing new to contemporary research methodology. Post positivists counter that triangulation is part of, but not equal to, the critical multiplist approach. Triangulation has been defined as the combination of two or more theories, data sources, methods or investigators in the study of a single phenomenon. The goal of triangulation is to circumvent the personal biases of investigators and overcome the deficiencies intrinsic to a single-investigator, single-theory or single-method study to promote greater confidence in the observed findings.
  • Critical multiplism is also concerned with reducing bias, in the recognition that no one approach or measure is perfect or without bias. As a result, both triangulation and critical multiplism seek to eliminate inherent bias in the research methods chosen. However, critical multiplism goes further in that it encourages the exhaustive study of phenomena from as many different perspectives as possible, given that the recognition that theory is a huge fishnet of complex, mutually interacting relationships among constructs or variables. Further, triangulation frequently is presented as being conducted by a lone researcher of a group of researchers working in tandem to study a phenomenon. In contrast, critical mutiplism does not require that researchers work in tandem, rather they may be working in vastly different regions of the globe, working in isolation, yet studying similar phenomena in different ways.
  • Is is the essential critique and scrutiny of the multiple ways of studying the phenomena that distinguishes critical multiplism from triangulation. Critical multiplism is an encouragement to all researchers to be open to all of the possible ways of examining phenomena in an effort to arrive at warranted knowledge claims. This openness is part of the reason for critical multiplists' desire to involve multiple stakeholders, beyond mere investigators, in the process of research.

5. List a few major technical problems with data collection in evaluation. (p. 358)
  • Unclear directions lead to inappropriate responses, or the instrument is insensitive or off-target. (Always pilot-test your methods.)
  • Inexperienced data collectors reduce the quality of the information being collected. (Always include extensive training and trial runs. Eliminate potential problem staff before they hit the field. Monitor and document data collection procedures.)
  • Partial or complete loss of information occurs. (Duplicate files and records; keep records and raw data under lock and key at all times.)
  • Information is recorded incorrectly. (Always check data collection in progress. Cross-checks of recorded information are frequently necessary.)
  • Outright fraud occurs. (Always have more than one person supplying data. Compare information, looking for the "hard to believe."
  • Procedures break down. (Keep logistics simple. Supervise while minimizing control for responsible evaluation staff. Keep copies of irreplaceable instruments, raw data, records, and the like.)
  • One of the newer issues in experimental designs concerns the failure to adequately consider statistical power in planning designs. As a result, Type II errors, or failure to find significant differences between groups when such differences really exist, occur far more frequently than we are aware. Such errors can cause us to reject beneficial programs because we believe they make no difference when, in fact, small sample sizes and/or large group variability may have limited our ability to detect differences. Lipsey (1990) discusses methods for planning designs to avoid such problems. (p. 317)

6. How are experimental designs related to data collection methods employed?
  • I think this question should be, "How are experimental designs, related to data collection methods, employed?" (insert commas)
  • Experimental designs are case controlled studies which use random sampling practices to place subjects in control groups and experimental groups and then compare the differences in outcomes. (p. 311)
7. Is sampling an issue in performing program evaluations, normally? When is it not an issue?
  • To answer most evaluation questions, data will be gathered from the entire population because the population of interest is relatively small and external validity, or generalizability, beyond the group of interest is not a priority. Methods of random sampling can be used when the group is large and generalizability is important. Purposive sampling is useful when information is being collected from small numbers of individuals or units. In these cases, the purpose is not generalizability, but description and generation of new ideas. (not an issue: small units, versus it is an issue:larger units) (p. 331/2)
  • Must choose the appropriate sample size to have confidence in validity and credibility for the general audience for the study. (p. 322)
  • When groups are small it is best to collect information from all of them. However, when data collection becomes costly, as with interviews or observations, sampling is necessary. Some might think that random sampling should always be used. However, random sampling for interviews and observations could result in data that are representative but not very useful for the intended purpose. (p. 330)
8. Define cost-benefit analysis, cost-effective analysis, cost-utility analysis, and cost -feasibility analysis.
  • cost analysis- The process of determining all the significant costs of something. This is a highly skilled process and for a long time in the history of evaluation was ignored or treated as a minor consideration, probably because in the early days, the evaluands were all programs already paid for by the state. Various species of cost analysis are useful for different purposes (link):
    • cost-benefit analysis - This approach requires, and so will only work when it is possible to reduce all costs and benefits of an option to monetary terms (this can be difficult). It has the advantage of giving a simple answer. Each alternative is examined to see whether benefits exceed costs, and the ratios of the alternatives are compared. The alternative with the highest benefit-to-cost ratio is then selected. When that is impossible, use cost-effectiveness analysis. Often, simple cost analyzes will suffice to satisfy the client. Given their costs, cost-benefit studies are only cost-effective when stakeholders are trying to make summative decisions about programs with quite different outcomes. Should we rebuild the playground or purchase new hooks? When a choice is to be made among programs with like outcomes, other types of cost studies that do not require monetizing benefits can be more appropriate. (link and p. 325)
    • cost-effective analysis involves comparing the costs of programs designed to achieve the same or similar outcomes. When the task for the administrator
      or stakeholder is to choose from among several different ways to achieve the same goal, this method would be the correct choice. Here we simply provide a list of all costs and all benefits, when it is not possible to reduce them all to money terms. This greatly improves the quality of choices made by those who see the analysis since it usually uncovers many hidden costs and hidden benefits.(link and page 326)
    • cost-utility analysis is used to analyze alternatives by comparing their costs and their utility as perceived by users. Utility can be measured by assessing users' preference for or satisfaction with each option. The results are ratios quite similar to cost-effectiveness ratios except the ratio reflects cost for satisfaction, not effect. (p. 327)
    • cost-feasibility analysis - The most modest use of cost analysis is to determine whether in fact one can afford a particular option at all, when all the costs-monetary and other-are correctly computed.(link)

9. What is the focus of a case study? (p. 307/8)
  • The focus of a case study is on the case itself. Such an approach may be particularly appropriate in evaluation when there is a need to provide in-depth information about the unit, or case, at hand, and not so much to generalize to a larger population. Because evaluation is typically intended to he situation specific, a case study design provides the opportunity to discover the unique attributes of an individual case.
    • description - describe something in depth
    • explanation - give the reader a real understanding of the program and the many different ways it might me viewed. The voices and perspectives of many different stakeholders involved with the program are heard.
    • exploration - exploring the "hows" and "whys" of a evaluand, and encourage deeper exploration of the issues, recognizing that there are many different perspectives.

10. Define focus groups.
  • This term refers to an important part of what is conventionally classified as qualitative methodology and to the small groups involved. Considerable expertise and a number of highly evolved techniques are involved in focus group methodology, which offers many advantages over one-on-one interviewing, such as lower cost (per head), resilience to cancellations, and the chance to draw out comments that would not surface in the personal interview, to set off against the weaknesses of reduced privacy, complex scheduling, and the need for a different kind of specialized skill. (link)
  • Focus groups are particularly useful in needs assessments and monitoring studies and for formative evaluations. (p. 351)
  • Focus groups can help confirm or disconfirm program theories during the planning stages of programs.
  • They can raise novel ideas based on participants' own experiences.
  • Focus groups can also be useful in discovering more about program outcomes, such as how participants have used what they gained, what barriers they faced, or what changes they would make in the program.

11. Is focus group interviewing the same as interviewing any small group of program participants together?
  • Focus groups are like an interview in that they involve face-to-face interaction, but they build on the group process. A skilled focus group facilitator will make use of ideas or issues raised by participants in the focus group to obtain reactions from others in the group. Discussion in focus groups is not always interviewer to interviewee, but often dialogue continues among focus group participants themselves. Thus, the interview is very much of a group process. (p. 351)
  • The role of the leader is to facilitate discussion by posing initial and periodic questions, moderating the responses of more vocal members, and encouraging responses of quieter members. The leader may also ask questions to clarify ambiguities or get reactions from other group members. Participating in groups can be intimidating for some. Sensitive topics can be difficult and there can be a tendency for individuals to acquiesce by agreeing with the majority of the group, i.e., group think. Fontana and Frey (2000) note that the skills required for leading a focus group are similar to those required for a good interviewer, but the leader must also be knowledgeable about methods for managing group dynamics. (p. 351)
  • Focus on higher order questioning or the the focus group then really becomes a structured group interview, not a locus group, because it has lost the key focus group characteristics of member interaction, openness, and exploration. (p. 351)

12. What is content analysis, and how is it applied in program evaluation?
  • ||
    Content_Analysis.jpg
    made from info on page 362
    ||
  • || made from info on page 362 ||
  • Content analysis is a special type of analysis of qualitative information collected in textual form (e.g., field notes, narrative interviews, newspaper articles, minutes of meetings). These procedures may be used to describe, analyze, and summarize the trends observed in these documents. Coding categories may focus on either the actual content of the document ("what is said") or underlying motives, emotions, or points of view ("how it is said"). (Remember the set of content analysis categories for newspaper articles on sex education.) (p. 362)

  • Advantages of Content Analysis (link to ppt)
    • Transparent, replicable method
    • Counting involves minimal interpretation
    • Allows for longitudinal analysis
    • Relatively unobtrusive - no reactive effects
    • Flexible - can be applied to various texts
    • Provides information about populations that are difficult to access directly

  • Disadvantages of Content Analysis(link to same ppt)
    • Only as good as the quality of the documents
    • Coding manuals have to be interpreted
    • Variant interpretations of latent content
    • Descriptive rather than explanatory--no answers to ‘why’ questions
    • Atheoretical? [Unrelated to or lacking a theoretical basis]

13. What are mixed-method evaluation designs?
  • See above.

Janesick, 1998, p. 13-58
1. How does the author define empirical?
  • "The meaning of the term empirical at this point means, "relying on direct experience and observation" and is the cornerstone of qualitative work." (p. 30)
  • I am not sure what she means by at this point; however, I have just started reading the article.

2. What are the six types of interview questions described by the author? (p. 46)
  1. Basic Descriptive Questions: Can you talk to me about your car accident? Tell me what happened on that evening. Describe how you felt that evening.
  2. Follow-up Questions:You mentioned that "planning time" is important to you. Can you tell me how you use planning time?
  3. Experience/Example Questions:You mentioned that you loved going to Paris. Can you give me an example or two of what made you love Paris? Talk about your impressions of Paris.
  4. Simple Clarification Questions: You have used the term constructivist teacher today. Can you clarify that for me? What exactly can you tell me about your constructivist teaching?
  5. Structural/Paradigmatic Questions: You stated that this class was a problematic one. What would you describe as the cause of these problems?Of all the things you have told me about being a critical care nurse, what is the underlying premise of your workday? In other words, what keeps you going every day?
  6. Comparison/Contrast Questions: You said there was a big difference between a great principal and an ordinary principal. What are some of these differences? Can you describe a few for me?

3. Does the author's interview question types mesh with Patton's four categories, presented in the Commentary of this unit?
  • Patton's four categories seem to divided into the cognitive/affective domains and are very general. Whereas Janesick's interview questions all deal with the cognitive domain (describe, plan, experience,..) and are more specific, thus can guide more easily novice program evaluators. [Probably more to it, but can't think of anything else right now.]

4. What are the four advantages and four disadvantages of focus groups, as listed by the author? (p. 36)
  • Strengths
    1. The major strength of focus groups is the use of the group interaction to produce data that would not be as easily accessible without the group interaction.
    2. Focus groups combine elements of both individual interviews and participant observation, the two principal data collection techniques of qualitative researchers.
    3. One can observe a great deal of interaction in a given limited time period on a particular topic.
    4. Participants' interaction among themselves replaces the interaction with the interviewer, leading to a greater understanding of participants' points of view.
  • Trade-offs/Disadvantages
    1. Focus groups are fundamentally unnatural social settings, when compared to participant observation.
    2. Focus groups are often limited to verbal behavior.
    3. Focus groups depend on a skilled moderator, not always available when needed.
    4. Do not use focus groups if the intent is something other than research; for example, conflict resolution, consensus building, staff retreats, and work to change attitudes.

5. How does the author describe analysis of interview data?


Poulin, Harris & Jones, 2000, p. 516-535
1. Why does the author emphasize the need for evaluators to know the organization's goals - both formal stated goals and informal goals?
  • Knowledge of goals gives insight into why an organization functions a particular way. Understanding the goals may also allow the evaluator to understand why actual organizational conduct differs from official policy. Having this knowledge allows the evaluator to recommend appropriate implementation strategies for change. Furthermore, awareness of organizational goals suggests to the organization (those responsible for implementing change) that the evaluators are committed to using the goals specifically identified by the organization when suggesting appropriate indicators of success. (p. 516)

  • In addition, the formal goals of an organization may not be representative of the actual behavior of the organization. Everyday decision making and behavior of members of the organization may conflict with the formal goals. (p. 517)

  • Without measuring the stated goals of the organizations within the system, we risk two things: (a) not learning key pieces of information about the desired outcomes of organizations and (b) producing an evaluation that will be ignored by the organizations we seek to aid in development. (p. 517)

2. Does the approach outlined in this article apply to internal evaluation, external evaluation, or both?

3. How does the author define evaluability assessment?
  • An evaluability assessment is a program analysis tool that helps the evaluator learn about the program in practice in addition to the formal, theoretical program or the program "on paper." This tool is used as a means to identify the stated goals of a program. (p 519)

  • Other definitions are,
    • a study to determine whether or not a program or project can be evaluated
    • a systematic process used to determine the feasibility of a program evaluation. It also helps determine whether conducting a program evaluation will provide useful information that will help improve the management of a program and its overall performance. (link)

4. Who made decisions on definitions of success in the case cited? (p. 519/520)
The process was
  1. implement an evaluability assessment to help identify the state goals of a program (admin/staff were interviewed and asked to specify the most important goals of their program and what factors they would use to define a successful youth at their program)
  2. Review of the goals revealed an overlap existed in definitions of success across agencies. The selection of outcome measures were based on the most frequently cited definitions of success.
  3. A pilot test of potential outcome measures as well as a survey of program staff was conducted to determine which measures worked best to collect the information desired and to gather program staff opinions regarding the measures. Measures included in the pilot test were those that measured the concepts desired with sufficient validity and reliability as tested by other studies.
  4. Final selection of measures was based on program staff members' sense of their face validity, how they seemed to work with youths, and their ability to gather the information desired in a concise manner. The measures finally selected were those that would fit with how programs felt they should be evaluated. That is. outcome measures were not imposed on programs without their input but, rather, were selected based on what the programs said that they actually tried to achieve. In this way, program evaluations could be conducted that would report information that the programs would find useful for program development and
    policy making in the larger system of prevention services in Philadelphia.

5. How are definitions of success used by evaluators? (p. 517)
  • Examining programs' definitions of success, over time, facilitates program development and policy making.

  • Organizations are dynamic. Continual review of the goals of an organization is necessary to remain "in the know" and to respond to changes in the environment that affect the organization. Adaptation of the evaluation to the changing needs and context of the organization will optimize the proposed reforms and produce information for the organization to make informed choices about change.

  • The evaluator should try to be aware of what each organization states it is actually trying to do and why it is trying to achieve these goals. Without this knowledge it is likely that the suggested reforms would not be viewed as viable or appropriate by the organizations involved, and it is unlikely that change will occur. Each organization acts as a key stakeholder in the system. These stakeholders are responsible for the implementation of changes within the system and within their organization. If the evaluator ignores the stated goals of each organization within the system, it wiIl likely produce an evaluation of little value to these organizations, an evaluation that will be ignored.

6. How are definitions of success used to assist the organization and its programs?
  • The inclusion of program goals with each program outcomes report informs the reader of the report with what the program is trying to do, how well they are doing it, and delineates the programs top goals (which may not be apparent). (p. 528)

  • Also, periodic review of providers' definitions of success renews confidence that the outcome reports produced contain the information about which programs are most concerned. (p. 529)

7. Is the author's view of program development aligned with Owen's view of program development?
  • I see a connection between Owen's Monitoring Form - provides insight into how programs can be fine-tuned and justify their existence.
  • Participant-orientated evaluations - working cooperatively with stakeholders to gain evidence and empirically based knowledge to enhance decision-making and the effectiveness of organizations.
  • Anyone else have other ideas??? The words "program development" throws me off as Owen's book is focused on program evaluation not development.


===Unit 5 Overview Program Evaluation and Distance Education

=

Achtemeier, S. D., Morris, L. V., & Finnegan, C. L. (2003). Considerations for developing evaluations of online courses. I(1). Available at http://www.aln.org/publications/jaln/v7n1/pdf/v7n1_achtemeier.pdf
Cyrs, T. E. (2001). Evaluating distance learning programs and courses. Available at: http://www.zianet.com/edacyrs/tips/evaluate_dl.htm (June 2003).
Joint Committee on Standards for Educational Evaluation. (1994). The program evaluation standards: How to assess evaluations of educational programs (pp. 23-24, 63, 81-82, 125-126). Thousand Oaks, CA: Sage.
Lockee, B., Moore, M., & Burton, J. (2002). Measuring success: Evaluation strategies for distance education. EDUCAUSE Quarterly, 25(1). Available at: http://www.educause.edu/ir/library/pdf/eqm0213.pdf
Moore, M. G. (1999). Editorial: Monitoring and evaluation. American Journal of Distance Education, 13(2) p. 1-5.
Whalen, T., & Wright, D. (1999). Methodology for cost-benefit analysis of web-based tele-learning: Case study of the Bell Online Institute. American Journal of Distance Education, 13(1), p. 22-44.

Achtemeier, Morris & Finnegan, 2003, online
1. What do these authors believe is the focus of evaluation plans?
  • The focus of the evaluation plans should be the Seven Principles for Good Practice in Undergraduate Education, as they reflect good practice in teaching and learning.

2. The authors concentrate their efforts on the development of a course evaluation instrument. What do they base their instrument development on?
  • They based their instrument development on principles of effective teaching identified in the literature,
    • Seven Principles for Good Practice in Undergraduate Education
    • Principles of Effective Teaching in the Online Classroom
    • Tom C. Reeves' fourteen pedagogical dimensions (p. 6)
    • Alexander Astin’s input-environment-outputs (frequently designated as I-E-O) model (p. 6)
    • ...

3. What do student evaluations of courses usually measure?
  • amount of student-faculty contact that took place during the course (p. 8)
  • student’s time on task
  • questions pertaining to effective teaching in the online environment
    • course goals were clearly articulated
    • what degree the student was satisfied with the learning activities in the course
    • questions about having necessary skills and equipment (few)
    • format easy to use (few)
    • having sufficient instructions (very few)
  • questions pertaining to concerned with students’ perceptions of the appropriateness, reasonableness, and fairness of the course (p. 9)

4. The authors use what process in trying to define quality distance education?
  1. to investigate the definitions and principles of effective teaching and learning in undergraduate education, generally, and distance education, specifically;
  2. to perform a content analysis of instruments currently in use in the online environment using as a frame of reference the concepts and principles drawn from the literature; and
  3. to develop considerations for the design of evaluation instruments in the online environment.

5. Is the authors' process, as described, a sound process?
  • It seems to be a sound process, as they attempt to create 'benchmarks' for the design of evaluation instruments that can be used to create valuable feedback loops for course and teaching improvement.

6. List the seven principles for good practice in undergraduate distance education. (p. 3)
  1. Encourage student-faculty contact,
  2. Encourage cooperation among students,
  3. Encourage active learning,
  4. Give prompt feedback,
  5. Emphasize time on task,
  6. Communicate high expectations, and
  7. Respect diverse talents and ways of learning.

7.Is student evaluation of courses or programs considered to be program evaluation? (p. 6)
  • It is only meant to be a part of a multiple-methods assessment and evaluation process.

8. The researchers found a "great disjuncture." What was the great disjuncture between? (p. 11)
  • This research found a great disjuncture between the guidelines suggested for effective teaching and learning and the principles that were evaluated by the end-of-course evaluation instruments. The absence of questions dealing specifically with the online environment suggests that many instruments used in the evaluation of online instruction were likely taken from traditional course settings and applied directly to evaluate computer-mediated instruction.


Cyrs, 2001, online

1. According to Cyrs, what is the purpose of evaluating distance education programs and courses?
  • Who are the stakeholders that need to know the outcomes of a program or course?
  • What needs to be known? What is the purpose 2. of the evaluative data?
  • Why do these stakeholder need to know?
  • When do the stakeholders need to know-during and/or after the completion of a distance learning program or course?
  • How should the data be presented?
  • Is the evaluation design empirical or anecdotal?
  • How often do the stakeholders need the data?
  • How will the data be used?

2. What is Cyrs stance on the formative/summative debate in evaluation?
  • Formative evaluation is conducted formally and informally throughout a course/program to provide corrective feedback to the stakeholders that need the data. This can be accomplished through scored tests and quizzes, self-tests that are not scored, and one-minute evaluations given at the end of a class. The latter asks one pertinent question such as “ What was the most important thing that you learned in class today?”

  • Summative evaluation takes place at the end of a course or program. These data are used to re-design a course or program. This type of evaluation includes attitudes towards the course/program as well as learning outcomes. In addition, summative evaluation would also include administration of the program/course. Sample summative evaluation questions could include open ended constructed response questions such as:
    • Identify the strengths and weaknesses of the course/program.
    • Would you recommend this course/program to your colleagues or other students?
    • What would you do differently?
    • What would you add or eliminate?
    • How relevant and useful was the content?
    • What are some of the benefits that you gained during the course/program?

3. Would you label Cyrs' stance on evaluation to be traditionalist or modern? Why?
  • On the continuum of traditionalist and modern stance of evaluation, I would place Cyrs more towards the modern stance. Cyrs delineates his reasons for instructors choose to present a course on the WWW. I have made bold his reasons that demonstrate recent shifts in pedagogy/andragogy.
  1. The student has access to the WWW on demand. The student will explore web resources independently or as guided learning with specific guidance from the instructor.
  2. Web-based instruction functions best in a constructionist environment.
  3. The instructor is not the sole source of information and therefore becomes a guide on the side-a facilitator rather than disseminator of information.
  4. The amount of time that a student has to learn something is variable unless for some reason it is specified.
  5. Students must accept responsibility for their own learning.
  6. Learning does not take place in a fixed location. It takes place at home, at work, in a library, as well as in a classroom.
  7. The primary content resource shifts from a single text and teacher to a variety of information resources, multimedia as well as print, available worldwide.
  8. Content is presented using a hypertext format with links to further levels of detail and elaboration that is under the control of the student.
  9. Student identified resources (URLs) must be evaluated as to validity and reliability.

4. What are the www criteria suggested by Cyrs?
  1. The class size was appropriate.
  2. There was a reasonable balance between real and delayed time classes.
  3. The course was totally asynchronous.
  4. The instructor used:
    1. listservs
    2. bulletin boards
    3. chat rooms
    4. audio conferencing
    5. e-mail
    6. voice-mail
  5. Student e-mail messages were answered within 24 hours.
  6. Navigation through the course was easy.
  7. Students were able to communicate with each other.
  8. Students received adequate feedback on assignments and projects.
  9. Students worked primarily in teams.
  10. The navigation icons were consistent through all of the web pages.
  11. Graphics were effective.
  12. Graphics and pictures were easily downloaded.
  13. All computer conferencing dialogue was available to all students at any time.
  14. The instructor provided some useful URLs.
  15. There was good discussion among teams during project work.
  16. The course syllabus was clear and directive.
  17. There was adequate real-time interaction with the instructor.
  18. Students were taught how to identify, access, and evaluate URLs.
  19. WWW contextual assumptions

Joint Committee on Standards, 1994, p. 22-23, 63, 81-82, 125-126

1. The Joint Committee has developed professional evaluation standards for what purpose?
  • The goal of The Program Evaluation Standards is the development of evaluation standards to help ensure useful, feasible, ethical, and sound evaluation of educational programs, projects and materials. Taken as a set, thirty Standards provide a working philosophy for evaluation. They define the Joint Committees’ conception of the principles that can guide and govern program evaluation efforts. They are intended for both users of evaluation and for evaluators. (link to doc)

2. What are the four categories of Joint Committee Standards?
  • The four attributes of sound program evaluation are:
    1. The utility standards are intended to ensure that an evaluation will serve the information needs of intended users.
    2. The feasibility standards are intended to ensure that an evaluation will be realistic, prudent, diplomatic, and frugal.
    3. The propriety standards are intended to ensure that an evaluation will be conducted legally, ethically, and with due regard for the welfare of those involved in the evaluation, as well as those affected by its results.
    4. The accuracy standards are intended to ensure that an evaluation will reveal and convey technically adequate information about the features that determine worth or merit of the program being evaluated.

3. How might an evaluator make use of these standards in his/her professional practice? (link to same doc above)
  • These standards guide the decisions, employment, and assessment of evaluations.
  • Stufflebeam (1992) describes sets of standards as noteworthy because they provide:
  • an operational definition of … program evaluation;
  • evidence about the extent of agreement concerning the meaning and appropriate methods of educational evaluation;
  • general principles for dealing with a variety of evaluation problems;
  • practical guidelines for planning evaluation;
  • widely accepted criteria for judging evaluation plans and reports;
  • conceptual frameworks by which to study evaluation;
  • evidence of progress ... to professionalize evaluation;
  • content for evaluation training;
  • descriptions of “best evaluation practices”
    Fournier (1994) recommends that “The Program Evaluation STANDARDS is a “must have” book for anyone responsible for reviewing evaluation proposals, planning and conducting evaluations, managing evaluation projects, or judging the merit and worth of evaluations once completed. For experienced practitioners, it provides a set of values and principles by which to guide successful practice… For newcomers and the less experienced who may be responsible for commissioning and using evaluations, the STANDARDS supply a useful framework for generating a list of questions to raise about any evaluation plan… an invaluable “how to” resource for graduate students venturing out into the field, and it instills a sense of what it means to be a responsible evaluator…

    Patton (1994), states that, “Certainly no contemporary student of evaluation should come through a training program without studying the STANDARDS.” (p. 195) “There can be no question that these are the evaluation profession’s definitive statement of STANDARDS. I use knowledge of the STANDARDS as an indicator for knowledge of evaluation… the STANDARDS represent much more than a set of professional guidelines; they constitute a philosophy of evaluation that emphasizes and values utility, feasibility, propriety, and accuracy.” (p. 198)


4. What does the Joint Committee mean by meta-evaluation?
  • Meta-evaluation is the process of delineating, obtaining, and applying descriptive information and judgmental information about an evaluation’s utility, feasibility, propriety, and accuracy and its systematic nature, competence, integrity/honesty, respectfulness, and social responsibility to guide the evaluation and publicly report its strengths and weaknesses.
  • Formative meta-evaluations—employed in undertaking and conducting evaluations—assist evaluators to plan, conduct, improve, interpret, and report their evaluation studies.
  • Summative meta-evaluations—conducted following an evaluation—help audiences see an evaluation’s strengths and weaknesses, and judge its merit and worth.
  • Meta-evaluations are in public, professional, and institutional interests to assure that evaluations provide sound findings and conclusions; that evaluation practices continue to improve; and that institutions administer efficient, effective evaluation systems. (link to an article abstract)



Lockee, Moore & Burton, 2002, online

1. Which four groups of issues do the authors include in evaluating distance education?
  • A few of the factors to consider are instructional, technological, implementation, and organizational issues.

  • Formative evaluation issues: largely fall into the two primary categories (p. 22)
  • Instructional design issues (such as teaching strategy choices and assessment methods). Evaluators seek answers to the primary question of learning effectiveness. If these questions can be addressed within the formative evaluation stage, then corrective measures can produce more effective learning experiences for distance students
    • Did students learn what the goals and objectives intended? If not, why?
    • Was the instruction well written?
    • Were the objectives clearly stated and measurable?
    • Were appropriate instructional strategies chosen?
    • Was there enough practice and feedback? Were examples provided?
    • Did assessment methods correlate with instructional content and approaches?
  • Interface design issues (Web site navigation, aesthetics, and so forth). In evaluating the interface design of a Web-based course, a few simple questions can provide insight into the strengths and weaknesses of a site’s look and feel.
    • Was the Web site easy to navigate?
    • Was it aesthetically pleasing, as well as legible?
    • Did each page in the site download easily?
    • If special plug-ins were needed, were links provided to acquire them?
    • Also, consideration of learners with special needs should be addressed at this stage.
      • If graphics or images were used, were alternative ways provided for sight-impaired learners to get the intended information?
      • If course information was presented using audio, could hearing-impaired learners access transcriptions?
      • Was information clearly available to learners with disabilities on where to get assistance if needed?

Summative evaluation issues. (But remember if a program is on-going, summative evaluation cam be used as formative evaluation for course/program improvement. Also, the dichotomy of evaluation into the polarized view formative and summative is not a true reflection of reality.)
  • Organizational issues.
    • (One example: use of faculty time.) Many critics have raised the issue of the time it takes to deliver a course online, given the increased amount of direct communication with students, plus the frequent increase in student numbers. Good questions to ask relate to the balance of workload and efficient use of time by faculty who teach at a distance. (p. 24)
      • Is time spent on DE courses significantly detracting from research and scholarship?
      • Are DE faculty designing efficient strategies to implement their courses?
    • An example of an organizational issue from the article. It became clear to us upon review of initial assignments in the first online course that no mechanism existed to receive and manage student work. the university established a filebox server system for all students that we leveraged for our distance learners. dealing with this issue during the formative stage of evaluation led to a workable solution before it became a crisis. (p. 25)
  • Implementation issues: the process of DE has a variety of stakeholders, from students to faculty to support personnel to the host institution itself. (p. 25)
    • Some implementation concerns are shared by all stakeholders, such as the reliability of the delivery technology and the accessibility and effectiveness of
      the student support services.
    • Other concerns are specific to individual stakeholders.
      • For example, distance learners must understand the distance environment and be prepared to engage in self-directed learning.
      • Also, distance learners should clearly understand faculty expectations and know who to contact for technological and instructional needs.
      • Regarding faculty concerns, their preparedness to teach in distance settings is important, while accessibility to appropriate professional development activities is essential. These factors, as well as incentives and rewards for teaching at a distance, are very real issues that faculty face, hence worthy of evaluation.
      • Finally, educational providers, such as institutions of higher education, are concerned with quality assurance. Are our distance courses and programs of strong quality and rigor? Do they meet our professional accreditation criteria? These questions can also be answered within a summative evaluation effort.

2. How do the authors view these four sets of issues? (p. 21)
  • While these factors can be isolated and itemized, by no means are they independent of each other. As in any system, the separate components must work together effectively so that the whole DE system can operate holistically. When DE delivery technologies break down, distance learners cannot engage in the planned instructional event. Without institutional policies that provide for online support services, distance learners can find it difficult or impossible to get assistance with matters necessary for their basic participation in a higher education program. Thus, a comprehensive review of DE efforts must not only scrutinize the individual system components, but also attempt to get a clear picture of how the parts work together as a whole to create positive outcomes (learning, satisfaction, matriculation, and so on).

3. What do the authors mean by incremental analysis?(p. 21)


4. How do these authors view formative and summative evaluation?
  • Formative evaluation serves to improve products, programs, and learning activities by providing information during planning and development. Data collected during the design and development process provides information to the designers and developers about what works and what doesn’t, early enough to improve the system while it remains malleable.

  • Summative evaluation determines if the products, programs, and learning activities, usually in the aggregate, worked in terms of the need addressed or system goal. imply, formative and summative evaluations differ in terms of the audience for the information collected, the time in the development cycle when the information is collected, and the intention behind the data collection. Summative evaluation is information provided to audiences external to the design and development team (such as funding agencies, clients, or accreditation agencies) about how the entire package works in a real setting. Although this information might be used to suggest changes, additions, segmentations, and such, it’s more likely that the information will be used to make fiscal and policy decisions to use, or continue funding, a learning system.

5. Do the authors have anything in common with Owen, in terms of the program aspect of evaluation? If so, what?



Moore, 1999, p. 1-5

1. Why does Moore believe that monitoring is required in distance education? (p. 1)
  • A good monitoring system tells administrators what problems instructors and students are experiencing and indicates if delays or breakdowns occur in the communication systems-while there is still enough time to take remedial action.

2. Is Moore's idea of monitoring the same as Owen's idea of monitoring evaluation?


3. What is the one generalization that Moore suggests holds true for any distance education program?
  • One of the few generalizations that can be made about any distance education program-whatever the communications media used and the content level-is that a good monitoring and evaluation system is likely to lead to a successful program, and a poor system is almost certain to lead to failure.

4. List the three elements of a program that are essential for effective monitoring, according to Moore.
  • first is the preliminary specification of good learning objectives (p. 1)
  • second key to successful monitoring and evaluation is the construction and, later, the handling of the products submitted by students or trainees as (evidence of learning--commonly known as assignments (p. 2)
  • third key to good monitoring and evaluation is a good data gathering and reporting system (p. 3)


Whalen & Wright, 1999, p 22-44

1. What are the two common measures incorporated in cost-benefit analysis?
  • Two common measures are the
  • break even point, the point at which costs are recovered, and
  • return on investment, which illustrates the economic gain or loss from having undertaken a project.

2. Define fixed costs and variable costs. (link to a glossary)

From Wikipedia
From Wikipedia

From Wikipedia

  • fixed costs: costs, which are incurred by a business whether it is operating to generate income or not and which do not necessarily increase or decrease as a total volume of production, increases or decreases. Rent, for example, must be paid whether or not any business is accomplished.
    • example, fixed costs for videoconferencing as: videoconferencing equipment, technicians' salaries for running the equipment, installation costs, and fees for basic telephone lines.
  • variable costs: the costs additional to fixed costs of running a business, that can vary depending on the level of demand and activity.
    • Variable costs for videoconferencing delivery of distance courses included fees for long distance network usage, shipping charges for supplementary print materials, honoraria for professors, and salaries paid for the preparation of course materials.
    Thus, variable costs increase with the number of students, while fixed costs are incurred before a course is even offered.

Breakeven_Number_of_Students.jpg
From page 26
From page 26

  • Examples of costs for traditional classroom delivery include
    • instructor's salary and benefits; number of courses taught by the instructor; costs of course development, course materials, administrative support, classroom overhead, and any additional time the instructor spent on the course for activities such as grading and meeting with students
    • development time, classroom overhead costs, instructors costs, and travel costs for the participants
    • high travel costs for students and length of time spent away from the job
  • Examples of costs for web-based courses include
    • development time was high a CD-ROM course, totaling $1,205,394 over three years.
    • equipment costs and course development costs
    • License Fees for Learning Platform Software
    • Hardware
    • Content Development. Content development variables include: instructional design; multimedia design; the production of text, audio, motion video, graphics, and photos in machine-readable format; course authoring; software development; integration of content and testing; modification/adjustment; training; course testing; and motion video elements.
    • Developer Salaries.
    • note an example: synchronous course required far less development time primarily due to less use of multimedia
    • employees must pay employees for the time they spend in training, student salary costs are a significant factor in costing analysis. More time spent in course delivery translates into higher student salary costs and less cost savings.

3. In web-based education and training, what does the term capital costs refer to? (P. 28/9)
  1. Capital costs include the server platform shared by all courses mounted on that server as well as the cost of the content development shared by all students taking that course. Operating costs represent the costs for the time students and instructors spend using the courses.
  2. Content development includes six items:
    1. instructional and multimedia design;
    2. the production of text, audio, video, graphics, and photographs;
    3. the development of authoring and delivery software, or the cost of licensing commercial software;
    4. the integration, modification, and testing of course content;
    5. student and instructor training; and
    6. course testing.

4. In comparing web-based education/training and conventional education, which has the higher fixed cost? (p. 42)
  • Web-based training has higher fixed costs than classroom based training; however, these higher course development costs are offset by lower variable costs in course delivery. This is primarily due to the reduction in course delivery time (course compression) and the potential to deliver courses to a larger number of students than is possible in a traditional classroom without incurring significant incremental costs. Realizing savings for Web-based courses requires a sufficient number of students in order to recover course development costs. Since employees must be paid for time they spend taking a course, student salaries are an important consideration in this costing study.

5. In comparing web-based education/training and conventional education, which has the higher variable cost?
  • Conventional education has higher variable costs, see above.

6. Is web-based training cost-effective, according to the authors?
  • Web-based training can be cost-effective is the return of investment (ROI) is greater than 1 (100%).
  • The return on investment (ROI) is the percentage that represents the net gain or loss of using Web-based training instead of classroom delivery.