這是用戶在 2025-3-12 14:22 為 https://app.immersivetranslate.com/pdf-pro/37e4a2d3-8056-4024-8881-3ebcf10b8f95/ 保存的雙語快照頁面,由 沉浸式翻譯 提供雙語支持。了解如何保存?

The Postpositivist Paradigm
後實證主義範式

and the Methods Branch  和方法處

The Postpositivist Paradigm
後實證主義範式

Abstract  摘要

Weiss (1998) asserted that the traditional and still dominant role conceptualization of evaluators is methods-based and representative of a neutral, detached social scientist: “The traditional role of the evaluator has been one of detached objective inquiry. . . . She puts her trust in methodology” (p. 98) . . . . Mark (2002) cited Campbell’s (1969) view of an evaluator’s role as a “technical servant” to an experimenting society as an example of the traditional methodsbased understanding of an evaluator’s role. (Skolits et al., 2009, p. 277)
Weiss (1998) 斷言,傳統上仍佔主導地位的評估人員角色概念是以方法為基礎,代表中立、超然的社會科學家:"評估人員的傳統角色是獨立客觀的調查。. . .她信任方法論」(第 98 頁)。. . .Mark (2002) 引用了 Campbell (1969) 的觀點:評估者的角色是實驗社會的「技術僕人」,以此作為傳統方法論理解評估者角色的一個例子。(Skolits 等人,2009 年,第 277 頁)。

The Methods Branch reflects the roots of the evaluation field in applied social research involving the use of rigorous methods of inquiry, largely based in the assumptions of the positivist and postpositivist paradigms. The origins of the positivist paradigm can be traced back to the writings of Sir Francis Bacon (15611626), in which he articulated the principles of the scientific method (Turner, 2001). In the 1800 s, Comte and Spencer contributed to the development of the positivist paradigm in the social sciences, seeing it as a means of improving society by applying scientific methods to discover laws about human behavior. Under this philosophical banner, social research was viewed as the search for general laws of human organization through the conduct of empirical observations.
方法分部反映了評估領域在應用社會研究中的根源,涉及使用嚴謹的調查方法,主要基於實證主義和後實證主義範式的假設。實證主義範式的起源可以追溯到 Francis Bacon 爵士 (15611626) 的著作,他在其中闡述了科學方法的原則 (Turner,2001)。在 1800 年代,孔德 (Comte) 和斯賓塞 (Spencer) 對社會科學中的實證主義範式的發展做出了貢獻,將其視為一種通過應用科學方法來發現人類行為規律,從而改善社會的手段。在此哲學旗幟下,社會研究被視為透過實證觀察來尋找人類組織的一般規律。
In the 1950 s in the United States, positivism became associated with quantitative research, measurement, and statistical analysis as a way of testing hypotheses about the general laws applied to human behavior. Positivists hold the ontological belief that one reality exists and that it is independent of the observer (Fielding, 2009). Their epistemological belief is that distance from the object of study contrib-
在 1950 年代的美國,實證主義與定量研究、測量和統計分析有關,作為測試適用於人類行為的一般法則的假設。實證主義者持有本體論的信念,認為只有一種現實存在,而且獨立於觀察者(Fielding, 2009)。他們的認知論信念是,與研究對象的距離有助於研究。

utes to reducing bias in the research. The positivist paradigm’s methodological belief is associated with an approach that prioritizes the use of “true experiments,” which require random selection of subjects and random assignment to interventions-conditions that can be very difficult to satisfy in the world of social research and evaluation.
實證主義範式的方法論信念,是以使用「真實實驗」為優先。實證主義範式的方法信念與優先使用「真實實驗」的方法有關,而「真實實驗」需要隨機選擇研究對象,並隨機分配到干預項目中,這些條件在社會研究與評估的世界中很難滿足。
Campbell (1991) envisioned the role of researchers in terms of an “experimenting society” that would make use of social science research methods to test theories to improve society. He offered a way for researchers to adapt the principles of positivism by the development of quasi-experimental methods (i.e., designs sharing many characteristics with experimental designs, but adapted for use with human populations) (Shadish & Cook, 1998). This topic is discussed at great length in Chapter 9. Hence the research and evaluation worlds began to operate more under the belief systems of the postpositivist paradigm than of the positivist paradigm. Postpositivists still hold to the methodological belief in quantitative approaches; however, they have reframed their ontological view of reality to take into account the complexity of human behavior. The ontological perspective adheres to a belief in one reality; postpositivists have added the notion that reality can be known within a certain level of probability. Distance from the object of study continues to be a hallmark of the epistemological belief system in postpositivism. Researchers strive to be “objective” by limiting their contact or involvement with people in the study. Campbell did not view experimental and quasi-experimental approaches as the only possible methods for conducting social research and evaluation; however, he did hold that true experiments are superior to other approaches because of their potential to control for bias.
Campbell (1991) 從「實驗社會」的角度來預見研究人員的角色,即利用社會科學研究方法來驗證改善社會的理論。他提供了一個方法,讓研究人員透過發展準實驗方法(即與實驗設計有許多共同特徵,但經調整後適用於人類族群的設計)來適應實證主義的原則(Shadish & Cook, 1998)。第九章將詳細討論此主題。因此,研究和評估世界開始更多地在後實證主義範式的信念系統下運作,而不是在實證主義範式的信念系統下運作。後實證主義者仍然堅持定量方法的方法論信念;但是,他們已經重新建構了他們對現實的本體觀點,以考慮到人類行為的複雜性。本體論的觀點堅持對單一現實的信念;後實證主義者加入了現實可以在一定的可能性範圍內被了解的概念。與研究對象的距離仍然是後實踐主義認知學信念系統的標誌。研究人員藉由限制他們與研究對象的接觸或參與,努力做到「客觀」。Campbell並不認為實驗和准實驗方法是進行社會研究和評估的唯一可能方法;但是,他認為真正的實驗比其他方法優越,因為它們有可能控制偏差。
Postpositivism is a major paradigm that guides many evaluators in their work. The axiological assumption of this paradigm is intertwined with the methodological assumption, in that the conduct of “good research” is a fundamental requirement for ethical conduct. Good research is described as that which reflects “intellectual honesty, the suppression of personal bias, [and] careful collection of empirical studies” (Jennings & Callahan, cited in Christians, 2005, p. 159).
後實證主義是指導許多評估人員工作的主要範例。這個範例的公理假設與方法假設交織在一起,即進行「良好的研究」是道德行為的基本要求。好的研究被描述為反映「知識誠實、抑制個人偏見、[以及]仔細收集經驗研究」(Jennings & Callahan, cited in Christians, 2005, p.159)。
The axiological assumption of the postpositivist paradigm is closely aligned with the ethical principles articulated by the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research (1979) in its Belmont Report (see Sieber, 1992, pp. 18-19). This report identifies three ethical principles and six norms that should guide scientific research:
後實證主義範式的公理假設,與全國生物醫學與行為研究人類受試者保護委員會(1979)在貝爾蒙報告(見Sieber, 1992, pp. 18-19)中所闡述的倫理原則緊密一致。這份報告指出了三個倫理原則和六個規範,應該用來指導科學研究:

Ethical Principles  道德原則

  1. Beneficence: Maximizing good outcomes for science, humanity, and the individual research participants and minimizing or avoiding unnecessary risk, harm, or wrong.
    Beneficence:為科學、人類和個別研究參與者創造最大的好結果,並盡量減少或避免不必要的風險、傷害或錯誤。
  2. Respect: Treating people with respect and courtesy, including those who are not autonomous (e.g., small children, people who are intellectually challenged or senile).
    尊重:以尊重和禮貌對待他人,包括那些沒有自主能力的人(例如小孩、智障者或老人)。
  3. Justice: Ensuring that those who bear the risk in the research are the ones who benefit from it; ensuring that the procedures are reasonable, nonexploitative, carefully considered, and fairly administered.
    正義:確保在研究中承擔風險的人是研究的受益者;確保程序是合理的、非剝削性的、經過審慎考量且公平管理的。

Norms of Scientific Research
科學研究規範

  1. A valid research design must be used. Faulty research is not useful to anyone and not only is a waste of time and money, but also cannot be described as ethical, in that it does not contribute to the well-being of participants.
    必須使用有效的研究設計。錯誤的研究對任何人都沒有用,不僅浪費時間和金錢,也不能說是合乎道德,因為它對參與者的福祉沒有貢獻。
  2. The researcher must be competent to conduct the research.
    研究人員必須有能力進行研究。
  3. Consequences of the research must be identified: Procedures must respect privacy, ensure confidentiality, maximize benefits, and minimize risks.
    必須確認研究的後果:程序必須尊重隱私權、確保機密、效益最大化、風險最小化。
  4. The sample selection must be appropriate for the purposes of the study, representative of the population to benefit from the study, and sufficient in number.
    樣本的選擇必須符合研究目的、能代表從研究中獲益的人口,而且數量要足夠。
  5. The participants must agree to participate in the study through voluntary informed consent-that is, without threat or undue inducement (voluntary), knowing what a reasonable person in the same situation would want to know before giving consent (informed), and explicitly agreeing to participate (consent).
    參與者必須透過自願知情同意的方式同意參與研究,也就是在沒有威脅或不當誘惑的情況下(自願),在給予同意之前知道在相同情況下合理的人想要知道的事情(知情),並明確同意參與(同意)。
  6. The researcher must inform the participants whether harm will be compensated.
    研究人員必須告知參與者傷害是否會獲得補償。
These principles and norms apply to all evaluators, no matter what their philosophical or theoretical beliefs. However, evaluators who hold different paradigmatic beliefs typically interpret these principles and norms differently, as will be seen in subsequent chapters.
這些原則和規範適用於所有的評估人員,不論他們的哲學或理論信念為何。然而,持有不同典範信念的評估人員通常會以不同的方式來詮釋這些原則和規範,這將會在接下來的章節中看到。

EXTENDING YOUR THINKING  擴展思維

Ethical Principles and Norms
道德原則與規範

  1. Goode (1996) created personal ads in a newspaper in order to gather data to learn more about courtship through personal advertisements. He did not respond to any of the men and women who answered his ads, and their identities remained anonymous. Do you think that this kind of research follows ethical principles and norms of scientific inquiry as described above? Explain.
    Goode (1996)在報紙上刊登個人廣告,以收集資料,藉由個人廣告了解更多關於求愛的資訊。他沒有回覆任何回答他廣告的男女,他們的身份也保持匿名。您認為這種研究符合上述科學探究的道德原則與規範嗎?請說明。
  2. Locate and review one website or article about unethical practices in evaluation. Why are ethics needed in the world of evaluation?
    找到並閱讀一篇有關評估中不道德行為的網站或文章。為什麼評估世界需要道德?
Situated in the postpositivist paradigm, Mark and Gamble (2009) suggest that the methodological choice of a randomized experimental design is ethically justified when the purpose of the study is to establish a cause-and-effect relationship and there is uncertainty about the effect of a particular intervention, because this design provides greater value in terms of demonstrating the effects of a treatment than do other approaches. According to Mark and Gamble, “a case can be made that good ethics justifies the use of research methods that will give the best answer about program effectiveness as this may increase the likelihood of good outcomes especially for those initially disadvantaged” (p. 205).
Mark 和 Gamble (2009) 從後實證主義範式的角度提出,當研究的目的是要建立因果關係,而特定干預的效果不確定時,隨機實驗設計的方法選擇在倫理上是合理的,因為這種設計比其他方法在展示治療效果方面提供了更大的價值。Mark 和 Gamble 認為,「有理由相信,良好的道德倫理可以證明使用研究方法是合理的,因為這些研究方法可以為計劃的有效性提供最佳答案,特別是對那些最初處於不利地位的人而言,這可能會增加取得良好結果的可能性」(第 205 頁)。

EXTENDING YOURTHINKING  擴展您的思維

Philosophical Assumptions of the Postpositivist Paradigm and the Methods Branch
後實證主義範式與方法分支的哲學假設

Using the following table, answer these questions:
使用下表回答這些問題:
  1. Can you imagine what a postpositivist evaluation would look like?
    您能想像後實證主義評估會是什麼樣子嗎?
  2. How would the evaluator set up the evaluation?
    評估員將如何設定評估?
  3. Would the evaluator be involved with the stakeholders or not?
    評估員是否會與利害關係人一起參與?
  4. How would the evaluator’s assumptions guide his/her decisions?
    評估員的假設如何引導他/她的決策?
The Postpositivist Paradigm and the Methods Branch
後實證主義範式與方法分支

Description: Focuses primarily on quantitative designs and data.
說明:主要著重於定量設計與資料。
Philosophical assumption
哲學假設
Guiding question  指導性問題 As experienced in life
如同在生活中所經歷的
Axiology  公理學 What is the nature of ethics?
道德的本質是什麼?

- 尊重 - 正義 - 惠益
- Respect
- Justice
- Beneficence
- Respect - Justice - Beneficence| - Respect | | :--- | | - Justice | | - Beneficence |
Ontology  本體 What is the nature of reality?
現實的本質是什麼?
One reality knowable within a certain level of probability
一個在某種可能性範圍內可知的現實
Epistemology  認識論 What is the nature of knowledge, and what is the relationship between the knower and that which would be known?
知識的本質是什麼?知識者與被知識者之間的關係是什麼?

- 遙遠 - 目標
- Distant
- Objective
- Distant - Objective| - Distant | | :--- | | - Objective |
Methodology  方法 What are the systematic approaches to gathering information about what would be known?
有哪些有系統性的方法來收集會知道的資訊?

- 科學方法 - 假設 - 定量方法或以定量為主的混合方法
- Scientific method
- Hypothesis
- Quantitative methods or quantitatively dominant mixed methods
- Scientific method - Hypothesis - Quantitative methods or quantitatively dominant mixed methods| - Scientific method | | :--- | | - Hypothesis | | - Quantitative methods or quantitatively dominant mixed methods |
Philosophical assumption Guiding question As experienced in life Axiology What is the nature of ethics? "- Respect - Justice - Beneficence" Ontology What is the nature of reality? One reality knowable within a certain level of probability Epistemology What is the nature of knowledge, and what is the relationship between the knower and that which would be known? "- Distant - Objective" Methodology What are the systematic approaches to gathering information about what would be known? "- Scientific method - Hypothesis - Quantitative methods or quantitatively dominant mixed methods"| Philosophical assumption | Guiding question | As experienced in life | | :---: | :---: | :---: | | Axiology | What is the nature of ethics? | - Respect <br> - Justice <br> - Beneficence | | Ontology | What is the nature of reality? | One reality knowable within a certain level of probability | | Epistemology | What is the nature of knowledge, and what is the relationship between the knower and that which would be known? | - Distant <br> - Objective | | Methodology | What are the systematic approaches to gathering information about what would be known? | - Scientific method <br> - Hypothesis <br> - Quantitative methods or quantitatively dominant mixed methods |

Methods Branch Theorists
方法分支理論家

Theorists of the Methods Branch have had, and continue to have, a great deal of influence on what is considered to be good evaluation. This brief history provides you with the necessary building blocks to proceed with the planning of evaluations that are reflective of lessons learned by eminent evaluators. Alkin (2013) identifies the following evaluation theorists in the Methods Branch: Tyler, Campbell, Cook, Shadish, Cronbach, Rossi, Chen, Henry, Mark, and Boruch. 1 1 ^(1){ }^{1} We add Donaldson as a contributor to the Methods Branch, especially as it is envisioned in a theory-based evaluation. And we add Kirkpatrick as another contributor to this branch, because of his model of the evaluation of
方法分部的理論家對於何謂好的評估已經有了很大的影響力,並將持續下去。這段簡短的歷史為你提供了必要的基石,讓你在規劃評估時,能反映出知名評估者的經驗教訓。Alkin (2013)在方法分支中指出下列評估理論家:Tyler、Campbell、Cook、Shadish、Cronbach、Rossi、Chen、Henry、Mark 和 Boruch。 1 1 ^(1){ }^{1} 我們加入 Donaldson 作為方法分支的貢獻者,尤其是在以理論為基礎的評價中的構想。我們加入 Kirkpatrick 作為這個分支的另一位貢獻者,因為他的評估模型

training. These theoretical positions are explained briefly and used to build a composite picture of current views of “methods” as a theoretical basis for evaluation practice. Advantages and challenges associated with approaches that reflect this branch are discussed. Examples of studies that reflect the Methods Branch illustrate how these theoretical perspectives are manifested in practice. Specific emphasis is placed on those aspects of these approaches that lead to designs addressing issues of causality.
訓練。這些理論立場會被簡短地解釋,並用來建立目前對「方法」的綜合看法,作為評估實務的理論基礎。討論了與反映這個分支的方法相關的優點與挑戰。反映方法分支的研究範例說明了這些理論觀點在實務中的表現。特別強調的是這些方法的那些方面會導致處理因果關係問題的設計。

Methods Branch Theorists
方法分支理論家

Ralph TylerDonald CampbellThomas CookWilliam ShadishRobert BoruchPeter RossiGary Henry Mel Mark Huey-Tsyh Chen Stuart Donaldson Donald Kirkpatrick Robert Brinkerhoff

Early Theorists and Theories
早期理論家與理論

Ralph Tyler used the term “educational evaluation” back in the 1990s, making him one of the earliest scholars in this field (Stufflebeam & Shinkfield, 2007). His approach to evaluation consisted primarily of establishing educational objectives and then determining whether those objectives had been met. An evaluator met with educators to determine broad goals and the desired student behaviors that the teachers hoped to see following instruction (the “objectives,” now more commonly known as “student outcomes”). Then the educators were supposed to design the curriculum to teach what was needed in order to achieve the objectives. The evaluator gave advice on the development of measures to determine whether the objectives were achieved. The results of the assessment were compared with the desired results to reach a judgment about the effectiveness of the instruction. Tyler is perhaps best known for his evaluation study known as the Eight-Year Study, which involved evaluation of the effectiveness of educational initiatives across the nation (Smith & Tyler, 1942). Although he did not employ experimental and control groups, he did posit that establishment of clear objectives and rigorous measurement of outcomes were key components of educational evaluation.
Ralph Tyler 在 1990 年代使用「教育評估」一詞,使他成為這個領域最早的學者之一 (Stufflebeam & Shinkfield, 2007)。他的評估方法主要包括建立教育目標,然後判斷這些目標是否已經達到。評估人員與教育工作者會面,以確定廣泛的目標和教師希望在教學後看到的學生行為(「目標」,現在更普遍地被稱為 「學生結果」)。然後,教育工作者應該設計課程,教授達成目標所需的內容。評估人員提供建議,以制定判斷是否達到目標的措施。評估的結果與預期的結果進行比較,以達到對教學有效性的判斷。Tyler 最為人所知的可能是他的評估研究,被稱為八年研究(Eight-Year Study),該研究涉及評估全國教育措施的有效性(Smith & Tyler, 1942)。儘管他沒有採用實驗組和對照組,但他確實認為,建立明確的目標和嚴格測量結果是教育評估的關鍵組成部分。
Evaluators from the social sciences contributed to theories of evaluation that were connected to the use of experiments. For example, Donald Campbell’s work in the early years of evaluation discussed the use of experimental and quasi-experimental designs and their role in controlling for extraneous variables in determining causal relationships. He did identify himself with the experimental approach and quantification, but at the same time realized that other approaches could contribute to increased understanding of program effects (Campbell, 1991; Shadish & Cook, 1998). Thomas Cook, William Shadish, and Robert Boruch extended the evaluation community’s understanding of the Methods Branch through their support for the use of experiments and quasi-experiments in evaluation studies.
來自社會科學的評估人員對與使用實驗有關的評估理論做出了貢獻。例如,Donald Campbell 在早期的評估工作中討論了實驗和准實驗設計的使用,以及它們在控制外來變數以確定因果關係中的作用。他確實認同實驗方法和量化,但同時也意識到其他方法可以幫助增加對計畫效果的了解(Campbell, 1991; Shadish & Cook, 1998)。Thomas Cook、William Shadish 和 Robert Boruch 通過支持在評估研究中使用實驗和准實驗,擴展了評估社群對方法分支的理解。
Peter Rossi contributed his thinking to evaluation from the perspective of evaluation research in the form of survey methods and social science experiments. He began writing about the connection between evaluation and social policy in the early 1970s. He continued making contributions along this line through his multiple books, the last of which was published in 2004 with coauthors Mark Lipsey and Howard Freeman (Rossi
Peter Rossi 以調查方法和社會科學實驗的形式,從評估研究的角度貢獻了他對評估的思考。他在 1970 年代早期開始撰寫關於評估與社會政策之間關聯的文章。他在這方面的貢獻一直持續到他的多本著作,其中最後一本是在 2004 年與 Mark Lipsey 和 Howard Freeman 共同出版的 (Rossi

et al., 2004). The strength of his contribution is in the explanation of how randomized control designs and quasi-experimental designs could be used in evaluation, and how evaluation findings could be tied to national policy in education and human services. Gary Henry and Melvin Mark extended theories of evaluation in the context of the use of causal modeling for evaluation, with particular attention to ethical issues (Mark & Gamble, 2009; Mark & Henry, 2006). They also presented extensions of a theory of evaluation that Pawson and Tilley (1997) discussed, known as “emergent realist evaluation” (ERE) theory (Henry, Julnes, & Mark, 1998). For Henry et al., ERE is a theoretical position based on the philosophy of neorealism, which holds that a reality exists independently of the observer and that regularities in the patterns of events can be explained by generative mechanisms (e.g., we can observe a tree falling and infer that gravity is a generative mechanism that pulls the tree to earth). ERE reflects earlier Methods Branch theorists’ views in these terms, as well as in the view that evaluation has a role to play in making sense of what is going on in the world.
等人,2004)。他的貢獻的優點在於解釋了如何在評估中使用隨機控制設計和准實驗設計,以及如何將評估結果與教育和人類服務的國家政策相結合。Gary Henry 和 Melvin Mark 在使用因果模型進行評估的背景下,擴展了評估理論,並特別關注道德問題(Mark & Gamble, 2009; Mark & Henry, 2006)。他們也提出了 Pawson 和 Tilley (1997) 所討論的評估理論的延伸,也就是「突現現實主義評估」(ERE) 理論 (Henry, Julnes, & Mark, 1998)。對 Henry 等人來說,ERE 是基於新現實主義哲學的理論立場,新現實主義哲學認為現實是獨立於觀察者而存在的,事件模式中的規律性可以用產生機制來解釋(例如,我們可以觀察到一棵樹倒下,並推斷出重力是一種產生機制將樹拉到地上)。ERE在這些方面反映了早期方法分支理論家的觀點,也反映了評價在理解世界上發生的事情時可以發揮作用的觀點。

Theory-Based Evaluation  以理論為基礎的評估

“Theory-based evaluation” is an approach that focuses on the theories people have about what it takes to have a successful program. In simple terms, you could think about how people learn or how they change their behavior. What conditions need to be in place for that to happen?
"以理論為基礎的評估」是一種方法,它著重於人們對於一個成功的計畫需要什麼的理論。簡單來說,你可以思考人們如何學習或如何改變他們的行為。要做到這一點,需要哪些條件?
Huey-Tsyh Chen (2005; Chen & Rossi, 1983) worked with Peter Rossi to develop the concept of theory-based evaluation as the logical extension of quantitative models that permit identification of variables contributing to the outcomes of a program. The “theory” part of this approach consists of the social science theories and stakeholders’ beliefs (theories) about what is necessary for a program to succeed. Lipsey (2007) describes Chen and Rossi’s early arguments as follows:
Huey-Tsyh Chen (2005; Chen & Rossi, 1983)與 Peter Rossi 合作發展了理論為基礎的評估概念,作為定量模型的邏輯延伸,允許鑑定造成計畫結果的變數。這個方法的「理論」部分包含了社會科學理論和利益相關者對於計劃成功的信念(理論)。Lipsey (2007) 對 Chen 和 Rossi 早期的論點描述如下:
Each social program embodies a theory of sorts-an action theory that reflects the assumptions inherent in the program about the nature of the social problem it addresses and the way it expects to bring about change in that problem. Chen and Rossi argued that evaluators should bring that theory to the surface and, if necessary, draw on other sources to further differentiate it. (p. 200)
每一個社會計劃都包含了某種理論--一種行動理論,反映了該計劃對其所針對的社會問題的性質的固有假設,以及該計劃期望為該問題帶來改變的方式。Chen 和 Rossi 認為,評估人員應該讓該理論浮出水面,並在必要時,利用其他來源來進一步區分該理論。(p. 200)
One approach to theory-based evaluation involves the use of sophisticated statistical analyses, such as path analysis and structural equation modeling (discussed in Chapter 12), to determine the significant contributions of theoretically derived variables to the outcomes. Interestingly, Chen (1994) saw theory-based evaluation as a move away from methods-driven evaluation. He argued that if evaluators started with a method (e.g., experimental design), then that would lead them to specific directives for how to conduct the evaluation. However, if the evaluators started with the theory of what was supposed to make the program work, then they would consider different methodological options. He recommends the use of both quantitative and qualitative methods in evaluation; however, for outcome evaluations he supports the use of randomized experimental designs in order to control threats to validity. Because he divorces himself from method as the determinant
以理論為基礎的評估方法之一,是使用精密的統計分析,例如路徑分析和結構方程模型(在第12章討論),來確定理論上衍生的變數對結果的重大貢獻。有趣的是,Chen (1994) 將理論為本的評估視為擺脫方法驅動的評估。他認為,如果評估者從一種方法(例如實驗設計)開始,那麼就會引導他們對如何進行評估做出具體的指示。但是,如果評估人員從理論著手,理論是什麼應該使計劃有效,那麼他們就會考慮不同的方法選擇。他建議在評估中使用定量和定性的方法;然而,對於結果評估,他支持使用隨機實驗設計,以控制對有效性的威脅。因為他擺脫了方法是決定因素的觀點。

factor for evaluation decisions, his work provides a bridge to other branches in the evaluation tree.
評估決策的因素,他的工作為評估樹的其他分支提供了一座橋樑。
Stewart Donaldson (2007) offers a change in labeling for evaluations that have program theory at their core. He defines “program-theory-driven evaluation science” as "the systematic use of substantive knowledge about the phenomena under investigation and scientific methods to improve, to produce knowledge
Stewart Donaldson (2007) 為以方案理論為核心的評估提供了標籤上的改變。他將「計畫理論驅動的評估科學」定義為「有系統地使用有關被調查現象的實質知識和科學方法來改善、產生知識」。

Methods Branch Theorists
方法分支理論家

Ralph Tyler Donald CampbellThomas CookWilliam ShadishRobert Boruch Peter RossiGary Henry Mel Mark Huey-Tsyh Chen Stuart Donaldson Donald Kirkpatrick Robert Brinkerhoff

merit, worth, and significance of evaluands such as social, educational, health, community, and organizational programs" (Donaldson, 2007, p. 9). The rationale for this label is that evaluators use the program theory to define and prioritize evaluation questions. The evaluators build a program theory with stakeholders by reviewing documents, prior research, talking with stakeholders, and observing the program in operation. They then use scientific methods to answer the evaluation questions.
社會、教育、健康、社區及組織計畫等評估對象的優點、價值及重要性」(Donaldson, 2007, p.9)。這個標籤的理由是,評估人員使用計畫理論來定義評估問題並將其優先排序。評估人員透過審閱文件、先前的研究、與利益相關者交談,以及觀察計畫的運作,與利益相關者一起建立計畫理論。然後,他們使用科學方法來回答評估問題。

Evaluation of Training Programs
訓練計畫評估

The “Kirkpatrick model” for evaluation of training programs dominated human resource development evaluations for many decades (Kirkpatrick, 1998). It essentially has four levels of evaluation: participant reactions, learning, behavior, and results. This model was extended to consider the financial return on investment (ROI) of training by Phillips (1997). Reaction evaluation is probably a familiar format for most of you who have participated in training programs. At the end of the program, your reactions are evaluated by means of a questionnaire that asks whether you found the training relevant, interesting, worthwhile, and appropriately conducted. Learning is evaluated in terms of the knowledge or skills gained or the changes in attitudes from the training. “Behavior changes” refer to changes in performance on the job or in a simulated situation. “Results” refer to the impact of the training on the organization in terms of its effectiveness in facilitating successful achievement of its mission. ROI measures how the results of the training affect the organization’s bottom line. A list of resources for evaluations of training programs is provided in Box 3.1.
數十年來,用於評估訓練計畫的「Kirkpatrick 模型」主宰了人力資源發展的評估(Kirkpatrick, 1998)。它基本上有四個層次的評估:參與者的反應、學習、行為和結果。Phillips (1997)將這個模型擴展到考慮培訓的財務投資回報 (ROI)。對大多數參加過培訓課程的人來說,反應評估可能是一種很熟悉的形式。在課程結束時,會透過問卷來評估您的反應,詢問您是否覺得訓練相關、有趣、有價值,以及進行得是否恰當。學習的評估是指從訓練中獲得的知識或技能,或是態度上的改變。"行為改變」是指在工作上或模擬情境中表現的改變。"結果」指的是訓練對組織的影響,即訓練對成功達成組織使命的效用。ROI 衡量訓練結果如何影響組織的底線。方框 3.1 提供了一份評估訓練計畫的資源清單。

Box 3.1. Resources for Evaluating Training Programs
方框 3.1.評估訓練計畫的資源

A section of the Business Performance website updates the Kirkpatrick model (www.businessper-form.com/workplace-training/evaluating_training_effectiven.htmi).
Business Performance 網站上有一部分更新了 Kirkpatrick 模型 ( www.businessper-form.com/workplace-training/evaluating_training_effectiven.htmi)。

The journal of the National Staff Development and Training Association (NSDTA), Training and Development in Human Services, published a special issue on training eva/uation are available at
National Staff Development and Training Association (NSDTA) 的期刊《Training and Development in Human Services》出版了一期關於培訓 eva/uation 的專刊。

Box 3.1 (cont.)  方框 3.1(續)

NSDTA’s website if you scroll down to “The New Key to Success” (https://aphsa.org//SM/NSDTA/ Resources.aspx).
NSDTA 的網站,如果您向下捲動至「成功的新關鍵」( https://aphsa.org//SM/NSDTA/ Resources.aspx)。
The American Society for Training and Development website (www.astd.org) is another valuable resource.
American Society for Training and Development 網站 ( www.astd.org) 是另一個寶貴的資源。
Russ-Eft and Preskill (2005) criticize the Kirkpatrick model and ROI because these are based on an assumption that if people like the training (i.e., have a positive reaction), their response will affect the bottom-line results. A model for the evaluation of training programs, developed by Russ-Eft and Preskill, focuses on training programs conducted in learning organizations and is discussed in Chapter 4 on the Use Branch of evaluation.
Russ-Eft 和 Preskill (2005) 批判了 Kirkpatrick 模型和投資回報率,因為這些都是基於一個假設,即如果人們喜歡培訓(即有正面的反應),他們的反應就會影響底線結果。Russ-Eft和Preskill所開發的訓練計畫評估模型,著重於在學習型組織中進行的訓練計畫,並在第四章評估的使用分支中討論。
Brinkerhoff (2003) also developed an impact model for training evaluations called the “success case method,” which includes both quantitative and qualitative data. He recommends its use in contexts in which a full experimental study is not feasible. This method has six steps:
Brinkerhoff (2003)也為訓練評估開發了一種影響模型,稱為「成功案例法」,其中包括定量和定性資料。他建議在無法進行完整實驗研究的情況下使用此方法。這個方法有六個步驟:
  1. Create a focus for the evaluation and develop a plan.
    建立評估的重點並制定計劃。
  2. Develop an impact model for the intervention that depicts how it will achieve its results (akin to a logic model).
    為干預發展影響模型,描述如何達成其結果(類似於邏輯模型)。
  3. Conduct a survey with all participants to identify those who were successful and those who were not.
    對所有參與者進行調查,以確定哪些人成功,哪些人不成功。
  4. Select a random sample from each group and interview these individuals to get their stories.
    從每個組別中隨機抽樣,訪問這些人,了解他們的故事。
  5. Prepare a report with the findings, conclusions, and recommendations; these sometimes take the form of “success stories.”
    將發現、結論和建議撰寫成報告;有時會採用「成功故事」的形式。
  6. Report on ROI in terms of the benefit to the company; divide the benefit (a performance measure) by the cost of the training to obtain the ratio of ROI.
    以對公司的效益來報告 ROI;將效益(績效評量)除以訓練成本,即可獲得 ROI 的比率。

Theory to Practice  理論到實踐

This section of the chapter is divided into three parts. The first part considers practice based on the work of Methods Branch theorists who prioritize the use of experimental and quasi-experimental designs (e.g., Brady & O’Regan, 2009; Duwe & Kerschner, 2008). The second part examines practice based on the work of Methods Branch theorists who prioritize theory-based evaluation approaches (e.g., Fredericks et al., 2008). The third part provides examples of the evaluation of training programs via methods-based approaches (e.g., Busch et al., 2005). Here is a “map” of this section, showing that we begin by covering experimental and quasi-experimental approaches to evaluation.
本章的這一部分分為三個部分。第一部分考慮的是以方法分支理論家的工作為基礎的實踐,他們優先使用實驗和准實驗設計(例如,Brady & O'Regan,2009;Duwe & Kerschner,2008)。第二部分檢視以方法分支理論家的工作為基礎的實踐,這些理論家優先使用以理論為基礎的評估方法(例如,Fredericks 等人,2008)。第三部分提供了透過以方法為基礎的方法來評估訓練計畫的實例(例如 Busch 等人,2005)。以下是本節的 「地圖」,顯示我們首先涵蓋實驗性與准實驗性的評估方法。

Theory to Practice: Methods Branch
理論到實踐:方法分支

  1. Experimental and quasi-experimental approaches
    實驗和准實驗方法
  2. Theory-based evaluation  以理論為基礎的評估
  3. Evaluation of training programs
    訓練計畫評估

Experimental and Quasi-Experimental Approaches
實驗和準實驗方法

Independent and Dependent Variables; Experimental and Control Groups
獨立變數和因果變數;實驗組和對照組

Experimental and quasi-experimental studies have an intervention designed to create change in knowledge, behavior, attitudes, aptitude, or some other construct. The independent variable is the program (or policy or process) that is implemented in hopes of seeing a change in knowledge, behavior, attitude, aptitude, or some other relevant construct (the dependent variable). Because there are many terms in this chapter that are used in a unique way in research and evaluation, we provide a list of the terms, definitions, and examples in Box 3.2. Some of the examples in this box are taken from a study that evaluated whether having students draw pictures of words that they were learning helped them remember the words better. The researchers (Wammes, Meade, & Fernandes, 2016) had an evaluation question: Does a program that encourages students to draw pictures of words help build stronger and more reliable memories of those words? They hypothesized that drawing words would improve memory of those words because a more cohesive memory trace would be created by the act of drawing that would integrate semantic, motor, and visual information.
實驗性與准實驗性研究有一種干預設計來創造知識、行為、態度、能力或某些其他構造的改變。自變量是實施的計畫(或政策或流程),希望看到知識、行為、態度、能力或其他相關構造(因變量)的改變。因為本章中有許多術語以獨特的方式用於研究與評估,所以我們在方塊 3.2 中提供了一份術語、定義與範例的清單。本方框中的一些例子取自一項研究,該研究評估了讓學生畫出學習中的詞彙圖片是否能幫助他們更好地記住這些詞彙。研究人員(Wammes, Meade, & Fernandes, 2016)提出了一個評估問題:鼓勵學生繪畫單字圖片的計畫是否有助於建立更強更可靠的單字記憶?他們假設繪畫詞彙會改善這些詞彙的記憶,因為繪畫的動作會整合語義、動作和視覺資訊,創造出更有凝聚力的記憶痕跡。

方框 3.2.定義和准實驗評估
Box 3.2. Definitions for
and Quasi-Experimental Evaluations
Box 3.2. Definitions for and Quasi-Experimental Evaluations| Box 3.2. Definitions for | | :--- | | and Quasi-Experimental Evaluations |
Term  期限 Definition  定義 Example  範例
Quantitative research  定量研究

以系統化方式收集實證數字資料的客觀研究。
Objective research that involves the
collection of empirical numerical
data in a systematic manner.
Objective research that involves the collection of empirical numerical data in a systematic manner.| Objective research that involves the | | :--- | | collection of empirical numerical | | data in a systematic manner. |

比較兩個組別,測量哪個組別記住的詞彙較多。
Compare two groups to measure which
group remembers more words.
Compare two groups to measure which group remembers more words.| Compare two groups to measure which | | :--- | | group remembers more words. |
Random assignment  隨機指派

進行實驗時,每位參與者或每組被分配到特定實驗條件的機率相同。
When conducting experiments,
each participant or group has the
same probability of being assigned
to a particular condition of the
experiment.
When conducting experiments, each participant or group has the same probability of being assigned to a particular condition of the experiment.| When conducting experiments, | | :--- | | each participant or group has the | | same probability of being assigned | | to a particular condition of the | | experiment. |

隨機分配兩組不同的學生。
Two groups of students randomly assigned
to two different groups.
Two groups of students randomly assigned to two different groups.| Two groups of students randomly assigned | | :--- | | to two different groups. |
Control group  對照組

對照組不接觸調查中的變量。
The control group is not exposed to
the variable under investigation.
The control group is not exposed to the variable under investigation.| The control group is not exposed to | | :--- | | the variable under investigation. |
This group wrote the word list out normally.
這組人正常寫出字表。
"Box 3.2. Definitions for and Quasi-Experimental Evaluations" Term Definition Example Quantitative research "Objective research that involves the collection of empirical numerical data in a systematic manner." "Compare two groups to measure which group remembers more words." Random assignment "When conducting experiments, each participant or group has the same probability of being assigned to a particular condition of the experiment." "Two groups of students randomly assigned to two different groups." Control group "The control group is not exposed to the variable under investigation." This group wrote the word list out normally.| | Box 3.2. Definitions for <br> and Quasi-Experimental Evaluations | | | :--- | :--- | :--- | | Term | Definition | Example | | Quantitative research | Objective research that involves the <br> collection of empirical numerical <br> data in a systematic manner. | Compare two groups to measure which <br> group remembers more words. | | Random assignment | When conducting experiments, <br> each participant or group has the <br> same probability of being assigned <br> to a particular condition of the <br> experiment. | Two groups of students randomly assigned <br> to two different groups. | | Control group | The control group is not exposed to <br> the variable under investigation. | This group wrote the word list out normally. |
Box 3.2 (cont.)  方框 3.2(續)
Experimental group  實驗組 The experimental group is exposed to the variable under investigation.
實驗組會接觸到所調查的變數。
This group drew the words instead of writing them.
這組人畫出單字,而不是寫出來。
Independent variable  獨立變數 The program (or policy or process) that is implemented in hopes of seeing a change in knowiedge, behavior, attitude, aptitude, or some other relevant construct. The independent variable is manipulated by the researcher.
為了讓知識、行為、態度、能力或其他相關構造有所改變而實施的方案(或政策或程序)。自變量由研究人員操控。
Rather than writing out a word list to remember, students were asked to draw the word.
與其寫出要記住的單字清單,不如讓學生畫出該單字。
Dependent variable  自變量 The dependent variable is the variable that is measured by the researcher to see if there is a change in knowledge, behavior, attitude, aptitude, or some other relevant construct. It is the variable that demonstrates the influence of the independent variable.
因變量是由研究者測量的變量,以瞭解在知識、行為、態度、能力或其他相關建構上是否有改變。它是顯示自變量影響的變量。
After the experimental group drew the words and the control group wrote the words, they tested who could remember more.
在實驗組畫出單字、對照組寫出單字之後,他們測試誰能記住更多的單字。
Quasi-experimental methods
准實驗方法
The two groups studied in an experiment are nonequivalent and do not involve random assignment to the experimental and control groups.
實驗中研究的兩個組別是非等值的,不涉及隨機分配到實驗組和對照組。
If the researchers were short on time or lacked funds, maybe they would use Psychology 101 students for the control group and Psychology 102 students as the experimental group.
如果研究人員缺乏時間或經費,也許他們會使用心理學 101 學生做為對照組,心理學 102 學生做為實驗組。
Experimental group The experimental group is exposed to the variable under investigation. This group drew the words instead of writing them. Independent variable The program (or policy or process) that is implemented in hopes of seeing a change in knowiedge, behavior, attitude, aptitude, or some other relevant construct. The independent variable is manipulated by the researcher. Rather than writing out a word list to remember, students were asked to draw the word. Dependent variable The dependent variable is the variable that is measured by the researcher to see if there is a change in knowledge, behavior, attitude, aptitude, or some other relevant construct. It is the variable that demonstrates the influence of the independent variable. After the experimental group drew the words and the control group wrote the words, they tested who could remember more. Quasi-experimental methods The two groups studied in an experiment are nonequivalent and do not involve random assignment to the experimental and control groups. If the researchers were short on time or lacked funds, maybe they would use Psychology 101 students for the control group and Psychology 102 students as the experimental group.| Experimental group | The experimental group is exposed to the variable under investigation. | This group drew the words instead of writing them. | | :---: | :---: | :---: | | Independent variable | The program (or policy or process) that is implemented in hopes of seeing a change in knowiedge, behavior, attitude, aptitude, or some other relevant construct. The independent variable is manipulated by the researcher. | Rather than writing out a word list to remember, students were asked to draw the word. | | Dependent variable | The dependent variable is the variable that is measured by the researcher to see if there is a change in knowledge, behavior, attitude, aptitude, or some other relevant construct. It is the variable that demonstrates the influence of the independent variable. | After the experimental group drew the words and the control group wrote the words, they tested who could remember more. | | Quasi-experimental methods | The two groups studied in an experiment are nonequivalent and do not involve random assignment to the experimental and control groups. | If the researchers were short on time or lacked funds, maybe they would use Psychology 101 students for the control group and Psychology 102 students as the experimental group. |
Source: Examples are based on Wammes, Meade, and Fernandes (2016).
資料來源:範例基於 Wammes、Meade 和 Fernandes (2016)。
One application of the experimental approach is embodied in the U.S. government’s What Works Clearinghouse (WWC; ies.ed.gov/ncee/wwe), which was established to identify effective programs in human service areas. The WWC is a product of the U.S. Department of Education’s Institute of Education Sciences. The institute has established standards for the review of methods used to indicate the effectiveness of programs funded by the Department of Education, including programs for reading, dropout prevention, early childhood education, elementary school math, English language learners, and middle school math. Each intervention is rated on the degree to which it meets the WWC standards as having either strong evidence (“meets evidence standards”), weaker evidence (“meets evidence standards with reservations”), or insufficient evidence (“does not meet evidence standards”). The standards are defined on the WWC website as follows:
實驗方法的一種應用體現在美國政府的 What Works Clearinghouse (WWC;ies.ed.gov/ncee/wwe),它的建立是為了找出人類服務領域中有效的計畫。WWC 是美國教育部教育科學研究所的產品。該研究所已制定標準,用以審查由教育部資助的計畫的有效性,包括閱讀、防止退學、早期兒童教育、小學數學、英語學習者及初中數學等計畫。每項干預計畫都會依其符合 WWC 標準的程度被評為證據強烈(「符合證據標準」)、證據較弱(「有保留地符合證據標準」)或證據不足(「不符合證據標準」)。WWC 網站對這些標準的定義如下:
Currently, only well-designed and well-implemented randomized controlled trials (RCTs) are considered strong evidence, while quasi-experimental designs (QEDs) with equating [i.e., the researchers compared the experimental and control groups to show their equivalence] may only meet standards with reservations. (What Works Clearinghouse, 2010, p. 11)
目前,只有設計完善、執行良好的隨機對照試驗(RCTs)才被認為是有力的證據,而具有等效性的准實驗設計(QEDs)[即研究人員比較實驗組和對照組,以顯示其等效性]可能只符合有保留的標準。(What Works Clearinghouse, 2010, p. 11)。
In 2018, the WWC added guidance on standards for regression discontinuity designs and single case designs. The Brady and O’Regan (2009) youth mentoring study of the effects of having a Big Brother or Big Sister is summarized in Box 3.3 ; this is an example of a Methods Branch study that used randomized control trials (RCTs) or an experimental design. Another sample study is presented later (in Box 3.4): Duwe and Kerschner’s (2008) “boot camp” study used a quasi-experimental design to evaluate the effects of a program to reduce recidivism for people who had spent time in jail.
2018年,WWC增加了回归不连续设计和单例设计的标准指南。方框 3.3 中概述了 Brady 和 O'Regan(2009 年)對擁有大哥哥或大姐姐的影響進行的青年指導研究;這是一個使用隨機控制試驗 (RCT) 或實驗設計的方法分支研究的範例。稍後將介紹另一項範例研究(見方框 3.4):Duwe and Kerschner (2008)的「新兵訓練營」研究使用準實驗設計來評估一項計畫的效果,以降低曾在監獄服刑者的再犯率。

Box 3.3. Sample Study with an Experimental (Randomized Control) Design: The Youth Mentoring Study
方框 3.3.實驗(隨機控制)設計的樣本研究:青少年指導研究

Sample study  樣本研究 Evaluation approach  評估方法 Document title  文件標題
Brady and  布雷迪和

採用隨機控制設計的混合方法(也稱為基於理論的評估;邏輯模型)
Mixed methods with randomized
control design (also, theory-based
evaluation; logic model)
Mixed methods with randomized control design (also, theory-based evaluation; logic model)| Mixed methods with randomized | | :--- | | control design (also, theory-based | | evaluation; logic model) |

"迎接挑戰,對青少年輔導進行 RCT 評估
"Meeting the Challenge of Doing an
RCT Evaluation of Youth Mentoring in
"Meeting the Challenge of Doing an RCT Evaluation of Youth Mentoring in| "Meeting the Challenge of Doing an | | :--- | | RCT Evaluation of Youth Mentoring in |
Ireland: A Journey in Mixed Methods"
愛爾蘭:混合方法之旅"
Sample study Evaluation approach Document title Brady and "Mixed methods with randomized control design (also, theory-based evaluation; logic model)" ""Meeting the Challenge of Doing an RCT Evaluation of Youth Mentoring in" Ireland: A Journey in Mixed Methods" | Sample study | Evaluation approach | Document title | | :--- | :--- | :--- | | Brady and | Mixed methods with randomized <br> control design (also, theory-based <br> evaluation; logic model) | "Meeting the Challenge of Doing an <br> RCT Evaluation of Youth Mentoring in | | Ireland: A Journey in Mixed Methods" | | |

The Evaluators  評估員

Bernadine Brady is a Social Science Researcher and Connie O’Regan is a Doctoral Fellow in the Child and Family Research Centre, National University of Ireland, Galway.
Bernadine Brady 是社會科學研究員,Connie O'Regan 是愛爾蘭國立大學高威分校兒童與家庭研究中心的博士研究員。

Phllosophical and Theoretical Lenses
哲學與理論鏡頭

This study is situated in the Methods Branch and illustrates the use of a randomized control design, thus exemplifying the approach most closely linked with the postpositivist paradigm. However, the evaluators describe their initial philosophical stance as reflective of the pragmatic paradigm, in that they used both quantitative and qualitative approaches. As they progressed through the study and began to use both types of data, they describe a shift from a pragmatic to a dialectical stance, because they contrasted the findings from the quantitative impact study with the inductive findings of the qualitative data. The dialectical stance allowed them to compare and contrast the quantitative and qualitative approaches to the exploration of youth mentoring.
本研究位於方法分部,說明隨機控制設計的使用,因此是與後實證主義範式關係最密切的方法的範例。然而,評估人員描述他們最初的哲學立場是反映實用範例,因為他們同時使用定量和定性的方法。隨著研究的進展,他們開始使用這兩種類型的資料,他們描述了從實用主義到辯證主義立場的轉變,因為他們把定量影響研究的結果和定性資料的歸納結果進行了對比。辩证的立场使他们能够比较和对比探索青年指导的定量和定性方法。

The evaluators are part of a small evaluation team headed by Professor Pat Dolan, Principal Investigator. The role of the evaluation team encompasses study design, data collection, and analysis.
評估人員是由首席研究員 Pat Dolan 教授領導的小型評估小組的一員。評估小組的職責包括研究設計、資料收集和分析。

The Evaluand and Its Context
Evaluand 及其背景

Big Brothers Big Sisters (BBBS) is an international youth mentoring program. The evaluand in this study was a BBBS program in the western part of Ireland that had been in operation for 5 years. When the study began, the program supported 60 pairs of adult mentors and youth. Box Figure 3.1 displays the underlying theory that guided the program.
大哥哥大姐姐(BBBS)是一個國際性的青少年指導計劃。本研究的評估對象為愛爾蘭西部的 BBBS 計劃,該計劃已運作了 5 年。研究开始时,该计划支持 60 对成人导师和青少年。方框图 3.1 显示了指导该计划的基本理论。

Method  方法

Design  設計
The evaluation team used a concurrent embedded mixed methods design that included randomized control trials as well as collection of qualitative data. Again, Box Figure 3.1 illustrates the theoretical model the evaluators tested to discover whether the treatment (the BBBS mentoring program) decreased the teens’ risky health behaviors and improved their socially appropriate behaviors, attitudes toward schoolwork, and peer/family relationships. Youth were randomly assigned either to participate in the BBBS program or not to participate. Given the integration of a theoretical model for youth mentoring programs, this evaluation could also be described as a theory-based evaluation.
評估小組使用了同時嵌入式混合方法設計,包括隨機控制試驗以及定性資料的收集。方框圖3.1再次說明了評估人員測試的理論模型,以確定治療(BBBS指導計劃)是否減少了青少年的高危健康行為,並改善了他們的社交行為、對學業的態度以及同伴/家庭關係。青少年被隨機指定參加或不參加 BBBS 計劃。由於結合了青少年輔導計劃的理論模型,這項評估也可以被稱為以理論為基礎的評估。

Box 3.3 (cont.)  方框 3.3(續)

Interpersonal history, social competencies, relationship duration,
人際關係史、社交能力、關係持續時間、

developmental stage, family and community context
發展階段、家庭和社區背景

moderators  版主
Box Figure 3.1. Rhode’s model of mentoring. Source: Du Bois and Karcher (2005). Copyright © 2005 Sage Publications. Reprinted by permission.
方框圖 3.1.Rhode 的指導模式。資料來源:Du Bois and Karcher (2005):Du Bois 和 Karcher (2005)。版權所有 © 2005 Sage Publications。經許可轉載。

Evaluation Purposes  評估目的

The study aimed to measure the impact of the BBBS mentoring program on the development of youth in the community. The evaluation started from a point of view of uncertainty about the effectiveness of mentoring as a policy intervention in an Irish context, and randomized control trials are deemed to be valuable in terms of exploring the impact (be it positive or negative) of interventions about which there is uncertainty (Oakley, 2000). Furthermore, it was decided that the randomized control trials should be undertaken in conjunction with a qualitative study to examine implementation and stakeholder perspectives.
該研究旨在衡量 BBBS 指導計劃對社區青少年發展的影響。評估的出發點是,在愛爾蘭的環境中,輔導作為政策干預的有效性尚不確定,而隨機控制試驗被認為在探討存在不確定性的干預所產生的影響(無論是正面還是負面影響)方面很有價值(Oakley, 2000)。此外,還決定隨機控制試驗應與定性研究一起進行,以檢視實施情況和利益相關者的觀點。

Evaluation Questions  評估問題

  1. What is the impact of the BBBS program on the participating youth?
    BBBS 計畫對參與的青少年有什麼影響?
  2. How is the program experienced by stakeholders?
    利害關係人如何體驗本計畫?
  3. How is the program implemented?
    該計劃如何實施?
  4. From comparing the outcome data from the impact study with the case study data from the mentoring pairs, what results emerge regarding the potential of this youth mentoring program?
    將影響研究的結果資料與師徒對談的個案研究資料進行比較後,會發現這項青少年輔導計劃的潛力有哪些結果?

Stakeholders and Participants
利害關係人與參與者

Stakeholders included the agency funding the evalua-tion-the Atlantic Philanthropies, which strongly supported the use of randomized control trials in evaluations. The host agency for the BBBS program is the Irish national youth organization Foroige. An expert advisory group (EAG) was formed of leading researchers and academics to guide the research team. Youth in the evaluation study ranged in age from 10 to 14, although youth in the entire program ranged in age from 10 to 18. Project staff included BBBS managers and project workers.
利益相關者包括提供評估經費的機構-大西洋慈善組織,該組織強烈支持在評估中使用隨機控制試驗。BBBS 計畫的主辦機構是愛爾蘭全國青年組織 Foroige。由頂尖的研究人員和學者組成了一個專家諮詢小組 (EAG),為研究團隊提供指導。參與評估研究的青少年年齡介於 10 到 14 歲之間,但整個計劃的青少年年齡介於 10 到 18 歲之間。專案人員包括 BBBS 經理和專案工作者。

Data Collection  資料收集

The evaluators used a survey-based methodology to answer the first evaluation question, concerning the impact of the program on the youth. The second question, concerning the experiences of the stakeholders, was addressed by means of "interviews conducted with key program participants, including youth, mentors, parents, and staff’ (p. 275). Data for the third question about program implementation were collected by means of reviewing documents from case files and focus groups conducted with staff. Data were collected at baseline and at 12,18 , and 24 months. An integrated analysis based on both quantitative and qualitative data was used for the fourth question.
評估人員使用以調查為基礎的方法來回答第一個評估問題,即計劃對青少年的影響。第二個問題是關於利害關係人的經驗,是透過「訪談計劃的主要參與者,包括青少年、指導者、家長和工作人員」(第 275 頁)來解決的。第三個問題關於計劃實施的資料,是透過審閱個案檔案中的文件,以及與工作人員進行焦點小組討論來收集的。資料是在基線、12、18 和 24 個月時收集的。第四個問題採用了基於定量和定性資料的綜合分析。

The sample for the first question consisted of 164 youth representing all those who were avaliable in the western region’s program. Parents, mentors, and teachers also completed survey data. The interviews were conducted with a purposive sample of 10 mentoring pairs who were selected to reflect “differences in age, gender, and location [i.e., rural or urban]” (p. 275).
第一個問題的樣本包括 164 名青少年,他們代表了西部地區計劃中的所有青少年。家長、指導者和教師也填寫了調查資料。訪談是有目的性地從 10 對指導對子中抽取的,這些對子的選擇反映了「年齡、性別和地區(即鄉村或城市)的差異」(第 275 頁)。
For the focus groups, the research team approached the BBBS caseworkers seeking an opportunity sample of matches within the study who would be willing to participate. The staff identified matches that were established and that would be willing to participate in a series of interviews with the research team. A total of 21 matches agreed to participate. The research team then reviewed this sample and selected a purposive sample representing a balance across characteristics of age, gender, location, family situation, and reason for referral. As the team members rolled out the design, they decided to reduce the number from 12 to 10 mentoring pairs, as this would provide them with a spread of participants across the characteristics of interest and would be more feasible for the research team to follow over two time periods: (1) when the match was established but less than 6 months old, and (2) once the pairs had been meeting for 1 year. Interviews were conducted separately at both time periods with each young person and with his/her mentor, parent, and caseworker.
對於焦點小組,研究小組與 BBBS 的個案工作者接觸,尋求願意參與研究的匹配對象的機會樣本。這些工作人員確定了已建立且願意參加研究小組一系列訪談的配對者。共有 21 個配對同意參與。研究團隊隨後檢討了這些樣本,並選擇了一個有目的性的樣本,該樣本代表了年齡、性別、地點、家庭狀況和轉介原因等不同特徵的平衡。當研究團隊成員開始進行設計時,他們決定將 12 對輔導對象的數目減少到 10 對,因為這樣可以讓參與者的特徵更加均衡,也更適合研究團隊分兩個時期進行追蹤:(1) 當配對已建立但不足 6 個月時,以及 (2) 當配對已見面 1 年之後。在這兩個時期,研究人員分別與每位青少年及其指導者、家長和個案工作者進行訪談。
One-off focus groups were conducted with members of the Foroige staff to review their attitudes about the program and gain their perspective on the poten-
我們與 Foroige 員工進行了一次性的焦點小組,以檢討他們對計劃的態度,並獲取他們對計劃潛力的看法。

tial of youth mentoring. Three such focus groups were held, involving 12 project staffers. An additional 12 individual interviews were held with BBBS caseworkers and line managers.
青少年指導的重要性。共舉辦了三次這樣的焦點小組,共有 12 位專案人員參與。另外還與 BBBS 的個案工作者和部門經理進行了 12 次個人訪談。

Management  管理層

The study covered a 3 -year period. Its EAG was chaired by the Foroige CEO and was composed of representatives of the study funders, the Atlantic Philanthropies, and international experts in the area of mentoring research and research methodology. Meetings of the EAG were convened at critical times to advise the research team on design, implementation, and analysis issues and to provide feedback on draft reports. At the local level, the research team held regular meetings with Foroige and BBBS staff to ensure that the study was implemented as planned and to troubleshoot in relation to any issues that arose. Good working relationships with all stakeholders greatly facilitated the successful implementation of the research. The research team consisted of the Principal Investigator, Researcher, and Doctoral Fellow, with additional support brought in as required.
該研究涵蓋 3 年時間。其 EAG 由 Foroige 執行長擔任主席,成員包括研究資金提供者、大西洋慈善基金會的代表,以及指導研究和研究方法領域的國際專家。EAG 在關鍵時刻召開會議,就設計、實施和分析問題向研究團隊提供建議,並就報告草案提供反饋意見。在當地層面,研究團隊定期與Foroige和BBBS的工作人員舉行會議,以確保研究按計劃進行,並排除任何可能出現的問題。與所有利益相關者的良好工作關係大大促進了研究的成功實施。研究團隊由首席調查員、研究員和博士研究員組成,並在需要時提供額外支援。

Meta-Evaluatlon  元評估

Evaluations of social interventions in Ireland have rarely drawn upon randomized control trials. This study was the first of its kind, so there was naturally a sense of apprehension among the research team and the study commissioner regarding the task of designing and implementing such a study. The support of the EAG was critical, therefore, in terms of ensuring that advice was provided at key stages of the study design and implementation. The process was a transparent one, in which the study commissioner, funder, research team, and experts were all aware of the issues and challenges encountered and could collaborate in addressing them.
愛爾蘭的社會干預評估很少採用隨機控制試驗。本研究是第一個這樣的研究,因此研究團隊和研究專員對於設計和實施這樣的研究的任務自然有一種憂慮感。因此,EAG 的支持對於確保在研究設計和實施的關鍵階段提供建議至關重要。這個過程是透明的,研究專員、資助者、研究團隊和專家都知道所遇到的問題和挑戰,並能合作解決這些問題和挑戰。

Reports and UtIIIzation  報告和 UtIIIzation

The full study findings will be used by Foroige to inform the ongoing development of the BBBS mentoring program in Ireland. It is envisaged that the findings will also be of interest to mentoring programs in other countries.
完整的研究結果將由 Foroige 用來為愛爾蘭 BBBS 指導計劃的持續發展提供資訊。預計其他國家的指導計劃也會對研究結果感興趣。

Box 3.3 (cont.)  方框 3.3(續)

REFLECTIONS FROM THE EVALUATORS
評估人員的反思

  • It is useful to see a study design as a framework, the finer details of which will evolve as you encounter challenges and develop a greater understanding of the program context. In this study, we had to constantly reflect on progress and change our plans to respond to the realities of the program’s constraints and ethical issues. Our experience was a “journey,” as the title of our 2009 article suggests. The study design can change even during the implementation stage. For example, recruitment of the sample took longer than anticipated. Once recruited and randomized, matching of intervention group youth also took longer than planned. As a result of these issues, the overall study’s time frame was extended.
    將研究設計視為一個框架是非常有用的,當您遇到挑戰並對計劃背景有更深入的瞭解時,框架的細節也會隨之發展。在這項研究中,我們必須不斷反思進展,並改變我們的計劃,以應對計劃的現實限制和倫理問題。正如我們 2009 年文章的標題所示,我們的經歷就是一次「旅程」。即使在實施階段,研究設計也可能會改變。例如,樣本招募所花的時間比預期的要長。一旦招募完成並進行隨機分組後,對干預組青少年進行配對的時間也比計劃的要長。由於這些問題,整體研究的時間被延長了。
  • Good working relationships with program staff are absolutely critical. If we had attempted to impose a study design, devised in the ivory tower of academia, without ongoing consultation with staff on the ground, it would have been a failure. Instead, we had to work with staff on all levels, acknowledging different ways of working, to try to ensure that the study could be implemented as consistently as possible across the 10 sites in which we were evaluating the program. It took some time to get the lines of communication clear and ensure that all staff members felt included and up to date with what was expected of them in the study. At the same time, we had to be careful not to overload them with too much complexity or detract from their ability to do their job. The time required for planning and relationship building is usually underestimated in studies of this nature, which is why flexibility in timescale is important.
    與計劃人員建立良好的工作關係是絕對重要的。如果我們嘗試強加一個在學術界象牙塔裡設計的研究設計,而沒有與實地員工進行持續的協商,這將會是一個失敗。相反,我們必須與各個階層的員工合作,承認不同的工作方式,嘗試確保這項研究能夠在我們評估計劃的 10 個地點盡可能一致地執行。我們花了一些時間來釐清溝通管道,並確保所有員工都覺得自己被納入其中,並且了解研究對他們的期望。與此同時,我們必須注意不要讓他們負擔太多複雜的工作,也不要影響他們的工作能力。在這類性質的研究中,規劃和建立關係所需的時間通常會被低估,因此時間規模的彈性非常重要。
  • It’s important to be aware that responding to ethical concerns can have implications for the study design. For example, in our case, one implication of the ethical protocols was a reduction in the study sample size, due to a compressed target age range and requirements for full consent from both young people and parents. The evaluators and other stakeholders must jointly agree on what compromises must be made.
    重要的是要知道,回應倫理問題可能會對研究設計造成影響。例如,在我們的案例中,倫理協議的其中一個影響就是減少研究樣本的數量,這是因為目標年齡範圍被壓縮,而且要求獲得年輕人和父母的完全同意。評估人員和其他利害關係人必須共同商定必須做出哪些妥協。
  • Having a theoretical model upon which to base the study was important in terms of facilitating the selection of relevant measures and guiding analysis of findings through testing a series of hypotheses.
    有一個理論模型作為研究基礎是很重要的,這有助於選擇相關措施,並透過測試一系列假設來指導分析結果。
  • The input of an expert group is very valuable. In our case, the fact that the evaluand and study funder were also represented on the EAG ensured that there was direct communication and transparency between all key stakeholder groups. It is our belief that the communication process was more seamless as a result.
    專家組的意見非常寶貴。在我們的案例中,評估人員和研究資金提供者也是 EAG 的成員,這確保了所有關鍵利害關係人團體之間的直接溝通和透明度。我們相信,溝通過程因此更加順暢。
  • There is scope to be creative regarding how qualitative and quantitative methods can work together to enhance understanding of the issue under study. In our case, the same team members were involved in both qualitative and quantitative work, which helped in terms of allowing a dialectical approach to emerge. The dialectical approach is explained further in Chapter 9 ; essentially, this means that the evaluators were able to compare results from the quantitative and qualitative parts of their study and reach conclusions based on an integration of these two methods.]
    在定性和定量方法如何相互配合以加強對所研究問題的理解方面,我們仍有創造性的空間。在我們的案例中,同一批團隊成員參與了定性和定量工作,這有助於形成一種辯證方法。第 9 章會進一步解釋這種辯證方法;基本上,這表示評估人員能夠比較研究中定量和定性部分的結果,並在整合這兩種方法的基礎上得出結論]。
  • We came to the mixed methods literature as a way of resolving our concerns with implementing the experimental design within the limitations of the real-world setting. We had been concemed that incorporating qualitative elements would result in an evaluation with two separate parts that
    我們參考混合方法的文獻,以解決在現實世界環境的限制下執行實驗設計的疑慮。我們曾經懷疑,如果加入定性元素,評估結果就會分成兩個獨立的部分,而這兩個部分又會產生不同的結果。

did not "speak" to each other, either epistemologically or methodologically. However, we were much heartened to discover that many of these arguments had been developed throughout the mixed methods literature. This provided us with useful frameworks and design options from which to integrate the study design into a coherent, "whole" evaluation design.
無論是在認知論或方法論上,這些論點都沒有互相「對話」。然而,我們很高興地發現,這些論點有許多已經在混合方法的文獻中得到了發展。這為我們提供了有用的框架和設計選項,讓我們可以將研究設計整合成一個連貫、「完整」的評估設計。

An Example of an Experimental Design
實驗設計範例

In the Brady and O’Regan (2009) study, the independent variable was the mentoring program, with two levels: participation in the mentoring program and nonparticipation. The group that received the mentoring program is called the experimental group; the group that did not receive the program is the control group. The dependent variables included risky behaviors for the youth’s health (e.g., use of alcohol and drugs), socially appropriate behaviors (e.g., nonviolent resolution of conflicts), attitudes toward schoolwork, and peer and family relationships.
在 Brady 和 O'Regan (2009) 的研究中,自變量是指指導計劃,有兩個層次:參與指導計劃和不參與指導計劃。接受指導計劃的組別稱為實驗組;未接受計劃的組別稱為控制組。因變量包括對青少年健康有害的風險行為(如酗酒和吸毒)、適合社會的行為(如以非暴力方式解決衝突)、對學業的態度以及朋輩和家庭關係。

EXTENDING YOUR THINKING  擴展思維

Experimental Design  實驗設計

μ 4 8 β μ μ 4 8 β μ mu4sum8 <= beta mu\mu 4 \sum 8 \leq \beta \mu Methods Branch  方法 分支機構
Evalvation approach  評估方法 Topical area  專題領域
Sample study  樣本研究

採用隨機控制設計的混合方法(也稱為基於理論的評估;邏輯模型)
Mixed methods with randomized control design
(also, theory-based evaluation; logic model)
Mixed methods with randomized control design (also, theory-based evaluation; logic model)| Mixed methods with randomized control design | | :--- | | (also, theory-based evaluation; logic model) |

愛爾蘭的青年指導
Youth mentoring
in Ireiand
Youth mentoring in Ireiand| Youth mentoring | | :--- | | in Ireiand |
O'Regan (2009)
mu4sum8 <= beta mu Methods Branch Evalvation approach Topical area Sample study "Mixed methods with randomized control design (also, theory-based evaluation; logic model)" "Youth mentoring in Ireiand" O'Regan (2009) | | $\mu 4 \sum 8 \leq \beta \mu$ | Methods Branch | | :--- | :--- | :--- | | Evalvation approach | Topical area | | | Sample study | Mixed methods with randomized control design <br> (also, theory-based evaluation; logic model) | Youth mentoring <br> in Ireiand | | O'Regan (2009) | | |
Using the description of the Brady and O’Regan (2009) study in Box 3.3, answer the following questions:
使用方框 3.3 中對 Brady 和 O'Regan (2009) 研究的描述,回答下列問題:
  1. What theory drove this evaluation?
    是什麼理論驅使這項評估?
  2. Which parts of this study were quantitative, and which were qualitative?
    本研究的哪些部分是定量研究,哪些部分是定性研究?
  3. What data were the evaluators able to gather by using concurrent mixed methods, rather than a purely positivist, quantitative, pre-post survey (using only evaluation question 1)?
    評估人員使用同時進行的混合方法,而非純粹實證主義的定量、事前事後調查(僅使用評估問題 1),能夠收集到哪些資料?
  4. What do you think the evaluators mean when they say that a mixed method design allowed them to use a “coherent, ‘whole’ evaluation design”?
    當評估人員說混合方法設計讓他們可以使用「連貫、「整體」的評估設計」時,您認為他們是什麼意思?
  5. Do you think the random assignment was the best way to decide which teens would be in the experimental group? What concerns would you have about this method?
    您認為隨機指派是決定哪些青少年進入實驗組的最佳方法嗎?您對這種方法有什麼疑慮?
  6. In Box 3.3, the evaluators state: “We came to the mixed methods literature as a way of resolving our concerns with implementing the experimental design within the limitations of the real-world setting.” What does this statement illustrate about the modifications the evaluators felt were needed to replace conducting a straightforward experimental design?
    在方框 3.3 中,評估人員說:「我們採用混合方法文獻,以解決我們對在現實世界環境的限制下實施實驗設計的疑慮」。這句話說明了評估人員認為需要做些什麼來取代直接的實驗設計?

Theory to Practice: Methods Branch
理論到實踐:方法分支

  1. Experimental and quasi-experimental approaches
    實驗和准實驗方法
  2. Theory-based evaluation  以理論為基礎的評估
  3. Evaluation of training programs
    訓練計畫評估

An Example of a Quasi-Experimental Design
准實驗設計的範例

Quasi-experimental designs are used when evaluators are not able to assign participants randomly to treatment groups. Duwe and Kerschner (2008) used a quasi-experimental design for their evaluation of an alternative “boot camp” program for nonviolent drug and property offenders. They could not randomly assign individuals to be in the experimental program or the existing program, because when they started the evaluation study, the experimental and control treatments had already been implemented with large groups. Therefore, they had to use a quasi-experimental approach, which does not require randomly assigning participants to groups. (See Box 3.4.) They also included a cost-benefit component to their study to compare costs of the existing program with the new program.
當評估者無法將參與者隨機分配到治療組時,就會採用準實驗設計。Duwe 和 Kerschner (2008)在評估一項針對非暴力毒品和財產罪犯的替代性「新兵訓練營」計畫時,使用了准實驗設計。他們無法隨機指派個人參加實驗計畫或現有計畫,因為當他們開始評估研究時,實驗和控制治療已經在大組中實施。因此,他們不得不使用准實驗方法,這種方法不需要隨機分配參與者到各組。(請參閱方框 3.4。)他們也在研究中加入了成本效益部分,以比較現有計 畫與新計畫的成本。

Box 3.4. Sample Study with a Quasi-Experimental Design: The Boot Camp Study
方框 3.4.准實驗設計的樣本研究:訓練營研究

Sample study  樣本研究 Evaluation approach  評估方法 Document title  文件標題
Duwe and  杜威和

准實驗設計;Kerschner (2008)
Quasi-experimental design;
Kerschner (2008)
Quasi-experimental design; Kerschner (2008)| Quasi-experimental design; | | :--- | | Kerschner (2008) |

"從訓練營的棺材中取出釘子:成本效益分析
"Removing a Nail from the Boot Camp Coffin:
cost-benefit analysis
"Removing a Nail from the Boot Camp Coffin: cost-benefit analysis| "Removing a Nail from the Boot Camp Coffin: | | :--- | | cost-benefit analysis |

明尼蘇達州挑戰監禁計劃的結果評估"
An Outcome Evaluation of Minnesota's
Challenge Incarceration Program"
An Outcome Evaluation of Minnesota's Challenge Incarceration Program"| An Outcome Evaluation of Minnesota's | | :--- | | Challenge Incarceration Program" |
Sample study Evaluation approach Document title Duwe and "Quasi-experimental design; Kerschner (2008)" ""Removing a Nail from the Boot Camp Coffin: cost-benefit analysis" "An Outcome Evaluation of Minnesota's Challenge Incarceration Program""| Sample study | Evaluation approach | Document title | | :--- | :--- | :--- | | Duwe and | Quasi-experimental design; <br> Kerschner (2008) | "Removing a Nail from the Boot Camp Coffin: <br> cost-benefit analysis | | | | An Outcome Evaluation of Minnesota's <br> Challenge Incarceration Program" |

The Evaluators  評估員

Grant Duwe is Senior Research Analyst with the Minnesota Department of Corrections. Deborah Kerschner is Senior Program Manager for the Minnesota Department of Corrections.
Grant Duwe 是明尼蘇達州懲教署的高級研究分析師。Deborah Kerschner 是明尼蘇達州懲教署的資深計畫經理。

Phllosophlcal and Theoretical Lenses
哲學與理論鏡頭

The evaluators work within the postpositivist paradigm. In this study, they adhered to the beliefs of objectivity by distance from the program, lack of interaction with the offenders to prevent bias, and the use of a quasi-
評估人員在後實證主義範式下工作。在本研究中,他們恪守客觀性的信念,與計劃保持距離,不與罪犯互動以防止偏見,並使用准客觀評估機制。

experimental design as their methodology. The two evaluators worked independently from the program staff. They analyzed extant data on characteristics of offenders and outcomes after release from the program or the traditional incarceration facilities.
實驗設計是他們的方法。這兩位評估人員獨立於計劃工作人員之外工作。他們分析了有關罪犯特徵和從計劃或傳統監獄釋放後結果的現有資料。

The Evaluand and Its Context
Evaluand 及其背景

In 1972, the state of Minnesota opened an alternative incarceration program for nonviolent drug and property offenders called the “Challenge incarceration Program” (CIP), which included 6 months of chemical dependency treatment combined with a physically demanding program (called “boot camp”), followed by two 6 -month phases of community service with intensive supervision.
1972 年,明尼蘇達州針對非暴力的毒品和財產罪犯開放了一項替代監禁計畫,稱為「挑戰監禁計畫」(Challenge incarceration Program,簡稱 CIP),其中包括 6 個月的化學依賴治療,以及一項體力要求極高的計畫(稱為「新兵訓練營」),接著是兩個 6 個月的社區服務階段,並有強化的監督。

Method  方法

Design  設計

The evaluators used a retrospective quasi-experimental design with two groups. The experimental group participated in CIP; the control group was incarcerated in a Minnesota correctional facility, but did not participate in CIP. Individuals were not randomly assigned to their groups; hence this was a quasi-experimental study. The researchers also included a cost-benefit component as part of the design that allowed them to assess the cost savings as a result of early release and reduced recidivism.
評估人員使用回顧性的準實驗設計,分為兩組。實驗組參加了 CIP;對照組被監禁在明尼蘇達州的一家懲教所,但沒有參加 CIP。個人並非隨機分配到各組,因此這是一項准實驗研究。研究人員還將成本效益部分作為設計的一部分,以便評估提早釋放和減少累犯所節省的成本。

Evaluation Purposes and Questions
評估目的與問題

*The present study evaluates CIP since its inception, focusing on two main questions: (a) Does CIP significantly reduce offender recidivism? and (b) Does CIP reduce costs?" (Duwe & Kerschner, 2008, p. 616).
*本研究評估CIP自創立以來的情況,著重於兩個主要問題:(a)CIP是否大幅降低罪犯的累犯率? 及(b)CIP是否降低成本?"。(Duwe & Kerschner, 2008, p. 616)。

Stakeholders and Participants
利害關係人與參與者

The evaluators do not explicitly discuss the stakeholders and participants, although they do provide an extensive discussion of the offenders from whom the data were collected.
評估人員並沒有明確討論利害關係人和參與者,不過他們確實對資料收集對象的犯罪者進行了廣泛的討論。

Data Collection  資料收集

This study used “four different measures-rearrest, reconviction, reincarceration for a new crime, and any
這項研究使用「四種不同的測量方式-逮捕、再次定罪、因新犯罪而再次入獄,以及任何其他的測量方式」。

return to prison (for either a new offense or a technical violation)” (Duwe & Kerschner, 2008, p. 619). “Recidivism was operationalized as a rearrest, a felony reconviction, a return to prison for a new criminal offense (i.e., reimprisonment), and any return to prison (i.e., reincarceration because of a new crime or technical violation). It is important to emphasize that the first three recidivism measures contain only new criminal offenses, whereas the fourth measure is much broader in that it includes new crimes and supervised release violations” (p. 622).
重新入獄(因新的犯罪或技術性違規)」(Duwe & Kerschner, 2008, p.619)。"累犯的操作方式包括再次被捕、重罪再次定罪、因新的刑事犯罪重返監獄(即再次入獄),以及任何重返監獄(即因新的犯罪或技術性違規而再次入獄)。必須強調的是,前三項累犯量測僅包含新的刑事犯罪,而第四項量測範圍更廣,它包括新的犯罪和違反監管釋放的行為」(第 622 頁)。
Cost data were obtained based on program operation costs by multiplying the total number of days in the program or in prison by the per diem rate of the prison. If an individual dropped out of the experimental program or was rearressted and incarcerated, then costs were collected for additional time spent in the prison. For those who participated in the program, costs were collected for the total number of bed days saved due to early release.
成本資料是根據計畫運作成本取得,方法是將計畫中或監獄中的總天數乘以監獄的每日費率。如果有個人退出實驗計畫,或再次入獄,則收集在監獄中多花時間的成本。對於參加計劃的人,則收集因提早釋放而節省的總住院日數。
The sample consisted of an experimental group and a control group. The experimental group included "all offenders who entered CIP from the time it opened, October 1992, through the end of June 2002. During this time, there were 1,347 offenders ( 1,216 male and 131 female) who entered CIP CIP CIP^(**)\mathrm{CIP}^{*} (p. 621). The control group included "offenders who were released from a Minnesota Correctional Facility within a similar timeframe, January 1, 1993, to December 31, 2002* (p. 621). Violent offenders were eliminated because they would not have been eligible for the CIP program. The evaluators randomly sampled from the remaining individuals to end up with a control group of 1,555 people who were similar to the experimental group on a number of defined variables, such as sex, age, race, and prior arrests.
樣本包括實驗組和控制組。實驗組包括 "所有從1992年10月CIP開張到2002年6月底進入CIP的罪犯。在此期間,共有1,347名罪犯(1,216名男性和131名女性)進入 CIP CIP CIP^(**)\mathrm{CIP}^{*} (第621頁)。對照組包括「在類似時間框架(1993 年 1 月 1 日至 2002 年 12 月 31 日)內從明尼蘇達州懲教機構釋放的罪犯*」(第 621 頁)。暴力罪犯被排除在外,因為他們沒有資格參加CIP計畫。評估人員從剩餘的個人中隨機抽樣,最終得到一個由1,555人組成的對照組,這些人在一些確定的變量上與實驗組相似,例如性別、年齡、種族和以前的逮捕記錄。

Management and Budget  管理與預算

The evaluation was conducted retrospectively; that is, the data used were already collected, and the evaluators conducted the analysis independently of the implementation of the program. No mention is made of the management plan or budget.
評估是以回溯方式進行的;也就是說,所使用的資料是已經收集好的,評估人員是在獨立於計劃執行的情況下進行分析的。未提及管理計劃或預算。

Box 3.4 (cont.)  方框 3.4(續)

Reports and Utiilization
報告與使用

The results revealed that the experimental program participants stayed out of prison longer than control group participants. Also, even when experimental group participants returned to prison, they spent significantly less time there than did the control group participants, because their crimes were less serious. Finally, the CIP saved the state of Minnesota $ 6.2 $ 6.2 $6.2\$ 6.2 million because of the reduced costs associated with
結果顯示,實驗組的參加者比對照組的參加者離開監獄的時間更長。此外,即使實驗組的參加者回到監獄,他們在監獄中所花的時間也遠遠少於控制組的參加者,因為他們的罪行沒有那麼嚴重。最後,CIP為明尼蘇達州節省了 $ 6.2 $ 6.2 $6.2\$ 6.2 百萬美元,因為減少了與以下方面相關的成本

early release. Duwe and Kerschner suggest that their findings can be used to justify the use of a boot camp approach if it is combined with drug treatment and is followed up with intensive supervision over a year.
提早釋放。Duwe 和 Kerschner 建議,如果新兵訓練營與戒毒治療結合,並在一年內進行密集監督,他們的研究結果可以用來證明新兵訓練營方式的合理性。

Meta-Evaluatlon  元評估

The evaluators do not mention meta-evaluation strategies.
評估人員沒有提及元評估策略。

EXTENDING YOUR THINKING Quasi-Experimental Design
擴展您的思考 準實驗設計

μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu Methods Branch   μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu 方法 分支

Sample study  樣本研究 Evaluation approach  評估方法

列出識別特徵
List the distinguishing
characteristics
List the distinguishing characteristics| List the distinguishing | | :--- | | characteristics |

Duwe 和 Kerschner (2008)
Duwe and
Kerschner (2008)
Duwe and Kerschner (2008)| Duwe and | | :--- | | Kerschner (2008) |

準實驗設計;成本效益分析
Quasi-experimental design;
cost-benefit analysis
Quasi-experimental design; cost-benefit analysis| Quasi-experimental design; | | :--- | | cost-benefit analysis |
Sample study Evaluation approach "List the distinguishing characteristics" "Duwe and Kerschner (2008)" "Quasi-experimental design; cost-benefit analysis" | Sample study | Evaluation approach | List the distinguishing <br> characteristics | | :--- | :--- | :--- | | Duwe and <br> Kerschner (2008) | Quasi-experimental design; <br> cost-benefit analysis | |
Using the description of the Duwe and Kerschner (2008) study in Box 3.4, respond to the following questions:
使用方框 3.4 中關於 Duwe 和 Kerschner (2008) 研究的說明,回答下列問題:
  1. The authors used a quasi-experimental design because the groups had already been chosen, so they could not be randomly assigned (as is required with an experimental design). Do you think there are any ethical concerns to be considered around the fact that those in the control group did not receive the alternative program?
    作者使用了准實驗設計,因為組別已經選好,所以無法隨機分配(實驗設計要求)。您認為對照組的人沒有接受替代方案,是否有任何倫理問題需要考慮?
  2. What was the independent variable? The dependent variable?
    自變量是什麼?因變量是什麼?
  3. What theory drove this evaluation?
    是什麼理論驅使這項評估?
  4. What were the results of this study, and what was concluded from the results?
    這項研究的結果是什麼,從結果中得出了什麼結論?
  5. Were you expecting different results than those reported by the authors? If so, what results were you expecting?
    您期望的結果是否與作者報告的結果不同?如果是,您預期的結果是什麼?
The authors state that CIP reduced costs for the state of Minnesota. Do you think that there may have also been other variables at play instead of, or concurrently with, CIP? If yes, how would you investigate what those variables might have been?
作者指出 CIP 降低了明尼蘇達州的成本。您是否認為可能還有其他變數在起作用,而不是 CIP?如果是,您會如何調查這些變數?

Theory-Based Evaluation  以理論為基礎的評估

Theory to Practice: Methods Branch
理論到實踐:方法分支

  1. Experimental and quasi-experimental approaches
    實驗和准實驗方法
  2. Theory-based evaluation  以理論為基礎的評估
  3. Evaluation of training programs
    訓練計畫評估
This second part of the “Theory to Practice” section considers practice based on the work of methods theorists who prioritize theory-based evaluation approaches. The waters that flow through the evaluation landscape took a distinctive turn when Lipsey (1998) and Chen (1994) first introduced the concept of theory-based evaluation. Building a program theory involves first identifying those elements that the stakeholders believe are necessary to achieve their desired results, and then developing a model that shows how the elements relate to each other in that process. As mentioned briefly in Chapter 2, theory-based evaluations sometimes result in tables, charts, or diagrams that are called “logic models,” “log frames,” or “program theory models.” The Brady and O’Regan (2009) youth mentoring study summarized in Box 3.9 provides a diagram that represents
理論到實踐 "的第二部分考慮了以方法理論家的工作為基礎的實踐,這些方法理論家優先採用基於理論的評估方法。當 Lipsey (1998) 和 Chen (1994) 首次提出以理論為基礎的評估概念時,流經評估領域的水流有了獨特的轉折。建立一個計畫理論需要先找出利害關係人認為對達成他們所期望的結果是必要的元素,然後發展一個模型來顯示這些元素在這個過程中是如何相互關聯的。正如在第二章中簡要提到的,基於理論的評估有時會產生表格、圖表或圖示,稱為 「邏輯模型」、「邏輯框架 」或 「計畫理論模型」。方框 3.9 中概述的 Brady 和 O'Regan (2009) 青年指导研究提供了一个图表,表示

the theory underlying the Big Brother Big Sister (BBBS) program in western Ireland. Fredericks et al.'s (2008) quality-of-life study used a theory-based approach to evaluate a program designed to provide individualized services for people with developmental disabilities, with an improved quality of life as the desired result. They used a logic model as a way to represent the program theory. A summary of this study appears in Box 3.5.
在西愛爾蘭,Big Brother Big Sister (BBBS) 計畫的基本理論。Fredericks 等人(2008 年)的生活品質研究使用以理論為基礎的方法來評估一項專為發展障礙者提供個人化服務的計畫,並以改善生活品質為預期結果。他們使用邏輯模型來表示計畫理論。這項研究的摘要見方框 3.5。
Box 3.5. Sample Study Using Theory-Based Evaluation: The Quality-of-Life Study
方框 3.5.使用理論為基礎的評估的樣本研究:生活品質研究
Sample study  樣本研究 Evaluation approach  評估方法 Document title  文件標題

Fredericks, Deegan 和 Carman (2008)
Fredericks, Deegan,
and Carman (2008)
Fredericks, Deegan, and Carman (2008)| Fredericks, Deegan, | | :--- | | and Carman (2008) |
Theory-based evaluation  以理論為基礎的評估 "Using System Dynamics as an
"使用系統動力學作為
Evaluation Tool: Experience from a
評估工具:評估工具
Demonstration Program"  示範計劃"
Sample study Evaluation approach Document title "Fredericks, Deegan, and Carman (2008)" Theory-based evaluation "Using System Dynamics as an Evaluation Tool: Experience from a Demonstration Program"| Sample study | Evaluation approach | Document title | | :--- | :--- | :--- | | Fredericks, Deegan, <br> and Carman (2008) | Theory-based evaluation | "Using System Dynamics as an | | | | Evaluation Tool: Experience from a | | | | Demonstration Program" |

The Evaluators  評估員

The evaluation was conducted under contract with a local university. Members of the team included three doctoral students studying public administration: Kimberly Fredericks, Joanne Carman, and Michael Deegan. Kimberly Fredericks is Assistant Professor and Coordinator of the Graduate Health Services Administration Program at the Sage Colleges in Albany, New York. Michael Deegan is a policy analyst and is Postdoctoral Research Fellow at the National Academies of Science
評估是根據與當地一所大學簽訂的合約進行的。團隊成員包括三位研究公共管理的博士生:Kimberly Fredericks、Joanne Carman 和 Michael Deegan。Kimberly Fredericks 是紐約奧爾巴尼 Sage Colleges 醫療服務管理研究生課程的助理教授兼協調人。Michael Deegan 是一位政策分析師,也是美國國家科學院的博士後研究員。

in Alexandria, Virginia, Joanne Carman is Assistant Professor and Coordinator of the Graduate Certificate Program in Nonprofit Management at the University of North Carolina at Charlotte.
Joanne Carman 位於維吉尼亞州亞歷山大,是北卡羅萊納大學夏洛特分校非營利管理研究生證書課程的助理教授兼協調人。

Phllosophlcal and Theoretical Lens
哲學與理論鏡頭

The evaluators situated their work in the social science theoretical framework known as the systems dynamic approach, which involves building a model to capture "the dynamic structures and processes of complex
評估人員將他們的工作定位在社會科學理論框架(即系統動態方法)中,其中包括建立一個模型來捕捉「複雜系統的動態結構和過程」。

Box 3.5 (cont.)  方框 3.5(續)

systems" (Fredericks et al., 2008, p. 252). Their overarching philosophical bellefs are in accord with the postpositivist paradigm, in that their goal is to develop a mathematical relationship between the variables in a system that influences the outcomes of a program. The first step in the process was to identify relevant variables and create a conceptual model that shows their relationships to each other and the desired results. The second step involved creating a mathematical model for examining the variables over time in order to capture the structure and processes as they relate to outcome variables.
系統」(Fredericks et al., 2008, p. 252)。他們的首要哲學信念與後實證主義範式一致,因為他們的目標是在系統中影響計劃結果的變數之間建立一種數學關係。這個過程的第一步是找出相關的變數,並建立一個概念模型,顯示這些變數之間的關係以及預期的結果。第二步是建立一個數學模型來檢視變數隨著時間的變化,以便捕捉它們與結果變數相關的結構和過程。

The Evaluand and Its Context
評估標準及其背景

The evaluand was a demonstration program for peopie with developmental disabilities being dellivered by six nonprofit agencies over 5 years. The goal of the program was to provide individualized services to these people that would lead to improved quality of life. The project had three primary goals: to provide individualized service in response to a person’s specific needs (rather than services delivered to a group), to provide flexible funding for service providers, and to streamline regulatory and administrative processes.
該評估是一項針對發展障礙人士的示範計畫,由六個非營利機構在五年內執行。該計畫的目標是為這些人提供個性化服務,從而改善他們的生活品質。該計畫有三個主要目標:針對個人的特定需求提供個人化服務(而非向群體提供服務)、為服務提供者提供靈活的資金,以及簡化監管和行政流程。
Services provided by agencies included evaluation and assessment; early childhood development; day care and universal PreK; school-age education; adult day programs (including day habilitation and day treatment); vocational and supported employment programs; after-school and weekend recreation programs; summer day camp; assistive technology resources; health care (including medical care, rehabilitation, dental care, audiology, and augmentative communication); residential programs (ranging from community residences to supported apartments to independent living); and family support services (including service coordination, family reimbursement, recreation, afterschool and overnight respite, and housing and accessibility assistance). Most consumers received residential habilitation services, day habilitation services, or both (Fredericks, cited in Fredericks et al, 2008, p. 254).
各機構提供的服務包括評估和評量、早期兒童發展、日間照顧和普及學前教育、學齡教育、成人日間計劃(包括日間適應訓練和日間治療)、職業和輔助就業計劃、課後和週末娛樂計劃、夏令營、輔助技術資源;健康照護(包括醫療照護、復健、牙科照護、聽力學及輔助溝通);住宿計畫(從社區住宅到支援公寓再到獨立生活); 以及家庭支援服務(包括服務協調、家庭補償、娛樂、課後及通宵暫顧,以及住房和無障礙協助)。大多數消費者接受住宿適應訓練服務、日間適應訓練服務,或兩者兼有(Fredericks, cited in Fredericks et al, 2008, p.254)。
The evaluand was represented by a logic model
邏輯模型代表評估對象

capturing the inputs (what resources are needed?) the process (activities); the outputs and short-term outcomes (evidence that the project is accomplishing its goals in the short term); and the long-term outcomes and impacts (the goals in the long term). This logic model is the portrayal of the program theory; that is, what needs to occur in order to achieve the desired effects? The logic model for this study appears in Chapter 7 (Box 7.5).
捕捉投入(需要哪些資源?)過程(活動);產出和短期結果(項目在短期內達成目標的證據);以及長期結果和影響(長期目標)。這個邏輯模型是方案理論的寫照;也就是說,為了達到預期的效果,需要發生什麼事情?本研究的邏輯模型出現在第 7 章(方框 7.5)。

Method  方法

Design  設計

This evaluation used a theory-based approach. The evaluators worked with the project’s Steering Committee to develop a comprehensive evaluation for the project. The Steering Committee consisted of members of the evaluation team, directors from the agencies, and representatives from the state agency that funded the project.
本次評估採用了理論為基礎的方法。評估人員與該專案的指導委員會合作,為該專案制定全面的評估。指導委員會的成員包括評估小組的成員、各機構的主管以及資助該專案的州政府機構的代表。

Evaluation Purposes and Questions
評估目的與問題

The evaluation was organized around four primary evaluation questions: (1) Who was served during the project? (2) What services did they get, and how much? (3) What were the outcomes? How did they relate to the project’s goals of increases in individualized service planning and delivery; increases in person-centered planning; increases in consumer choice; increases in community integration; and improved quality of life for consumers in terms of home, relationships, personal life, work/school, and community? (4) How much did the services cost?
評估圍繞四個主要評估問題進行:(1) 誰在專案期間獲得了服務?(2) 他們獲得了哪些服務,以及獲得了多少服務?(3) 結果如何?這些結果與計劃的目標有什麼關係:增加個人化的服務規劃和提供;增加以個人為中心的規劃;增加消費者的選擇;增加社區融合;以及改善消費者在家庭、人際關係、個人生活、工作/學校和社區方面的生活品質?(4) 服務的費用是多少?

Data Collection  資料收集

Data collection consisted of review of case records for demographic and disability diagnosis, Medicaid billing and expenditure data, site visits to the agencies that provided the services, and interviews. An outcome survey was also used to collect data from project staff, families, and participants each year. Service providers (direct care, supervisory, and support staff) were
資料蒐集包括審閱人口統計和殘障診斷的個案記錄、Medicaid 帳單和支出資料、實地訪問提供服務的機構,以及訪談。每年也會使用結果調查來收集專案工作人員、家庭和參加者的資料。服務提供者 (直接照護、監督和支援人員)

interviewed anonymously about their attitudes and perceptions about the services they provided. Data were compiled and analyzed each year by site and in the aggregate.
匿名訪問他們對所提供服務的態度和看法。每年都會按地點和總數編輯和分析資料。

Management and Budget  管理與預算

The evaluation was staffed by two doctoral students working 20 hours a week. The students were responsible for managing the data collection, data analysis, and report writing, with additional support provided by the evaluation team’s director (as well as members of the Steering Committee, as needed). The annual budget for the evaluation was approximately $ 90 , 000 $ 90 , 000 $90,000\$ 90,000.
評估工作由兩名博士生負責,每週工作 20 小時。這兩名學生負責管理資料收集、資料分析和報告撰寫,評估小組的主任(以及指導委員會的成員,如有需要)則提供額外的支援。評估的年度預算約為 $ 90 , 000 $ 90 , 000 $90,000\$ 90,000

Meta-Evaluatlon  元評估

The evaluation team met with the Steering Committee periodically to discuss the progress of the evaluation. The content and discussions that emerged from these meetings helped the evaluators to realize they needed to add another component to the evaluation, which is described in the next section of this summary.
評估小組定期與指導委員會開會討論評估進度。從這些會議中得出的內容和討論幫助評估人員意識到他們需要在評估中增加另一個組成部分,這將在本摘要的下一節中描述。

Reports and Utlllzation  報告與使用

The evaluation information was reported back to the sites on a regular basis. The sites received multiple copies of detailed reports, as well as one-to two-page executive summaries that were intended to be dis-
定期將評估資訊回報給實地。評估地點會收到多份詳細報告,以及一至兩頁的執行摘要,這些摘要是為了讓評估地點了解評估結果。

tributed to front-line staff, board members, and other interested stakeholders.
分發給前線員工、董事會成員及其他有興趣的利害關係人。
The evaluators noted that data from the fourth year of the evaluation revealed discrepancies from site to site in implementation of the various program components, and that sites were experiencing challenges that limited their ability to provide services as specified in the logic model. They shared these findings with the Steering Committee and recommended that the evaluation team use a qualitative system dynamics approach to see whether it could identify implementation problems and ways to ameliorate these problems. This involved three meetings between the evaluators and the stakeholders to develop a diagram that captured how the project worked in practice (not in theory). Through this process, they were able to “identify and conceptualize several issues that may have been hindering the success of the program, including competing goals, capacity limitations in the agencies, community constraints, and time-management problems for employees. Specifically, the model identified the pressures that were inhibiting the project’s ability to increase individualized services and to improve certain aspects of the consumer’s quality of life” (Fredericks et al., 2008, p. 257). The evaluators provide a detailed description of the various diagrams that were developed and how they enhanced the project’s ability to gain knowledge about its effectiveness.
評估人員注意到,第四年的評估數據顯示,各個地點在實施計劃各個組成部分方面存在差異,而且各個地點遇到的挑戰限制了他們按照邏輯模型的規定提供服務的能力。他們與指導委員會分享了這些發現,並建議評估小組使用定性系統動力方法,看看能否找出實施問題以及改善這些問題的方法。這涉及到評估人員與利益相關者之間的三次會議,以制定一個圖表來捕捉項目在實踐中(而非理論上)是如何運作的。通過這個過程,他們能夠「找出並概念化可能妨礙計劃成功的幾個問題,包括相互競爭的目標、機構的能力限制、社區限制以及員工的時間管理問題」。具體來說,該模型找出了妨礙計劃增加個性化服務和改善消費者生活品質某些方面的壓力」 (Fredericks et al., 2008, p.257)。評估人員詳細描述了所建立的各種圖表,以及這些圖表如何增強該專案獲取有關其成效的知識的能力。

REFLECTIONS FROM THE EVALUATORS
評估人員的反思

During the course of this evaluation, we learned several valuable lessons. The first lesson has to do with how the stakehoiders in an evaluation can have different understandings of specific concepts and terms. For example, the funder approached the evaluation team and asked us to design a “comprehensive evaluation.” As evaluators, we took this to mean exactly that-developing an evaluation that looks at the project’s implementation, outcomes, program theory, and answer some efficiency and effectiveness questions. We collected the baseline data during the first year of the project, and a few months into the second year of the project, we presented our first report summarizing the baseline data to the Steering Committee. We described the population being served at each of the sites, as well as a summary of the baseline measures for the outcome data that we would be tracking for the next 4 years.
在評估過程中,我們汲取了幾個寶貴的教訓。第一個教訓是關於評估中的利益相關者如何對特定的概念和術語有不同的理解。例如,資金提供者找到評估小組,要求我們設計一個 「全面的評估」。作為評估人員,我們認為這正是這個意思--制定一個評估,考察專案的實施、結果、計劃理論,並回答一些效率和效果問題。我們在專案第一年收集了基線資料,在專案第二年的幾個月後,我們向指導委員會提交了第一份報告,總結了基線資料。我們描述了每個地點的服務對象,以及未來四年我們將追蹤的結果資料的基線測量摘要。
In spite of the project’s being somewhat participatory in nature - in that we asked for input from the Steering Committee at every stage of the evaluation design (they approved the survey instruments, data collection plans, etc.)-the report was not very well received. Immediately following the presentation, the committee members started asking us, “Where are the recommendations?” As the
儘管該專案具有一定的參與性質,我們在評估設計的每個階段都要求指導委員會提供意見(他們批准了調查工具、資料收集計畫等),但報告的反響並不很好。報告發表後,委員會成員立即開始問我們:「建議在哪裡?由於

Box 3.5 (cont.)  方框 3.5(續)

evaluators, we were a bit surprised by this. We asked, “What do you mean by recommendations? We are only in the second year of the program. We don’t even have any outcome data to report.” They responded by explaining that they wanted us to tell them what was working well so far, what wasn’t, and what changes they should be making to improve the implementation of the program. It was very clear, by the end of this meeting, what the group had really wanted was for us to design a “formative evaluation.” In hindsight, this made sense, given that this was a demonstration program. Yet, when the funder and the sites approached the evaluators, they used the phrase “comprehensive evaluation.” This was the language that ended up in the evaluation contract. Whereas we interpreted the phrase “comprehensive evaluation” in terms of evaluation design, the funder and the sites used the phrase so that it would give them great flexibility over what they could ask us to deliver over time.
評估人員,我們對此感到有點驚訝。我們問:「您說的建議是什麼意思?我們才剛開始第二年的計劃。我們甚至沒有任何結果數據可以報告"。他們的回答是,他們希望我們告訴他們目前哪些地方做得好,哪些地方做得不好,以及他們應該做出哪些改變來改善計劃的實施。很明顯,在這次會議結束時,他們真正想要的是我們設計一個「形成性評估」。事後看來,這是有道理的,因為這是一個示範計畫。然而,當資金提供者和機構與評估人員接觸時,他們使用了 「全面評估 」這個詞語。這就是評估合同中最後使用的語言。雖然我們是從評估設計的角度來詮釋「全面評估」一詞,但資金提供者和研究機構使用這個詞語,是為了讓他們在要求我們在一段時間內完成評估時,能有極大的靈活性。
Not surprisingly, after this meeting, we evaluation team members had to regroup and change our data collection strategies, change the way we allocated resources, and find money in the budget to visit all of the sites. We added a person to the evaluation team, and traveled to each of the six sites to conduct personal and group interviews with different levels of staff (e.g., direct care workers, supervisors, financial administrators) to find out how implementation of the project was going. We tape-recorded the interviews, transcribed the data, and wrote a new report answering the questions that they were most interested in-what was working so far and what could be improved.
毫不奇怪,在這次會議之後,我們評估小組的成員必須重新組合,並改變我們的資料收集策略、改變我們分配資源的方式,並在預算中找錢去訪問所有的地點。我們在評估小組中增加了一個人,並前往六個地點中的每一個,與不同階層的員工(例如,直接照護工作者、主管、財務管理員)進行個人和小組訪談,以瞭解計劃的實施情況。我們將訪談內容錄音、轉錄資料,並撰寫一份新的報告,回答他們最感興趣的問題 - 目前為止有哪些成效以及哪些地方可以改善。
Another lesson that we learned has to do with the assumptions that well-intended and presumably informed stakeholders can make about the population being targeted by an evaluation. In designing the evaluation, we had a meeting where the subject of attrition came up. Given that this was going to be a 5-year project, we knew that tracking people over time might be difficult. The members on the Steering Committee who represented the agencies, however, assured us that attrition would not be a problem, in that this was not a transient group of people. In analyzing the data during second and third years of the evaluation, we began to realize that attrition was indeed a problem at some of the program sites. Follow-up interviews confirmed that at some of the program sites, consumers dropped in and out of the program on a fairly regular basis. Typically, this occurred at the smaller sites that had collaborative relationships with other service providers. As the consumers’ needs changed, they were being referred to other providers to meet those needs.
我們汲取的另一個教訓,是關於用心良苦且可能知情的利害關係人,對評估所針對的人口所做的假設。在設計評估的過程中,我們開了一次會議,會議上提到了流失的問題。由於這將會是一個為期五年的專案,我們知道在一段時間內追蹤人們的情況可能會很困難。但是,指導委員會中代表各機構的成員向我們保證,流失不會是一個問題,因為這不是一個短暫的群體。在評估的第二年和第三年分析資料時,我們開始意識到,在一些計劃地點,流失確實是一個問題。追蹤訪談證實,在某些計畫地點,消費者經常退出或退出計畫。這種情況通常發生在與其他服務提供者有合作關係的小型地點。當消費者的需求改變時,他們會被轉介到其他服務提供者,以滿足這些需求。
The final lesson that we learned has to do with the importance of the political and institutional support for demonstration projects and their evaluations. The project was originally created in response to political and institutional fears that the Medicaid funding stream might be converted to a block grant (just as Aid to Families with Dependent Children was converted to Temporary Assistance for Needy Families). By the end of the third year of the project, there had been a shift in the larger political environment, and the funder and agencies were no longer worried that this was going to happen. This realization had profound effects on the momentum of the evaluation. Data collection for the evaluation was no longer a priority for most of the sites. Evaluation team members had to expend considerably more time and effort to ensure that data were being collected, and interest in the evaluation reports declined markedly. in fact, when the final report was delivered, it was emailed to the funder. There was no Steering Committee meeting, no official presentation, and no feedback.
我們汲取的最後一個教訓與示範專案及其評估的政治和機構支持的重要性有關。該專案最初是為了回應政治和機構對 Medicaid 資金流可能轉換為整筆補助的恐懼而建立的(就像「受扶養兒童家庭援助」轉換為「貧困家庭臨時援助」一樣)。在專案的第三年結束時,大的政治環境已經改變,資金提供者和機構不再擔心這會發生。這個意識對評估的動力有深遠的影響。對於大多數機構來說,評估的資料收集不再是優先項目。評估小組成員不得不花更多的時間和精力來確保數據的收集,對評估報告的興趣也明顯下降。沒有指導委員會會議,沒有正式報告,也沒有回饋。
Theory-Based Evaluation  以理論為基礎的評估
μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu Methods Branch   μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu 方法 分支
Sample study  樣本研究 Evaluation approach  評估方法 List the distinguishing characteristics
列出識別特徵
Fredericks, Deegan, and Carman (2008) Theory-based mixed methods evaluation
以理論為基礎的混合方法評估
Theory-Based Evaluation mu4Sigma8 <= beta mu Methods Branch Sample study Evaluation approach List the distinguishing characteristics Fredericks, Deegan, and Carman (2008) Theory-based mixed methods evaluation | Theory-Based Evaluation | | | | :---: | :---: | :---: | | $\mu 4 \Sigma 8 \leq \beta \mu$ Methods Branch | | | | Sample study | Evaluation approach | List the distinguishing characteristics | | Fredericks, Deegan, and Carman (2008) | Theory-based mixed methods evaluation | |
Using the description of the Fredericks et al. (2008) study in Box 3.5, please answer the following questions:
請利用方框 3.5 中對 Fredericks 等人(2008 年)研究的描述,回答下列問題:
  1. Building a program theory involves identifying those elements that the stakeholders believe are necessary to achieve their desired results. Name three elements or variables that Fredericks et al. believed were necessary to improve the quality of life for people with developmental disabilities.
    建立計畫理論需要找出利害關係人認為達到預期結果所需的那些要素。請說出 Fredericks 等人認為改善發展障礙人士生活品質所需的三個要素或變數。
  2. Building a program theory involves developing a model that shows how the elements relate to each other in that process. What do you call the kind of model Fredericks et al. developed to illustrate their program?
    建立一個程式理論需要開發一個模型,說明在這個過程中各元素之間的關係。您如何稱呼 Fredericks 等人為了說明他們的程式而開發的那種模型?
  3. Return to Box 3.5 and note the questions the evaluators asked and then how the data were collected. Do you think they missed collecting data from any group that might have wanted to give their input? Was everything and everyone covered, in your opinion?
    回到方框 3.5,注意評估員提出的問題,然後再注意如何收集資料。您認為他們是否遺漏了向任何可能想提供意見的團體收集資料?在您看來,是否涵蓋了所有的人和事?
  4. Why couldn’t the evaluators continue using exclusively quantitative methods for the study during the fourth year? What happened to make them include qualitative methods in the study? What do you notice about the use of a mixed methods design in this study in terms of how it changes the study’s processes and findings?
    為什麼評估人員不能在第四年繼續只使用定量方法進行研究?發生了什麼事讓他們在研究中加入定性方法?在本研究中使用混合方法設計會如何改變研究的過程和結果?
  5. The evaluators were quite kind in sharing with us, the authors and readers of this book, three major problems they encountered in the study. Select one of the three and explain how you think the evaluators could have avoided the problem, or explain why you believe none of the problems could have been avoided.
    評估人員非常友善地與我們--本書的作者和讀者--分享了他們在研究中遇到的三個主要問題。請從這三個問題中選擇一個,並解釋您認為評估人員可以如何避免這個問題,或者解釋為什麼您認為這些問題都無法避免。
  6. The findings were emailed to the funder. How else could the findings from this evaluation been disseminated and to whom? How do you think the final report was eventually used? After the amount of time and money invested in this project, and the findings that might improve the demonstration program, do you feel that this method of disseminating the findings was ethical? What power do evaluators have in this situation?
    將評估結果以電子郵件寄給資助者。這次評估的結果還可以如何發佈,發佈給誰?您認為最終報告是如何被使用的?在這個專案中投入了大量的時間和金錢,而且評估結果可能會改善示範專案,你覺得這種傳播評估結果的方法合乎道德嗎?評估人員在這種情況下有什麼權力?

Evaluation of Training Programs
訓練計畫評估

Theory to Practice: Methods Branch
理論到實踐:方法分支

  1. Experimental and quasi-experimental approaches
    實驗和准實驗方法
  2. Theory-based evaluation  以理論為基礎的評估
  3. Evaluation of training programs
    訓練計畫評估
This final part of the “Theory to Practice” section includes early theoretical approaches to the evaluation of training programs. One of the earliest approaches to training evaluation, and one that continues to be used in many organizations, was developed by Kirkpatrick (1975). As mentioned previously in this chapter, the Kirkpatrick model has four levels or stages (see Box 3.6). A sample study that used the Kirkpatrick model to evaluate training is presented in Box 3.7. Busch et al. (2005) conducted a study of school leadership that illustrates Kirkpatrick’s four-level model of evaluation.
理論到實踐 "部分的最後一部分包括了早期的培訓計劃評估的理論方法。Kirkpatrick (1975)是最早的培訓評估方法之一,也是許多組織一直沿用的方法。正如本章之前提到的,Kirkpatrick模型有四個層次或階段(見方框3.6)。方框 3.7 中介紹了一項使用 Kirkpatrick 模型來評估培訓的示例研究。Busch 等人(2005)對學校領導力進行了一項研究,說明了 Kirkpatrick 的四層評估模型。

Box 3.6. Kirkpatrick's Model of Evaluation
方框 3.6.Kirkpatrick 的評估模型

1 Reaction stage Measuring how much the participants enjoyed the training
1 反應階段 衡量參與者對訓練的喜好程度

2 Learning stage Looking at what skills or information was absorbed by the participants during and immediately after the training
2 學習階段 檢視參加者在訓練期間和訓練結束後立即吸收了哪些技能或資訊
3 Behavior stage Testing the transfer of learning and the application of knowledge and skills by the participants after training, back in the workplace
3 行為階段 測試參加者在受訓後回到工作場所的學習轉移以及知識和技能的應用情況
4 Results stage Attempting to capture the effect of a training program on the organization’s performance
4 結果階段 嘗試捕捉訓練計畫對組織績效的影響
Source: Based on O’Toole (2009).
資料來源:根據 O'Toole (2009)。
Box 3.7. Training Evaluation of School Leadership
方框 3.7.學校領導力的培訓評估
Using Kirkpatrick's Model
使用 Kirkpatrick 模型
Box 3.7. Training Evaluation of School Leadership Using Kirkpatrick's Model | | Box 3.7. Training Evaluation of School Leadership | | :--- | :--- | :--- | :--- | | Using Kirkpatrick's Model | |

The Evaluators  評估員

Joseph R. Busch is an associate dean at the Fielding School of Psychology, Thomas P. O’Brien is the principal at Brentwood High School on Long Island, NY, and William D. Spangler teaches at the School of Management, Binghamton University, in New York.
Joseph R. Busch 是菲爾丁心理學院的副院長,Thomas P. O'Brien 是紐約長島布倫特伍德高中的校長,William D. Spangler 在紐約賓漢頓大學管理學院任教。

Phillosophical and Theoretical Lenses
菲利浦哲學與理論鏡頭

The evaluators situate their work in the postpositivist paradigm, and they use the Kirkpatrick model of evaluation. They also make use of a leadership theory postulating that leadership development requires recognition of an individual’s leadership style, the development of a plan for enhancing leadership skills, and mentoring and reflection.
評估人員將他們的工作置於後實證主義範式中,並使用 Kirkpatrick 評估模型。他們還運用了領導力理論,該理論認為領導力發展需要認清個人的領導風格、制定提升領導技能的計劃,以及指導和反思。

The Evaluand and Its Context
評估標準及其背景

A leadership development program was implemented through a collaborative effort with the state department of education, the university, and local school superintendents. It was not a certificate- or degree-granting program; rather, it was designed to encourage potential leaders to consider pursuing leadership positions. It had several components: assessments of individuals’ leadership styles and competencies, workshops on leadership theory and practice, mentoring by school administrators, and opportunities for individual and group reflection.
通過與州教育部、大學和當地學校校長的合作,實施了一項領導力發展計劃。該計畫不是一項頒發證書或學位的計畫;相反,它旨在鼓勵潛在的領導人考慮追求領導職位。它有幾個組成部分:個人領導風格和能力評估、領導理論和實踐研討會、學校行政人員的指導,以及個人和團體反思的機會。

Method  方法

Design  設計

Kirkpatrick’s four-level model was used to design the evaluation approach (reaction, learning, behavior, results; see Box 3.6).
Kirkpatrick 的四層模型被用來設計評估方法(反應、學習、行為、結果;見方框 3.6)。

Stakeholders and Participants
利害關係人與參與者

The participants in the training program were teachers who showed promise as leaders. The evaluation reported on the experiences of three cohorts who completed the 8 -month training ( n n nn 's = 25 , 10 = 25 , 10 =25,10=25,10, and 22 for these cohorts).
參加培訓計劃的都是有希望成為領導者的教師。該評估報告了三批完成了 8 個月培訓的教師的經驗( n n nn = 25 , 10 = 25 , 10 =25,10=25,10 ,以及這三批教師中的 22 人)。

Data Collection  資料收集

The evaluators used both quantitative and qualitative measures, including two reaction surveys; a quantitative assessment of learning and behavior through role plays and an in-basket scenario; the Multifactor Leadership Questionnaire (also for measuring learning and behavior); and a survey on the participants’ educational and career plans (results). Qualitative data included responses to open-ended questions in the reaction surveys, the mentors’ and superintendents’ written conclusions about the program, and participant journals.
評估人員使用了定量和定性的測量方法,包括兩項反應調查;通過角色扮演和籃子裡的情景對學習和行為進行的定量評估;多因素領導力問卷(也用於測量學習和行為);以及對參加者的教育和職業計劃(結果)進行的調查。定性資料包括對反應調查中開放式問題的回應、導師和主管對該計劃的書面結論以及參與者日記。

Management and Budget  管理與預算

No information is included in the article about this topic.
文章中未包含此主題的相關資訊。

Meta-Evaluatlon  元評估

The evaluators do not directly address meta-evaluation; however, they do offer commentary about the unrellability of some of their measures and suggest that these might be changed in future evaluations of this type.
評估人員並沒有直接討論元評估;然而,他們確實對某些評估措施的不可比性提出了評論,並建議在未來的同類型評估中可能會改變這些措施。

Reports and Utillization
報告與使用

No specific uses of the evaluation are mentioned, but the evaluators do conclude that their study’s overall findings support the use of this type of program to identify potential leaders for school systems and to help program participants determine whether they should pursue administrative positions.
沒有提到評估的具體用途,但評估人員確實得出結論,他們的研究的整體結果支持使用這種類型的計劃來為學校系統發掘潛在的領導者,並幫助計劃參與者決定他們是否應該追求行政職位。

Evaluation of Training Programs
訓練計畫評估

μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu Methods Branch   μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu 方法 分支

Evalvation approach  評估方法

列出識別特徵
List the distinguishing
characteristics
List the distinguishing characteristics| List the distinguishing | | :--- | | characteristics |

Sumple 研究 混合方法訓練評估 Busch, O'Brien, and Spangler (2005) Kirkpatrick 模式
Sumple study
Mixed method training evaluation
Busch, O'Brien,
and Spangler (2005) Kirkpatrick's model
Mixed method training evaluation Busch, O'Brien, and Spangler (2005) Kirkpatrick's model| Mixed method training evaluation | | :--- | | Busch, O'Brien, | | and Spangler (2005) Kirkpatrick's model |
Sumple study Mixed method training evaluation,Busch, O'Brien,,and Spangler (2005) Kirkpatrick's model | Sumple study | Mixed method training evaluation <br> Busch, O'Brien, <br> and Spangler (2005) Kirkpatrick's model | | | :--- | :--- | :--- |
Evalvation approach "List the distinguishing characteristics" "Sumple study Mixed method training evaluation,Busch, O'Brien,,and Spangler (2005) Kirkpatrick's model " | | Evalvation approach | List the distinguishing <br> characteristics | | :--- | :--- | :--- | | Sumple study Mixed method training evaluation <br> Busch, O'Brien, <br> and Spangler (2005) Kirkpatrick's model | | |
Using the description of the Busch et al. (2005) study in Box 3.7, please respond to the following questions:
使用方框 3.7 中關於 Busch 等人(2005 年)研究的說明,請回答下列問題:
  1. What characteristics illustrate Busch et al.'s use of the Kirkpatrick model of evaluation?
    Busch 等人使用 Kirkpatrick 評估模型有哪些特點?
  2. What do you identify as the strengths and weaknesses of this model in terms of the type of data that were collected by the evaluators?
    就評估人員所收集的資料類型而言,您認為此模式的優缺點為何?
  3. What would you suggest modifying in this study to improve the usefulness of the results?
    為了提高結果的實用性,您建議修改本研究的哪些內容?
The school leadership study illustrates the Kirkpatrick four-level model as follows: The evaluators (1) used two reaction surveys to determine how much the participants enjoyed the training (the reaction stage); (2) tested the transfer of learning and changes in behavior through the use of role plays, a questionnaire, and an in-basket scenario (the learning and behavior stages); and (3) obtained information about the participants’ educational and career plans (the results stage). Other studies to be discussed in later chapters demonstrate additional approaches that are commensurate with the postpositivist paradigm and the Methods Branch.
學校領導力研究說明 Kirkpatrick 四層模型如下:評估人員(1)使用兩項反應調查來確定參加者對訓練的喜好程度(反應階段);(2)透過角色扮演、問卷調查和籃子裡的情境來測試學習的轉移和行為的改變(學習和行為階段);以及(3)取得有關參加者的教育和職業計畫的資訊(結果階段)。稍後各章將討論的其他研究展示了與後實證主義範式和方法分支相稱的其他方法。

Critiques of the Methods Branch
Lack of Applicabiility in the "Real World"
批評方法分支在「現實世界」中缺乏適用性

Some criticisms of the approaches associated with the Methods Branch provide a transition to Chapter 4 on the Use Branch. For example, Stufflebeam (2003, p. 27) has critiqued the theory-based approach to evaluation, saying that it makes little sense because it “assumes that the complex of variables and interactions involved in running a project in the complicated, sometimes chaotic conditions of the real world can be worked out and used a priori to determine the pertinent evaluation questions and variables.” He continues:
對方法分支相關方法的一些批評,為第四章的使用分支提供了一個過渡。舉例來說,Stufflebeam (2003, p. 27)批判了以理論為基礎的評估方法,他說這種方法意義不大,因為它 「假設在複雜、有時混亂的現實世界條件下執行一個專案時所涉及的複雜變數和互動關係,可以事先計算出來並用來決定相關的評估問題和變數」。他繼續說:
Many evaluation plans that appeared in proposals were true to the then current evaluation orthodoxy, i.e., evaluations should determine whether valued objectives had been achieved
許多出現在提案中的評估計畫都忠於當時的正統評估方法,即評估應該確定是否達成了有價值的目標。

and met requirements of experimental design and post hoc, objective measurement. This conceptualization was wrong for the situations I found in Columbus classrooms. At best, following this approach could only confirm schools’ failures to achieve (dubious) objectives. Such evaluations would not help schools get projects on track and successfully meet the education needs of poor kids. (Stufflebeam, 2003, p. 30)
並符合實驗設計與事後客觀測量的要求。對於我在哥倫布教室發現的情況來說,這種概念是錯誤的。採用這種方法充其量只能證實學校未能達到(可疑的)目標。這樣的評估無法幫助學校讓計畫走上軌道,也無法成功滿足貧困孩子的教育需求。

Pros and Cons of Experimental Approaches
實驗方法的利弊

A portion of the evaluation community agrees that well-conducted randomized experiments are best suited for assessing effectiveness when multiple causal influences create uncertainty about what caused results. However, they are often difficult (and sometimes impossible) to carry out. An evaluation must be able to control exposure to the intervention and to ensure that treatment and control groups’ experiences remain separate and distinct throughout the study.
部分評估社群同意,當多重因果影響造成結果的不確定性時,進行良好的隨機實驗最適合評估成效。然而,這些實驗通常很難進行(有時甚至不可能)。評估必須能夠控制干預的暴露,並確保治療組和控制組的經驗在整個研究過程中保持分離和不同。
Several rigorous alternatives to randomized experiments are considered appropriate for other situations: quasi-experimental comparison group studies, statistical analyses of observational data, and (in some circumstances) in-depth case studies. The credibility of their estimates of program effects relies on how well the studies’ designs rule out competing causal explanations. Collecting additional data and targeting comparisons can help rule out other explanations (Kingsbury, Shipman, & Caracelli, 2009).
在其他情況下,有幾種嚴謹的方法可以取代隨機實驗:準實驗性比較小組研究、觀察資料的統計分析,以及(在某些情況下)深入的個案研究。這些研究對計畫效果的估計是否可信,取決於研究的設計能否很好地排除相互競爭的因果解釋。收集額外的資料並針對比較,有助於排除其他解釋(Kingsbury, Shipman, & Caracelli, 2009)。

EXTENDING YOUR THINKING  擴展思維

Methods Branch Evaluations
方法 分支機構評估

  1. On the U.S. Department of Education’s What Works Clearinghouse website (ies.ed.gov/ncee/wwc), go to “Publications and Reviews” and then “Intervention Reports.” You will see a list of different topics that all lead to summaries of program evaluations done by the U.S. government or its grantees. After you find a study that is interesting to you, answer these questions:
    在美國教育部的 What Works Clearinghouse 網站上 ( ies.ed.gov/ncee/wwc),移至「出版與評論」,然後移至「干預報告」。您會看到一個不同主題的清單,這些主題都指向美國政府或其受資助者所做的計畫評估摘要。找到您感興趣的研究之後,請回答這些問題:

    a. Did the study use RCTs, or did it use a quasi-experimental design?
    a. 該研究是使用 RCT,還是使用準實驗設計?

    b. What was the independent variable?
    b. 自變量是什麼?

    c. What was the experimental group?
    c. 試驗組是什麼?

    d. What was the control group?
    d. 對照組是什麼?

    e. What were the dependent variables?
    e. 因變數是什麼?

    f. Who were the stakeholders?
    f. 誰是利害關係人?

    g. What methodology was used?
    g. 使用了什麼方法?

    h. What were the results of the study?
    h. 研究結果如何?

    i. What branch of our tree does this study illustrate and why?
    i. 本研究說明了樹的哪個分支,為什麼?
  2. As noted above, Stufflebeam (2003) has written: “This conceptualization was
    如上文所述,Stufflebeam (2003) 曾寫道:"這個概念化是

    wrong for the situations I found in Columbus classrooms. At best, following this approach could only confirm schools’ failures to achieve (dubious) objectives” (p. 30). Do you agree with Stufflebeam that discovering failures is a reason to look for other approaches to evaluation? Why or why not?
    對於我在哥倫布教室發現的情況來說,這是錯誤的。充其量,採用這種方法只能證實學校未能達到(可疑的)目標」(第 30 頁)。你是否同意 Stufflebeam 的說法:發現失敗是尋找其他評估方法的理由?為什麼?
  3. Describe the similarities and differences among an experimental approach, a quasi-experimental approach, a theory-based approach, and training programs evaluations.
    說明實驗法、準實驗法、理論法和訓練計畫評估之間的異同。
  4. From your perspective, what do you think are some of the pros and cons of working in the Methods Branch?
    從您的角度來看,您認為在方法科工作有哪些優點和缺點?

Your Evaluation Plan: Your Philosophical Stance
您的評估計劃:您的哲學立場

Begin writing your understandings of the postpositivist paradigm and the Methods Branch as a way of clarifying your own thinking about your philosophical beliefs and how they might influence the way you conduct an evaluation. This perspective can become part of your evaluation plan later, when you decide which approach you will use.
開始寫下你對後實踐主義範式和方法分支的理解,以此來釐清你自己對哲學信念的思考,以及它們可能如何影響你進行評估的方式。稍後,當您決定使用哪一種方法時,這個觀點可以成為您評估計畫的一部份。

Moving On to the Next Chapter
邁向下一章

Evaluators expressing concerns with a narrow focus on methods have proposed that evaluators begin to give more attention to their interactions with stakeholders and to how those interactions influence the use of the evaluation findings. The Use Branch is the topic of Chapter 4, which explores those theorists who have focused on the need to be responsive to the stakeholders in order to improve the probability that their findings will be used to improve programs and for other uses that are appropriate.
對於狹隘地專注於方法的評價者,建議評價者開始更多地關注他們與利害關係人的互動,以及這些互動如何影響評價結果的使用。使用分支是第四章的主題,這一章探討了那些理論家,他們著重於回應利害關係人的需求,以提高他們的發現被用來改善計畫和其他適當用途的可能性。
Remember the studies below, as we refer to them again in later chapters.
請記住以下的研究,因為我們會在以後的章節中再次參考它們。
μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu Methods Branch   μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu 方法 分支
Sample study  樣本研究 Evaluation approach  評估方法 Topical area  專題領域
Brady and 0'Regan (2009)
Brady 和 0'Regan (2009)
Mixed methods with randomized control design (also, theorybased evaluation; logic model)
採用隨機控制設計的混合方法(也稱為基於理論的評估;邏輯模型)
Youth mentoring in Ireland
愛爾蘭的青年指導
Duwe and Kerschner (2008)
Duwe 和 Kerschner (2008)
Quasi-experimental design; cost-benefit analysis
準實驗設計;成本效益分析
Preventing return to prison in the United States
防止在美國重返監獄
mu4Sigma8 <= beta mu Methods Branch Sample study Evaluation approach Topical area Brady and 0'Regan (2009) Mixed methods with randomized control design (also, theorybased evaluation; logic model) Youth mentoring in Ireland Duwe and Kerschner (2008) Quasi-experimental design; cost-benefit analysis Preventing return to prison in the United States| $\mu 4 \Sigma 8 \leq \beta \mu$ Methods Branch | | | | :---: | :---: | :---: | | Sample study | Evaluation approach | Topical area | | Brady and 0'Regan (2009) | Mixed methods with randomized control design (also, theorybased evaluation; logic model) | Youth mentoring in Ireland | | Duwe and Kerschner (2008) | Quasi-experimental design; cost-benefit analysis | Preventing return to prison in the United States |
μ 4 Σ 8 β μ μ 4 Σ 8 β μ mu4Sigma8 <= beta mu\mu 4 \Sigma 8 \leq \beta \mu Methods Branch  方法 分支機構
Sample study  樣本研究 Evaluation approach  評估方法 Topical area  專題領域

Fredericks, Deegan 和 Carman (2008)
Fredericks, Deegan,
and Carman (2008)
Fredericks, Deegan, and Carman (2008)| Fredericks, Deegan, | | :--- | | and Carman (2008) |

以理論為基礎的混合方法評估
Mixed methods theory-based
evaluation
Mixed methods theory-based evaluation| Mixed methods theory-based | | :--- | | evaluation |

針對發展障礙人士的示範計畫
Demonstration program for
people with developmental
disabilities
Demonstration program for people with developmental disabilities| Demonstration program for | | :--- | | people with developmental | | disabilities |
Busch, O'Brien,
and Spangler (2005)
Busch, O'Brien, and Spangler (2005)| Busch, O'Brien, | | :--- | | and Spangler (2005) |

使用 Kirkpatrick 模型進行混合方法培訓評估
Mixed methods training
evaluation using Kirkpatrick's
model
Mixed methods training evaluation using Kirkpatrick's model| Mixed methods training | | :--- | | evaluation using Kirkpatrick's | | model |
School leadership  學校領導
mu4Sigma8 <= beta mu Methods Branch Sample study Evaluation approach Topical area "Fredericks, Deegan, and Carman (2008)" "Mixed methods theory-based evaluation" "Demonstration program for people with developmental disabilities" "Busch, O'Brien, and Spangler (2005)" "Mixed methods training evaluation using Kirkpatrick's model" School leadership| | $\mu 4 \Sigma 8 \leq \beta \mu$ | Methods Branch | | :--- | :--- | :--- | | Sample study | Evaluation approach | Topical area | | Fredericks, Deegan, <br> and Carman (2008) | Mixed methods theory-based <br> evaluation | Demonstration program for <br> people with developmental <br> disabilities | | Busch, O'Brien, <br> and Spangler (2005) | Mixed methods training <br> evaluation using Kirkpatrick's <br> model | School leadership |

Note  注意事項

  1. Alkin included Carol Weiss in the Methods Branch, but we think she belongs in the Use Branch because much of her work has focused on how to improve the use of evaluation findings.
    Alkin 將 Carol Weiss 列入方法分部,但我們認為她屬於使用分部,因為她的大部分工作都集中在如何改善評估結果的使用。

Preparing to Read Chapter Four
準備閱讀第四章

Branch  分支機構 Paradigm  範例 Description  說明
Methods  方法 Postpositivist  後現實主義

主要著重於定量設計和資料,但也可使用混合方法
Focuses primarily on quantitative designs
and data, but mixed methods can be used
Focuses primarily on quantitative designs and data, but mixed methods can be used| Focuses primarily on quantitative designs | | :--- | | and data, but mixed methods can be used |
Use  使用

主要著重於利害關係人認為有用的資料;提倡使用混合方法
Focuses primarily on data that are found
to be useful by stakeholders; advocates for
the use of mixed methods
Focuses primarily on data that are found to be useful by stakeholders; advocates for the use of mixed methods| Focuses primarily on data that are found | | :--- | | to be useful by stakeholders; advocates for | | the use of mixed methods |
Social Justice  社會正義 Constructivist  建構主義

主要著重於透過定性方法識別多重價值觀與觀點,但也可用於混合方法
Focuses primarily on identifying multiple
values and perspectives through qualitative
methods, but can be used for mixed methods
Focuses primarily on identifying multiple values and perspectives through qualitative methods, but can be used for mixed methods| Focuses primarily on identifying multiple | | :--- | | values and perspectives through qualitative | | methods, but can be used for mixed methods |
Transformative  變革

主要著重於邊緣族群的觀點,並透過混合方法審視系統性的權力結構,以促進社會正義與人權
Focuses primarily on viewpoints of
marginalized groups and interrogating
systemic power structures through mixed
methods to further social justice and human
rights
Focuses primarily on viewpoints of marginalized groups and interrogating systemic power structures through mixed methods to further social justice and human rights| Focuses primarily on viewpoints of | | :--- | | marginalized groups and interrogating | | systemic power structures through mixed | | methods to further social justice and human | | rights |
Branch Paradigm Description Methods Postpositivist "Focuses primarily on quantitative designs and data, but mixed methods can be used" Use "Focuses primarily on data that are found to be useful by stakeholders; advocates for the use of mixed methods" Social Justice Constructivist "Focuses primarily on identifying multiple values and perspectives through qualitative methods, but can be used for mixed methods" Transformative "Focuses primarily on viewpoints of marginalized groups and interrogating systemic power structures through mixed methods to further social justice and human rights" | Branch | Paradigm | Description | | :--- | :--- | :--- | | Methods | Postpositivist | Focuses primarily on quantitative designs <br> and data, but mixed methods can be used | | Use | Focuses primarily on data that are found <br> to be useful by stakeholders; advocates for <br> the use of mixed methods | | | Social Justice | Constructivist | Focuses primarily on identifying multiple <br> values and perspectives through qualitative <br> methods, but can be used for mixed methods | | Transformative | Focuses primarily on viewpoints of <br> marginalized groups and interrogating <br> systemic power structures through mixed <br> methods to further social justice and human <br> rights | |
As you prepare to read this chapter, think about these questions:
當您準備閱讀本章時,請思考這些問題:
  1. What are the characteristics of the pragmatic paradigm?
    實用範例的特徵是什麼?
  2. How do those characteristics influence the practice of evaluation?
    這些特性如何影響評估的實踐?
  3. Which major thinkers have contributed to the approaches associated with the Use Branch?
    哪些主要思想家對與使用分支相關的方法做出了貢獻?
  4. How did the ideas grow from the early days to the present in this theoretical context?
    在這種理論背景下,思想是如何從早期發展到現在的?