Systematic Screening for Behavior Disorders (SSBD) Technical Manual Universal Screening for PreK–9 Hill Walker, Ph.D. Herbert H. Severson, Ph.D. Edward G. Feil, Ph.D. SECON D E DI T ION Copyright © 2014 by Hill M. Walker, Herbert H. Severson, and Edward G. Feil All rights reserved. Cover and interior design by Aaron Graham The purchaser is granted permission to use, reproduce, and distribute the reproducible forms in the book and on the CD solely for use in a single classroom. Except as expressly permitted above and under the United States Copyright Act of 1976, no parts of this work may be used, reproduced, or distributed in any form or by any means, electronic or mechanical, without the prior written permission of the publisher. Published in the United States by Pacific Northwest Publishing 21 West 6th Avenue Eugene, OR 97401 ISBN 978-1-59909-065-8 Pacific Northwest Publishing Eugene, Oregon | www.pacificnwpublish.com TABLE OF CONTENTS List of Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 SSBD National Standardization Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Grades 1–6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preschool and Kindergarten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Supplemental SSBD Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 SSBD Instrument Development Procedures . . . . . . . . . . . . . . . . . . . . . . . 7 Phase 1: Initial Development of SSBD Instruments . . . . . . . . . . . . . . . . . 7 Stage 1 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Interrater Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Test-Retest Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Sensitivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Stage 2 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 SIMS Behavior Observation Codes . . . . . . . . . . . . . . . . . . . . . . . . . 10 Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Trial Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Stage 1 Instruments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Stage 2 Instruments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 SIMS Behavior Observation Codes. . . . . . . . . . . . . . . . . . . . 19 Discriminating Externalizers, Internalizers and Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Efficiency in Classifying Participant Groups . . . . . . . 21 Sex Differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Intercorrelations Among Stage 2 Measures and SIMS Behavior Observation Code Variables. . . . . . . 24 SSBD Field Testing and Replication. . . . . . . . . . . . . . . . . . . . . . . . . 27 Validation Studies of the SSBD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Test-Retest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Internal Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Interrater. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Item Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Factorial Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Technical Manual | i SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Concurrent Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . Discriminant Validity . . . . . . . . . . . . . . . . . . . . . . . . . . Construct Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Social Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phase 3: Extensions of SSBD Instruments . . . . . . . . . . . . . . . . . . . . . . . . The Early Screening Project: Using the SSBD with Preschool and Kindergarten Students. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . History and Development of ESP. . . . . . . . . . . . . . . . . . . . . . Validation Studies: Reliability . . . . . . . . . . . . . . . . . . . . . . . . Interrater Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . Test-Retest Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . Consistency Across Measures. . . . . . . . . . . . . . . . . . . Validation Studies: Validity . . . . . . . . . . . . . . . . . . . . . . . . . . Content Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concurrent Validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . Discriminative Validity. . . . . . . . . . . . . . . . . . . . . . . . . Treatment Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of ESP Technical Adequacy . . . . . . . . . . . . . . . . . Using the SSBD with Students in Grades 7–9 . . . . . . . . . . . . . . . . Update on SSBD Research and Outcomes . . . . . . . . . . . . . . . . . . . . . . . . Research Conducted by Other Professionals . . . . . . . . . . . . . . . . . Research Conducted by the SSBD Authors and Colleagues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: Normative Comparisons: SSBD Original Norms and Updated Supplemental Normative Databases . . . . . . . . . . . . . . . . . . . . . Appendix B: SSBD Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii | Table of Contents 36 36 53 56 57 57 57 59 59 60 60 61 61 62 62 63 64 65 66 66 68 70 73 79 89 LIST OF FIGURES AND TABLES Figure 1: Means of Children Ranked Highest on Externalizing Dimension, Internalizing Dimension, and Nonranked Peers on T-Scores of ESP Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Figure 2: First Steps Intervention Results on the SSBD . . . . . . . . . . . . . . . . . . . . 63 Table 1: Proportion of Cases on SSBD Stage 2 and SIMS Behavior Observation Codes by Standardization Sample Size . . . . . . . . . . . . . . 2 Table 2: SSBD Standardization Sample Demographic Characteristics . . . . . . . 4 Table 3: Number and Age of Children in the ESP Normative Sample . . . . . . . 6 Table 4: Test-Retest Stability Coefficients for Individual Teachers on the Stage Rank 1 Ordering Procedures . . . . . . . . . . . . . . . . . . . . . . 16 Table 5: Item-Total Correlations for the Stage 2 Behavior Scale Across Time 1 and Time 2 Rating Occasions . . . . . . . . . . . . . . 17 Table 6: Means, Standard Deviations, and ANOVAs for the Three Participant Groups on the Stage 2 Instruments . . . . . . . . . . . . 18 Table 7: Means, Standard Deviations, and ANOVAs for the Three Participant Groups on the Classroom and Playground Observations Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Table 8: Scheffé Analysis of Mean Differences on Discriminating SSBD Stage 2 and SIMS Behavior Observation Codes Variables . . . 21 Table 9: Correlations Between Predictor Variables and Group Membership and Corresponding Beta Weights . . . . . . . . . . . . . . . . . .23 Table 10: Sex Differences on Stage 2 and SIMS Behavior Observation Codes Variables for Combined Participant Groups . . . . . . . . . . . . . . 23 Table 11: Correlation Matrix for Stage 2 and SIMS Behavior Observation Codes Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Table 12: Item-Total Correlations for the Stage 2 Adaptive and Maladaptive Rating Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Table 13: Adaptive and Maladaptive Behavior Rating Scale Factor Structure and Item Loadings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Table 14: PSB Code Category and Code Category Combination Means, Standard Deviations, and Significance Tests by Participant Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Technical Manual | iii SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 15: Comparison of Frequency of Items Checked on the Critical Events Index for Externalizing and Internalizing Elementary Students Who Met Risk Criteria on the SSBD . . . . . . . . 40 Table 16: Means, Standard Deviations, and Significance Tests for Four Participant Groups of North Idaho Children’s Home Residents . . . 41 Table 17: Chi-square Analysis of Critical Events Items for Four North Idaho Children’s Home Participant Groups . . . . . . . . . . . . . . . 43 Table 18: Means, Standard Deviations, and Significance Tests for Fourth-Grade Externalizing, Internalizing, and Nonranked Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 19: Means, Standard Deviations, and Significance Tests for Participant Groups on the Adaptive and Maladaptive Rating Scale Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Table 20: SARS Behavior Profiles for Externalizing, Internalizing, and Nonranked Control Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 21: Means, Standard Deviations, and Significance Tests for Isolate and Nonisolate Participants on Teacher Social Skills Ratings and SSBD Stage 2 and SIMS Observation Codes Measures . . . . . . . . . . . 51 Table 22: Correlations Between Year One and Year Two Follow-up Scores on SSBD Stage 2 Measures for Combined Externalizing and Internalizing Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Table 23: SIMS Behavior Observation Codes Predictor Variables for Discrimination Analysis Classifying Previous Year’s Participant Group Status . . . . . . . . . . . . . . . . . . . . . . 52 Table 24 Correlations Between SSBD Stage 2 Measures and Achenbach TRF Scales and SSRS Scales . . . . . . . . . . . . . . . . . . . . . . . . 69 Table 25: Similarities in SSBD Score Profiles for Normative and Research-Based Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Table 26: Original SSBD Norms vs. Supplemental Practice— Research Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Table 27: Profiles for Externalizing and Internalizing Students Meeting vs. Not Meeting SSBD Stage 2 Risk Criteria . . . . . . . . . . . . . 84 Table 28: Descriptive Statistics for SSBD Stage 2 Measures . . . . . . . . . . . . . . . . 85 Table 29: Lane et al. Supplemental Norms From Research Conducted in the U.S. Southeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Table 30: Meeting/Not Meeting SSBD Stage 2 Risk Criteria by Ethnicity and Externalizing vs. Internalizing Status . . . . . . . . . . . . . 87 iv | Table of Contents ACKNOWLEDGMENTS A large number of professionals have made important contributions to the research and development of the SSBD system. Project staff members and colleagues of the authors who participated directly in the research process on the SSBD were Bonnie Todis, Alice Block-Pedego, Maureen Barckley, Greg Williams, Norris Haring, and Richard Rankin. Their contributions and dedication always met the highest standards of professionalism. The SSBD system was field-tested at a number of sites around the country in order to develop its normative database and test its efficacy. In particular, Vicki Phillips, Marilyn McMurdie, and Gayle Richards of the Kentucky Department of Education and State of Utah made enormous contributions to this process. Their generosity, dedication, and contributed time were outstanding and are greatly appreciated. Fulvia Nicholson of the Jordan School District in Utah conducted a full-scale, year-long replication of the SSBD through a grant from the Utah Department of Education. Her skill, professional dedication, and generosity were instrumental in making this a highly successful replication. The authors are indebted to her for these consistently high-quality efforts. Linda Colson and Lisa York of Illinois also cooperated with the authors and their staff in testing the SSBD over a year-long period. Our thanks and gratitude are also extended to them for the quality and generosity of their efforts. Other individuals who made important contributions to the SSBD’s development include Ken Reavis, Stevan Kukic, Steve Forness, Bill Jenson, Mike Nelson, Ken Sturm, Ray Lamour, Kathy Ludholtz, Gary Adams, Hyman Hops, Lew Lewin, Peter Nordby, Bob Hammond, Bob Lady, and Kathy Keim-Robinson. We would especially like to acknowledge the professional colleagues who more recently shared their SSBD research and practice databases with us to supplement and substantially expand our original normative base of 4,463 cases. These supplemental norms comprised nearly 7,000 additional cases drawn from five different regions of the United States. We acknowledge the following individuals for their invaluable efforts in this regard: Doug Cheney, Lucille Eber, Kathleen Lane, Gale Naquin, Technical Manual | v SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Jen Rose, Jason Small, Scott Stage, and Rich and Ben Young and their colleagues at Brigham Young University. These individuals were also instrumental in conducting data runs and analyses that made it possible to maximally utilize these normative case data. We are most indebted to them for their generosity and support. The validation and norming of the SSBD over a 5-year period were supported in part by research and model development grants to the authors from a series of federal and state agencies. Finally, we would like to acknowledge the excellent work of Jason Small, a research analyst at the Oregon Research Institute, for his comprehensive analysis of the psychometric characteristics of the SSBD resulting from three large-scale evaluation studies of the First Step program in which the SSBD was used as a universal screener. vi | Acknowledgments INTRODUCTION This document describes the development, trial testing, validation studies, and norming procedures and outcomes for the Systematic Screening for Behavior Disorders (SSBD) screening system. These initial activities occurred over a 5-year period prior to the SSBD’s publication. The development and validation of other measures in the Screening, Identification, and Monitoring System (SIMS), including the School Archival Records Search (SARS) and SIMS Behavior Observation Codes, occurred in conjunction with work around SSBD Stages 1 and 2. As part of SSBD development and validation, the SARS and SIMS Behavior Observation Codes were often administered to students who met risk criteria at Stage 2. Therefore, this manual also presents technical information around the development, validation, and use of these SSBD follow-up assessments in conjunction with Stage 1 and 2 measures. SSBD NATIONAL STANDARDIZATION SAMPLE At the completion of screening Stage 2 and/or use of the SIMS Behavior Observation Codes, data and information are available to make normative comparisons. These normative data allow schools to determine how an individual student compares with his or her peers on dimensions assessed by the SSBD, and can help determine the student’s specific behavioral status and possible eligibility for referral, special education certification, access to interventions, and/or specialized services and supports. Normative data are presented in Appendix A tables of the Administrator’s Guide, and should be of value to professionals during decision making about potentially at-risk students. Normative data were also used to identify cut-offs for decision rules at Stage 2 that are associated with risk for externalizing or internalizing disorders. The composition of this national standardization sample is described in the next sections, by grade level range. Technical Manual | 1 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) National Standardization Sample: Grades 1–6 The national standardization sample for the SSBD is comprised of approximately 4,400 cases (N = 4,463) on the Stage 2 measures and approximately 1,300 cases (N = 1,219) on the SIMS Behavior Observation Codes. These cases were developed within 17 school districts located in 8 states across the country. These states were Oregon, Washington, Utah, Illinois, Wisconsin, Rhode Island, Kentucky, and Florida. Table 1 contains the proportion of the total cases in the standardization sample from each of these sites for both Stage 2 measures and SIMS Behavior Observation Codes variables. Table 1 Proportion of Cases on SSBD Stage 2 Measures and SIMS Behavior Observation Codes by Standardization Sample Size SSBD STAGE 2 MEASURES State Florida n % Total Sample 82 1.8 Washington 280 6.3 Illinois 198 4.4 Kentucky 1,144 25.6 Oregon 1,284 28.8 261 5.8 1,038 23.3 176 3.9 4,463 100 Rhode Island Utah Wisconsin SIMS BEHAVIOR OBSERVATION CODES State n % Total Sample Washington 77 6.3 Illinois 99 8.1 Kentucky 212 17.4 Oregon 455 37.3 Utah 316 25.9 Rhode Island 2 | SSBD National Standardization Sample 60 4.9 1,219 100 This sample was developed over a 2-year period spanning the 1987–88 and 1988–89 school years. The development of the sample was made possible through state education department (Utah and Kentucky) and school district contacts of the authors. Two sites (Illinois and Wisconsin) contacted the authors regarding participation in the standardization process and in facilitating a field test of the SSBD. Correlations were computed between the Stage 2 measures and grade and sex of students in the standardization sample. For the Critical Events Index and the Adaptive and Maladaptive Behavior Scales, the correlations with grade were .02, −.04, and .00. The corresponding correlations with sex of student were −.18, .28, and −.26. Although several of these correlations reached significance at (p < .05), they were clearly in the low range of magnitude and, in the authors’ estimation, did not justify the creation of separate samples and distributions based on grade or sex. The SIMS Behavior Observation Codes AET code and some of the PSB code categories, however, showed substantial age (AET) and/or sex differences (e.g., participation, social engagement, as well as positive and negative social interaction). Thus, separate distributions were calculated by age and sex of student on these variables for externalizers, internalizers and nonranked participants. (See SSBD Administrator’s Guide and SSBD Observer Training Manual for tables resulting from these distributions) The authors were able to obtain data on the demographic and socioeconomic status characteristics for 12 of the 17 school districts participating in the standardization sample development effort. Table 2 displays this information by total school district enrollment, total number and proportion of non-white students, and the total proportion of students coming from low-income homes. Non-White proportions of the school population across these districts ranged from less than 1% to 33%. The proportion of students coming from low-income families ranged from 4.3% to 40%. Across school districts in the standardization sample, both non-White and low-income student status appeared to be broadly represented. Technical Manual | 3 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 2 SSBD Standardization Sample Demographic Characteristics Total Enrollment Total Non-White Enrollment % Total LowIncome Enrollment % Oregon Springfield — — 3.0 — 33.0 Park Rose 1,254 179 14.3 — 35.0 17,686 — 24.6 — Kentucky Fayette Ohio 25.9 4,157 55 1.3 1,524 37.0 — — <1.0 — 40.0 1,732 — 7.3 — 24.4 SASED 5,354 625 12.0 12 4.3 Dist. #33 2,026 590 29.0 468 23.0 Dist. #34 239 5 <1.0 0 — Dist. #25 473 1 <1.0 25 5.0 Granite 76,799 6,374 8.3 12,364 16.1 Jordan 62,281 3,346 5.4 6,237 10.6 29,268 9,671 33.0 11,414 39.0 Owen Henderson Illinois Utah Washington Tacoma Peninsula Florida 7,064 376 5.3 805 12.3 62,778 25,738 31.0 — — 5,659 1,129 20.0 2,175 38.0 879 10 1.1 59 6.7 Rhode Island Wisconsin Standard score and percentile distributions of externalizing, internalizing, and nonranked student cases are presented and discussed in the SSBD Administrator’s Guide. Cutoff scores based on these distributions are used as decision criteria for determining whether individual students meet criteria for risk at Stage 2 and may benefit from additional assessments, referral, certification, and access to needed supports and specialized services. Complete instructions for making these decisions are contained in the SSBD Administrator’s Guide. 4 | SSBD National Standardization Sample National Standardization Sample: Grades Prekindergarten and Kindergarten The normative sample for prekindergarten and kindergarten was developed as part of the Early Screening Project (ESP). The sample consisted of 2,853 children, aged 3 to 6 years old, who were enrolled in typical and specialized programs from 1991 to 1994. Because the SSBD uses a gating procedure and a comparison group, a decreasing number of children participated across stages. Of the 2,853 children beginning in Stage 1, 1,401 (49%) moved to Stage 2 and 541 (19%) were assessed with the SIMS Behavior Observation Codes. The participating children were from preschool and kindergarten classrooms in the following states: California (n = 517), Kentucky (n = 687), Louisiana (n = 386), Nebraska (n = 65), New Hampshire (n = 25), Oregon (n = 220), Texas (n = 612), and Utah (n = 341). The specialized preschools included programs for children identified as having serious emotional/ behavioral disorders, having developmental and language delays, and living in families with low incomes (Head Start). The sample consisted of 46% females and 54% males, with most of the children not eligible for Special Education services (78%). Of those who did qualify for Special Education services, 2% were eligible under the behavioral disorder category, 14% under developmental or language delay, and 6% under other categories (e.g., at risk and other health impaired). Sixty-nine percent of the children were White (as reported by their teachers), with 16%, 12%, and 3% reported as Hispanic, Black, and Native American or Asian, respectively. Family income (as reported by teachers) was 39% “middle” income ($15,000–$75,000/year); yet a substantial portion of families (58%) were reported to be “low” income (less than $15,000/year or Head Start eligible). Of the 1,304 families with low incomes, 974 had children enrolled in Head Start. Community size was 10% urban (over 1 million), 6% semi-urban (between 250,000 and 1 million), 21% suburban, and 63% rural (less than 100,000) Table 3 uses data from the Early Screening Project and concurrent measures collected over a 3-year period (from September 1991 through June 1994). This research was conducted and involved separate but related studies for the purpose of replicating and extending findings on the reliability and validity of the instrument with preschool and kindergarten students. Technical Manual | 5 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 3 Number and Age of Children in the ESP Normative Sample Age Not reported Stage 1 140 Stage 2 61 SIMS Behavior Observation Codes 5 3 years old 260 137 61 4 years old 1,463 721 278 5 years old 915 448 179 6 years old Total 75 34 18 2,853 1,401 541 SUPPLEMENTAL SSBD NORMS In the last several years, the SSBD authors have been able to recruit new research as well as practice SSBD data and results from a series of ten sites and colleagues from across the United States. A number of professional colleagues have developed substantial databases from conducting research studies in which the SSBD was used as a study measure or was the focus of the research. We find that SSBD data from numerous research studies are a close match to our original norms when they are collected in exactly the same fashion and under similar conditions. This result argues for the relevance of the original norms in decision making regarding today’s students as such normative student profiles have remained stable across school years. Based on the stability of the SSBD normative behavior levels, it is justifiable to retain the original cutoff points when using the SSBD as a universal screener (i.e., decision rules and risk criteria cutoffs at Stage 2) and as a determinant for optional, additional screening and/or access to supports and intervention services. The new supplemental norms of 6,743 cases for externalizers and internalizers, generated for students who do and do not meet Stage 2 risk criteria, provide important and highly consistent benchmarks across regional sites for evaluating the behavioral status of today’s students. A presentation of these updated norms is provide in Appendix A: Normative Comparisons: SSBD Original Norms and Updated Supplemental Normative Databases. 6 | Supplemental SSBD Norms SSBD INSTRUMENT DEVELOPMENT PROCEDURES Construction and testing of the measures that make up SSBD screening Stages 1 and 2 and the SIMS Behavior Observation Codes are described herein. Research on the SSBD’s development has been conducted in three phases. In Phase 1, research efforts were focused on the initial development and testing of SSBD instruments, definitions, and response formats of measures used across the screening stages. These efforts occurred over a 1-year development period. In Phase 2, 4 years of research and development were devoted to validation and field testing of the developed measures. Lastly, Phase 3 is characterized by research and applied work in extending the SSBD to other populations and settings. Phase 1: Initial Development of SSBD Instruments Stage 1 Instruments Three separate versions of the SSBD Stage 1 definitions and rating formats were investigated and evaluated prior to selection of those included in the final version of the SSBD. Each prototype version was trial-tested with teachers and aides in elementary classrooms in school districts in Oregon and Washington. Three criteria were used in evaluating these prototype versions: •• Interrater Reliability: The degree to which teachers with identical amounts of exposure to the same students agree in their rank orderings of them on externalizing and internalizing behavioral dimensions. •• Test-retest Reliability: The extent to which teacher rankings of students are stable over time. •• Sensitivity: The accuracy of the procedures in identifying students in general education classrooms who had been previously certified by a child study team as having behavioral disorders. These criteria guided revisions of the SSBD Stage 1 instruments and procedures during the development process. Their application is described below. Technical Manual | 7 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Interrater Agreement Initial testing of the first prototype version of the Stage 1 definitions and rank ordering procedures yielded Spearman rank order correlations (rhos) among pairs of teachers, and teachers and aides ranging from .60 to .94 for the externalizing rank-ordering dimension and from .35 to .72 for the internalizing dimension. Although these agreement levels were promising, they were not sufficiently high to achieve the authors’ Stage 1 goal of reliably identifying students potentially at risk for externalizing and internalizing behavior disorders. Stage 1 is arguably the most important of the SSBD screening stages because it determines which nominated students are included in subsequent screening stages and thereby qualify for further assessment(s), possible referral, and access to supports and services. Consequently, the Stage 1 procedures were revised to achieve greater behavioral specificity and precision in the externalizing and internalizing definitions and to also simplify the ranking procedure. The revised version of the Stage 1 procedures improved interrater agreement for the externalizing dimension but reduced it for the internalizing dimension. Spearman rhos, computed for two teachers and an aide on the externalizing dimension, ranged from .89 to .99; on the internalizing dimension, however, the range was a disappointing .11 to .28. The internalizing rhos and feedback from participating teachers indicated that the internalizing definition was still ambiguous and lacked sufficient clarity for effective use. Following this phase, the internalizing definition was rewritten a second time to provide additional behavioral specificity. This version of the Stage 1 procedures was then trial tested using eight pairs of teachers and teachers and aides. Noticeable improvements in agreement levels were obtained. Spearman rhos for participating teachers across the two sites ranged from .89 to .94 for the externalizing dimension and from .82 to .90 for the internalizing dimension. The agreement levels achieved were considered acceptable for achieving Stage 1 screening goals. Test-Retest Reliability A series of studies was conducted, also in the Oregon and Washington sites, on the temporal stability of the revised version of the Stage 1 procedures. Ten teachers participated in these studies, and temporal interval lengths ranged from 10 days to 1 month. Test-retest estimates over these 8 | Phase 1: Initial Development of SSBD Instruments time intervals ranged from .81 to .88 for the externalizing dimension and from .74 to .79 for the internalizing dimension. These estimates, in the authors’ view, met acceptable standards of temporal stability. Sensitivity To test the sensitivity of the Stage 1 procedures, nine general education teachers in kindergarten through sixth grade were identified in whose classrooms ten certified students with behavioral disorders (BD) had been placed previously by the school district. These teachers were not informed of the purpose of the study but were asked simply to complete the Stage 1 ranking procedures using all students enrolled in their classrooms. It was assumed that if the SSBD were sensitive to behavioral differences among students enrolled in least restrictive environment (LRE) settings, previously certified BD students would be ranked high relative to other students on the Stage 1 externalizing and internalizing behavioral dimensions. This proved to be the case. The Stage 1 procedures identified nine of the ten BD students as being within the highest three ranks on the externalizing dimension; the remaining pupil was ranked fifth on the internalizing dimension. Stage 2 Instruments The SSBD Stage 2 screening instruments (Critical Events Index and Combined Frequency Index of Adaptive and Maladaptive Behavior) were developed from prototype item lists contributed by Walker and his colleagues (Hersh & Walker, 1983; Walker, 1982; Walker, Reavis, Rhode & Jenson, 1985). The items that made up these three lists had been trial tested extensively in prior studies, refined and socially validated by both regular and special education teachers as measures of teacher behavioral standards and academic expectations for general education students (Walker, 1986; Walker & Rankin, 1983). Additional items included in the Critical Events Index (CEI) prototype list were based on externalizing and internalizing dimensions as conceptualized by Achenbach & Edelbrock (1979) and Ross (1980). These lists were informally trial tested in the Oregon site using a sample of 15 cooperating elementary teachers who rated the behavioral status of randomly selected students on them. Feedback from teachers regarding the items and inspection of means and variances were used as a basis for revising these items. Technical Manual | 9 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) SIMS Behavior Observation Codes The observation codes used in conjunction with SSBD Stages 1 and 2 were derived from codes developed by Walker and colleagues for recording pupil behavior within instructional and playground settings in prior research (Walker, Hops, & Greenwood, 1984). These two codes were trial tested extensively in school settings during the 1984–85 school year as part of a related research study (Shinn, Ramsey, Walker, Stieber, & O’Neill, 1987). Observer training times required for mastery of these codes were relatively brief. Interobserver agreement ratios were consistently in the .90 to .99 range for the Academic Engaged Time (AET) code and in the .78 to .90 range for the Peer Social Behavior (PSB) code during their testing and refinement. In Shinn et al. (1987), the AET code powerfully discriminated between a group of 39 antisocial and 41 at-risk control fifth grade boys (Shinn, Ramsey, Walker, Stieber & O’Neill, 1987). The antisocial students averaged 70% academic engagement during structured, classroom observations while the at-risk controls averaged 83%. Results of these instrument development procedures indicated that the SSBD measures appeared to have sufficient levels of reliability, sensitivity, and content validity to justify efforts for systematically investigating their psychometric characteristics and including them in the overall SSBD system. In 1986, the authors were awarded a 3-year field-initiated research grant from the U.S. Office of Special Education Programs to support these research efforts. This grant made it possible to study the psychometric properties of the SSBD measures extensively, to field-test the SSBD system, and to collect normative data on the Stages 2 and 3 instruments within 8 states and 18 school districts across the United States. Results of Phase 2 of the SSBD research and development process, supported by this external funding, are described in the next three sections under the headings of Trial Testing, Validation, and Field Testing. 10 | Phase 1: Initial Development of SSBD Instruments Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Trial Testing A year-long study designed to trial test the instruments comprising the SSBD was implemented during the 1985–86 school year. The study posed a number of crucial questions regarding the reliability and validity of teacher judgments and the psychometric characteristics of the instruments comprising each of the SSBD assessment stages. This year-long trial test of the SSBD within a Springfield, Oregon, elementary school had two major goals: •• To evaluate the psychometric characteristics of the instruments used at each SSBD screening stage •• To evaluate teacher accuracy in identifying, via the SSBD Stage 1 ranking procedures, contrasted groups of students (i.e., highranked externalizers, high-ranked internalizers, and unranked students) who would be expected to behave differently from each other within instructional and free-play settings. This study also assessed teachers’ general acceptance of the screening procedures in terms of their perceived value and consumer satisfaction. In this regard, the authors were particularly interested in how long it took to implement the Stage 1 and 2 procedures as well as their ease of use. Participants in this study were 18 teachers assigned to grades 1 through 5 in a cooperating elementary school located in Springfield, Oregon, and students enrolled in their classes (N = 454). All 18 teachers individually completed the SSBD Stage 1 and 2 assessment procedures on two occasions 31 days apart. In Stage 1, teachers nominated two mutually exclusive lists of students whose characteristic behavior patterns were best represented respectively by the externalizing or internalizing behavioral definitions (n = 10 each). Next, each teacher rank ordered the students within both lists in terms of the degree to which their characteristic behavior patterns matched the appropriate behavioral profile (i.e., externalizing or internalizing). Technical Manual | 11 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) It was necessary for the pupil lists to be identical at both Stage 1 ranking occasions in order to assess the test-retest stability of teacher rank orderings using Spearman rhos. Therefore, after the participating teachers had completed their Time 2 rank orderings, the pupil membership of the externalizing (n = 10) and internalizing (n = 10) lists for Time-1 and 2 were compared. Those teachers whose pupil lists differed from their initial list were given their original time 1 lists (in scrambled order) for the externalizing and/or internalizing behavioral dimensions and asked to re-rank this list of students. Teachers were then asked to rate the top three ranked externalizers and top three internalizers from their Stage 1 lists on the Stage 2 Critical Events Index and the Combined Frequency Index. These procedures were completed in the same manner at both Time 1 and Time 2 (one month follow-up interval). The teacher form of the Child Behavior Checklist (Achenbach & Edelbrock, 1979) was also completed by the classroom teacher for the three top ranked externalizing and internalizing students in each class following completion of the second set of SSBD rankings and ratings. In addition, the Stage 2 Combined Frequency Index was completed on two students from each teacher’s classroom (n = 33) who did not appear on either the externalizing or internalizing lists in SSBD Stage 1. These students served as normative controls for both Stage 2 and Stage 3 assessments. Following completion of the Time 1 and Time 2 teacher ranking/rating tasks, a sample of students was selected from each of the 18 classrooms for direct observation within instructional and free-play settings using the SIMS Behavior Observation Codes. From each classroom, parental permission was sought to observe four students: one externalizer, one internalizer, and two unselected, nonranked students who served as controls. The externalizing and internalizing participating students were those who had been ranked highest on these behavioral dimensions across both ranking occasions. Letters of consent were sent first to parents of the students with the highest average rankings across Times 1 and 2; if consent was denied, consent was then sought for observation of the student with the second highest average ranking. Signed permission forms were returned for 16 externalizers (8 first choices and 8 second choices) and 15 internalizers (6 first choices and 9 second choices). Parental consent was not sought for students who ranked lower than second on either the externalizing or internalizing dimensions. Thirty-three of 12 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments 36 consent forms were signed and returned for the unranked control students. Thus, a total of 64 students were observed on the SIMS Behavior Observation Codes (33 controls, 16 externalizers and 15 internalizers). Each student from the three groups was observed on four occasions, twice under seatwork conditions in the regular classroom setting and twice under regular recess conditions on the playground. Observers were uninformed as to the group membership (externalizing, internalizing, or control) of any of the study participants. Classroom observations were 15 minutes in length, and these sessions were recorded only during reading, mathematics, social studies and language periods. Whenever possible, observations were conducted during independent seatwork periods and no data were collected during teacher-led activities and classroom periods involving group unison responding, such as those used in direct instruction formats. Observers were provided with a stopwatch that was allowed to run whenever the target student was academically engaged, i.e., attending appropriately to academic materials and tasks, making appropriate motor responses, and requesting teacher assistance with academic tasks. Whenever the target student was not academically engaged (e.g., disturbing others, talking out, off task, out of seat and so forth), the stopwatch was stopped and remained off until the student resumed being academically engaged. Playground observations were scheduled for 15 minutes each but were sometimes shorter because recess periods did not always last for this length of time. Observations were conducted under regular recess conditions only using partial interval coding procedure; coding did not take place during playground activities that were actively led or controlled by an adult. Observers coded the target student’s playground social behavior according to the following guidelines. •• Only one code category could be recorded during any given observation interval. •• The category of social engagement overrode all other categories. If any social engagement was observed, the participant’s behavior was coded as socially engaged (SE) for that interval. •• If the target student changed activities during an interval, the activity that occurred most during of the interval was coded. •• All other code categories overrode the no codeable response (NC) category in the recording process. NC was coded only if no other category could be determined. Technical Manual | 13 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) •• The student’s behavior was coded as negative if any negative behavior occurred during the interval. Five graduate student observers were trained on the SIMS Behavior Observation Codes by the authors’ colleagues. Each observer received from 3.5 to 5 hours of direct, supervised training distributed across the classroom and playground codes. The first three training sessions occurred in a simulation training setting where observers practiced recording while viewing videotapes of classroom and playground behavior. The final two training sessions occurred in naturalistic classroom and playground settings. Reliability criteria for the termination of training were two consecutive sessions of a minimum of 90% agreement on the AET code and two consecutive sessions of a minimum of 70% agreement per session, with a mean of 80% or greater agreement on the PSB code. Observer agreement for the AET code was calculated by dividing the larger amount of time on the stopwatch recorded by one observer into the smaller amount recorded by the other observer and multiplying by 100. Agreement on the playground code was determined by dividing the number of intervals in which there was complete agreement among the observer pair by the total number of intervals observed and multiplying by 100. A total of 256 observations was completed in classroom and playground settings on students. Reliability checks were conducted by a colleague who served as the observer trainer/calibrator during 40 of these observation sessions (15.6% of total observations). The five observers conducted a total of 224 observation sessions while the trainer/calibrator conducted the remaining 32. Results for the SSBD trial test are described below by each assessment stage. Correlations are reported between selected Stage 2 measures, SIMS Behavior Observation Codes outcomes, and the Achenbach Child Behavior Checklist (CBC). Further, additional analyses are reported that assess the combined effects of Stage 2 measures and SIMS Behavior Observation Codes outcomes in correctly classifying students assigned to the three participant groups (externalizers, internalizers, controls) by teacher rankings in SSBD Stage 1. Finally, score differences for males and females across the three participant groups are reported on selected measures. 14 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Stage 1 Instruments The gender ratio of students nominated by participating teachers to form the externalizing and internalizing groups differed markedly. For example, at ranking Time 1, there were 46 males and 8 females in the externalizing group and 27 males and 27 females in the internalizing group. These proportions were nearly identical for the Time 2 ranking occasion, with 45 males and 9 females in the externalizing group and 25 males and 29 females in the internalizing group. The test-retest stability of teacher rankings was assessed in two ways. First, stability was measured by determining the percentage of students who were placed into the same participant groupings (externalizing, internalizing) by their teachers on the two ranking occasions. A statistically significant relationship between teachers’ classifications of students on the externalizing and internalizing behavioral dimensions across ranking occasions was obtained. An analysis of the proportions of the identical students comprising these participant groups from Time 1 to Time 2 exceeded chance expectations. Results indicated that of the 168 students who were classified by teachers as externalizers at Time 1, 130, or 77%, were so classified one month later. Using a chi-square analysis, this result was significant at (p < .001). Further, of the 51 students ranked among the top three externalizers by each teacher at Time 1, 35, or 69%, also were ranked in the top three at ranking Time 2. For the internalizers, 132 of 165, or 80%, were classified as members of the same group on both ranking occasions (p < .001). A similar proportion of students (69%) ranked in the top three internalizers across the two ranking occasions. The second method of assessing the stability of teacher rankings in Stage 1 involved computing Spearman rank order coefficients (rhos) between the Time 1 and Time 2 data sets for each teacher. This analysis produced 34 rho coefficients (one classroom was excluded from this analysis because the teacher changed between the two data collection occasions). Across the 17 remaining teachers, these rho coefficients ranged from .33 to .98 for the externalizing dimension and averaged .76. For the internalizing dimension, the range was from .45 to .94 and averaged .74. Table 4 below contains test-retest rhos over a 1-month period for individual teachers on the externalizing and internalizing dimensions. Technical Manual | 15 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 4 Test-Retest Stability Coefficients for Individual Teachers on the Stage 1 Rank-Ordering Procedures Teacher Externalizing Dimension Internalizing Dimension 1 .81 .73 2 .78 .66 3 .82 .74 4 .67 .87 5 .89 .73 6 .59 .82 7 .92 .57 8 .49 .70 9 .96 .81 10 .83 .72 11 .77 .94 12 .73 .78 13 .87 .45 14 .72 .69 15 .84 .87 16 .98 .76 17 .33 .67 Only two teachers on the externalizing dimension and one teacher on the internalizing dimension had stability coefficients of less than .50 for their rank orderings of students over a 1-month period. Some teachers had substantial discrepancies in their stability coefficients between the externalizing and internalizing dimensions with one dimension being lower or higher than the other. However, there was no systematic pattern to such discrepancies. Stage 2 Instruments The stability of the Combined Frequency Index (CFI) Adaptive and Maladaptive Behavior Scales was assessed across rating occasions for the three highest ranked students, on the externalizing and internalizing lists. Internal consistency of the scales was assessed at both rating time points. These analyses were not conducted for the Critical Events Index due to the scoring system used (1 or 0) and to the extremely low frequencies of positively checked events for all three student groups. 16 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Pearson correlations were computed for the top-ranked externalizers and top-ranked internalizers in order to assess the stability of teacher ratings over a 1-month period. Students in the externalizing and internalizing groups were combined for this analysis. The resulting correlations were .88 for the Adaptive Behavior Scale and .83 for the Maladaptive Behavior Scale between the Time 1 and Time 2 ratings. These correlations may be inflated, however, due to the combined influence of sex, grade, and group membership factors and should be interpreted cautiously. An inspection of the raw data for the two scales indicated a normal distribution for the Adaptive Behavior Scale; for the Maladaptive Behavior Scale, there was a generally normal distribution with a slightly positive skew. Internal consistency analyses (coefficient alpha) were conducted on both CFI scales. For the Adaptive Behavior Scale, alpha was .85 and .88, respectively, for the two rating occasions. For the Maladaptive Behavior Scale, the comparable figures were .82 and .87. Item analyses were conducted on the Adaptive and Maladaptive Behavior Scales to determine which items correlated positively with total scale scores. Table 5 presents item-total correlations for the Adaptive and Maladaptive Behavior Scales across the Time 1 and Time 2 rating occasions. Table 5 Item-Total Correlations for the Stage 2 Behavior Scales Across Time 1 and Time 2 Rating Occasions ADAPTIVE BEHAVIOR Item MALADAPTIVE BEHAVIOR Time 1 Time 2 Item Time 1 Time 2 1 0.66 0.67 1 0.64 0.68 2 0.68 0.67 2 −0.24 −0.12 3 0.59 0.59 3 0.62 0.69 4 0.48 0.66 4 0.57 0.57 5 0.64 0.70 5 0.69 0.87 6 0.62 0.75 6 0.60 0.62 7 0.46 0.69 7 0.60 0.75 8 0.68 0.70 8 0.71 0.69 9 0.67 0.50 9 0.17 0.22 10 0.16 0.27 10 0.65 0.70 11 0.61 0.72 11 0.54 0.59 12 0.06 0.02 Technical Manual | 17 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Across rating occasions and scales, the item total correlations ranged from −.24 to .87. The deletion of one item each in the Adaptive and Maladaptive Behavior scales, would have increased alpha; deletion of any other items would have either lowered alpha or left it unchanged. One item each in the CFI Adaptive and Maladaptive Rating Scales was subsequently revised to improve clarity and ratability. Table 6 contains means and standard deviations for the three participant groups on the Stage 2 measures for the Time 2 rating occasion. Data on the Stage 2 rating scales for all three participant groups, controls included, were recorded only at rating Time 2. Table 6 Means, Standard Deviations, and ANOVAs for the Three Participant Groups on the Stage 2 Instruments Variable Externalizers Internalizers Controls F Ratio p Value Adaptive Behavior Rating Scale M = 36.38 SD = 6.16 M = 44.50 SD = 7.69 M = 54.68 SD = 4.37 40.56 <0.01 Maladaptive Behavior Rating Scale M = 29.61 SD = 6.90 M = 18.16 SD = 4.83 M = 13.71 SD = 3.30 50.90 <0.01 Critical Events Index M = 1.72 M = 1.57 Range of Critical Events 0–6 0–5 The mean differences on the CFI Adaptive and Maladaptive Behavior Scales among the three participant groups were highly significant. These differences were also in the predicted direction with controls, internalizers, and externalizers rated in order from most adaptive to least maladaptive. The incidence of positive occurrences on the two SSBD Critical Events Indices was extremely low for both the externalizing and internalizing participant groups. A Critical Events Index was not completed by participating teachers for control participants. Correlations were computed between Adaptive and Maladaptive Behavior Scales and the externalizing and internalizing subscales of the Achenbach Child Behavior Checklist (CBC) in order to assess the 18 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments concurrent validity of the Stage 2 instruments. For the adaptive behavior rating scale, the correlations with the CBC externalizing scale at rating Times 1 and 2 were −.63 and −.68 (p < .001); for the Maladaptive Behavior Scale, these correlations were .81 and .77 (p < .001). Correlations between the SSBD Adaptive Behavior Scale and the CBC internalizing scale were .22 and .01 for the Time 1 and Time 2 ratings, respectively; correlations were not computed between the CBC internalizing scale and the SSBD Maladaptive Behavior Scale due to the behavioral content differences between these two scales. Neither of the correlations with the internalizing scale was significantly different from zero. SIMS Behavior Observation Codes Of the 64 students observed with the SIMS Behavior Observation Codes, there were 16 externalizers, 15 internalizers, and 33 controls. Fourteen were in first grade, 16 were in second grade, 11 were in third grade, 13 were in fourth grade, and 10 were fifth graders. There were 36 males and 28 females. Again, there were gender proportion differences by participant group. The 16 externalizers consisted of 12 males and 4 females; the 15 internalizers consisted of 8 males and 7 females. Controls consisted of 16 males and 17 females. Reliability estimates were calculated on the AET and PSB codes by computing interobserver agreement coefficients between the observer trainer/calibrator and each study observer. The mean agreement level for the 19 AET reliability checks was .96 with a range from .86 to 1.00. The mean agreement level for the 21 reliability checks on the PSB code was .84 and ranged from .65 to 1.00. Table 7 presents means and standard deviations on measures derived from the AET and PSB behavior observation codes. Significance levels for mean differences among the participant groups are also reported for ANOVAs conducted on each measure. Technical Manual | 19 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 7 Means, Standard Deviations, and ANOVAs for the Three Participant Groups on Classroom and Playground Observation Measures Externalizers Internalizers Controls F p Value Variable M SD M SD M SD Academic Engaged Time (AET) 53.88 16.53 68.20 13.25 71.56 11.87 9.61 <0.01 Socially Engaged (SE) 28.22 19.85 28.36 19.51 39.25 20.78 2.34 0.10 Parallel Play (PLP) 21.09 19.34 25.63 22.34 23.38 16.64 0.22 0.80 Participation (P) 31.53 30.54 13.40 25.15 18.86 22.81 2.15 0.13 Alone (A) 13.70 15.26 20.66 27.95 5.87 9.69 4.16 0.02 No Code (NC) 0.59 1.50 0.93 1.48 0.84 1.56 Positive Behavior 54.91 21.59 45.28 32.15 62.95 18.15 3.14 0.05 Negative Behavior 13.18 13.33 1.70 2.74 6.19 7.18 7.27 <0.01 Table 7 indicates that the following measures discriminated between the three participant groups: •• Academic Engaged Time •• Alone •• Total Positive Behavior •• Total Negative Behavior These observational measures were also correlated with the Achenbach CBC externalizing and internalizing scales. Academic Engaged Time, −.42 (p < .01), correlated significantly with the CBC externalizing scale. None of the SSBD Observation Codes or categories correlated significantly with the CBC internalizing scale. Discriminating Externalizers, Internalizers, and Controls The Scheffé procedure for the analysis of mean differences was applied to those Stage 2 measures and SIMS Behavior Observation Codes variables for which significant F ratios were obtained. Table 8 lists variables on which at least one pair of participant groups differed at the .05 level or beyond. 20 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments The results in Table 8 indicate that externalizers: •• Were rated by teachers as engaging in significantly less adaptive behavior than both internalizers and controls. •• Were rated as significantly more maladaptive than either internalizers or controls. •• Spent less time academically engaged than internalizers and controls. The results in Table 8 indicate that internalizers: •• Engaged in significantly less adaptive behavior than controls. •• Produced significantly more maladaptive behavior than controls. •• Spent significantly more time alone than controls. Table 8 Scheffé Analysis of Mean Differences on Discriminating SSBD Stage 2 and Observation Coding Variables PAIRS OF PARTICIPANT GROUPS SIGNIFICANTLY DIFFERENT AT THE .05 LEVEL (Group Means in Parentheses) Adaptive Behavior Rating Scale Externalizers (36.38) and Controls (54.68) Internalizers (44.50) and Controls (54.68) Externalizers (36.38) and Internalizers (44.50) Maladaptive Behavior Rating Scale Externalizers (29.61) and Controls (13.71) Internalizers (18.16) and Controls (13.71) Externalizers (29.61) and Internalizers(18.61) SIMS Behavior SIMS Behavior Observation Codes: Academic Engaged Time (AET) Externalizers (53.88) and Controls (71.56) Externalizers (53.88) and Internalizers (68.20) SIMS Behavior Observation Codes: Alone Internalizers (20.66) and Controls (5.87) Efficiency in Classifying Participant Groups A discriminant function analysis was conducted to determine the number of study participants who could be correctly classified into their respective participant groups (externalizers, internalizers, controls) on the basis of their scores on SSBD Stage 2 measures and SIMS Behavior Observation Codes outcomes. The Stage 2 variables entered in this analysis were Adaptive Behavior Scale score and Maladaptive Behavior Scale score. The SIMS Behavior Observation Codes variables used were Academic Engaged Time, Social Engagement, Social Involvement, Parallel Play, Participation, Alone, Positive Social Interaction, and Negative Technical Manual | 21 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Social Interaction. Results of the discriminant analysis indicated that 89.47% of the study participants were correctly classified into their respective participant groups on the basis of their SSBD Stage 2 scores and SIMS Behavior Observation Codes variable scores. Of the 13 externalizers included in this analysis, one was misclassified as an internalizing student. Three of the 12 internalizers were misclassified as controls and 2 of the 30 controls were misclassified, one as an externalizer and one as an internalizer. Incomplete data on three externalizers and three internalizers led to their deletion from this analysis. It should be noted that with 3 groups, 55 participants, and 10 discriminating variables, the participants to variables ratio was quite low in this analysis. Further, the achieved 89.47% correct classification rate was not adjusted for the prior probabilities of group membership. There were approximately twice as many controls as there were externalizers and internalizers in this analysis. A multiple regression analysis was conducted to determine the extent that scores on SSBD variables that discriminated the participant groups could predict group membership. The six variables that discriminated the three participant groups were entered into this analysis. The multiple correlation (R) between group membership and these six variables was .849. In combination, these variables accounted for approximately 72% of the variance between groups. Group membership was dummy coded in this analysis. A simultaneous regression analysis was also conducted to determine the relative weights of the variables entered in the equation, thereby permitting an analysis of which variables were most effective in predicting group membership. Table 9 shows the correlations between each predictor variable and group membership and corresponding beta weights. These results indicate that virtually all of the variance in group membership could be accounted for by Maladaptive Behavior Scale score, Adaptive Behavior Scale score, and scores on the Alone and Academic Engaged Time variables from the SIMS Behavior Observation Codes. 22 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Table 9 Correlations Between Predictor Variables and Group Membership and Corresponding Beta Weights Variable Group Membership Beta Weights −.79 −.42 .77 .35 SIMS Behavior Observation Codes: Alone −.30 −.19 SIMS Behavior Observation Codes: Academic Engaged Time .45 .11 SIMS Behavior Observation Codes: Positive Social Interaction .35 .04 SIMS Behavior Observation Codes: Negative Social Interaction −.24 .04 Maladaptive Behavior Rating Scale Adaptive Behavior Rating Scale Sex Differences An analysis was conducted to determine which of the SSBD Stage 2 and SIMS Behavior Observation Codes variables registered sex differences across the three participant groups. Statistically significant mean differences among males and females were obtained for two of the playground measures (Social Engagement and Participation in structured games and activities) and teacher ratings on the Adaptive Behavior Scale in Stage 2. The Social Engagement variable measures peer-to-peer social interactions in free play contexts. Table 10 presents means and standard deviations on these variables across the three participant groups by sex of student. Table 10 Sex Differences on Stage 2 and SIMS Behavior Observation Codes Variables for Combined Participant Groups Females (n = 28) Social Engagement Participation Adaptive Behavior Scale Ratings Males (n = 36) M SD Significance M SD 45.21 17.06 (p < 0.01) 25.18 16.88 9.65 12.63 (p < 0.01) 29.39 29.55 51.49 5.81 (p < 0.03) 45.71 7.27 The relatively small numbers of participants prohibited analysis of sex differences within each of the participant groups. Technical Manual | 23 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Intercorrelations Among Stage 2 Measures and SIMS Behavioral Observation Code Variables Because variables from Stage 2 measures and the SIMS Behavioral Observation Codes were treated as independent measures in the analyses reported above, the authors examined the intercorrelations among these variables in order to assess the extent of their covariation. Table 11 shows intercorrelations among the Stage 2 measures and the SIMS Behavioral Observation Codes reported for the SSBD trial test. Table 11 Correlation Matrix for Stage 2 and SIMS Behavior Observation Codes Variables Adaptive Maladap. Scale Scale AET Social Eng. Particip. Parallel Play Adaptive Scale 1.00 Maladaptive Scale −.60 1.00 Academic Engaged Time .02 −.21 1.00 Social Engagement .12 .15 −.16 1.00 Participation −.10 −.18 .24 −.40 1.00 Parallel Play −.27 .10 −.13 −.25 −.40 1.00 .21 -.16 .03 −.26 −.05 −.20 Alone Alone 1.00 Inspection of the correlation matrix in Table 11 indicates the intercorrelations among these variables were in the low to moderate range. The highest correlations were between teacher ratings of adaptive and maladaptive student behavior (r = −.60) and between Social Engagement and Participation and Parallel Play code categories (r = .40, .40). Correlations of this magnitude among these variables are not unexpected because (a) teachers rated the same participants on both the Adaptive and Maladaptive Behavior Scale item lists, (b) the Positive Social Interaction category subsumes both the Social Engagement and Social Involvement codes, and (c) Social Interaction opportunities are severely restricted during structured playground activities during which the Participation category was coded. However, statistical significance for group differences on any one of these moderately correlated measures could predict similar outcomes on the others. Overall, results of the initial trial testing of the SSBD were encouraging. Estimates of reliabilities for the instruments comprising each of the 24 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments three SSBD screening stages were judged acceptable for their assessment purposes. Both the test-retest stability of teacher rankings of students on the final form of the Stage 1 procedures and interrater agreement levels among pairs of teachers and/or teachers and aides were satisfactory and provided a foundation for future research on the system. The accuracy of the teachers’ classification of students, as indicated by the consistency of students’ group membership (i.e., externalizing, internalizing) from Time 1 to Time 2, was substantial. However, while overall test-retest rhos for the Stage 1 ranking dimensions averaged .75, several teachers were in the low .30s and .40s. Similarly, some teachers were not as consistent as others in their accuracy of assigned group membership for students from Time 1 to Time 2. Overall, however, the Stage 1 procedures allow teachers to identify behavior patterns that remain quite stable over periods of 1 month or less—a finding that has important implications for the referral of students to special education and related services. Teachers participating in this initial trial testing of the SSBD, via their Stage 1 ranking tasks, also validated findings from the professional literature on the differential representation of sex differences within externalizing and internalizing behavior patterns and disorders. In this study, the ratio of boys to girls in the teacher-nominated Stage 1 sample was nearly six to one for members of the externalizing group. For members of the internalizing group, boys and girls each comprised about half of the sample. At Stage 2, the CFI Adaptive and Maladaptive Behavior Scales demonstrated acceptable internal consistency and short-term stability, with all correlations in the mid to high .80s. As noted earlier, however, these coefficients should be interpreted cautiously. Item total correlations for these scales were, with the exception of one item on each scale, adequate. Coefficient alpha for the two scales was in the mid to high .80s and would be improved with deletion of these items. The SIMS Behavior Observation Codes proved to be highly reliable and was sensitive in discriminating the three participant groups in both classroom and playground settings. The average interobserver agreement level for the AET code was .96 during the study. The PSB code was somewhat less reliable than the AET code, perhaps because of the greater complexity of peer social behavior and the uncontrolled stimulus Technical Manual | 25 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) conditions of playground settings. However, this code demonstrated an acceptable interobserver agreement level of .84 in this trial test. Leff and his colleagues, in a comprehensive review of over 80 coding systems, have rated the PSB code as one of the best coding systems available for recording playground social behavior (see Leff & Lakin, 2005). The direction of the observed behavioral differences for the externalizing, internalizing, and control participants was consistent with the authors’ expectations based on empirical evidence presented in the literature. Teachers’ ability to identify intact groups of students, using the Stage 1 ranking procedures, and the clear differentiation of these groups on teacher ratings and direct observational measures recorded by professionally trained observers, serves to validate both teacher judgment and the viability of the SSBD approach. The construct validity of the SSBD rests upon the bipolar externalizinginternalizing behavioral classification of Achenbach (1978) and Ross (1980) and the assumption that it is possible to reliably differentiate externalizers and internalizers from each other, and both behavior patterns from nonranked control students. The discriminant function analysis conducted in this study addressed this question. Collectively, the Stage 2 measures and SIMS Behavior Observation Codes variables were efficient in correctly classifying students whose group membership was based on teacher assignments in Stage 1. As noted, these measures correctly classified 89.4 7% of the study participants. Four variables in combination accounted for 72% of the variance in determining this group membership (i.e., Maladaptive Behavior Scale score, Adaptive Behavior Scale score, and SIMS Behavior Observation Codes scores for Academic Engaged Time and Alone). This level of overall precision in separating students into identifiable groups spoke well for the continued development of the SSBD. Overall, results of the initial trial testing of the SSBD system and its component measures were quite encouraging and provided a basis for the design of a series of more extensive validation studies designed to investigate a range of validity types and psychometric characteristics of the SSBD instruments under field test conditions. Descriptions of and results from these studies are reported in the next section. 26 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments SSBD Field Testing and Replication The SSBD has been formally field tested within six sites across the country. These sites were located in the states of Oregon, Utah, Illinois, Wisconsin, Kentucky, and Rhode Island. SSBD Stage 1 and 2 measures were recorded in all six of these field sites. The SIMS Behavior Observation Codeswas administered in all these sites except Wisconsin. The authors and their colleagues conducted on-site training of school district personnel involved in the field testing process prior to initiation of any data collection within each site. Attempts were made to field-test the SSBD in these sites under conditions that would approximate as closely as possible those that would exist under normal conditions of screening and SSBD usage. The adherence to a minimum set of research requirements across these sites no doubt attenuated achievement of this goal to some degree. However, these requirements were necessary to produce comparable, reliable, and generalizable data across these sites. Formal training in administration of the SSBD procedures and dealing with logistics involved in meeting field-test research requirements were usually accomplished within a single day; however, the training of school personnel as reliable observers on the SIMS Behavior Observation Codes procedures generally required a second day of training and supervised practice using the codes within in vivo settings. In some cases, follow-up visits were made to field-test sites to conduct additional training, coordinate the monitoring of observer cadres, and assist with the calibration of interobserver agreement indices. Telephone contacts were maintained with field-test sites throughout the field-testing process, which spanned periods ranging from 4 to 8 months. A supervisor/coordinator was identified within each field-test site to monitor and troubleshoot problems that arose during the field-testing process. These field-testing activities were supported by a 3-year, field-initiated research grant from the U.S. Office of Special Education Programs. Field-test results from these sites allowed for intersite replication of SSBD procedures and outcomes, and were quite consistent across sites. In addition, data and results from these sites were included in the SSBD national standardization sample for the Stage 2 (N = 4,463) measures and the SIMS Behavior Observation Codes (N = 1,219). Technical Manual | 27 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) A formal replication of both the SSBD’s implementation and the results of its initial field testing, as reported by Walker, Severson, Stiller, Williams, Haring, Shinn, and Todis (1988), was conducted by Nicholson (1988) during the 1987–88 school year and is also reported in Walker, Severson, Nicholson, Kehle, Jenson, & Clark (1994). This systematic replication was supported by a grant from the Utah Office of Special Education. The replication effort spanned the entire 1987–1988 school year and was conducted within the Jordan School District, a suburban school district serving the suburbs of Salt Lake City, Utah. Though the full range of SES levels is represented in this district, it served primarily a middle-class population. Three elementary schools within the Jordan Utah School district participated in the SSBD replication study. Participants involved in the SSBD Stage 1 screening procedures were 1,468 students and their respective teachers (n = 58) in grades 1–5 within these three elementary schools. At SSBD screening Stage 2, participants consisted of 475 students in grades 1–5 who were selected from Stage 1 based on their teachers’ rankings. Participants who were observed in classroom and playground settings with the SIMS Behavior Observation Codes were 225 participants identified as the top-ranked externalizer, top-ranked internalizer, and two nonranked students selected from each participating classroom. A total of 900 observations of a minimum 12 minutes’ duration were recorded on these participants in classroom and playground settings on two occasions each. Classroom observations were recorded during independent seatwork periods whenever possible and during normal recess periods on the playground. Observers were rigorously trained and carefully monitored in this study. Reliability checks were conducted on 16 of the classroom observation sessions, and interobserver agreement averaged 95%. Similarly, interobserver agreement was calculated for 49 of the playground observation sessions and averaged 88%. Of the 173 students who appeared in the highest three ranks on the Stage 1 externalizing rank order dimension, 82% were males and 18% were females. In contrast, 44% of the top-ranked internalizers were males and 66% were females. These results are very similar to the proportions identified in the Walker et al. (1988) initial field test (see SSBD Trial Testing above). 28 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Similarly, interrelationships among the Stage 2 measures and SIMS Behavior Observation Codes variables also closely replicated the initial trial test results. Correlations among the SSBD Stage 2 measures ranged from .61 to .77 and intercorrelations among SIMS Behavior Observation Codes variables ranged from .13 to .62. As in the Walker et al. (1988) trial test, correlations between Stage 2 and SIMS Behavior Observation Codes variables were low and ranged from −.21 to .17. In addition, coefficient alpha for the Stage 2 Adaptive and Maladaptive Behavior Rating Scales were .94 and .90, respectively, in this replication. The Stage 2 measures and SIMS Behavior Observation Codes variables were highly sensitive in discriminating behavioral differences between high-ranked externalizers, high-ranked internalizers, and nonranked control participants. A discriminant function analysis indicated that the SSBD Stage 2 measures and SIMS Behavior Observation Codes variables correctly classified 84% of the three participant groups overall. As in the Walker et al. (1988) study, the classification ratios were highest for nonranked students, followed by externalizers and then internalizers. Externalizers exhibited less adaptive behavior, more maladaptive behavior and more critical events than either internalizers or nonranked students. They also spent less time academically engaged and produced fewer positive interactions as compared to internalizers and nonranked students. Internalizers exhibited less adaptive behavior, more maladaptive behavior and more critical events than nonranked students. They also spent a lower percentage of observed time academically engaged than nonranked students. Though fewer between-participant differences were found on the SIMS Behavior Observation Codes PSB code categories in the Nicholson (1988) study, these results overall closely replicated those reported by Walker et al. (1988) in a separate replication study. Resource teachers, psychologists, and general education teachers were surveyed to assess their general satisfaction with the SSBD procedure and to compare its efficacy with traditional procedures. Resource teachers and psychologists completed a 13-item survey, and general education teachers completed an 11-item survey. Results indicated that resource teachers and psychologists were much more favorable about their experiences with the SSBD than were general education teachers. Seventy-five percent of the resource teachers and psychologists would Technical Manual | 29 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) recommend the SSBD for use by other school faculties, and a majority of these respondents (n = 8) responded to each survey item favorably. In contrast, only 33% of the general education teachers sampled (n = 51) would recommend the system to other school faculties. However, when the survey items were analyzed by individual school faculties (n = 3), it was apparent that faculty members one school was very negative in its responses to the survey while the other two faculties were considerably more positive. In this school, the percentage of faculty responding favorably ranged from 12—72% across the survey items and averaged 32%. In the second school, in contrast, these percentages ranged from 53–87% and averaged 67%. In the third school, these figures ranged from 42–90% and averaged 64%. Thus, the results from the most negative school may have been substantially biased or influenced by variables over which the field-test personnel had little control or knowledge. The results of this replication study provided considerable overlap in findings with the initial trial study conducted by Walker et al. (1988) in one school and involving 18 teachers. The Nicholson (1988) replication involved 58 teachers in 3 elementary schools located in another state. These findings extended substantially the results of the original trial test of the SSBD as reported by Walker et al. (1988). Consumer satisfaction surveys of the Nicholson (1988) study are important contributions in the comprehensive evaluation of alternative approaches to existing practices as represented by the SSBD. Validation Studies of the SSBD An extensive series of validation studies has been conducted on the SSBD to date, examining the psychometric properties of Stage 1 and Stage 2 instruments as well as the SIMS Behavior Observation Codes and School Archival Records Search. Results from these studies and others described in this section relate to the following types of reliability: test-retest, internal consistency, and interrater reliability. Additionally, this section describes findings that provide empirical SSBD support for each of the following validity types: item, factorial, concurrent, discriminant, criterion related, predictive, and construct. 30 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Reliability Test-Retest Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley (1990) investigated the test-retest stability of the SSBD Stage 1 and 2 measures over a 1-month period. Forty teachers in the elementary age range completed the SSBD Stage 1 and 2 procedures on two occasions separated by 31 days. The mean test-retest rank order correlations (rhos) on the externalizing and internalizing behavioral profiles (n = 10 students each) were .79 and .72, respectively. Individual teacher rank order correlations ranged from −.16 to .96 on the externalizing dimension and from −.07 to .92 on the internalizing dimension. Eighty-eight percent of the 40 participating teachers had test-retest rhos greater than .45. In a reanalysis of these data, the authors excluded two teachers from the sample whose rhos were negative (−.16 and −.07) and treated them as outliers. The average externalizing rho improved to .88 and the average internalizing rho improved to .74 in this reanalysis. Pearson correlations were computed for the Stage 2 measures from Time 1 to Time 2. For the Critical Events Index, the resulting r was .81; for the Combined Frequency Index adaptive behavior rating scale, the r was .90; and for the maladaptive behavior rating scale, the r was .87. These correlations were all statistically significant at (p < .01). Overall, these results suggest that teachers are capable of making relatively stable judgments regarding child behavioral characteristics using the Stage 1 and 2 screening instruments. Internal Consistency Walker, Severson, Stiller, Williams, Haring, Shinn, and Todis (1988) calculated coefficient alpha to estimate the internal consistency of the Combined Frequency Index Adaptive and Maladaptive Behavior Scales. Using a sample of 18 teachers, each of whom rated 8 students in their classes on two occasions (i.e., the three highest ranked externalizers and internalizers and two nonranked comparison students), coefficient alpha was derived for these two scales at two time points separated by one month. For the Adaptive Behavior Scale, alpha was .85 and .88, respectively, across the two rating occasions. For the Maladaptive Behavior Scale, these coefficients were .82 and .87. Coefficient alpha was also Technical Manual | 31 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) calculated for the SSBD national standardization sample of 4,463 cases on the Stage 2 instruments. The resulting alpha coefficients were .94 for the Adaptive Behavior Scale and .92 for the Maladaptive Behavior Scale. The average item intercorrelations for these two scales were .59 and .49, respectively. Coefficient alpha was not calculated for the Critical Events Index because of the divergent behavioral content (externalizing and internalizing) sampled by the items comprising this instrument. Interrater The authors extensively investigated the interrater agreement of the SSBD Stage 1 ranking procedures in the process of developing and refining the externalizing and internalizing behavioral profiles (results of these activities were described earlier under “Instrument Development Procedures”). Interrater agreement levels have been established for the Stage 2 measures. Interrater agreement was the primary criterion used to develop, evaluate, and revise the SIMS Behavior Observation Codes. The Academic Engaged Time (AET) code uses a one-paragraph definition and a stopwatch duration recording procedure to estimate the proportion of time spent academically engaged (see SSBD Observation Manual). Interrater agreement indices were calculated by dividing the smaller amount of stopwatch recorded time from one observer by the larger amount recorded by the second observer and multiplying by 100. These ratios consistently have ranged between 90–100% and have averaged approximately 95% in SSBD studies. Similarly, interrater agreement ratios have been calculated for the partial interval Peer Social Behavior (PSB) (see SSBD Observation Manual). However, since the PSB is a five-category code with a 10-second recording interval, interrater agreement among pairs of observers was determined by dividing the number of recording intervals on which there was complete agreement by the total number of intervals recorded and multiplying by 100. Interrater agreement ratios for the PSB code have consistently averaged 85% and have generally ranged between 80 and 90%. (Note: The PSB was originally a six-category code; however, the categories of social engagement and social involvement were combined into one code category—social engagement—because of their overlapping content and failure to discriminate among participant groups.) 32 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Validity The following types of validity have been estimated to date on the SSBD system: item, factorial, concurrent, discriminant, criterion related, predictive, and construct. Evidence in support of each of these validity types is described in this section. Item Validity Item validity was estimated on the Stage 2 Adaptive and Maladaptive Behavior Scales by calculating item-total correlations using the SSBD standardization sample (n = 4,463). Table 12 contains corrected itemtotal correlations for the Adaptive and Maladaptive Behavior Scale items. These correlations ranged from .64 to .81 for the Adaptive Behavior Scale and from .32 to .83 for the Maladaptive Behavior Scale. Item-total correlations averaged .59 for the Adaptive Scale items and .49 for the Maladaptive Scale items. All items in both scales met the minimum criterion of .30 and above for acceptable item-total correlations. Table 12 Item-Total Correlations for the Stage 2 Adaptive and Maladaptive Rating Scales (n = 4,463) Adaptive Rating Scales Maladaptive Rating Scales Item Corrected Item-Total Correlation Item Corrected Item-Total Correlation A1 .79 M1 .79 A2 .78 M2 .34 A3 .75 M3 .73 A4 .72 M4 .66 A5 .64 M5 .83 A6 .80 M6 .75 A7 .72 M7 .78 A8 .77 M8 .76 A9 .77 M9 .32 A10 .68 M10 .76 A11 .81 M11 .68 A12 .73 Alpha = .94 Mean interitem correlation = .59 Alpha = .92 Mean interitem correlation = .49 Technical Manual | 33 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Factorial Analysis A Principal Components Factor Analysis with a Varimax rotation, using SPSS-X procedures, was used to investigate the factor structure of the SSBD Stage 2 Adaptive and Maladaptive Behavior Scales. The items in these two scales were factor analyzed in tandem rather than separately within the two scales. This procedure, conducted on the SSBD national standardization sample, yielded seven factors with eigenvalues greater than one that collectively accounted for 79% of the variance. Two factors were then specified for extraction in a second-order analysis. The results of this analysis, with corresponding item loadings on each of the two factors, are presented in Table 13. It was expected that these two rating scales would collectively provide measures of the two primary forms of school adjustment required of all students in school settings, i.e., peer related and teacher related (Walker, Ramsey, & Gresham, 2004). The two factors that emerged from the factor analysis confirmed this hypothesis. Factor One was very dominant in the overall structure and accounted for 52% of the total variance; in contrast, Factor Two accounted for only 9%. The corresponding eigenvalues for these two factors were 12.11 and 2.09, respectively. Factor One consisted of scale items that define school adjustment according to adult expectations while the content of factor two seemed to focus primarily on peer relations. Table 13 lists the item factor loadings on these two factors across the Adaptive and Maladaptive Behavior Scale items. (See the SSBD Instruments and Forms Packet for descriptions of these items.) Factor scores were calculated for these two factors based on a sample of 1,337 externalizers, 1,310 internalizers, and 862 nonidentified comparison students drawn from the standardization sample of 4,463 cases. For Factor One (teacher related), these scores for externalizers, internalizers and comparison students were, respectively, .89, −.67, and −.36. For Factor Two (peer related), these scores were −.27, −.37, and .99, respectively. 34 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Table 13 Adaptive and Maladaptive Behavior Rating Scale Factor Structure and Item Loadings (n = 4,463) ROTATED FACTOR MATRIX Item Factor 1 Teacher Related Factor 2 Peer Related M5 .84 −.31 M7 .82 −.28 M8 .82 −.18 M1 .79 −.34 M6 .79 −.19 M3 .75 −.25 M10 .75 −.29 A1 −.70 .51 A2 −.63 .56 A11 −.63 .59 M11 .61 −.34 M9 .37 −.05 A12 −.16 .83 A8 −.24 .81 A10 −.14 .78 A6 −.43 .72 A3 −.32 .71 A9 −.38 .70 A4 −.33 .69 A7 −.33 .68 M2 .08 −.57 M4 .49 −.52 A5 −.48 .50 FACTOR SCORES Group Teacher Related Peer Related M SD M Externalizer .89 .89 −.27 SD .77 Internalizer −.67 .63 −.37 1.00 Nonranked −.36 .36 .99 .54 Technical Manual | 35 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Concurrent Validity The concurrent validity of the SSBD Stage 2 measures was estimated as part of a study of the SSBD system’s use within resource room settings (Walker, Block-Pedego, Severson, Barckley & Todis, 1989). Teacher rating and direct observational measures were completed on a sample of 56 resource room students in the elementary age range. Six teachers and their resource room students participated in this study. The SSBD Stage 2 measures were correlated with the Walker-McConnell Scale of Social Competence and School Adjustment (Walker & McConnell, 1988) and with direct observation code measures recorded by the Classroom Adjustment Code (CAC) (Walker, Block-Pedego, McConnell, & Clarke, 1983). The Walker-McConnell scale is designed for use by teachers in the K–12 grade range in rating students’ social skills. The elementary scale consists of 43 items and has three subscales. Its national standardization sample contains 1,812 cases. Extensive studies of the scale’s validity and reliability have been conducted by the authors and are reported in the manual for the scale. Correlations between total score on the Walker-McConnell scale and total scores on the Critical Events Index and the Adaptive and Maladaptive Rating Scales were −.57 (p < .001), .79 (p < .001), and −.44 (p < .001) respectively. Correlations of this magnitude provide partial support for the concurrent validity of the SSBD Stage 2 measures with a wellconstructed and validated measure of teacher-rated social skills. Similarly, correlations between the three Stage 2 measures and the child behavior code categories of an on task and unacceptable were as follows: Critical Events Index vs. on task and unacceptable were −.45 (p < .01) and .15 (p < .05), respectively. These same correlations for the Adaptive Behavior Scale were .45 (p < .01) and −.16 (p < .05); for the Maladaptive Behavior Scale, they were −.37 (p < .03) and .26 (p < .05). Though they were in the low to moderate range of magnitude, these correlations provide evidence that teacher self-reported ratings of students’ classroom behavior were related to their actual observed behavior as recorded by independent, professionally trained observers. Discriminant Validity A number of studies have been conducted by the authors and their colleagues investigating the discriminant validity of the SSBD system 36 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments and its component measures. These studies have ranged from assessing the performance of clinical samples on selected SSBD measures to discriminant function analyses of the classification efficiency of the Stage 2 measures and SIMS Behavior Observation Codes. The studies and results reported herein under discriminant validity are extensive in documenting the SSBD procedure’s ability to identify externalizers and internalizers with problematic behavioral profiles and to separate them accurately and reliably from students with well-adjusted school behavior patterns. In one sense, this is the most important form of SSBD validity because the procedure and its component instruments were designed to identify and discriminate externalizing and internalizing students from pools of students who do not manifest such behavioral tendencies. The studies described below provide substantial evidence in support of the SSBD’s ability to accomplish this goal. Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley (1990) conducted validation and replication studies on the SSBD system in Oregon and Washington field sites. In the Oregon site, 170 teachers in grades 1–5 completed the Stage 1 and 2 measures. Stage 2 measures were completed by each participating teacher on the three top-ranked externalizing students, the top three internalizing students, and two nonranked comparison students. ANOVAs were computed to test for participant group differences on the Stage 2 measures. The resulting F ratios were as follows: Critical Events Index, F (2,853) = 163.62 (p < .001); Adaptive Behavior Scale, F (2,850) = 500.51 (p < .001); and Maladaptive Behavior Scale, F (2,821) = 596.97 (p < .001). The corresponding Omega squared coefficients for these one way ANOVAs were .28, .54, and .59, respectively. Post hoc Scheffe tests indicated that all possible pairs of mean differences among the three participant groups were significant at (p < .01). In a replication of these findings, 40 regular classroom teachers in the Washington site completed SSBD Stages 1 and 2, resulting in ratings on 270 high-ranked externalizers, high-ranked internalizers, and nonranked comparison students who did not appear on either rank-ordering list of ten students each per classroom. ANOVAs were computed separately for the Critical Events Index, the Adaptive Behavior Scale, and the Maladaptive Behavior Scale to test for between-participant differences on these measures. As in the Oregon site, the Stage 2 measures proved to be highly sensitive in discriminating these three participant groups. Technical Manual | 37 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) The F ratios for the Critical Events Index, the Adaptive Behavior Scale, and the Maladaptive Behavior Scale were F (2,267) = 77.97 (p < .001); F (2,267) = 152.00 (p < .001); and F (2,267) = 214.93 (p < .001), respectively. Omega squared coefficients for these analyses were .34, .47, and .56. Post hoc Scheffe tests indicated that all mean differences for the three participant groups exceeded chance expectations (p < .05) on each of the Stage 2 instruments. A sample of four students was selected from each of the 97 participating classrooms in the Oregon site where the SIMS Behavior Observation Codeswas used. These were the highest ranked externalizer, the highest ranked internalizer, and two unranked comparison students (a boy and a girl). These participant groups were observed on two occasions each in academic and playground settings using the AET and PSB codes, respectively. Comparison students averaged 77% of observed time academically engaged; the corresponding figures for the internalizing and externalizing participants were 73% and 65%. The mean differences between comparison and externalizing students and between comparison and internalizing students were significant at (p < .01) and (p < .05); the mean difference between externalizers and internalizers was not statistically significant. Table 14 contains means and standard deviations for the three participant groups on individual PSB code categories and on variables derived from combining selected code categories. Table 14 PSB Code Category and Code Category Combination Means, Standard Deviations, and Significance Tests by Participant Group (N = 336) SIMS Behavior Observation Codes: PSB Variables Externalizers (n = 73) Internalizers (n = 76) Comparison (n = 152) Omega Squared Mean (SD) Mean (SD) Mean (SD) (SD) 35.1 (15.3) 0.03 Socially Engaged (SE) 31.1 (16.8) Participation (P) 1 27.4 2tab (13.1) 22.6 (28.7) 7.4 (16.1) 14.2 (22.6) 0.05 Parallel Play (PLP) 5.8 (7.5) 10.72,3 (10.6) 4.9 (7.5) 0.07 Alone (A) 6.1 (8.3) 8.6 (10.0) 3.5 (5.5) 0.06 No Codeable Response Total Positive Behavior (+) Total Negative Behavior 1 1 2 2 1.5 (1.7) 1.7 (3.6) 1.3 (2.0) — 80.51 (17.2) 77.22 (18.2) 88.6 (11.5) 0.10 6.01,3 (7.9) 1.8 (4.2) 1.9 (5.6) 0.07 1 = Externalizers vs. Comparison Students (p < 0.05) 2 = Internalizers vs. Comparison Students (p < 0.05) 3 = Externalizers vs. Internalizers (p < 0.05) 38 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Statistically significant differences were obtained between one or more pairs of student participant groups on all but one of these variables (i.e., No Codeable Response). Thus, both Stage 2 measures and SIMS Behavior Observation Codes variables proved to be highly sensitive to behavioral differences among teacher-nominated externalizing, internalizing, and nonranked comparison students, as reflected in both teacher ratings and direct observations recorded in natural settings by independent, professionally trained observers. Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley (1990) investigated two forms of discriminant validity for the SSBD. These were (a) the ability of the SSBD Stage 2 measures and SIMS Behavior Observation Codes variables to correctly classify the group membership assignments of study participants by teachers in screening Stage 1, and (b) the existence of statistically significant differences on Stage 2 measures and SIMS Behavior Observation Codes variables for first- vs. second-ranked students identified by their teachers in SSBD Stage 1. Using a discriminant function analysis procedure, these authors found that the SSBD Stage 2 measures and SIMS Behavior Observation Codes variables correctly classified 84.69% of the three participant groups selected by their teachers in screening Stage 1. In this same analysis, 142 of 150 (95%) nonranked comparison students were correctly classified, with four misclassified as externalizers and four misclassified as internalizers. Similarly, 56 of 69 (81%) externalizers were correctly classified, with 4 misclassified as nonranked comparison students and 9 misclassified as internalizers. Finally, 51 of 75 (68%) internalizers were correctly classified, with 13 misclassified as comparison students and 11 misclassified as externalizers. Although this level of classification efficiency far exceeds chance levels, the SSBD measures were least successful in correctly classifying internalizers whose behavioral characteristics appear to have less salience for adult raters and often more closely overlap with those of nonranked comparison students. Independent t tests were used to test for significant differences between Stage 1 first- and second-ranked students on the Stage 2 measures and SIMS Behavior Observation Codes variables. For externalizers, all three Stage 2 measures discriminated at (p < .05) while the SIMS Behavior Observation Codes variables of Parallel Play and Total Positive Behavior discriminated at this level for first- and second-ranked students. All mean differences favored the highest ranked externalizing students as predicted. For internalizing participants, the corresponding discriminating Technical Manual | 39 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) variables were the Critical Events Index, the Adaptive Behavior Scale, and the SIMS Behavior Observation Codes variable of Participation. As with externalizers, these differences favored the highest ranked internalizers. These results provide further empirical evidence for the sensitivity of teacher ranking judgments when using the Stage 1 screening procedures. Todis, Severson, and Walker (1990) investigated items on the Critical Events Index that significantly discriminated between externalizers and internalizers with scores on the SIMS Behavior Observation Codes that indicate risk for externalizing and internalizing disorder. Using two separate samples drawn from SSBD field test sites, a total of 41 participants (27 externalizers, 14 internalizers) enrolled in grades 1–5 from the two sites met these criteria. Nine of the 33 items of the Critical Events Index significantly discriminated between externalizers and internalizers. Table 15 contains these items along with the proportion of the two participant samples that had the item(s) checked as present by their teachers along with corresponding significance levels. Table 15 Comparison of Frequency of Items Checked on the Critical Events Index for Externalizing and Internalizing Elementary Students Who Met Risk Criteria on the SSBD Item Externalizers (n = 27) Internalizers (n = 14) p Value (2-tailed Fisher’s Exact Test) Ignores teacher reprimands 96.2% 7.7% <.01 Is physically aggressive 73.1% 0% <.01 Damages other’s property 50.0% 0% <.01 Steals 38.5% 0% .02 Uses obscene language, swears 34.6% 0% .02 Has tantrums 30.8% 0% .04 Makes lewd or obscene gestures 30.8% 0% .04 Demonstrates obsessive/ compulsive behavior 38.5% 7.7% .06 7.7% 84.6% <.01 57.7% 30.8% <.01 Exhibits painful shyness Is teased, neglected by peers Item 8 (“demonstrates obsessive/compulsive behavior”) in Table 15 approached statistical significance at (p < .06). Eight of these 10 items were in the predicted directions for externalizers and internalizers in terms of expected or anticipated prevalence rates. However, on two items (“is teased and/or neglected by peers” and “demonstrates 40 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments obsessive-compulsive behavior”), teachers assigned higher rates to externalizers than to internalizers. Both of these items were originally judged by the authors to be more characteristic of internalizers than externalizers. In prior research, the Critical Events Index has proved to be highly sensitive in discriminating externalizers and internalizers from nonranked comparison students. The results of the above study indicate that nearly a third of the items discriminated externalizers from internalizers as well using the combined sample of 41 participants from two sites. The SSBD Stage 2 measures were completed on 106 participants by teachers within the North Idaho Children’s Home, a residential facility serving severely emotionally disturbed and/or abused children in the K–12 grade range. This facility maintains two residential programs and two day treatment programs. Elementary and middle school age students were included in the sample rated by their teachers for this study. A total of 52 participants were rated from the regular residential population, along with a total of 17 participants from the secure treatment residential program, which serves severely involved students (e.g., homicidal, suicidal, depressed and so forth). The nonresidential portion of the sample comprised 20 day treatment students served by the residential facility and 17 students assigned to a community-based educational program operated by the facility staff on the campus of a cooperating college. Table 16 contains means and standard deviations for the four participant groups on the three Stage 2 measures. Table 16 Means, Standard Deviations, and Significance Tests for Four Participant Groups of North Idaho Children’s Home Residents (N = 106) PARTICIPANT GROUP Day Treatment Secure Treatment (n = 20) (n = 17) Residential (n = 52) Community (n = 17) M SD M SD M SD M SD Critical Events Index 5.50 3.26 9.11 3.93 5.84 4.13 6.47 4.16 Adaptive Behavior Rating Scale 3.26 0.61 3.09 0.73 3.40 0.60 3.27 0.80 Maladaptive Behavior Rating Scale 2.73 0.89 2.90 0.65 2.53 0.77 2.64 0.87 Technical Manual | 41 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) For the Critical Events Index, the entries in Table 16 represent the average number of items checked as present by the teacher; for the Adaptive and Maladaptive Behavior Scales, the entries are the average item scores, on a 5-point Likert rating scale of frequency, as assigned by teachers. Separate one-way ANOVAs conducted for these three measures indicated statistically significant differences among the four groups on the Critical Events Index (3,102) = 3.22, p < .02), but no significant differences on either the Adaptive or Maladaptive Behavior Scale. A post hoc Scheffé test indicated that the means for the secure treatment (n = 17) and regular residential (n = 52) students were significantly different (p < .05); none of the other mean differences approached significance. Chi-square analyses were conducted for the four groups on each of the 33 items comprising the Critical Events Index to identify discriminating items. Table 17 contains the results of these analyses along with average percentages of each participant group having that item checked by their teachers as present. Seven of the 33 items significantly discriminated the four groups at (p < .05). Students in the secure treatment group had the highest prevalence rates of any of the four groups on five of the seven discriminating items. Their profiles across the 33 items of the Critical Events Index are extremely problematic and indicate very severe levels of behavioral pathology. In addition, profiles of the other three participant groups were also highly indicative of serious behavioral adjustment problems in comparison with nonranked students of the same age. The sensitivity of the Critical Events Index in profiling the greater severity of the secure treatment participant group and in discriminating the other three groups from nonranked students provides additional evidence in support of its discriminant validity. 42 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Table 17 Chi-square Analysis of Critical Events Items for Four North Idaho Children’s Home Participant Groups CRITICAL EVENT ITEM 1. Steals 2. Sets Fire 3. Vomits 4. Tantrums 5. Assaults Adult* 6. Painful Shyness 7. Weight Loss/Gain 8. Sad Affect* 9. Physically Aggressive 10. Damages Property 11. Obsessive-Compulsive 12. Nightmares 13. Sexual Behaviors 14. Self-Abusive 15. Seriously Injure 16. Suddenly Cries 17. Severe Headaches* 18. Suicide Thoughts 19. Thoughts Disorders* 20. Ignores Warnings 21. Lewd Gestures 22. Physical Abuse 23. Drug Abuse 24. Sexually Abused 25. Obscene Language 26. Cruelty to Animals 27. Teased by Peers* 28. Restricted Activity* 29. Enuretic 30. Encopretic 31. Sexually Molests Children 32. Hallucinations 33. Lacks Interests* PERCENTAGE OF PARTICIPANTS HAVING ITEM CHECKED Day (n = 20) 70 — 5 60 — 15 10 30 30 30 20 15 — 5 — 20 10 20 — 45 20 5 — 15 45 — 55 5 5 5 5 — 5 Secure (n = 17) 18 — — 52.9 29.4 41.1 17.6 76.4 35.2 29.4 58.8 11.7 17.6 17.6 — 32.5 5.8 5.8 58.8 64.7 47.0 11.7 5.8 17.6 64.7 5.8 70.5 41.1 — 11.7 17.6 5.8 35.2 Residential (n = 52) 25 — 7.6 53.8 53.8 15.3 1.9 61.5 21.1 21.1 42.3 26.9 15.3 11.5 3.8 13.4 34.6 13.4 23.0 40.3 15.3 — 13.4 21.1 40.3 1.9 21.1 5.7 9.6 3.8 5.7 5.7 1.9 Community (n = 17) 23.5 5.8 5.8 64.7 64.7 17.6 5.8 35.2 23.5 35.2 35.2 76.4 5.8 29.4 11.7 35.2 23.5 5.8 23.5 47.0 23.5 11.7 23.5 — 58.8 — 35.2 11.7 — — — 5.8 — *Significance at p < 0.05 Technical Manual | 43 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) As part of a larger study of the social competence of regular students enrolled in second- and fourth-grade classrooms, Hops, Lewin, Walker, and Severson (1990) incorporated the SSBD Stage 1 measures as part of their overall data collection procedures. A total of 47 participants were participants in this part of their overall study, with 26 participants enrolled in grade 2 and 21 enrolled in grade 4. Of the 26 second graders, 10 were highly ranked by their teachers as externalizers, 9 were highly ranked as internalizers, and 7 did not appear on either rank-order list using the SSBD Stage 1 procedures. For the 21 fourth graders, there were 7 each externalizers, internalizers, and nonranked students. The second- and fourth-grade participant groups were combined for purposes of statistical analysis. These authors recorded a broad range of social competence and academic skill measures on all students, including the participants described above, who were enrolled in the participating classes and for whom prior parental consent was obtained. These measures included direct observations recorded on child and/or teacher behavior in classroom and playground settings, sociometric assessment procedures, and teacher ratings of the students’ status on such dimensions as aggression, social withdrawal, and academic competence. Comparative analyses were conducted of the SSBD participant groups’ relative status on these measures. Separate one-way ANOVAs were conducted for each of these study variables, with post hoc tests conducted for variables yielding significant F ratios. Results of these analyses are presented in Table 18. Table 18 contains means, standard deviations, and results of corresponding post hoc test results for each statistically significant study variable. A relatively large number of variables in the Hops et al. (1990) study discriminated the three participant groups. As a rule, mean levels and corresponding discriminations were in expected directions (e.g., externalizers were rated as significantly more aggressive than internalizers or nonranked students, internalizers were rated as significantly more socially withdrawn than either of these participant groups, both groups had lower average likability ratings assigned by their teachers, externalizers and internalizers had lower peer preference scores than nonranked students, externalizers had much higher observed rates of teacher disapproval in reading than either internalizers or nonranked students and so forth). However, it should be noted that the codes used by Hops et al. (1990) for recording the participants’ playground social behavior were not sensitive in discriminating the three groups. 44 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Table 18 Means, Standard Deviations, and Significance Tests for Fourth-Grade Externalizing, Internalizing, and Nonranked Students (N = 47) Externalizers Internalizers Nonranked M SD M SD M SD TEACHER RATINGS OF SOCIAL SKILLS Subscale 1 46.001 11.31 56.68 16.19 66.35 12.26 Subscale 2 1 58.52 13.95 53.31 15.12 70.78 7.99 Subscale 3 1 31.76 8.16 36.75 10.61 45.64 5.12 2 TEACHER RATINGS OF ACADEMIC SKILLS Reading 5.111 2.34 5.182 2.31 7.71 1.20 Math 5.35 2.26 6.00 1.78 7.85 1.40 TEACHER RATINGS OF BEHAVIORAL ATTRIBUTES Likability 0.701 0.77 0.872 1.20 2.14 1.40 Aggression 7.58 4.01 1.68 2.54 1.35 2.37 Withdrawal 1.29 Popularity 7.111 Peer Preference (Positive choices minus negative choices) Social Impact (Positive choices plus negative choices) 1,3 1.79 2 3.31 2.96 0.42 0.75 6.43 6.682 5.58 13.71 5.95 -0.501 1.13 -0.432 0.96 0.58 0.80 0.27 0.99 0.10 0.67 0.48 0.72 3 SOCIOMETRIC STATUS OBSERVED TEACHER BEHAVIOR Teacher Disapproval (Reading) 0.061,3 0.08 0.01 0.04 0.01 0.04 Teacher Disapproval (Math) 0.01 0.06 0.02 0.05 0.02 0.06 OBSERVED STUDENT BEHAVIOR Classroom On Task (Reading) 75.361 24.91 86.16 10.87 84.01 27.73 On Task (Math) 86.22 9.80 85.92 11.63 90.96 7.69 15.66 23.11 5.85 11.23 20.84 35.48 % Negative Social Behavior 0.57 1.53 0.10 0.41 0.00 0.00 % Time Alone 4.45 5.36 7.68 9.10 8.76 17.32 Playground Sociability Index (Initiations to peers/ Initiations from peers) 1 = Externalizers vs. Nonranked Students (p < .05) 2 = Internalizers vs. Nonranked Students (p < .05) 3 = Externalizers vs. Internalizers (p < .05) Technical Manual | 45 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Block-Pedego, Walker, Severson, Todis, and Barckley (1989) conducted a study in which the Alone category of the SIMS Behavior Observation Codes was used as a selection variable for identifying subgroups of students identified as externalizers and internalizers in Stage 1. Those students who had very low rates of social contact with peers in free-play settings were selected for inclusion in the study. A pool of externalizers (n = 22) and internalizers (n = 30) represented the highest 25% of scores on the Alone variable of the SIMS Behavior Observation Codes. A random sample of 52 participants was selected from the pool of nonidentified students who served as controls in difference analyses conducted on SSBD Stage 2 measures and SIMS Behavior Observation Codes variables in order to identify behavioral correlates of large amounts of time spent alone in free-play settings. The Critical Events Index average scores in this study were 4.50 (SD = 2.90), 2.30 (SD = 2.08) and .08 (SD = .27) for externalizers, internalizers, and nonranked students, respectively. Means for externalizers and internalizers were significantly different from the mean for nonranked students at (p < .05). The Adaptive Behavior Scale average scale scores for these groups were 31.53 (SD = 6.89), 42.80 (SD = 8.75), and 55.55 (SD = 3.26); similarly, Maladaptive Behavior Scale average scale scores were 35.18 (SD = 5.90), 19.00 (SD = 4.59), and 13.76 (SD = 3.00). The mean scores for externalizers and internalizers were significantly different from control participant means for both scales at (p < .05). Item-by-item comparisons were also conducted for the three participant groups on the Stage 2 rating scales. Means and standard deviations for each of these item comparisons are presented in Table 19. The results of post hoc tests designating statistical significance at (p < .05) are also presented in Table 19 on an item-by-item basis for the two Combined Frequency Index scales. Inspection of this table indicates that all of the rating scale items discriminated externalizers from nonranked participants; these same items were substantially less powerful in discriminating internalizers and nonranked participants. Nearly all these items discriminated externalizers from internalizers. Separate t tests were conducted for externalizers vs. nonranked participants and for internalizers vs. nonranked participants on each of the SIMS Behavior Observation Codes PSB categories. For externalizers vs. 46 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Table 19 Means, Standard Deviations, and Significance Tests for Participant Groups on the Adaptive and Maladaptive Rating Scale Items (N = 104) ADAPTIVE BEHAVIOR RATING SCALE A1. Follows established classroom rules A2. Is considerate of the feelings of others A3. Produces work of acceptable quality given her/his skill level A4. Gains peers’ attention in an appropriate manner A5. Expresses anger appropriately (i.e., reacts to situations without being violent or destructive) A6. Cooperates with peers in group activities or situations A7. Makes assistance needs known in an appropriate manner (e.g., asks to go to the bathroom, raises hand when finished with work, asks for help with work, etc.) A8. Is socially perceptive (i.e., "reads" social situations accurately) A9. Does seatwork assignments as directed A10. Compliments peers regarding their behavior or personal attributes (e.g., appearance, special skills, etc.) A11. Complies with teacher requests and commands A12. Initiates positive social interactions with peers MALADAPTIVE BEHAVIOR RATING SCALE M1. Requires punishment (or threat of same) before she/ he terminates an inappropriate activity or behavior M2. Refuses to participate in games and activities with other children at recess M3. Behaves inappropriately in class when corrected (e.g., shouts back, defies the teacher, etc.) M4. Responds inappropriately when other children try to interact socially with her/him M5. Child tests or challenges teacher-imposed limits (e.g., classroom rules) M6. Uses coercive tactics to force the submission of peers (e.g., manipulates, threatens, etc.) M7. Creates a disturbance during class activities (e.g., is excessively noisy, bothers other students, out of seat, etc.) M8. Manipulates other children and/or situations to get his/her own way M9. Is overly affectionate with others (peers and adults), e.g., touching, hugging, kissing, hanging on, etc. M10. Is excessively demanding (e.g., requires or demands too much individual attention) M11. Pouts or sulks Externalizers (Highest 25% Alone) M SD 2.811,3 0.73 2.221,3 0.75 Internalizers (Highest 25% Alone) M SD 4.222 0.85 3.962 0.75 Nonranked (Random Sample) M SD 4.80 0.40 4.70 0.50 2.901 1.10 3.512 1.22 4.62 0.66 2.651,3 1.12 3.292 1.03 4.64 0.74 1,3 2.69 0.99 4.07 1.23 4.78 0.67 2.631,3 0.72 3.852 1.06 4.90 0.30 2.861,3 0.99 3.742 1.16 4.74 0.44 2.271,3 0.88 3.142 1.02 4.66 0.59 2.771,3 1.23 3.662 1.17 4.68 0.58 1,3 1.91 0.88 2 2.57 1.24 3.62 0.85 2.771,3 3.001 0.81 1.15 4.332 2.772 0.91 1.08 4.88 4.68 0.32 0.58 M SD M SD M SD 3.501,3 1.19 1.74 1.16 1.34 0.74 2.471,3 0.98 3.22 1.18 1.20 0.40 2.711,3 1.38 1.07 0.38 1.40 1.21 3.001,3 0.94 2.252 0.94 1.26 0.82 3.951,3 1.07 1.39 0.84 1.36 0.63 3.201,3 1.28 1.29 0.54 1.12 0.38 4.141,3 1.19 1.44 0.64 1.24 0.43 3.231,3 1.13 1.48 0.64 1.22 0.46 1.761 1.09 1.40 0.97 1.18 0.69 3.661,3 0.96 1.662 0.91 1.18 0.48 3.501,3 1.14 1.882 0.75 1.18 0.43 1 = Externalizers vs. Non ranked Students (p<.05) 2 = Internalizers vs. Non ranked Students (p<.05) 3 = Externalizers vs. Internalizers (p<.05) Technical Manual | 47 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) nonranked participants, the following codes and variables, derived from code category combinations, discriminated at (p < .05): Parallel Play, Alone, Positive Interaction, and Negative Interaction. For internalizers vs. nonranked participants, the discriminating code categories were Parallel Play, Alone, and Positive Interaction. Behavioral profiles on each of these code categories favored the nonranked participants. Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley (1990) examined the school records of 85 high-ranked externalizers, 86 high-ranked internalizers and 161 nonranked comparison students from their larger sample of 856 participants in grades 1–5. Their school records were systematically coded using the School Archival Records Search (SARS) procedure (Walker, Block-Pedego, Todis, & Severson, 1991; in press.) The SARS makes it possible to code systematically 11 archival school records variables commonly found in school folders (e.g., referrals, achievement, attendance, discipline contacts, and so forth). Table 20 contains profiles of the three participant groups on each of these SARS variables. Table 20 SARS Record Search Profiles for Externalizing, Internalizing, and Nonranked Control Students (N = 332) SARS Variables Externalizers (n = 85) Internalizers (n = 86) Nonranked Control (n = 161) Achievement Test M = 38.35 SD = 27.58 M = 43.61 SD = 28.73 M = 68.56 SD = 23.74 YES NO YES NO YES Behavioral Referrals Within School 18 67 5 81 0 161 Academic Referrals Within School 25 60 28 58 9 152 Current IEP 18 67 22 64 8 153 Placement in Non-Regular Classroom 17 68 22 64 18 143 Chapter I Services 30 55 33 53 18 143 Referrals Out of School NO 9 76 3 83 1 160 Negative Narrative Comments 46 39 35 51 17 144 Disciplinary Contacts with Principal 37 48 5 81 5 156 Speech and Language Referrals within School 10 75 24 62 13 148 *p < .01 for all SARS variables on all Chi-square group comparisons 48 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Chi-square analyses indicated that all of the SARS variables discriminated the three participant groups at (p < .01). The Yes column shows the number of students who met SARS at-risk criteria for each variable; the No column shows those who did not. Inspection of Table 20 indicates that the SARS behavioral profiles for externalizers and internalizers were highly problematic and document extensive adjustment problems in the school setting. Their profiles were dramatically different from those of the nonranked comparison students. Although both groups seem to be considerably at risk for school failure, the profiles for externalizers appeared to be somewhat more problematic than the internalizers’ profiles. Thus, teacher-identified participant groups in SSBD Stage 1 proved to have school records profiles that indicate very serious adjustment problems and that provide established support for the utility of teacher judgment in the screening and identification of behaviorally at-risk students. Finally, as part of a larger study that investigated the construct validity of the overall SSBD procedure, Walker, Block-Pedego, Severson, Todis, Williams, Barckley, and Haring (1991) examined the social competence of externalizing and internalizing participant groups. Participants for this study were participating teachers (grades 1–5) and students (n = 280) in their classrooms in a single elementary school located in the Peninsula School District of Washington State. Two teachers from four of the grade levels and three teachers from one grade level were study participants. The SSBD Stage 1 and 2 measures and SIMS Behavior Observation Codes variables were recorded for students participating in the study. In addition, sociometric procedures, teacher ratings of social skills, and SARS archival school record searches were also completed for these participants. These authors used a sociometric assessment procedure in which students indicated their three most preferred work and play choices to identify participants for the study. Students who received one or fewer work and play choices in this assessment procedure were selected for study inclusion. Results of this procedure indicated that 22% of the externalizing participants (n= 24), 25% of the internalizing participants (n = 27), and 5% of nonranked comparison participants (n = 3) met the criterion for being socially isolated. For comparative purposes, isolate and nonisolate groups of externalizers and internalizers were constructed by the authors. Isolate participants were those ranked by their teachers as having either externalizing or internalizing behavior patterns Technical Manual | 49 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) and as receiving one or fewer work and play choices from their peers. Comparison students (n = 57) consisted of those who did not appear on either the externalizing or internalizing rank-order teacher lists and had more than one work and play choice from peers. The isolate and nonisolate participant groups were compared on teacher ratings of social skills, the SSBD Stage 2 measures, and the SIMS Behavior Observation Codes PSB variables. Table 21 contains means, standard deviations, and results of statistical tests of participant group differences on each of these variables. This table also contains means and standard deviations for the nonranked comparison students to provide a normative standard for evaluation of externalizing and internalizing participants’ behavioral level. Substantial mean differences were identified between the isolate and nonisolate externalizing and internalizing participant groups on each of the three major classes of dependent measures. Total scale score and all three subscales of the Walker-McConnell scale discriminated isolates from nonisolates on both externalizing and internalizing dimensions. Similarly, all three Stage 2 instruments also significantly discriminated these participant groups. Isolates had less favorable behavioral profiles on these two classes of measures than nonisolates in every case. Four SIMS Behavior Observation Codes PSB categories discriminated isolates from nonisolates, the No Code category discriminated isolate from nonisolate externalizers, and the categories of Parallel Play, Alone, and Total Positive Behavior discriminated isolate from nonisolate internalizers. As with the other study measures, the profiles for non-isolates were more favorable than for isolates on each of these variables. Overall, these results and measures demonstrate the sensitivity of teacher judgment to differences in child behavior as expressed through the SSBD instruments. 50 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments 72.84 46.70 57.43 60.74 67.66 Normal Controls Externalizers Isolate Nonisolate Internalizers Isolate Nonisolate 10.79 12.42 14.71 13.67 6.83 SD 7.71 10.25 — F M 2.35 1.20 OBSERVATION CODING MEASURES Normal Controls Externalizers Isolate .58 .56 Nonisolate Nonisolate 1.25 Internalizers Isolate 3.42 1.43 Nonisolate Internalizers Isolate 4.16 Externalizers Isolate 3.64 .05 Normal Controls Nonisolate M STAGE 2 MEASURES F 8.69 19.95 — 1.08 4.30 8.53 1.89 4.66 SD 8.85 .39 — F Parallel Play .89 1.45 1.92 4.40 .22 SD Critical Events Index M Subscale 1 Teacher Preferred TEACHER SOCIAL SKILLS RATINGS .01 .54 — p .00 .00 — p .01 .00 — p 4.36 18.71 4.82 5.30 6.57 M 52.06 47.25 44.91 38.62 56.85 M 13.65 13.82 13.56 14.66 8.75 SD 17.75 24.33 — F 7.09 18.38 9.17 9.72 9.84 SD F — 9.86 0.01 — F 8.78 8.26 Alone 7.06 8.07 9.63 8.87 3.25 SD Adaptive Behavior Rating Scale 62.53 49.70 66.72 51.00 75.59 M Subscale 2 Peer Preferred .00 .92 — p .00 .01 — p .00 .00 — p 1.84 1.28 0.78 2.30 0.28 M 13.15 15.09 20.75 25.84 12.36 M 5.87 8.59 9.12 8.94 3.60 SD 13.41 8.40 — F F 5.42 7.59 — 5.91 1.77 1.23 2.30 0.57 SD .06 4.66 — F No Code 3.18 5.18 7.71 8.92 2.43 SD Maladaptive Behavior Rating Scale 43.88 38.48 38.16 32.08 47.80 M Subscale 3 School Adjustment .81 .04 — p .02 .01 — p .00 .01 — p 90.17 72.57 85.68 86.00 88.00 M 174.08 148.92 8.40 129.79 196.17 M 18.61 16.95 — F 10.22 20.69 14.74 10.36 11.91 SD 9.56 .00 — F Total Positive 25.15 29.33 34.78 32.01 16.26 SD Total Scale Score .00 .96 — p .00 .00 — p Table 21 Means, Standard Deviations, & Significance Tests for Isolate and Nonisolate Participants on Teacher Social Skills Ratings and SSBD Stage 2 and SIMS Observation Codes Measures Technical Manual | 51 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 22 Correlations Between Year One and Year Two Follow-up Scores on SSBD Stage 2 Measures for Combined Externalizing and Internalizing Groups (N = 155) Critical Events Index Adaptive Rating Scale Maladaptive Rating Scale Critical Events index .32 p = < .01 −.41 p = < .01 .36 p = < .02 Adaptive Behavior Scale −.26 p = < .02 .45 p = < .01 −.39 p = < .01 Maladaptive Behavior Scale .34 p = < .01 −.55 p = < .01 .70 p = < .01 Inspection of Table 22 indicates that the correlations for the Stage 2 measures across the 1-year follow-up period ranged from .32 (Critical Events Index) to .70 (Maladaptive Behavior Scale). Correlations among these measures at each time point were in the low to moderate range. The SIMS Behavior Observation Codes variables, recorded in the follow-up year (1987–88), were entered into a discriminant function analysis procedure in order to test their accuracy in classifying the participants’ group membership in the previous year (1986–87). A total of 10 observation code categories and combinations were used as predictors in this analysis. Overall, these variables correctly classified 53% of participants in the three student groups, with 44%, 43%, and 60% of externalizers, internalizers and nonranked students, respectively, being correctly classified. Table 23 lists the predictor variables for each of the two discriminant functions in the analysis arranged by order of magnitude according to the size of their correlations within each function. Table 23 SIMS Behavior Observation Codes Predictor Variables for Discrimination Analysis Classifying Previous Year’s Participant Group Status (N = 155) PSB Variables No Code Function 1 Function 2 .59 .21 Total Positive −.59 .12 Academic Engaged Time −.56 .02 Alone Positive Interaction Total Negative .51 −.44 −.34 .23 .27 .80 Negative Interaction .20 .75 Social Engagement −.28 .49 Parallel Play .19 −.24 Participation .11 −.20 52 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments These results indicate that the follow-up observation data classified the participants’ group status at Rating Time 1 above chance expectation levels. However, the classification efficiency of these variables was only in the moderate range. Overall, these results suggest relatively modest levels of predictive validity for the SSBD measures as recorded over a 1-year follow-up period. The fact that 69% and 52% of externalizers and internalizers, respectively, appeared in their teachers’ top three Stage 1 rankings in Year 2 suggests moderate behavioral stability on the part of these participants and considerable teacher sensitivity to their behavioral characteristics. The stability of the participants’ maladaptive behavior, as rated by teachers, was substantially higher than the stability for either their adaptive behavior or their status on the Critical Events Index. Construct Validity This type of validity refers to an instrument’s ability to measure a particular construct and is usually established through indirect evidence and inference (Salvia & Ysseldyke, 1988). Two examples of the SSBD’s construct validity have been demonstrated. In the Walker, Severson, Todis, Block-Pedego, Williams, Haring, and Barckley (1990) study, these authors identified 40 elementary classrooms (grades 1–5) in a cooperating school district in the State of Washington in which a total of 54 certified SED students had been previously mainstreamed. The participating teachers were not informed about these students or about their relationship to the proposed study. These teachers completed SSBD Stages 1 and 2 for their classrooms. SARS record searches were also completed on eight students from each classroom (three externalizers, three internalizers, and two nonranked students). The authors were especially interested in whether the 54 SED students would appear among their teachers’ top three ranks on either the externalizing or internalizing dimensions in SSBD Stage 1. Teachers assigned all 54 SED students to either the externalizing (n = 45) or internalizing (n = 9) rank order lists. In terms of their actual rank orders, 39 of 45 externalizers were ranked by their teachers as among the top three students on the externalizing list (n = 10), while all 9 internalizers appeared among the top three ranks. These results provide support for the sensitivity of the SSBD in measuring the construct of school-related behavior disorders. Technical Manual | 53 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Severson, Walker, Barckley, Block, Todis, and Rankin (1989) investigated the construct validity of the SSBD procedure using participants enrolled in grades 1, 3, and 5 of a single elementary school in Washington State. Data were collected on all students (N = 76) in the three classrooms using the following instruments: •• Sociometric assessments (work and play) •• Walker-McConnell Scale of Social Competence and School Adjustment •• SIMS Behavior Observation Codes for Academic Engaged Time (AET) •• SIMS Behavior Observation Codes for Peer Social Behavior (PSB) •• School Archival Records Search (SARS) •• SSBD Critical Events Scale •• SSBD Adaptive and Maladaptive Behavior Scales The focus of analyses conducted on these variables was to confirm the degree of problematic adjustment status of high-ranked externalizers and internalizers in SSBD Stage 1 as indicated by multiple indicators of school adjustment (e.g., sociometrics, direct observations, school records, social skills ratings). These school adjustment indicators provided a large number of specific measures that could register confirmatory evidence of school adjustment problems. Two subsets of five variables each from this array were chosen for the internalizing and externalizing participant groups on the basis of both theoretical and empirical grounds. That is, five variables were selected for each behavioral dimension that fit a theoretical model of factors likely to contribute to internalizing vs. externalizing dimensions and that manifested a unique contribution to total variance based on their partial correlations within each dimension. Five variables were judged to be an appropriate number for a sample size of 76. This five-variable set for externalizers consisted of: •• School adjustment subscale of the Walker-McConnell Social Skills Scale •• Sociometric work and play choices •• Amount of negative social interaction, based on SIMS Behavior Observation Codes •• Amount of total positive social behavior, based on SIMS Behavior Observation Codes •• SIMS Behavior Observation Codes PSB No Code category 54 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments The variable set for internalizers was as follows: •• Peer subscale of the Walker-McConnell Social Skills Scale •• Sociometric work and play choices •• Referral for speech and language (SARS) •• Amount of time spent alone on the playground, based on SIMS Behavior Observation Codes •• Amount of time spent in parallel play at recess, based on SIMS Behavior Observation Codes These variables were then submitted to a discriminant function analysis procedure using a direct entry method where the groups to be discriminated were the top three ranked students (teacher ranks 1–3) vs. the bottom seven (teacher ranks 4–10) on both the externalizing and internalizing dimensions. These analyses created two discriminant functions which were linear combinations of weighted standard scores, thus making it possible to aggregate these sources of data in order to create an overall score for each participant. This score served as an aggregated criterion variable for establishing the “behavioral at-risk status” of each participant. The students who appeared on the externalizer (n = 10) and the internalizer (n = 10) teacher-generated lists were respectively rank ordered on this “risk index” according to the composite scores they received based on the five variables identified for externalizers or internalizers. The two discriminant function analyses determined how well SSBD Stage 1 teacher rankings correctly classified participants as being in the top three ranks vs. the bottom seven ranks on this index for the externalizing and internalizing dimensions. The Spearman rank order correlation between the problem behavior index and the SSBD Stage 1 rankings was rho = .71 (p < .001) for externalizers and rho = .76 (p < .001) for internalizers. The results of the discriminant function analyses indicated that Stage 1 teacher rankings correctly classified 73% of externalizers as being in the top three ranks vs. ranks 4–10 on the index, and similarly, Stage 1 teacher rankings correctly classified 82% of the internalizers (ranks 1–3 versus 4–10) on this index. These classification ratios are well above chance expectations. The resulting Wilk’s lambdas of .54 and .50 for these analyses, were both significant at p < .05. Technical Manual | 55 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) In the authors’ view, these results provide confirmatory evidence of a largely external nature that teacher rankings on externalizing and internalizing dimensions, as expressed through the structured format of the SSBD procedure, are predictive of school adjustment problems as indicated by a series of multiple measures of behavioral status. The correlations between teacher-generated Stage 1 rankings and rankings of the same participants on the externally generated behavioral deviance index are in the moderate to high range of magnitude and provide evidence in support of the SSBD’s construct validity. Social Validity The SSBD procedure is broadly perceived by both experts and practitioners in education as representing an innovative and best-practices approach to the task of behavioral screening and identifying appropriate candidates for further evaluation and access to supports, services, and intervention(s). A large number of educators and psychologists in higher education, state education department offices, and school district settings have endorsed the SSBD as an effective response to the need for systematic universal screening and identification and the development of higher quality referrals (i.e., students with behavioral problems/disorders whose school adjustment is seriously impaired). On the basis of documented need and the strength of the research data presented, and following a vigorous peer review, the SSBD received program validation in the form of approval by the U.S. Department of Education’s Program Effectiveness Panel in February 1990. This panel of experts in the field of evaluation viewed the SSBD as effective in meeting its stated goals, as adoptable and transportable to other sites, and as a program that fills a critical need in education. PEP validation made the SSBD eligible for access to funds for supporting its national dissemination and adoption. Since its publication, the SSBD has been positively reviewed by a number of professionals concerned with the EBD student population. 56 | Phase 2: Trial Testing, Field Testing, and Validation of SSBD Instruments Phase 3: Extensions of SSBD Instruments The SSBD was originally validated for use with students in Grades 1–6. Since its release, practitioners and researchers have explored the relevance of the SSBD to other contexts outside of Grades 1–6. These extensions characterize Phase 3 of SSBD validation, which is ongoing and seeks to validate the use of the SSBD with students in other grades, including preschool/kindergarten and middle school/junior high students. The Early Screening Project: Using the SSBD with Preschool and Kindergarten Students Since introduction of the SSBD, there has been considerable interest by other researchers in using the system with preschool children, and preliminary investigations confirmed that the SSBD could be successfully adapted for preschool use. History and Development of ESP The primary adaptation of the SSBD procedure was to alter the SIMS Behavior Observation Codes procedures. Eisert, Walker, Severson, and Block (1989) found the Peer Social Behavior observations were able to discriminate reliably among preschool groups of externalizers, internalizers, and control children. Sinclair, Del’Homme, and Gonzalez (1993) also reported a pilot study using the SSBD with preschool children. Sinclair et al. used the SSBD intact except that (1) in Stage 1, the teachers were asked to nominate and rank seven externalizers and seven internalizers (out of classes of 15), (2) the direct observation of Academic Engaged Time was eliminated, and (3) the direct observation of Peer Social Behavior during free play in the classroom and on the playground was doubled to four 10-minute sessions. The three top-ranked externalizers and internalizers were followed up on with SSBD Stage 2 rating scales and the SIMS Behavior Observation Codes. While their results were encouraging, Sinclair et al. found that changes were needed to make the SSBD more appropriate for the preschool population. For example, the cutoff criteria for defining problem children needed adjustment to take into account the developmental status of younger vs. older children (e.g., younger children engage in more parallel Technical Manual | 57 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) and solitary forms of play than do older students). In 1990 Edward Feil, working with the SSBD authors Walker and Severson, began to modify the SSBD to make it appropriate for younger children. From 1990 to 1994, this research was supported in part through grants from (1) the U.S. Department of Education, Office of Special Education and Rehabilitative Services, Research in Education of the Handicapped Program: Student Initiated and Field-Initiated Research, and (2) the U.S. Department of Health and Human Services, Administration for Children and Families: Head Start Research Fellows Program. This resulted in publication of Feil’s dissertation research (Feil, 1994; Feil & Becker, 1993). The modified screener became known as the Early Screening Project (ESP). In revising the original elementary-based SSBD for use with preschoolers, the authors found it necessary to consider changing some of the SSBD procedures. Because most preschool children will exhibit some problem behaviors at one time or another (Campbell, 1990; Paget, 1990), the frequency and intensity of the behaviors were most likely the important discriminative features. The Stage 2 SSBD behavior checklist measures were substantially modified to make them appropriate for rating preschool-level children. Approximately half of the occurrence/nonoccurrence items on the Critical Events Index were changed to a 5-point Likert scale to allow a better report on frequency and/or intensity of behavior problems. Also, items regarding academics were omitted due to their inapplicability to preschool activities, and wordings were changed to make the items more appropriate to preschool children. Items that specifically referred to aggressive acting-out behavior were put together into a new scale titled the Aggressive Behavior Scale. Consequently, nine SSBD occurrence/nonoccurrence items were converted to frequency ratings and used for externalizers only. The Critical Events Index for preschool and kindergarten contains 16 occurrence/nonoccurrence items, and the Aggressive Behavior Scale consists of nine 5-point Likert response scales that are sensitive to both frequency and intensity dimensions. In order to better distinguish children with internalizing behavior problems— who are generally more difficult to identify accurately—the authors used the Social Interaction Rating Scale (Hops, Walker, & Greenwood, 1988) for children who were highly ranked as internalizers. This scale uses eight 7-point Likert-type scale behavioral items that (1) correlated with observational measures of social interaction, and (2) discriminated 58 | Phase 3: Extensions of SSBD Instruments between appropriate referrals and nonranked peers. A score of 28 or less successfully discriminated between referred children and their typical nonreferred classmates, with 90% correct classification. This promising adaptation of the SSBD procedure for use with preschools led to additional validation studies and inclusion of preschool and kindergarten students in screening procedures outlined in the 2nd edition of the SSBD. These validation studies support the reliability and validity of preschool and kindergarten application of the SSBD, and are described in more detail below. Validation Studies: Reliability Interrater Reliability Pearson correlations and Kappa coefficients between raters (i.e., teacher/ assistant teacher pairs) for Stage 1, Stage 2, and the concurrent measures (i.e., Preschool Behavior Questionnaire and Connors Teacher Rating Scale, when applicable) were completed to obtain interrater reliability coefficients. A cross-tabulation table was constructed in Stage 1, considering only whether a child was nominated to be among the three highest ranked externalizers and internalizers by the teacher and assistant teacher. Kappa coefficients were computed between the teachers and assistant teachers and ranged from .42 to .70. These coefficients show that Stage 1 has adequate reliability for screening purposes. In Stage 2, comparing the teachers’ and assistant teachers’ scale scores resulted in highly significant reliability coefficients ranging from .48 to .79, with a median coefficient of .71. These coefficients are equal to those of the Preschool Behavior Questionnaire (Behar & Stringfield, 1974) and Conners Teacher Rating Scale (1989), two published measures used for the identification of preschool behavior problems (e.g., Attention Deficit Hyperactivity Disorder and Oppositional Defiant Disorder). The observational interrater reliability coefficients were calculated from a random sample of 20% of the observations. Interrater reliability was derived by dividing the smaller score by the larger score. In two research studies, this provided a proportion indicator of rater differences weighted for length of observation and resulted in coefficients of .87 and .88, which is within acceptable limits for a screening device of this type (Salvia and Ysseldyke, 1988). Technical Manual | 59 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Test-Retest Reliability For test-retest reliability, teachers and assistant teachers were asked to rank order and rate the children again in the spring after a 6-month interim period. In Stage 1, considering only whether a child was nominated by the teacher and assistant teacher to be in the three highest ranked externalizer and internalizer groups, a cross-tabulation table was constructed to examine stability over time. Kappa coefficients were computed between the teachers and assistant teachers and resulted in coefficients of .59 for externalizers and .25 for internalizers. These coefficients show a drop, but this is to be expected with 6 months between data collection periods. Classroom fall and spring scores on the Critical Events, Adaptive, and Maladaptive scales were compared and resulted in highly significant correlations ranging between .75 and .91, with a median correlation of .77. Correlations of classroom fall and spring scores on the two concurrent measures in this study (i.e., Preschool Behavior Questionnaire and Conners scales) resulted in highly significant coefficients ranging between .61 and .79, with a median correlation of .72. One study assessed the ESP measures as compared to a concurrent measure (i.e., Conners scale) over a 1-year test-retest reliability period. Pearson correlation coefficients of the ESP Stage 2 measures were generally greater than the Conners stability coefficients. With the exception of the Critical Events Scale, the correlation coefficients of the Stage 2 measures were highly significant (p < .001) . Although the attrition rate was high (from 121 to 26 subjects) and therefore makes these results inconclusive, the representativeness of the study’s participants and the strong validity coefficients of the Stage 2 measures are encouraging after a time span of more than 1 year (November 1991–February 1993). These results are above expectations for coefficients over a 1-year time span (Elliot, Busse, & Gresham, 1993). Consistency Across Measures The consistency across measures was examined by comparing the standard T-scores (M = 50, SD = 10) of the children ranked highest on Stage 1 externalizer and internalizer dimensions and children ranked as average (nonranked) who served as a control comparison across ESP and concurrent measures. In Figure 1, these groups were discriminated on all measures used, and most clearly differentiated on the Aggressive, 60 | Phase 3: Extensions of SSBD Instruments Adaptive, and Maladaptive Behavior scales. Both the externalizer and internalizer groups had relatively equivalent scores on the Critical Events Index and Social Behavior Observation. Figure 1 Means of Children Ranked Highest on Externalizing Dimension, Internalizing Dimension, and Nonranked Peers on T-Scores of ESP Measures 65 Externalizer T-Scores 60 55 Internalizer 50 45 40 Critical Events Nonranked Aggressive Behavior Adaptive Behavior Maladaptive Behavior Negative/Nonsocial Behavior Observation Validation Studies: Validity Content Validity Content validity is the degree to which a measure is representative of the domain of interest (Elliot et al., 1993). In this case, content validity refers to externalizing and internalizing behavioral dimensions. Content validity was inferred from three data sources: empirical findings from past studies, the judgments of a panel of experts, and preschool teacher feedback. In the formulation phase of this research (from October 1990 to June 1991), all the above sources were consulted. The literature search was completed in Fall 1991 and is represented in the item selection and adaptations of the ESP instruments. A draft of the ESP was presented to a panel of experts during Fall 1990. The few changes suggested were minor and were implemented before any data were collected. A pilot study was conducted in Spring 1991 in one preschool classroom of nine children and two teachers. After completion of the Stage 2 behavior questionnaires, these teachers did not object to any of the items on the ESP. Technical Manual | 61 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Concurrent Validity The concurrent validity of ESP measures was examined through correlations with the Behar and Connors scales. These data showed very good overall concurrent validity, with significant correlations ranging from .19 to .95 and a median and mode of .69 and .80, respectively. The Aggressive, Adaptive, and Maladaptive Behavior Scales also showed substantial concurrent validity. Consistent with past findings, the observational data have lower correlations than the teacher rating scale data. All the ESP scales were statistically significant on at least two of the three concurrent scales. Further concurrent validity of the ESP was examined by comparing the Stage 2 behavior questionnaire with SIMS Behavior Observation Codes variables using Pearson’s correlations. Most of the correlation coefficients were significant, ranging from .23 to .35. Because these data are from a different source (i.e., observational measures vs. teacher ratings), the low correlations (r) for these measures is expected and was typical (Elliot, Busse, & Gresham, 1993; Schaughency & Rothlind, 1991). Discriminative Validity Discriminant function analysis using the general linear model estimates the accuracy of a set of dependent measures in predicting a priori groupings. The a priori groups were teacher recommendation of Behavior Disorders (BD) eligibility status (i.e., whether the teacher listed the child for further evaluation for BD status), and the dependent measures were the ESP. A discriminant analysis provides a measure of the accuracy of the ESP with specificity and sensitivity coefficients. Specificity and sensitivity are important criteria when choosing an assessment method (Elliot et al., 1993). Sensitivity is the percentage of true positives, and specificity is the percentage of true negatives (Schaughency & Rothlind, 1991). The discriminant classification resulted in sensitivity and specificity rates ranging from 62% to 100% and 94% to 100%, respectively. This shows that the ESP has a low false diagnosis rate. An overall MANOVA test of the group means for the ESP measures on the combined samples found a highly significant difference (F = 24.67, df = (7,203), p < .001) between those students identified by teachers and those who were not identified. The discriminant function and MANOVA test indicate that the ESP is an accurate measure for predicting BD behaviors in preschoolers. The discriminant function results show that the ESP has a very low chance of overidentifying children with behavior problems. Usually it 62 | Phase 3: Extensions of SSBD Instruments would be desirable for a screening instrument to slightly overidentify potentially at-risk children because later assessment could separate the false positives from the true positives. Since the issue of labeling young children with behavior disorders can be fraught with personal feelings of stigmatization, the ESP’s small chance of obtaining a false-positive outcome is an asset. That is, practitioners can be confident that a child who is identified with the ESP is actually different from his/her peers. Treatment Utility Treatment utility is the degree to which assessment activities are shown to contribute to beneficial intervention outcomes (Hayes, Nelson, & Jarrett, 1987). To assess the ESP’s utility for intervention, it was used as part of an intervention study of The First Step to Success Program. First Step is a school enhancement program for kindergartners and their parents that targets three areas that are very important for every child’s school success: (1) getting along with teachers, (2) getting along with peers, and (3) doing school work. Trained consultants provided a behavior intervention plan for 25 at-risk children identified by the ESP with problematic aggressive and externalizing behaviors. As shown in Figure 2, teacher ratings of the children’s behavior improved after the First Step intervention. Adaptive Behavior scores and percent of Academic Engagement increased, while Maladaptive Behavior and Child Behavior Checklist (Achenbach & Edelbrock, 1986) scores decreased from preintervention to postintervention. These results indicate that the ESP can be used as an effective monitor of intervention effects as well as a streamlined identification procedure. Figure 2 First Step Intervention Results on the SSBD 90 80 Preintervention Postintervention 70 60 50 40 30 20 10 Adaptive Behavior Maladaptive Behavior Child Behavior Checklist Academic Engagement Technical Manual | 63 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Summary of ESP Technical Adequacy This line of ESP research consists of a series of studies designed to evaluate the psychometric properties of the ESP. The results from these studies show that the ESP can be used with diverse groups of preschool children, the results can be interpreted with confidence, and the instruments meet criteria for acceptable ESP technical adequacy. As noted earlier, the correlations between Stage 2 teacher measures and SIMS Behavior Observation Codes variables were low, but this is an expected outcome (Cairns & Green, 1979; Schaughency & Rothlind, 1991). Observational measures record child behavior directly with less bias and filtering of information. However, observational measures are very sensitive to ecological variables, such as situation-dependent interactions and physical settings. Both ratings and observational measures are important to develop an effective understanding of the child within the preschool context. Ratings appear to be more effective predictors of individual differences, and observations appear to be more effective in the analysis of interactional regulation and development (Cairns & Green, 1979). Both kinds of ESP data and analyses are important to understanding behavior problems with their socially dependent basis. In sum, the ESP has excellent psychometric characteristics and procedures that justify its use for its intended purposes. The ESP meets current standards for special education best practices in student decision making. The ESP also conforms to developmental standards for procedural integrity among preschool-age populations (Bredekamp, 1987). The ESP assesses preschool-age children’s social and emotional behavior with multimethodological techniques and with an emphasis on teacher judgments (Stages 1 and 2). Developmental differences between preschool and school-age children have been accounted for in developing the ESP. Finally, the technical adequacy of the ESP rating scales demonstrate that teachers have a wealth of normative information regarding children’s development and competencies across differing domains and situations. The ESP procedures take advantage of teachers’ extensive normative knowledge base using cost-effective and systematic screening procedures. In addition to normative teacher ratings, the ESP includes direct observations of the child’s behavior in the context of peer interactions. The information gained in these assessments can be used to plan interventions, identify children with special needs, communicate with parents, and evaluate program effectiveness. 64 | Phase 3: Extensions of SSBD Instruments Using the SSBD With Students in Grades 7–9 The SSBD has also been used successfully within the middle and junior high school context. While less research attention has focused on this population to date, emerging findings suggest that the SSBD can be an effective tool in identifying students at-risk for externalizing and internalizing disorders during the middle school years. The work of Caldarella and colleagues has successfully replicated the SSBD procedure for use at middle and junior high school levels (See Caldarella, Young, Richardson, Young, & Young, 2008; Richardson, Caldarella, Young, Young, & Young, 2009). These investigators used the standard SSBD implementation guidelines and procedures in this process. In Caldarella et al. (2008), the SSBD was administered to adolescents in middle and junior high schools in Utah. The SSBD was implemented within two suburban secondary schools that enrolled a total of 2,173 students. The ASEBA Teacher Report Form and Social Skills Rating System were also administered to students nominated in Stage 1 to compare the SSBD with other established measures of student behavior, . Findings from this study provide support for: •• Concurrent validity of the SSBD at Stage 1: The number of office discipline referrals and grade point average for students nominated during Stage 1 differed significantly from ODRs and GPA of students not identified. •• Internal consistency and interrater reliability at Stage 2: Internal consistency estimates for the Critical Events Index, Adaptive Behavior Scale, and Maladaptive Behavior Scale were in the adequate range (.54–.90) Additionally, interrater reliability correlations indicated adequate agreement between teachers on ratings (.58–.60). •• Convergent validity at Stage 2: Students identified as at risk on the SSBD were also identified as having higher levels of externalizing or internalizing behavior on the other rating scales. Significant correlations (p < 0.05) between SSBD Stage 2 measures and the ASEBA Teacher Report Form, Social Skills Rating System, number of office discipline referrals, and grade point average provide support for the convergent validity of Stage 2 measures. Technical Manual | 65 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) More research should be conducted to assess the reliability and validity of using the SSBD with middle school students, but these initial findings provide support for extending screening with the SSBD to older students. Caldarella and colleagues concluded that, based on their SSBD research in these contexts, the screening procedure works effectively with middle school students and teachers without the necessity of structural adaptation or redesign. As with the development of the Early Screening Project (ESP), (i.e., a preschool adaptation of the SSBD,) these research outcomes substantially extend the SSBD’s applicability inclusive of the age-grade range from preschool through middle school (Feil, Walker, Severson, & Ball, 2000; Lane et al., 2012; Walker, Severson & Feil, 1995) UPDATE ON SSBD RESEARCH AND OUTCOMES Research Conducted by Other Professionals Since its publication, excellent research has been conducted on the SSBD by numerous professionals. By far, the most prolific researcher in this regard has been Kathleen Lane and her colleagues. Lane et al. have also provided a superb review of the empirical and the practical knowledge bases on the SSBD system (See Lane, Menzies, Oakes, & Kalberg, 2012, Chapter Two). They have conducted the most comprehensive research to date on the SSBD and have used the instrument 1) as a criterion measure for evaluating other screening approaches (Lane, Kalberg, Lambert, Crnobori, & Bruhn, 2010), 2) as an efficacious screening approach in its own right (Lane, Oakes,and Menzies, 2010), and 3) as a system for enhancing and evaluating the impact of behavioral-academic interventions (Lane et al., 2012). Additional applications of the SSBD by Lane and her colleagues include monitoring at-risk students’ behavior within and across school years, preventing negative behavioral and academic outcomes, and providing a basis for determining movement across tiers within three-tiered PBIS approaches. Lane concluded her review by noting that the SSBD is a cost-effective screening approach with excellent 66 | Phase 3: Extensions of SSBD Instruments psychometrics of its constituent measures, and is the only commercially available screening system that identifies both externalizers and internalizers. She notes that the SSBD is regarded by many professionals as the gold standard of universal behavioral screening (Lane et al., 2012). SSBD research has also been conducted by Cheney and his associates at the University of Washington. Their research has focused primarily on elementary-age students (see Cheney & Breen, 2008; Cheney, Flower, & Templeton, 2008; Walker, Cheney, Stage, & Blum 2005) within studies of Positive Behavior Support approaches and contexts. Other SSBD applications have been reported by Epstein and Cullinan (1998), in which they found positive convergent validity between the SSBD and the Scale for Assessing Emotional Disturbance. Epstein & Sharma (1998) found similar relationships between the SSBD and the Behavioral and Emotional Rating Scale. Further, Epstein, Nordness, Nelson, & Herzog (2002) reported moderate convergent validity between the SSBD and the BERS measure of behavioral and emotional functioning. Lane et al. (2012) has empirically documented strong relationships between the SSBD and the Drummond Student Risk Screening Scale (Drummond, 1994). Finally, Kamps and colleagues have used the SSBD effectively as a screening evaluation measure in a series of prevention and intervention studies with students having EBD or at risk for same (see Kamps, Kravits, Rauch, Kamps, & Chung, 2000; Kamps, Kravits, Stolze, & Swaggart, 1999). The extensive use of the SSBD in research and related applications by other professionals and the positive empirical and practical outcomes associated with its use are gratifying (Severson, Walker, Hope-Doolittle, Kratochwill, & Gresham, 2007). These results indicate the SSBD instrument continues to be a valid and, reliable tool in addressing the needs of behaviorally at-risk students and the school staffs that accommodate them. Next, some recent research findings are reported by the SSBD authors and their colleagues resulting from the SSBD’s use in two large scale randomized controlled trials of the First Step to Success program (Walker et al., 1997). Technical Manual | 67 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Research Conducted by the SSBD Authors and Colleagues Walker and his colleagues conducted two randomized controlled trials (RCT) of the First Step Early Intervention program in which the SSBD procedure was used as a screening device and also as a measure of intervention outcomes from pre- to post-1-to year followup assessments. The first RCT was a 4-year efficacy trial of First Step to Success (Walker et al., 2009), conducted in the Albuquerque, NM’s school district, where 72% were students of color and 70% were eligible for free and reduced lunch. The Albuquerque district ranks as the 17th largest in the United States. The second RCT was a national effectiveness trial of First Step involving six participating sites in Oregon, California, Illinois, West Virginia, and Florida. A total of 200 primary grade participants in the efficacy trial were evenly divided between intervention and usual care conditions. In the national effectiveness trial, a total of 286 students were enrolled across the six sites, with 142 students in the intervention group and 146 in the comparison, usual care condition. The efficacy trial is reported in Walker, Seeley, Small, Severson, Graham, Feil, Serna, Golly and Forness (2009); the effectiveness trial is reported in Sumi, Woodbridge, Javitz, Thornton, Wagner, Rouspil, Yu, Seeley, Walker, Golly, Small, Feil, & Severson (2012). The SSBD proved to be a reliable and accurate instrument for the broadbased screening of general education classrooms in these two large scale studies. The SSBD Stage 2 measures and the Academic Engaged Time (AET) SIMS Behavior Observation Codes codes proved to be sensitive in discriminating among externalizing and comparison students at baseline and in documenting gains produced by the First Step intervention in these two studies. In terms of effect sizes the efficacy trial reflected these gains; the figures were .82 for Adaptive Behavior ratings and .87 for Maladaptive Behavior ratings by participating teachers. The effect size for observer-recorded AET was .44 for the efficacy trial. For the effectiveness RCT, these comparable figures were, respectively, .42, .67, and .35. The relative sensitivity of these SSBD measures provides support for their general use as outcome measures for short-term interventions in school settings. 68 | Update on SSBD Research and Outcomes Table 24 Correlations Between SSBD Stage 2 Measures and Achenbach TRF Scales and SSRS Scales Achenbach TRF Scales Conduct Disorder Oppositional Defiant Disorder Aggression CEI .53 .54 .62 ABI −.44 −.40 −.43 MBI .57 .68 .69 SSRS Scales Social Skills CEI −.45 ABI .53 MBI −.37 .59 −.33 .62 SRSS Parent Ratings Social Skills Scale CEI −.13 ABI .14 MBI −.09 Problem Solving Academic Competende −.23 .20 −.21 Problem Behavior Scale .30 −.30 .29 CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index These two studies used randomized control study designs and so provided an opportunity to further establish the concurrent and convergent validity of the SSBD Stage 2 screening measures. Table 24 shows correlations between the SSBD Stage 2 measures and the Achenbach TRF scales of Conduct Disorder, Oppositional Defiant Disorder, and Aggression and the Gresham and Elliott SSRS social skills, problem behavior, and academic competence scales. Correlations between SSBD measures and parent ratings of social skills and problem behavior scales of the SSRS are also shown. Seeley, Small, Walker, Feil, Severson, Golly, & Forness (2009) analyzed the Albuquerque RCT database for the First Step program’s sensitivity to students with ADHD. This analysis allowed calculations of correlations between the Connors ADHD Scale (Connors, 1997) and the SSBD Stage 2 measures. Correlations were run for teacher-reported total score, inattentive score, and hyperactive score. For the CEI, these correlations Technical Manual | 69 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) were: .31, .32, and .36. For the ABI, the correlations were −.37, −.42, and −.50. Finally, for the MBI they were .47, .43, and .34. For the most part, these validity coefficients were in the moderate range, with the exception of the correlations between SSBD measures and parent ratings of social skills and problem behavior on the Gresham and Elliott scale. They provide further confirmation that teachers and parents view the behavior of the same children quite differently and also that children may well behave differently across home and school settings. Caldarella and his colleagues (Caldarella et al., 2008) conducted an analysis of the CEI in which they identified externalizing items and internalizing items by evaluating which of them discriminated significantly between Stage 1 externalizing- and internalizing-identified students using SSBD data from middle and junior high school students. In a similar fashion, Small (2014), as part of an analysis of the SSBD’s psychometrics, identified lists of externalizing, internalizing, and other items using a combination of content analysis by experts followed by confirmation through factor analyses. The sample used for this analysis consisted of 2,237 cases of primary grade-level students distributed among three separate studies (n = 723, 1,098, and 416, respectively), in which the SSBD served as a universal screener within investigations of the First Step program’s efficacy and effectiveness. The externalizing items identified are listed by their item numbers as follows: 1, 2, 4, 5, 9, 10, 15, 20, 21, 23, 25, and 26. Identified internalizing items are 3, 6, 7, 8, 11, 12, 14, 16, 17, 18, 27, and 33. The other items are 13, 19, 22, 24, 28, 29, 30, 31, and 32. Although these analyses do not change the way the CEI is scored for decision-making purposes regarding screening, having this item classification available may be helpful to professionals for diagnostic purposes and for use in identifying strategies and interventions for use with externalizing and internalizing students. Overall, the above research outcomes a) clearly establish the sensitivity of the Stage 2 Adaptive and Maladaptive Behavior Scales as well as the AET Code in documenting intervention effects produced by a well-implemented behavioral intervention and b) the CEI, ABI, and MBI appear to be moderately related to three well-known and highly respected scales 70 | Update on SSBD Research and Outcomes for assessing social competence, ADHD symptoms, and behavioral pathology, respectively, among general education students. These results extend the applicability of these SSBD measures well beyond universal screening purposes. CONCLUSION This Technical Manual has described the research and development process the SSBD authors and their colleagues conducted to establish the psychometric integrity, efficacy, and social validity of the SSBD procedure and the instruments that make up each of its screening stages. The resulting outcomes of this 5-year development and testing process are impressive in establishing the SSBD’s accuracy, validity, and reliability. There appears to be little doubt that students who meet risk criteria on the SSBD have very serious behavioral adjustment problems of either an externalizing or internalizing nature. As such, the quality and accuracy of teacher referrals are likely to be significantly enhanced when using the SSBD or multi-gating systems like it (see Kettler et.al. 2014). In addition, “all” students enrolled in general education classrooms will have the opportunity, through the SSBD’s systematic application, to be screened on a regular basis and to access needed services. The SSBD meets best-practice standards in the areas of proactive, universal screening and in providing information for use in designing interventions and matching students’ problems with available programs, placements, and/or intervention procedures. The accompanying SSBD Administrator’s Guide provides instructions and guidelines for applying the SSBD system with high levels of implementation fidelity. This guide is designed to provide a roadmap for the SSBD coordinator to use in setting up, implementing, and troubleshooting the SSBD’s application. Technical Manual | 71 REFERENCES Achenbach, T. M. (1978). The child behavior profile. I. Boys aged 6–11. Journal of Consulting and Clinical Psychology, 46, 478–488. Achenbach, T., & Edelbrock, C. (1979). The child behavior profile. II. Journal of Consulting and Clinical Psychology, 47, 223–233. Block-Pedego, A., Walker, H. M., Severson, H. H., Todis, B., & Barckley, M. (1989). Behavioral correlates of social isolation occurring within free play settings (Technical Report). Eugene, OR: Oregon Research Institute. Caldarella, P., Young, E., Richardson, M., Young, B., & Young, K. (2008). Validation of the Systematic Screening for Behavior Disorders in middle and junior high school. Journal of Emotional and Behavioral disorders, 16, 105–117. Cheney, D., Breen, K., & Rose, J. (2008). Universal school-wide screening to identify students for Tier 2/Tier 3 interventions. National Forum for Implementation of School-wide Positive Behavior Supports. Chicago, Ill. Cheney, D., Flower, A., & Templeton, T. (2008). Applying response to intervention metrics in the social domain for students at-risk of developing emotional and behavioral disorders. Journal of Special Education, 42, 108–126. Eisert, D., Walker, H. M., Severson, H. H., Block-Pedego, A. E. (1989). Patterns of social-behavioral competence in behavior-disordered preschoolers. Early Child Development and Care, 41, 139–152. Epstein, M. & Cullinan (1998). Manual for the Scale for Assessing Emotional Disturbance (SAED). Austin, TX: Pro-Ed. Epstein, M., Nordness, P., Nelson, R., & Hertzog, M. (2002). Convergent validity of the Behavioral and Emotional Rating Scale with primary grade-level students. Topics in Early Childhood Special Education, 22, 114–122. Epstein, M., & Sharma, J. (1998). Manual for the Behavioral and Emotional Rating Scale (BERS). Austin, TX: Pro-ed. Technical Manual | 73 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Feil, E., Walker, H., Severson, H., & Ball, A. (2000). Proactive screening for emotional/behavioral concerns in Head Start preschools: Promising practices and challenges in applied research. Behavioral Disorders, 26, 13–25. Feil, E., Walker, H., Severson, H., Golly, A., Seeley, J., & Small, J. (2009). Using positive behavior support procedures in Head Start classrooms to improve school readiness: A group training and behavioral coaching model. National Institute of Health Dialog, 12, 88–103. Hersh, R., & Walker, H. M. (1983). Great expectations: Making schools effective for all children. Policy Studies Review [Special issue], 2(1), 147–188. Hops, H., Lewin, L., Walker, H. M., & Severson, H. H. (1990). Social competence correlates of externalizing, internalizing and normal behavior patterns among elementary aged students (Technical Report). Eugene, OR: Oregon Research Institute. Kamps, D., Kravits, T., Rauch, J., Kamps, J., & Chung, N. (2000). A prevention program for students with or at risk for ED: Moderating effects of variation in treatment and classroom structure. Journal of Emotional and Behavioral disorders, 8, 141–154. Kamps, D., Kravits, T., Stolze, J., & Swaggart, B. (1999). Prevention strategies for at-risk students and students with EBD in urban elementary schools. Journal of Emotional and Behavioral Disorders, 7, 178–188. Kettler, R., Glover, T.,Albers, C., & Feeney-Kettler, K. (2014). Universal screening in educational settings. Washington, DC: American Psychological Association. Lane, K., Kalberg, J., Lambert, W., Crnobori, M., & Bruhn, A. (2010). A comparison of systematic screening tools for emotional and behvavioral disorders: A replication. Journal of Emotional and Behavioral Disorders, 18, 100–112. Lane, K., Menzies, H. M., Oakes, W. P., & Kalberg, J. R. (2012). Systematic screenings of behavior to support instruction: From preschool to high school. New York, NY: Guilford. 74 | References Lane, K. L., Menzies, H. M., Oakes, W. P., Lambert, W., Cox, M., & Hawkins, K. (2012). A validation of the Student Risk Screening Scale for internalizing and externalizing behaviors: Patterns in rural and urban elementary schools. Behavioral Disorders, 37, 244–270. Lane, K., Oakes, W., & Menzies, H. (2010). Systematic screenings to prevent the development of learning and behavior problems: Considerations for practitioners, researchers, and policy makers. Journal of Disabilities Policy Studies, 21, 160–172. Leff, S., & lakin, R. (2005). Playground-based observation systems: A review and implications for practitioners and researchers. School Psychology Review, 34(4), 475–489. Nicholson, F. (1988). Evaluation of a three-stage, multiple-gating, standardized procedure for the screening and identification of elementary school students at risk for behavior disorders. Unpublished doctoral dissertation, University of Utah, Salt Lake City. Reynolds, W. M. (1984). Depression in children and adolescents: Phenomenology, evaluation and treatment. School Psychology Review, 13(2), 171–182. Richardson, M., Caldarella, P., Young, B., Young, E., & Young, K. (2009). Further validation of the Systematic Screening for Behavior Disorders in middle and junior high school. Psychology in the Schools, 46, 605–615. Ross, A. O. (1980). Psychological disorders of childhood. New York: McGraw-Hill. Salvia, J., & Ysseldyke, J. (1988). Assessment in special and remedial education. Palo Alto: Houghton Mifflin. Seeley, J. Small, J., Walker, H., Feil, E., Severson, H., Golly, A., & Forness, S. (2009). Efficacy of the First Step to Success intervention for students with attention-deficit/hyperactivity disorder. School Mental Health, 1, 37–48. Severson, H. H., Walker, H. M., Barckley, M., Block-Pedego, A. E., Todis, B., & Rankin, R. (1989). Confirmation of the accuracy of teacher rankings on internalizing and externalizing behavior profiles: Mostly hits and a few misses. (Technical Report). Eugene, OR. Oregon Research Institute. Technical Manual | 75 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Severson, H. H., Walker, H. M., Hope-Doolittle, J., Kratochwill, T., & Gresham, F. M. (2007). Proactive, early screening to detect behaviorally at-risk students: Issues, approaches, emerging innovations, and professional practices. Journal of School Psychology, 45, 193–223. Shinn, M. R., Ramsey, E., Walker, H. M., Stieber, S., & O’Neill, R. E. (1987). Antisocial behavior in school settings: Initial differences in an at risk and normal population. The Journal of Special Education, 21(2), 69–84. Small, J. (2014). Psychometric analysis of the Systematic Screening for Behavior Disorders (SSBD). ORI Technical Report. Eugene, OR. Sumi, W., Woodbridge, M., Javitz, H., Thornton, S., Wagner, M., Rouspil, K., Yu, J., Seeley, J., Walker, H., Golly, A., Small, J., Feil, E., & Severson, H. (2012). Journal of Emotional and Behavioral Disorders, 21 (1), 66–78. Tsai, S. & Cheney, D. (2012). The impact of the adult-child relationship on school adjustment for children at risk of serious behavior problems. Journal of Emotional and Behavioral Problems, 20(2), 105–114. Todis, B., Severson, H. H., & Walker, H. M. (1990). The critical events scale: Behavioral profiles of students with externalizing and internalizing behavior disorders. Behavioral Disorders, 15(2), 75–86. Volpe, R., DiPerna, J., Hintze, J. & Shapiro, E. (2005). Observing students in classroom settings: A review of seven coding schemes. School Psychology Review, 34(4), 454–474. Walker, H. M. (1986). The assessments for integration into mainstream settings (AIMS) assessment system: Rationale, instruments, procedures, and outcomes. Journal of Clinical Child Psychology, 15(1), 55–65. Walker, H. M., Block-Pedego, A. E., Severson, H. H., Barckley, M., & Todis, B. J. (1989). SSBD profiles of resource room students in restrictive and less restrictive classroom settings (Technical Report). Eugene, OR: Oregon Research Institute. Walker, H. M., Block-Pedego, A., Severson, H., Todis, B., Williams, G., Barckley, M., & Haring, N. (1991). Behavioral profiles of sociometrically isolated students at risk for externalizing and internalizing behavioral disorders. (Technical Report). Eugene, OR. Oregon Research Institute. 76 | References Walker, H. M., Block-Pedego, A., Todis, B., & Severson, H. H. (1991). The school archival records search (SARS). Longmont, CO: Sopris West. Walker, H. M., Cheney, D., Stage, S., & Blum, C. (2005). School-wide screening and positive behavior supports: Identifying and supporting students at-risk for school failure. Journal of Positive Behavior Intervention, 7, 194–204. Walker, H. M., & McConnell S. R. (1988). The Walker-McConnell scale of social comptence and school adjustment. Chico, CA: Duerr Evaluation Associates. Walker, H. M., Ramsey, E., & Gresham, F. M. (2004). Antisocial behavior in school. Belmont, CA: Wadsworth. Walker, H. M., & Rankin, R. (1983). Assessing the behavioral expectations and demands of less restrictive settings. School Psychology Review, 12, 274–284. Walker, H. M., Reavis, H. K, Rhode, G., & Jenson, W. R. (1985). A conceptual model for delivery of behavioral services to behavior disordered children in educational settings. In P. Bomstein & A. Kazdin (Eds.), Handbook of clinical behavioral services with children. Homewood, IL: Richard D. Irwin. Walker, H. M, Seeley, J., Small, J., Severson, H., Graham, B., Feil, E., Serna, L., Golly, A., & Forness, S. (2009). A randomized controlled trial of the First Step to Success early intervention: Demonstration of program efficacy outcomes in a diverse, urban school district. Journal of Emotional and Behavioral Disorders, DOI: 10.1177/1063426609341645. Walker, H. M., Severson, H., & Feil, E. (1995). The Early Screening Project (ESP). Eugene, OR: Deschutes Research. Walker, H. M., Severson, H., Nicholson, F., Kehle, T., Jenson, W.R., & Clark, E. (1994). Replication of the Systematic Screening for Behavior Disorders (SSBD) procedurel for the identification of at-risk children. Journal of Emotional and Behavioral Disorders. Technical Manual | 77 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Walker, H. M., Severson, H. H., Stiller, B., Williams, G., Haring, N. G., Shinn, M. R., & Todis, B. (1988). Systematic screening of students in the elementary age range at risk for behavior disorders: Development and trial testing of a multiple gating model. Remedial and Special Education, 9(3), 8–14. Walker, H. M., Severson, H. H., Todis, B., Block-Pedego, A., Williams, G., Haring, N., & Barckley, M. (1990). Systematic screening for behavior disorders (SSBD): Further validation, replication, and normative data. Remedial and Special Education, 11(2), 32–46. Walker, H. M., Stiller, B., Golly, A., Kavanagh, K., Severson, H., & Feil, E. (1997). First Step to Success: Helping young children overcome antisocial behavior (an early intervention program for grades K-3). Longmont, CO: Sopris West. 78 | APPENDIX A: Updated Supplemental Norms APPENDIX A Normative Comparisons: SSBD Original Norms and Updated Supplemental Normative Databases SSBD procedures and instruments are completely standardized and self-contained. Normative levels on the SSBD have been established to assist in the following tasks: •• Facilitate decision making in moving from one screening stage to another •• As tools in determining eligibility in relation to generalized normative criteria The original national standardization sample for the SSBD Stage 2 consists of 4,463 cases, and the SIMS Behavior Observation Codes have a national normative sample of 1,219 cases. It should be noted that one rarely encounters an observation code in the professional literature that has been nationally normed. Details on the original SSBD normative sample were described earlier. Here we discuss new supplemental normative databases that have been developed over the past decade by research and practice usages of the SSBD. It is recognized that students passing both SSBD screening Stages 1 and 2 may exceed normative levels and expectations for the referring classroom setting but may not exceed normative cutoff scores derived from the SSBD national standardization sample. This possible outcome highlights the importance, whenever possible, of using normative criteria that are external to the behavioral norms and expectations of specific school settings and local districts in screening referred students who may have moderate to severe behavior disorders. However, in this context, it is also important to demonstrate that the referred student’s behavioral profile is substantially divergent from those of nonreferred, same-sex peers within the referring classroom setting. Ideally, local practice norms on the SSBD should be used to inform and supplement decision making whenever they exist. Technical Manual | 79 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Supplemental SSBD Practice and Research-Generated Norms Since its publication, the SSBD has been used extensively by single schools and districts. As noted, it has also been a popular research tool and has been the focus of considerable research attention in the context of universal, proactive screening while frequently serving as a validation criterion or standard against which other screeners are judged (Lane, Menzies, Oakes, & Kahlberg, 2012). Relatively large databases exist for SSBD screenings as a result of this extensive usage. During the 2012–13 school year, the authors were able to assemble an extensive supplemental normative database on the SSBD Stage 2 instruments consisting of 6,743 cases. These cases and resulting student profiles were generated through research conducted on the SSBD by other investigators and through use of the SSBD in ongoing school district practices involving regular universal screening of grade 1–6 general education student populations for behavior problems over the past decade. They represent the following regions of the United States: Northwest, Mountain West, Southwest, Midwest, Southeast, South, and East. The supplemental SSBD normative data from across these regions and cities provide an important context for evaluating the relevance of the SSBD’s original normative sample for current decision making about behaviorally at-risk students. Table 25 contrasts the original SSBD normative levels on the Stage 2 instruments (Critical Events Index, Adaptive Behavior Index, and Maladaptive Behavior Index) with normative levels drawn from a series of research studies in which the SSBD Stage 1 and 2 instruments were administered. Participating teachers and schools were asked to simply complete Stages 1 and 2 SSBD screening without making any decisions about qualifying scores or cut points for individual students or selecting any of them from the larger student pool based on their Stage 2 score profiles. The original norming of the SSBD Stage 2 instruments followed this instrument administration and data collection procedure exactly. As can be seen in Table 25, the normative levels for externalizing and internalizing students who were nominated from screening Stage 1, when averaged for each of the Stage 2 instruments, were quite similar to the original SSBD normative levels and to each other even though they represented different regions of the United States and many years between screening occasions. Thus, these samples contained an undifferentiated 80 | APPENDIX A: Updated Supplemental Norms Table 25 Similarities in SSBD Score Profiles for Original Normative and Research-Based Samples SSBD Original Norms (N = 4,463) M Externalizing M Internalizing 3.40 2.03 .12 ABI 35.10 44.42 55.29 MBI 29.45 17.20 13.37 CEI M Nonranked From Walker, H. M., & Severson, H. H. (1990). Systematic screening for behavior disorders (SSBD). Longmont CO: Sopris West. Study 1 — NW Region (N = 454) M Externalizing M Internalizing M Nonranked CEI 1.72 1.57 — ABI 36.38 44.50 — MBI 29.61 18.16 — From Walker, H. M., Severson, H., Stiller, B., Williams, G., Haring, N., Shinn, M., & Todis, B. (1988). Systematic screening of pupils in the elementary age range at risk for behavior disorders: Development and trial testing of a multiple gating model. Remedial and Special Education, 9(3), 8–14. Study 2 (N = 856) M Externalizing M Internalizing M Nonranked CEI 3.02 1.96 .11 ABI 36.52 44.78 55.43 MBI 30.65 18.62 13.52 From Walker, H. M., Severson, H. H., Todis, B. J., Block-Pedego, A. E., Williams, G. J., Haring, N. G., & Barckley, M. (1990). Systematic screening for behavior disorders (SSBD): Further validation, replication and normative data. Remedial and Special Education, 11(2), 32-46. Study 3 (N = 1,468) M Externalizing M Internalizing CEI 3.20 1.80 M Nonranked .10 ABI 36.10 44.70 55.90 MBI 28.40 15.90 13.30 From Walker, H. M., Severson, H. H., Nicholson, F., Kehle, T., Jenson, W. R., & Clark, E. (1994). Replication of the Systematic Screening for Behavior Disorders (SSBD) procedure for the identification of at-risk children. Journal of Emotional and Behavioral Disorders, 2(2), 66-77. Utah Sample (N = 2,188) Positive Behavior Support Initiative, Brigham Young University M Externalizing M Internalizing M Nonranked CEI 3.51 2.91 — ABI 36.30 41.91 — MBI 28.57 18.73 — Caldarella, P., Young, E. L., Richardson, M. J., Young, B. J., & Young, K. R. (2008). Validation of the Systematic Screening for Behavioral Disorders in middle and junior high school. Journal of Emotional and Behavioral Disorders, 16(2), 105-117. CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index Technical Manual | 81 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) mix of students, some of whom would and some of whom would not meet SSBD Stage 2 risk criteria. The close correspondence between the student profiles in Table 25 across these divergent research samples and the similarity to the original SSBD norms speaks to the stability of externalizing, internalizing, and normative or representative behavior patterns both across years and regional sites. Table 26 contrasts the original SSBD normative profiles with a series of samples drawn from different U.S. regions in which participants were (required to meet Stage 2 risk criteria.) As can be seen in this table, SSBD Stage 2 profiles are significantly more problematic, as expected. Again, the consistency of Stage 2 measures scores as highly consistent across samples and regions. In a large SSBD Midwest sample of students (n = 1,970), those students who met Stage 2 risk criteria based on their CEI scores had much higher scores and more problematic profiles across the Stage 2 instruments than those who did not meet these criteria. Table 26 displays the Stage 2 instrument profiles for these two groups of students. Similarly, in a replication of these results within two randomized control trials implemented to investigate the efficacy as well as the effectiveness of the First Step to Success early intervention program, the SSBD was used to screen grade 1–3 students for inclusion in these studies (Sumi, Woodbridge, Javitz, Thornton, Wagner, Rouspil, & Severson, 2013; Walker, Seeley, Small, Severson, Golly, & Feil, 2009). The target student populations for these studies required that they have externalizing type problems and disorders, so screening for internalizers was not conducted. Participating students in these two trials totaled more than 500 in number and represented six sites across the United States. As can be seen in Table 27, those externalizing students who met SSBD Stage 2 exit criteria had substantially more problematic profiles on the Stage 2 measures than those who did not. Note: For the First Step trials, 96% of the included students met Stage 2 criteria based on their CEI profiles only. 82 | APPENDIX A: Updated Supplemental Norms Table 26 Original SSBD Norms vs. Supplemental Practice—Research Samples SSBD Original Norms (U.S. Region; N = 4,463) Externalizers M Critical Events Index Internalizers M 3.4 2.03 Adaptive 35.91 44.42 Maladaptive 29.45 17.20 Lane et al. Samples (Southeast Region; N = 1,195) Sample 1 (n = 180) Critical Events Index 6.96 4.96 Adaptive 28.74 35.94 Maladaptive 38.03 23.86 8.17 5.57 Adaptive 28.10 34.71 Maladaptive 38.03 24.28 6.16 4.37 Adaptive 29.18 36.34 Maladaptive 36.76 22.44 Sample 2 (n = 254) Critical Events Index Sample 3 (n = 761) Critical Events Index Stage-Cheney Sample (Northwest Region; N = 432) Critical Events Index 6.41 4.56 Adaptive 28.36 34.24 Maladaptive 37.86 26.35 Naquin Sample (Southern Region; N = 800+) Critical Events Index 6.43 4.97 Adaptive 27.45 33.97 Maladaptive 37.56 23.76 Eber-Rose et al. Sample (Midwestern Region; N = 1,970) Critical Events Index 6.01 4.04 Adaptive 30.87 37.76 Maladaptive 36.60 22.83 CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index Technical Manual | 83 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 27 Profiles for Externalizing and Internalizing Students Meeting vs. Not Meeting SSBD Stage 2 Risk Criteria Met Stage 2 Risk Criteria, Externalizer • Descriptive Statistics N Minimum Minimum Sum Mean SD Critical Events Index 336 0 15 2020 6.01 2.17 Adaptive Behavior Score 200 12 48 6174 30.87 6.76 Maladaptive Behavior Score 199 0 52 7283 36.60 7.23 Valid N (listwise) 199 Met Stage 2 Risk Criteria, Internalizer • Descriptive Statistics N Minimum Minimum Sum Mean SD Critical Events Index 368 1 13 1487 4.04 1.84 Adaptive Behavior Score 259 16 60 9781 37.76 8.89 Maladaptive Behavior Score 256 11 56 5845 22.83 7.92 Valid N (listwise) 256 Did Not Meet Stage 2 Risk Criteria, Externalizer • Descriptive Statistics N Minimum Minimum Critical Events Index Sum Mean SD 813 0 4 1332 1.64 1.28 Adaptive Behavior Score 813 0 60 32264 39.69 8.36 Maladaptive Behavior Score 812 0 51 19495 24.01 7.61 Valid N (listwise) 812 Did Not Meet Stage 2 Risk Criteria, Internalizer • Descriptive Statistics N Minimum Minimum Critical Events Index 453 1 3 655 1.45 0.61 Adaptive Behavior Score 453 0 60 20924 46.19 10.03 Maladaptive Behavior Score 453 0 55 6285 13.87 4.60 Valid N (listwise) 453 84 | APPENDIX A: Updated Supplemental Norms Sum Mean SD Table 28 Descriptive Statistics for SSBD Stage 2 Measures Albuquerque Efficacy Trial Total Sample (N = 723) Met Stage 2 Criteria (n = 371) Didn’t Meet Stage 2 Criteria (n = 352) M (SD) Min Max M (SD) Min Max M (SD) Min Max CEI 4.9 (3.6) 0 17 7.7 (2.7) 1 17 2.1 (2.7) 0 4 ABI 34.5 (7.7) 13 60 30.5 (5.6) 13 48 38.7 (7.3) 18 60 MBI 31.8 (8.5) 11 51 36.7 (6.5) 16 51 26.6 (7.3) 11 46 Measure National Effectiveness Trial Total Sample (N = 1,084) Met Stage 2 Criteria (n = 660) Didn’t Meet Stage 2 Criteria (n = 424) M (SD) Min Max M (SD) Min Max M (SD) Min Max CEI 6.0 (3.3) 2 20 7.9 (2.9) 4 20 3.0 (0.8) 2 4 ABI 36.7 (8.1) 13 60 33.6 (6.8) 13 57 41.6 (7.5) 25 60 MBI 28.7 (8.7) 11 50 32.4 (7.7) 11 50 22.9 (6.8) 11 47 Measure CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index Similar findings were demonstrated in other trials. As Table 28 indicates, profiles of externalizing and internalizing students meeting Stage 2 criteria were more problematic than profiles of students who didn’t meet risk criteria. Table 29 presents average score levels for students exceeding Stage 2 risk criteria for three additional supplemental normative sites developed in research by Kathleen Lane of the University of Kansas and her colleagues representing the Southeast United States, wherein students had to exceed Stage 2 cutoff points to be included in a research study or to be considered for further screening (Lane, Kalberg, Lambert, Crnobari, & Bruhn, 2010; Lane, Menzies, Oakes, Lambert, Cox, & Hawkins, 2012). Score levels on the Stage 2 instruments were substantially higher than average for students in the above samples where risk criteria for Stage 2 measures had to be met. As can be seen in this table, the average score levels for students in these three samples were significantly higher and their SSBD profiles were more problematic than for students who Technical Manual | 85 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Table 29 Lane et al. Supplemental Norms from Research Conducted in the U.S. Southeast Sample 1 (N = 761) Controls n Externalizers M SD n M Internalizers SD n M SD CEI 614 1.26 1.29 77 6.16 2.44 70 4.37 2.14 ABI 614 46.24 9.48 77 29.18 5.71 70 36.34 7.31 MBI 614 18.46 7.94 77 36.76 6.64 70 22.44 5.83 From Lane, K. L., Kalberg, J. R., Lambert, E. W., Crnobari, M., & Bruhn, A. L. (2010). A comparison of systematic screening tools for emotional and behavioral disorders, JEBD, 18, 100–112. Sample 2 (N = 180) Controls n Externalizers M SD n M Internalizers SD n M SD CEI 70 1.57 1.11 59 6.96 2.91 51 4.96 1.96 ABI 70 44.42 9.36 59 28.7 6.61 51 35.94 8.16 MBI 70 20.32 9.31 59 38.03 6.90 51 23.86 6.94 From Lane, K. L., Menzies, H. M., Oakes, W. P., Lambert, W., Cox, M., & Hankins, K. (2012). A validation of the Student Risk Screening Scale for internalizing and externalizing behaviors: Patterns in rural and urban elementary schools. Behavioral Disorders, 37, 244–270. Sample 3 (N = 253) Controls n Externalizers Internalizers M SD n M SD n M SD CEI 181 1.48 1.31 51 8.17 4.85 21 5.71 3.97 ABI 181 43.15 9.85 50 28.10 6.94 21 34.71 8.74 MBI 181 18.76 8.39 50 37.32 6.82 21 24.28 5.71 From Lane, K. L., Oakes, W. P., Harris, P. J., Menzies, H. M., Cox, M., & Lambert, W. (2012). Initial evidence for the reliability and validity of the Student Risk Screening Scale for internalizing and externalizing behavior at the elementary level. Behavioral Disorders, 37, 99–122. CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index typically did not meet criteria for risk at Stage 2. Similar SSBD databases, drawn from the South (N = 850) and Northwest (N = 429) regions and developed under these same conditions, closely matched these score levels. For example, the CEI average scores for externalizing students in these two regional samples were 6.43 and 6.41, respectively. For the Adaptive Behavior Index, these corresponding averages were quite low at 27.45 and 28.36. Their Maladaptive Behavior Index score averages were 33.97 and 34.24. Similar results were obtained for internalizing students across these two sites. The CEI average scores for internalizers were higher than average at 4.56 and 4.97, respectively. The Adaptive Behavior 86 | APPENDIX A: Updated Supplemental Norms Table 30 Meeting/Not Meeting SSBD Stage 2 Risk Criteria by Ethnicity and Externalizing vs. Internalizing Status PART A — STUDENTS MEETING STAGE 2 RISK CRITERIA White African-American Hispanic/Latino Asian Multi-Ethnic Externalizing M Internalizing M Externalizing M Internalizing M Externalizing M Internalizing M Externalizing M Internalizing M Externalizing M Internalizing M CEI 6.63 4.13 5.72 4.25 6.01 3.83 6.33 3.82 6.40 4.00 ABI 30.50 38.31 31.08 35.60 30.91 38.29 31.00 37.67 33.78 39.33 MBI 36.12 23.78 36.90 24.44 35.68 20.70 38.00 21.67 36.56 21.75 CEI 1.54 1.45 1.71 1.49 1.71 1.46 1.10 1.28 1.67 1.55 ABI 41.19 47.36 38.41 46.05 38.72 44.57 42.90 47.64 38.62 44.91 MBI 23.49 14.74 26.30 14.37 23.30 13.29 20.70 13.84 24.21 14.27 PART B — STUDENTS NOT MEETING STAGE 2 RISK CRITERIA CEI = Critical Events Index; ABI = Adaptive Behavior Index; MBI = Maladaptive Behavior Index Index score averages for internalizers across the two sites were 33.97 and 34.24 while their Maladaptive Behavior Index scores were 23.76 and 26.35 for the South and Northwest sites. We have also examined a portion of this supplemental normative base for the influence of ethnicity on status of student behavior disorders (i.e., externalizing, internalizing). We analyzed the Midwest data base of 1,970 cases for this purpose. Table 30 (part A and B) provides average Stage 2 instrument scores for five ethnic groups: White, African American, Hispanic-Latino, Asian, and Multi-Ethnic. Examination of this table shows remarkable similarities in Stage 2 score profiles across these ethnic groups. Those students who met Stage 2 risk criteria show very similar patterns in their externalizing and internalizing profiles regardless of their ethnicity, and they also reflect known differences between the severity of externalizing and internalizing profiles on the CEI, ABI, and MBI. Thus, students who are screened via Stages 1 and 2 of the SSBD and meet risk criteria for both, look very much alike in the content of their behavioral profiles as well as in their severity levels. These results are encouraging and indicate that separate norms do not need to be created to accommodate ethnicity in SSBD screening and decision making. Technical Manual | 87 APPENDIX B SSBD Bibliography and Resource List A bibliography of information resources regarding the SSBD is provided below in Appendix B for those seeking to access greater detail about the screening system and its use by others. This material describes SSBD applications along with empirical outcomes and provides commentaries and perceptions of the system by both users and researchers. This bibliography is organized into the following categories: books, journal articles, chapters, and websites. The vast majority of this information has been contributed by other professionals. The websites listed are those that have emerged through Google searches for PowerPoints. Rather than present SSBD PowerPoints about the SSBD, we refer you to the websites in which information about the SSBD is contained. Books Crone, D. A., Horner, R. H., & Hawken, L. S. (2004). Responding to problem behavior in schools: The Behavior Education Program. New York: Guilford. Kauffman, J. M. (2001). Characteristics of emotional and behavioral disorders of children and youth (7th ed.). Columbus, OH: Merrill. Kauffman, J. M., & Brigham, F. J. (2009). Working with troubled children. Verona, WI: Full Court Press. Kettler, R., Glover, R., Albers, C., & Feeney-Kettle, K. (Eds.). Universal screening in educational settings: Identification, implementation, and interpretation. Washington DC: Division 16 Practitioners’ Series of the American Psychological Association. Lane, K. L., & Beebe-Frankenberger, M. E. (2004). School-based interventions: The tools you need to succeed. Boston: Allyn & Bacon. Lane, K. L., Kalberg, J. R., & Menzies, H. M. (2009). Developing schoolwide programs to prevent and manage problem behaviors: A step-by-step approach. New York: Guilford. Technical Manual | 89 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Lane, K. L., Menzies, H. M., Bruhn, A. L., & Crnobori, M. (2011). Managing challenging behaviors in schools: Research-based strategies that work. New York: Guilford Press. Lane, K. L., Oakes, W. P., & Cox, M. (2011). Functional assessment-based interventions: A university-district partnership to promote learning and success. Manuscript submitted for publication. Lane, K. L., Menzies, H. M., Oakes, W. P., & Kalberg, J. R. (2012). Systematic Screenings of Behavior to Support Instruction: From Preschool to High School. New York: Guilford. Lane, K. L., Robertson, E. J., & Wehby, J. H. (2002). Primary Intervention Rating Scale. Unpublished rating scale. Nelson, J. R., Cooper, P., & Gonzalez, J. (2004). Stepping Stones to Literacy. Frederick, CO: Cambium Learning Group. Pashler, H., Bain, P. M., Bottge, B. A., Graesser, A., Koedinger, K., McDaniel, M., et al. (2007). Organizing instruction and study to improve student learning: A practice guide (NCER 2007–2004). Washington, DC: National Center for Education Research, Institute of Education Sciences, U.S. Department of Education. Retrieved from ies.ed.govinceelwwc!pdflpracticeguidesI20072 004.pdf. Walker, H. M., Ramsey, E., & Gresham, F. M. (2004). Antisocial behavior in school: Evidence-based practices (2nd ed.). Belmont, CA: Wadsworth. Walker, H. M., Stiller, B., Golly, A., Kavanagh, K., Severson, H. H., & Feil, E. (1997). First Step to Success: Helping young children overcome antisocial behavior. Longmont, CO: Sopris West. Chapters Morris, R. J. Shah, K., & Morris, Y. P. (2002). Internalizing behavior disorders. In K. L. Lane, F. M. Gresham, & T. E. O’Shaughnessy (Eds.), Interventions for children with or at risk for emotional and behavioral disorders (pp. 223–241). Boston: Allyn & Bacon. Severson, H., & Walker, H. (2002). Proactive approaches for identifying children at risk for sociobehavior problems. In K. Lane, F. M. Gresham, 90 | APPENDIX B: SSBD Bibliography & Resource List & T. E. O’Shaughnessy (Eds.), Interventions for children with or at risk for emotional and behavioral disorders (pp. 33–54). Boston: Allyn & Bacon. Walker, H. M., Severson, H. H., Seeley, J. R., & Feil, E. G. (in press). Multiple gating approaches to the universal screening of students with school related behavior disorders. In R. Kettler, R. Glover, C. Albers, & K. Feeney-Kettler (Eds.), Universal screening in educational settings: Identification, implementation, and interpretation. Washington DC: Division 16 Practitioners’ Series of the American Psychological Association. Journal Articles Caldarella, P., Young, E. L., Richardson, M. J., Young, B. J., & Young, K. R. (2008). Validation of the Systematic Screening for Behavioral Disorders in middle and junior high school. Journal of Emotional and Behavioral Disorders, 16(2), 105–117. DOI: 10.1177/106342660731312 Cheney, D., Flower, A., & Templeton, T. (2008). Applying response to intervention metrics in the social domain for students at risk of developing emotional or behavioral disorders. Journal of Special Education, 42, 108–126. Epstein, M.H., Nordness, P. D., Cullinan, D., & Hertzog, M. (2002). Scale for Assessing Emotional Disturbance: Long-term test-retest reliability and convergent validity with kindergarten and first-grade students. Remedial and Special Education, 23, 141–148. Epstein, M. H., Nordness, P. D., Nelson, J. R., & Hertzog, M. (2002). Convergent validity of the Behavioral and Emotional Rating Scale with primary grade-level students. Topics in Early Childhood Special Education, 22, 114–121. Gresham, F. M., Lane, K. L., & Lambros, K. M. (2000). Comorbidity of conduct problems and ADHD: Identification of “fledgling psychopaths.” Journal of Emotional and Behavioral Disorders, 8, 83–93. Hasselbring, T. S., & Goin, L. I. (1999). Read 180 [Computer software]. New York: Scholastic. Hastings, R. P. (2003). Brief report: Behavioral adjustment of siblings of children with autism. Journal of Autism and Developmental Disorders, 33, 99–105. Technical Manual | 91 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Kalberg, J. R., Lane, K. L., & Menzies, H. M. (2010). Using systematic screening procedures to identify students who are nonresponsive to primary prevention efforts: Integrating academic and behavioral measures. Education and Treatment of Children, 33, 561–584. Kamps, D., Kravits, T., Rauch, J., Kamps, J. L., & Chung, N. (2000). A prevention program for students with or at risk for ED: Moderating effects of variation in treatment and classroom structure. Journal of Emotional and Behavioral Disorder, 8, 141–154. Kamps, D., Kravits, T., Stolze, J., & Swaggart, B. (1999). Prevention strategies for at-risk students and students with EBD in urban elementary schools. Journal of Emotional and Behavioral Disorders, 7, 178–188. Kazdin, A. E. (1977). Assessing the clinical or applied importance of behavior change through social validation. Behavior Modification, 1, 427–452. Lane, K. L. (1999). Young students at risk for antisocial behavior: The utility of academic and social skills intervention. Journal of Emotional and Behavioral Disorders, 7, 211–223. Lane, K. L. (2003). Identifying young students at risk for antisocial behavior: The utility of “teachers as tests.” Behavioral Disorders, 28, 360–389. Lane, K. L. (2007). Identifying and supporting students at risk for emotional and behavioral disorders with multi-level models: Data-driven approaches to conducting secondary interventions with academic emphasis. Education and Treatment of Children, 30, 135–164. Lane, K. L., Eisner, S. L., Kretzer, J. M., Bruhn, A. L., Crnobori, M. E., Funke, L. M., et al. (2009). Outcomes of functional assessment-based interventions for students with and at risk for emotional and behavioral disorders in a job-share setting. Education and Treatment of Children, 32, 573–604. Lane, K. L., Harris, K., Graham,S., Weisenbach, J., Brindle, M., & Morphy, P. (2008). The effects of self-regulated strategy development on the writing performance of second grade students with behavioral and writing difficulties. Journal of Special Education, 41, 234–253. Lane, K. L., Kalberg, J. R., Bruhn, A. L., Mahoney, M. E., & Driscoll, S. A. (2008). Primary prevention programs at the elementary level: Issues of 92 | APPENDIX B: SSBD Bibliography & Resource List treatment integrity, systematic screening, and reinforcement. Education and Treatment of Children, 31, 465–494. Lane, K. L., Kalberg, J. R., Lambert, W., Crnobori, M., & Bruhn, A. (2010). A comparison of systematic screening tools for emotional and behavioral disorders: A replication. Journal of Emotional and Behavioral Disorders, 18, 100–112. Lane, K. L., Kalberg, J. R., Menzies, H., Bruhn, A., Eisner, S., & Crnobori, M. (2011). Using systematic screening data to assess risk and identify students for targeted supports: Illustrations across the K-12 continuum. Remedial and Special Education, 32, 39–54. Lane, K. L., Kalberg J. R., Parks, R. J., & Carter, E. W. (2008). Student Risk Screening Scale: Initial evidence for score reliability and validity at the high school level. Journal of Emotional and Behavioral Disorders, 16, 178–190. Lane, K. L., Little, M. A., Casey, A. M., Lambert, W., Wehby, J; H., Weisenbach, J. L., et al. (2009). A comparison of systematic screening tools for emotional and behavioral disorders: How do they compare? Journal of Emotional and Behavioral Disorders, 17, 93–105. Lane, K. L., Little, M. A., Redding-Rhodes, J. R., Phillips, A., & Welsh, M. T. (2007). Outcomes of a teacher-led reading intervention for elementary students at-risk for behavioral disorders. Exceptional Children, 74, 47–70. Lane, K. L., Mahdavi, J. N., & Borthwick-Duffy, S. A. (2003). Teacher perceptions of the prereferral intervention process: A call for assistance with school-based interventions. Preventing School Failure, 47, 148–155. Lane, K. L., Oakes, W. P., Ennis, R. P., Cox, M. L., Schatschneider, C., & Lambert, W. (in press). Additional evidence for the reliability and validity of the Student Risk Screening Scale at the high school level: A replication and extension. Journal of Emotional and Behavioral Disorders. Lane, K. L., Oakes, W. P., & Menzies, H. M. (2010). Systematic screenings to prevent the development of learning and behavior problems: Considerations for practitioners, researchers, and policy makers. Journal of Disabilities Policy Studies, 21, 160–172. Lane, K. L., Parks, R. J., Kalberg, J. R., & Carter E. W. (2007). Systematic screening at the middle school level: score reliability and validity of the Technical Manual | 93 SYSTEMATIC SCREENING FOR BEHAVIOR DISORDERS (SSBD) Student Risk Screening Scale. Journal of Emotional and Behavioral Disorders, 15, 209–222. Todis, B., Severson, H. H., & Walker, H. M. (1990). The Critical Events Scale: Behavioral profiles of students with externalizing and internalizing behavior disorders. Behavioral Disorders, 15, 75–86. Walker, B., Cheney, D., Stage, S., & Blum, C. (2005). School wide screening and positive behavior supports: Identifying and supporting students at risk for school failure. Journal of Positive Behavior Interventions, 7, 194–204. Walker, H. M., Golly, A., McLane, J. Z., & Kimmich, M. (2005). The Oregon First Step to Success replication initiative: State-wide results of an evaluation of the programs impact. Journal of Emotional and Behavioral Disorders, 13(3), 163–172. Walker, H. M. Kavanagh, K., Stiller, B., Golly, A., Severson, H. H., & Feil, E. G. (1998). First Step to Success: An early intervention approach for preventing school antisocial behavior. Journal of Emotional and Behavioral Disorders, 6, 66–80. Walker, H. M., Severson, H., Nicholson, F., Kehle, T., Jenson, W. R., & Clark, E. (1994). Replication of the Systematic Screening for Behavior Disorders (SSBD) procedure for the identification of at-risk children. Journal of Emotional and Behavioral Disorders, 2, 66–77. Walker, H. M., Severson, H., Stiller, B., Williams, G., Haring, N., Shinn, M., et al. (1988). Systematic screening of pupils in the elementary age range at risk for behavior disorders: Development and trial testing of a multiple gating model. Remedial and Special Education, 9, 8–20. Walker, H. M., Severson, H., Todis, B. J., Block-Pedego, A. E., Williams, G. J., Haring N.G., et al. (1990). Systematic Screening for Behavior Disorders (SSBD): Further validation, replication, and normative data. RASE: Remedial and Special Education, 11, 32–46. Webster-Stratton, C. (2000). The Incredible Years training series. Washington, DC: Office of Juvenile Justice and Delinquency Prevention, Juvenile Justice Bulletin. 94 | APPENDIX B: SSBD Bibliography & Resource List Wolf, M. M. (1978). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart. Journal of Applied Behavior Analysis, 11, 203–214. Websites Association for Positive Behavior Support http://www.apbs.org Delaware PBS Project http://www.pbisnetwork.org/wp-content/uploads/2011/04/ScreeningNWPBISMay2011.ppt The Illinois PBIS Network http://www.pbisillinois.org/home http://www.pbisillinois.org/curriculum/universalscreening/ scoring-tools Maine Department of Education—Screening & Progress Monitoring http://www.maine.gov/doe/rti/screening.html http://www.mepbis.org/docs/wcc-03-02-10-systematic-screening-ofbehavior.pdf New Hampshire Center for Effective Behavioral Interventions and Supports NH CEBIS (PBIS) http://www.nhcebis.seresc.net http://www.nhcebis.seresc.net/universal_ssbd Northwest PBIS Network http://www.pbisnetwork.org/ Positive Behavior Support Initiative, Brigham Young University http://education.byu.edu/pbsi Southeastern Regional Education Service Center (SERESC) http://www.seresc.net Technical Manual | 95
© Copyright 2026 Paperzz