Content Validity & Reliability Procedure

CAEP Standard Alignment: R5/RA5: Quality Assurance System and Continuous Improvement
CAEP Standard Component Alignment: R5.2/RA5.2: Data Quality

CAEP Criteria for Evaluation of EPP-Created Assessments and Surveys (2022)

Before you proceed with the content validity and reliability procedures, please view the CAEP Criteria for Evaluation of EPP-Created Assessments and Surveys.

View Criteria

CAEP Criteria for Evaluation of EPP-Created Surveys (2022)

Before you proceed with the content validity and reliability procedures, please view the CAEP Criteria for Evaluation of EPP-Created Surveys.

View Criteria

Introduction

The College of Education and Professional Development (COEPD) at Marshall University has established a content validity procedure for all Education Preparation Provider (EPP) created assessments and surveys, including key assessments, performance tasks, clinical evaluations, and national board-certified exams. The EPP adopted the procedure to evaluate its assessments in Spring 2022. The content validity and reliability procedures are used by both initial- and advanced-level programs. Procedures follow the guidelines outlined in the CAEP Evaluation Framework document for EPP-Created Assessments to design, pilot, and judge the adequacy of the assessments created by the EPP.

The purpose of the content validity procedure is to provide guidance for collecting evidence and to document the adequate technical quality of assessment instruments and rubrics used to evaluate candidates in the COEPD.

CAEP Defined Assessments

CAEP uses the term “assessments” to cover content tests, observations, projects or assignments, and surveys – all of which are used with candidates. Surveys are often used to gather evidence on candidate preparation and candidate perceptions about their readiness to teach. Surveys are also helpful to measure the satisfaction of graduates or employers with preparation and the perceptions of clinical faculty about the preparedness of EPP completers.

Assessments and rubrics are used by faculty to evaluate candidates and provide them with feedback on their performance. Assessments and rubrics should address relevant and meaningful candidate knowledge, performance, and dispositions, aligned with CAEP standards. An EPP will use assessments that comprise evidence offered in accreditation self-study reports to examine candidates at various points from admission through completion consistently. These are assessments that all candidates are expected to complete as they pass from one stage of preparation to the next or that are used to monitor the progress of candidates’ developing proficiencies during one or more stages of preparation.

EPP-Defined Assessment

The definition of assessment adopted by the EPP includes three significant processes: data collection from a comprehensive and integrated set of assessments, analysis of data for forming judgments, and use of analysis in making decisions. Based on these three processes, assessment is operationally defined as a process in which data/information is collected, summarized, and analyzed as a basis for forming judgments. Judgments then form the basis for making decisions regarding continuous improvement in our programs.

EPP Five Year Review Cycle

The EPP established a consistent process to review all EPP-created assessments/rubrics on a five-year cycle when possible.

Content Validity Procedure

Content Validity and Reliability Guide

The Content Validity and Reliability Guide offers a flowchart and examples for faculty to help with the validity and reliability process.
(Updated April 10, 2023)

View Guide

Content Validity and Reliability Guide

The Content Validity and Reliability Procedure PowerPoint offers definitions and examples for faculty to help with the validity and reliability process.
(Updated April 10, 2023)

View PowerPoint

The EPP will use the following 10-step validity and reliability procedure to ensure the validity of EPP-Created Assessments:

Validity

Steps	Directions	Examples
1. Form a Small Working Group	The working group should consist of faculty, students, program completers, and external representatives. Generally, 5-7 individuals are sufficient. Please be mindful that the external representative must be an expert in the field. Be sure to document the name, employment title, years of service in the field, and employer of all completers and external representatives (Example #1). Years of Service may be grouped: 0-5 years, 6-10 years, 11-15 years, 16-20 years, 20 or more years.	(Example #1) Dr. Jane Doe, Superintendent, 6-10 years of service, Cabell County Schools
2. Identify Performance Domains	Identify Performance Domains with your Working Group. Your performance domain may be taken directly from and aligned to your specialized accreditation standards (Example #2). The performance domain should contain an operational definition.	(Example #2) COMMITMENT TO STUDENTS (CAEP RA1.1): The creation of a learning environment and community to promote successful teaching and learning. Advanced candidates of the COEPD shall:
3. Compile Rubric Items	Items in your rubric should align to and measure the Performance Domain. Think carefully about the realm of possibilities when creating your items. You cannot have too many items (Example #3).	(Example #3) COMMITMENT TO STUDENTS All Possible Items (as defined by Working Group): Respects the rights of all stakeholders Promotes collaboration and teamwork to improve learning Selects, uses, adapts, and promotes evidence-based practices to meet the needs of learners Selects and uses valid assessment instruments to inform professional decisions Promotes policies and practices which facilitate a positive learning environment Demonstrates flexibility and adaptability to novel or unexpected situations Engages in tasks in a manner showing preparation and organization Promotes system level change to better meet the needs of student’s and their families Advocates for student’s needs

Video Overview

Steps	Directions	Examples
1. Identify Panel of Experts	The working group should now identify a Panel of Experts consisting of as many external representatives (content experts) as possible. The Panel of Experts should consist of at least 15 members. Although faculty, students, and program completers will be a part of the Expert Panel, strive to have as many external content experts as possible. Be sure to document the name, employment title, and employer of all completers and external representatives.	No example needed.
2. Distribute Virtual Q-Sort	Distribute a Virtual Q-Sort to the Panel of Experts. A Q-Sort is a sorting technique designed to study subjectivity (views, opinions, beliefs, values, etc.). For this process, you will conduct a Virtual Q-Sort using Qualtrics. Using Qualtrics, you will use the Pick, Group, and Rank question type in your survey. If you do not have a Qualtrics account, please request a faculty account at marshall.edu/it/qualtrics. Essentially, the working group takes a hard look at each possible item aligned to a domain, then ranks whether it is an Essential Item, an Item that is Useful but Not Essential, or an item that is Not Necessary. Steps to a Virtual Q-Sort Survey Design (Example #1) In a new survey, write clear instructions to the Panel of Experts with expectations. Select a Pick, Group, and Rank Question Type Include the Domain, Operational Definition, and ALL items identified by the Working Group in Step 1. The Panel of Experts will be able to drag and drop each item into the three scales: Essential, Useful but Not Essential, or Not Necessary. Distribute Virtual Q-Sort to Panel of Experts	(Example #1)

Video Overview

Steps	Directions	Examples
1. Q-Sort Results	Once you receive your Q-Sort Results in Qualtrics, you can obtain Lawshe’s Content Validity Ratio (CVR) for each item. You first look for the total number of individuals who ranked the item as Essential (Example #1). CVR calculates a proportionate level of agreement for each item.	(Example #1)
2. Find CVR Using Excel	Download and use the CVR Calculator to obtain CVR more easily. = (ne-N/2)/(N/2) ne = total number of respondents who rated item as Essential. N = total number of participants.	(Example #2)
3. Retain Items Meeting CVR	Use the CVR Chart (Example #3) to identify the number of participants and the CVR critical value associated with the number of participants. Compare the CVR obtained in Excel with the CVR Chart to determine the minimum CVR value required for an item to be valued based on the number of participants. In the example, you will see that since there were 39 respondents, we used .333 as the CVR critical value. Therefore, we only retain and use items that have a CVR at or above .333 (highlighted in green in Example #2). NOTE: If you used items directly from your specialized standards, and the expert panel did not rank them as essential or were NOT retained with CVR – USE THE ITEM ANYWAY because the item is from your standards!	(Example #3)

Now that you have a Q-Sort Report from Qualtrics, and an Excel Spreadsheet with CVR and retained items, begin saving your documents/files to be compiled as evidence.

Video Overview

Steps	Directions	Examples
1. Working Group Creates/Adjusts Rubric	The working group has now identified the items to keep in a rubric based on the CVR. The working group may need to create a new rubric, adjust an existing rubric, and possibly modify the accompanying assessment so that it focuses on the rubric.	No Example
2. Rubrics	As many standards, if not most, align to candidate performance, most assessment measures will align to a performance-based rubric which will allow for: Common framework and language for assessment purposes. Performance or behavior examinations. Standard and criteria evaluations. Substantive faculty discussion on improvement. Collaboration promoting shared expectations and grading practices.	No Example
3. Rubric Contents	Rubric should contain the following: Domain or Standard. Operational Definition or Task. Items for Rating (skills, content knowledge, dispositions, etc.) aligned to a standard and standard component if applicable. Levels of Performance or Mastery. Description of each characteristic at each level of performance/mastery.	No Example
4. Rubric Development	When creating or adjusting a rubric, discuss the following with your working group: Identify what you want to assess. Identify the items you obtained from your Q-Sort/CVR. Identify and Describe each Level of Performance/Mastery. Describe the best work you could expect using these levels (top/positive level) Describe an unacceptable product (lowest/negative level). Describe levels of intermediate or mediocre products (middle levels).	No Example

Steps	Directions	Examples
1. Distribute Information to Expert Panel	Distribute the Assignment, Assignment Instructions, and the Evaluation Rubric to the same Panel of Experts used with the Q-Sort. Looking for Construct Validity, the panel members will rate the representativeness and clarity of each item as it relates to the overall construct. NOTE: The panel of experts will not be rating the performance indicators or descriptors. Only the items. Per the example, the scales include: Item is Representative/Clear Item Needs Minor Revisions to be Representative/Clear Item Needs Major Revisions to be Representative/Clear Item Is Not Representative/Clear Representativeness refers to how well the item measures the domain and Clarity refers to how clearly the item is stated. Provide each expert with either a paper copy response form or a form created in Qualtrics. Example #1 demonstrates using Qualtrics to collect responses. A template for paper copies can be found the COEPD Resources Microsoft Team under Assessment.	Example #1

Steps	Directions	Examples
1. Obtain CVI	With data obtained from the Panel of Experts on Item Representativeness and Clarity, complete the CVI: Retain only items with CVI ≥ .80. If under, convene the working group to determine how the item fits with the domain or standard or how it is worded.	Example #1

Reliability

Chronbach’s Alpha is used to assess the reliability, or internal consistency, of a set of scale or test items and indicates whether an item measures the same construct. It is the most common measure of reliability. Chronbach’s Alpha closer to 1 is considered highly reliable, whereas closer to zero, it is considered less reliable. Generally, Chronbach’s Alpha at or above .80 is highly reliable.

Steps	Directions	Examples
1. Export Reviewer Feedback from Qualtrics	Under Data and Analysis, Export to SPSS and download your raw data.	Example #1 Raw data table in SPSS
Delete lines of Data exported from Qualtrics that are not specific question responses.	Lines pertaining to data collected by Qualtrics, questions asking for narrative or short answers, and answers containing only questions or charts may be deleted. To delete lines of data, right-click on the item line and select “clear.”	Example #2 Lines next to arrow may be deleted.

This is our second toggle content.

Additional Resources

CAEP 2022 Revised Standards Workbook CVR Calculator Qualtrics Request Login to Qualtrics