Trustworthy AI in Higher Education: Building Practices People Can Trust

Share

Artificial intelligence has already entered higher education. It did not wait for every institution to finish a policy, form a committee, or agree on definitions. It is showing up in software colleges already own, in productivity tools employees use every day, in analytics platforms, in advising systems, in student-facing services, and in the quiet corners of administrative work.

That reality changes the question.

The useful question is no longer whether AI belongs in higher education. It is already here. The better question is whether our institutions are prepared to use it in ways that are ethical, transparent, responsible, and trustworthy.

That distinction matters. AI can be useful and still damage trust. It can save time and still create confusion. It can identify patterns and still reinforce inequities. It can help people work faster while also making it harder to explain how decisions were made. In higher education, where decisions affect students, employees, access, opportunity, financial aid, academic progress, and institutional planning, speed alone is not enough.

Trustworthy AI is not about slowing everything down. It is about knowing which uses can move quickly, which uses deserve more review, and which decisions should never be handed over to a tool without human judgment.

AI is entering through more than one door

One of the hardest parts of AI governance is that AI is not arriving in one neat package.

Some uses are formal and visible. A college buys a chatbot. A university implements predictive analytics. A vendor adds AI features to a student success platform. An administrative office purchases a tool that summarizes documents, classifies tickets, or generates reports. These uses are easier to identify because they usually involve procurement, implementation, contracts, and system ownership.

But many uses are informal. An employee uses generative AI to draft an email. Someone pastes survey comments into a tool to identify themes. A staff member asks AI to summarize meeting notes, clean a spreadsheet, write code, create a report narrative, or explain a policy. These uses may be helpful, but they can also create privacy, accuracy, equity, and documentation problems if the institution has not set expectations.

That is the governance challenge. AI is not entering higher education as a single system. It is spreading through daily work. If institutions only govern the obvious tools, they will miss the everyday practices that shape real decisions.

Trustworthy AI is not “one” thing

When people hear “trustworthy AI,” they often think first about fairness. Fairness matters, but it is only one part of the picture.

A trustworthy AI practice has to be valid and reliable. The tool needs to work for the purpose it is being used for, not just look impressive in a vendor demonstration. It has to be safe and secure. Institutions need to understand what data is being used, where that data goes, how long it is retained, and whether sensitive information is protected.

It also has to be explainable enough that people understand the role AI is playing. That does not mean every user needs to see the source code or understand every technical detail. It means people should understand the purpose of the tool, the data involved, the limits of the output, and the human review process.

Most importantly, trustworthy AI has to be accountable. Someone must own the decision.

In higher education, we should be careful with language that says “the model decided” or “the system flagged the student” as if the institution had no role. Institutions decide. People decide. AI may inform a decision, suggest a pattern, generate a draft, or recommend an action, but responsibility remains with the humans and the institution using it.

Trustworthy AI is not a label. It is a set of institutional habits.

Habits

Habit 1: Classify AI by decision risk

Not every AI use deserves the same level of review.

A staff member using AI to brainstorm ideas for a meeting agenda is not the same as a student success system assigning risk scores that influence advising outreach. A tool that summarizes public information is not the same as one that affects admissions, financial aid, employment, discipline, academic standing, or access to services.

Without risk classification, institutions often fall into one of two traps.

The first trap is treating every AI use as dangerous. That freezes useful work and drives experimentation underground. The second trap is treating every AI use as harmless. That allows consequential decisions to move forward without enough review.

A risk-tier model gives institutions a practical middle ground.

Low-risk uses might include brainstorming, formatting, drafting internal language, summarizing public information, or improving personal productivity when no sensitive data is involved. These uses still need guidance, especially around data privacy and validation, but they may not require a formal review process.

Moderate-risk uses may shape internal operations, communications, service prioritization, or analysis that informs decisions. These uses need more structure. They should include documentation, data review, validation, and clear human oversight.

High-risk uses are different. If AI influences admissions, financial aid, discipline, employment, academic standing, student support prioritization, or access to services, the institution needs a stronger review process. It should understand what data is used, how the model performs across groups, what alternatives exist, how humans review the output, and how affected people can challenge or correct information.

The goal is not to make innovation harder. The goal is to make the level of review match the level of consequence.

Habit 2: Keep a campus AI inventory

You cannot govern what you cannot see.

A campus AI inventory is one of the simplest and most useful first steps an institution can take. It does not have to begin as a complex enterprise system. A spreadsheet can work if it has an owner, a review cycle, and the right fields.

At minimum, an AI inventory should capture the tool name, vendor, institutional owner, purpose, data used, affected population, risk tier, decision role, human review process, review date, and current status.

This inventory does more than list tools. It reveals patterns.

An institution may discover that multiple offices are using different AI tools for similar tasks. It may find that vendors have quietly added AI features to existing products. It may learn that employees are using public tools for sensitive work because no approved internal option exists.

Those discoveries should not be treated as failures. They are the map. Once the institution can see the landscape, it can govern with its eyes open.

Habit 3: Document the decision process

AI can produce outputs quickly, but institutions still need to explain decisions slowly enough to be understood.

Documentation is not bureaucracy for its own sake. It is how institutional judgment becomes visible.

For any meaningful AI use, the institution should be able to answer basic questions:

What problem are we trying to solve? What data is being used? What does the AI system produce? Is the output a recommendation, a score, a draft, a summary, or an automated action? Who reviews it? Who can override it? What happens if the output is wrong? How will we know whether the tool is helping or harming?

Not every use case needs a 40-page report. But there should be a durable record. People change jobs. Vendors change products. Models change behavior. Without documentation, institutions create memory gaps, and memory gaps are where accountability tends to disappear.

A simple AI decision record can be enough for many uses. It should describe the purpose, decision authority, data sources, review steps, known limitations, equity checks, and approval date.

Bad documentation says, “The system flagged the student.”

Better documentation says, “The model suggested outreach, and a staff member reviewed the student’s context before any action was taken.”

That difference matters.

Habit 4: Test for bias and uneven impact

Bias is often discussed as if it only appears when someone intentionally builds a biased model. In practice, bias is usually more ordinary and more stubborn.

Bias can enter through historical data. If past decisions reflected unequal access, unequal opportunity, or inconsistent processes, the model may learn those patterns. Bias can enter through missing data. If one student group is less likely to have complete records, the model may perform differently for that group. Bias can enter through proxy variables. A variable that appears neutral may still stand in for income, geography, race, first-generation status, age, or other characteristics.

Bias can also enter through the intervention itself.

Consider a student success model that flags students for outreach. That sounds helpful. But what if the outreach message feels discouraging? What if some students are flagged because they work full time, but the intervention assumes traditional student availability? What if the model is less accurate for transfer students, adult learners, rural students, or students with incomplete records?

Testing should include both technical and human review. Institutions should examine model performance across groups where legally and ethically appropriate. They should look at false positives and false negatives, not just overall accuracy. They should ask whether some populations are over-flagged, under-supported, or affected differently by the process.

They should also ask how the tool changes human behavior. A risk score should not become a scarlet letter in an advising record. A dashboard should not train staff to see a student as a problem to manage instead of a person to support.

Bias testing is not a one-time blessing. It is ongoing monitoring. A model that performs acceptably one year may drift as student behavior, institutional practices, or external conditions change.

Use Cases

Enrollment analytics: prediction is not judgment

Enrollment analytics is one area where AI can offer genuine value. Admissions and enrollment teams work with large applicant pools, limited time, changing demographics, and increasing pressure to communicate effectively. Predictive tools can help identify patterns, prioritize outreach, and support planning.

But the distinction between support and substitution matters.

A model may predict that one applicant is more likely to enroll than another. That prediction should not become a judgment about the student’s worth. It should not quietly reduce outreach to students who may need more support. It should not reinforce historical recruitment patterns simply because those patterns were efficient in the past.

A trustworthy enrollment analytics process starts by defining the role of the model. Is it being used to prioritize communication? Identify students who may need financial aid follow-up? Support territory planning? Shape scholarship strategy? Each purpose carries a different level of risk.

The institution also needs to examine the data. Are the variables appropriate? Are any variables serving as proxies that could create uneven impact? Are rural students, low-income students, first-generation students, transfer students, or adult learners affected differently?

Most importantly, staff need to understand what the score means and what it does not mean. A prediction is not destiny. It is a signal. If people are trained to use AI as a sorting machine, they will get sorting-machine outcomes. If they are trained to use it as one input among many, it can help focus attention without replacing judgment.

Student success alerts: support, not stigma

Student success alerts may be the most sensitive use case because the work is so close to the student experience. Early alerts, risk scores, predictive retention models, advising flags, and nudges can help institutions act sooner. That matters. Waiting until a student has already disappeared is not a student success strategy.

But language and design matter.

A risk score can become a label. An alert can become a stigma. A dashboard can train staff to see a student as a problem rather than a person.

A trustworthy design treats AI as an early signal, not a diagnosis. The system might suggest that, based on available patterns, a student may benefit from outreach. It should not declare that a student is unlikely to succeed.

That difference is not cosmetic. It changes the intervention.

Institutions also need to think beyond the alert. If AI identifies a student as needing support, does the institution have the capacity to respond? Are students routed to the right office? Is outreach documented? Is anyone checking whether the intervention helped? Can students correct inaccurate information or provide context the model cannot see?

Responsible AI should not only identify students at risk. It should help institutions ask why the risk exists in the first place.

Administrative analytics: efficiency still needs judgment

Administrative analytics may feel lower risk than student-facing tools, but it still affects people.

AI can summarize survey comments, generate budget narratives, identify operational patterns, draft reports, classify service tickets, support policy searches, or help leaders interpret complex data. These uses can save time and improve consistency.

The risk is that AI-generated summaries often sound more confident than they are. A polished paragraph can make shaky assumptions look settled. A summary of employee feedback can miss minority viewpoints. A generated analysis can flatten context, especially when the input data is messy or incomplete.

For administrative analytics, trustworthy practice means validation and traceability. Can the summary be traced back to underlying evidence? Did a human review the output? Is the institution clear when something is AI-assisted? Is confidential information protected? Are leaders avoiding the temptation to use AI as an authority cloak for decisions that still require human judgment?

AI can help people move faster through the weeds. It should not quietly choose the garden design.

Governance is a team sport

Responsible AI cannot live in one office.

Executive leadership sets institutional priorities and appetite for risk. Data governance supports definitions, quality, access, lineage, and stewardship. IT and information security focus on systems, integrations, privacy, and cybersecurity. Legal and compliance help interpret regulatory, contractual, civil rights, FERPA, records, and procurement obligations.

Academic leaders bring the teaching, learning, and faculty governance perspective. Student affairs and enrollment leaders understand how tools affect the student experience. Institutional research and analytics teams can help with measurement, validation, bias testing, and interpretation. Faculty, staff, and students need voice because they are often closest to the effects.

That does not mean everyone reviews every low-risk use. That would grind the process into dust.

It does mean higher-risk uses need a cross-functional review path. AI decisions often sit at the intersection of technology, data, policy, student experience, and institutional mission. No single office sees the whole elephant.

A lightweight review workflow is enough to start

AI governance does not have to begin as a massive bureaucracy.

A practical workflow can start with intake. Someone proposes a use case and answers basic questions: What tool is being used? What is the purpose? What data is involved? Who is affected? What decision or action could change because of the output?

Next, classify the risk. Low-risk uses may receive standard guidance and move forward. Moderate-risk uses may need documentation, data review, and an identified human owner. High-risk uses may require formal review, a pilot, bias assessment, privacy review, transparency language, and executive approval.

Then pilot before scaling. A pilot should define both the success measure and the stop condition. What would convince us the tool is helping? What would cause us to pause, redesign, or retire it?

After launch, monitor. AI governance is not a ribbon-cutting ceremony. It is a maintenance plan.

Transparency is a design choice

Transparency does not mean explaining every technical detail to every person. What people need is a plain-language explanation of purpose, data, decision authority, human review, and correction paths.

For example:

We use an analytics tool to help identify students who may benefit from additional outreach. The tool does not make decisions about academic standing, financial aid, or eligibility. A staff member reviews information before taking action. Students may contact our office to ask questions or correct information.

That kind of explanation builds trust because it reduces mystery. It also disciplines the institution. If we cannot explain an AI use in plain language, that is a warning sign. Either the use case is not clear enough, or the decision process is not mature enough.

Transparency should be built into student-facing tools, employee-facing tools, vendor contracts, internal documentation, and training. It is not decoration added after the fact.

Procurement is a governance checkpoint

Procurement is one of the strongest leverage points for responsible AI.

Once a tool is purchased, integrated, and normalized, it becomes much harder to ask basic questions. Institutions should ask those questions before the contract is signed or before AI features are activated.

What AI features are included? Can they be turned off? What data is used? Is institutional data used to train the vendor’s model? Where is data stored? How long are prompts, outputs, and logs retained? What audit logs are available? What documentation exists about model performance and limitations? How are updates communicated? What happens if the output causes harm or is materially wrong?

Institutions should also ask about accessibility, explainability, security, subcontractors, retention, deletion, breach terms, and human override options.

The goal is not to interrogate vendors for sport. The goal is to avoid buying a mystery box and discovering later that it contains institutional risk with a glossy interface.

AI literacy should train judgment, not panic

AI literacy cannot be one-size-fits-all.

Faculty need support around course policies, assessment design, academic integrity, student guidance, and responsible use in teaching and learning. Staff need support around privacy, prompt hygiene, validation, documentation, and escalation. Students need support around disclosure norms, verification, career skills, academic integrity, and the limits of AI tools. Leaders need support around governance, procurement, risk, communication, and accountability. Technical and analytics teams need deeper preparation around model evaluation, bias testing, monitoring, and secure implementation.

A useful AI literacy program should make several things clear.

Do not put sensitive institutional data into unapproved tools. Do not treat AI output as automatically correct. Do not hide material AI use in decision processes. Do not use AI to make high-impact decisions without review. Do use AI where it can reduce friction, improve service, support analysis, or help people work more effectively within appropriate guardrails.

Tone matters. If guidance sounds like fear, people will work around it. If guidance sounds like permission without boundaries, people will overreach.

The better message is simple: use judgment, protect people, document the work, and ask for review when the decision matters.

A 90-day roadmap can get institutions moving

Institutions do not need perfect AI governance before taking action. They need visible ownership, visible risk, and visible learning.

In the first 30 days, name an AI sponsor or coordinating group, form a small cross-functional working team, start an AI inventory, and publish interim guidance. The guidance does not need to answer every future question. It should tell people what is allowed, what is not allowed, and when they need review.

In days 31 to 60, adopt risk tiers, create a simple intake form, identify current AI-enabled tools, and review the highest-risk known uses. This is also the right time to create an AI decision record template so documentation becomes routine.

In days 61 to 90, pilot the review workflow, choose one or two use cases for deeper evaluation, develop role-based training, publish plain-language transparency language, and define monitoring measures.

By the end of 90 days, an institution may not have a mature AI governance program. But it can have the bones of a functional one.

That is the point: not perfection, but motion with guardrails.

What institutions measure will shape what AI becomes

After launch, institutions should measure more than speed.

Speed matters, but speed alone is not enough. A tool can save time and still create confusion, inequity, or bad decisions faster than before.

A balanced measurement approach should include benefit, risk, equity, and process.

Benefit might include time saved, faster responses, better targeting, improved service, or improved completion of a process. Risk might include error rates, privacy incidents, inappropriate use, overreliance, complaints, or appeals. Equity might include whether false positives, false negatives, outreach, services, or outcomes differ across student or employee groups. Process might include whether reviews are completed, documentation is current, training is finished, and users understand their responsibilities.

The main question is not simply whether the AI worked.

The better question is whether AI improved the decision or service in a way that was accurate, fair, explainable, and consistent with institutional values.

Responsible AI is not anti-innovation

The most common AI governance failures are predictable.

Policy without practice. Innovation without ownership. Automation bias. Hidden data exposure. Equity as an afterthought. Vendor trust by default. Pilots that become permanent by inertia. Transparency only after harm occurs.

Because these risks are predictable, they can be designed against.

Institutions can inventory AI uses. They can classify risk. They can document decision processes. They can protect sensitive data. They can define human oversight. They can test for uneven impact. They can explain AI use in plain language. They can ask better procurement questions. They can train people by role. They can monitor after launch. They can retire or redesign tools that do not earn trust.

None of this requires higher education to become anti-AI. In fact, responsible governance can make useful AI adoption easier. When people know the rules of the road, they can move. When there are no rules, cautious people stop moving and reckless people speed through the fog.

Trustworthy AI is not a pause button. It is not a policy binder that sleeps on a shelf. It is a working model for using powerful tools in places where decisions matter.

AI will keep changing. The institutional responsibility will not: protect people, explain decisions, measure impact, and keep human judgment where institutional judgment belongs.

Here’s to using AI effectively and efficiently, with data.

Brian M. Morgan
Chief Data Officer, Marshall University