Bobby Zhao
Product Designer

Criticalchat

Criticalchat

CriticalChat is an HCI case study I did at the Harvard Graduate School of Design. It investigates current gen-AI tools' limitations in facilitating user critical thinking processes and explores design features to address the cognitive atrophy as an emerging phenomenon from prolonged AI dependency.

CriticalChat is an HCI case study I did at the Harvard Graduate School of Design. It investigates current gen-AI tools' limitations in facilitating user critical thinking processes and explores design features to address the cognitive atrophy as an emerging phenomenon from prolonged AI dependency.

Responsibilities

Responsibilities

Design Research,User Interview, Ideation, UIUX

Design Research,

User Interview,

Ideation, UIUX

Design Research,

User Interview, Ideation, UIUX

Background

Most AI platforms function as single-turn input-output systems, providing responses based on direct prompts without facilitating iterative or divergent thinking. This limitation curtails the development of critical thinking skills by reinforcing surface-level understanding or confirmation bias

The WHO reports 4.5 billion people lack access to essential health services.

Confirmation Bias

Mental Model Stagnation

Speed over Quality

Problem
Urgency

Cognitive Atrophy

The WHO reports 4.5 billion people lack access to essential health services.

Studies highlight that many gen-AI users engage in cognitive offloading, an unconscious act of delegating cognitive tasks to external tools to reduce mental effort, due to intellectual complacency and product convenience.

The WHO reports 4.5 billion people lack access to essential health services.

In EEG testing, frequent gen-AI users also showed much lower neural engagement in memory and creativity regions. They also became increasingly less diligent and defaulted to "cut-and-paste" behavior.

The WHO reports 4.5 billion people lack access to essential health services.

Data Bias

In political context, it is absolutely crucial to recognize that LLM is trained on data manufactured by academic and journalistic institutions with inherent cultural/geopolitical bias, and the nuances in many complex issues won't be presented to most users unless explicitly prompted.


On national and cultural levels, when training data and ideological guardrails are aligned with a singular, hard-line truth, gen-AI risks becoming a tool for manufacturing consent.

The WHO reports 4.5 billion people lack access to essential health services.

On Critical
Thinking

What Define Critical Thinking?

The WHO reports 4.5 billion people lack access to essential health services.

Studies highlight that many gen-AI users engage in cognitive offloading, an unconscious act of delegating cognitive tasks to external tools to reduce mental effort, due to intellectual complacency and convenience.

The WHO reports 4.5 billion people lack access to essential health services.

Pascarella & Terenzini Measures

The WHO reports 4.5 billion people lack access to essential health services.

Reasoning

Reasoning

Detecting misuse of defintions

Detecting misuse of defintions

Argument Analysis

Argument Analysis

Making strong arguments

Making strong arguments

Thinking as Hypothesis Testing

Thinking as Hypothesis Testing

Check for sample size when a generalization is made

Check for sample size when a generalization is made

Likelihood & Uncertainty Analysis

Likelihood & Uncertainty Analysis

Predict the probability of an event occurrence

Predict the probability of an event occurrence

Decision Making & Problem Solving

Decision Making & Problem Solving

Actively seeking analogies

Actively seeking analogies

Gerlich, Michael. 2025. "AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking" Societies 15, no. 1: 6. https://doi.org/10.3390/soc15010006

the Delphi Report

The WHO reports 4.5 billion people lack access to essential health services.

Interpretation

Interpretation

Understanding experiences/data/rules

Understanding experiences/data/rules

Analysis

Analysis

Identify inferential relationships among concepts

Identify inferential relationships among concepts

Evaluation

Evaluation

Assessing statement/representation credibility

Assessing statement/representation credibility

Inference

Inference

Acquire additional elements needed to draw conclusions

Acquire additional elements needed to draw conclusions

Self-Regulation

Self-Regulation

Self-consciously monitoring one's cognitive activities and results deduced

Self-consciously monitoring one's cognitive activities and results deduced

Facione, Peter. (1989). Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction. Research Findings and Recommendations. 315.

What limits cognitive engagement in current gen-AI interactions?

What limits cognitive engagement in current gen-AI interactions?

Research Method

Research Method

Explorative Research

Literature Review

Data Analysis

Behavioral Analysis


Evaluative Research

Comparative Testing

Assessment Testing

EEG + Eyetracking

Use Case Selection

Use Case Selection

I narrowed down the use case focus to tasks within "seeking information", where users form mental models for complex questions that rarely have one right answer.

User
Behaviors

User
Behaviors

Selective Bias

Users tend to focus on results that expand on their narratives while ignoring the subtle callouts of assumptions and biases.



Lost in Iterations

The existing text-heavy format does not offer visual hierarchy, which result in users frequently losing track of key information.

UI Limitation

UI Limitation

Callouts Too Subtle

Little Information Hierarchy

Monotonous Text Format

Feature Proposal

Assumption Callouts

Assumption Callouts

Detecting misuse of definitions

Detecting misuse of definitions

Context Elaboration

Context Elaboration

Making strong arguments


Making strong arguments


Alternative Arguments

Alternative Arguments

Check for sample size when a generalization is made

Check for sample size when a generalization is made

Rapid Prototyping

Rapid Prototyping

+

Given the limited timeframe for this study, it was imperative to prototype and test fast, so that I may iterate on the design. I used cursor and OpenAI API to create the first version of the critical chatbot to test the features.

Task Design

Task Design

A group of test subject is selected (n=15) to research and write on an unfamiliar subject in 2 attempts, one with the default gen-AI setup, the other with the critical chatbot prototype with critical-thinking features.

Scenario


You're writing a short opinion post on LinkedIn for Earth Day. You want to comment on nuclear energy as a sustainable energy source.


You're turning to this AI tool for a quick answer or position summary to help shape your post.

Insights from Testing

Insights from Testing

EEG Test

EEG Test

EEG measured electrical activity of the user's brain during their task performance. The results highlighted an increase in the Beta waves when users interacted with the critical thinking features among many users. As beta brain waves measure active cognitive process, this validates my hypothesis that design intervention can indeed improve cognitive engagement during gen-AI use.

Thematic Insights

Thematic Insights

Multiple users spontaneously called the assumption feature the most helpful:

  • Helped them surface implicit biases

  • Clarified definitions of key terms

Users criticized standard AI responses for being:

  • Surface-level summaries (unless prompted additionally)

  • Not surprising or challenging

Many users felt the custom interface had too much reading:

  • Too much text causes cognitive overload, which induces the skimming behavior that overlook key information

Attitudinal Survey

Attitudinal Survey

Users are not always engaged with the assumption callout feature despite its reported value for some users.

Most users reported high critical thinking engagement with AI's answer with custom chatbot's features.

Offers a high-level overview of caregiving, equipping them with guiding and reflective questions around 5 key areas

Feature Refinement

Feature Refinement

Final Design

Final Design

Prominent Assumption Callout

  • Helped them surface implicit biases

  • Clarified definitions of key terms

Alt Perspective & Diagram Toggle

  • Quick Toggle between alternative perspective, context expansion and original answer allows comprehensive view on the subject inquired

  • Diagram provides an intuitive cognitive break when text causes visual fatigue

Relevant Exploration

  • Suggested follow-ups enable users to dive deeper into the subject