Local AI for Therapists: Session Notes, Treatment Plans, and Client Privacy Without the Cloud

More on this topic: Local AI Privacy Guide | Local AI for Lawyers | Ollama Troubleshooting | VRAM Requirements | Building a Local AI Assistant

I practice IFS (Internal Family Systems) and I’ve been teaching T’ai Chi for years. I spend a lot of time around therapists, bodyworkers, and healers. And I keep hearing the same thing: they’re drowning in documentation and desperate for AI to help, but terrified of sending client data to the cloud.

The ones who are already using ChatGPT for notes know they’re in a gray area. The ones who aren’t feel like they’re falling behind.

There’s a third option that nobody in the therapy world is talking about yet. You can run AI models entirely on your own computer. No internet connection needed. No data leaves your office. Your clients’ most vulnerable disclosures stay on your hardware, under your control, exactly where HIPAA says they should be.

The documentation problem therapists don’t talk about publicly

If you’re a therapist, you know the numbers even if you haven’t counted. The research puts it at 30-40% of working hours spent on documentation. Session notes, treatment plans, progress summaries for insurance, letters of medical necessity, referral letters, court-ordered documentation. Every client, every session, every week.

A therapist seeing 25 clients per week can easily spend 10-15 hours on paperwork alone. That’s time not spent with clients, not spent on training, and not doing much for the burnout problem the field already has.

AI is good at this kind of work. You give it bullet points from a session, it expands them into proper clinical language in the right format. A SOAP note that takes 15 minutes to write takes 2 minutes to review and edit when AI drafts it. The math is obvious.

The problem is where the AI runs.

Why cloud AI is a HIPAA liability for therapists

When you type session notes into ChatGPT, that data travels over the internet to OpenAI’s servers. It gets processed, and depending on your plan and settings, it may be stored and used for model training. Even with enterprise plans and Business Associate Agreements, you’re trusting a third party with Protected Health Information.

The HIPAA Privacy Rule is clear: PHI is any individually identifiable health information. Client names, diagnoses, session content, treatment details. All PHI. And it must be safeguarded against unauthorized access.

Here’s what most therapists don’t realize about BAAs with AI companies: they typically exclude liability for data breaches that occur during model training or processing. You sign a BAA, you feel covered, but the fine print says the AI company isn’t responsible if your client’s trauma history ends up in a training dataset. Your ethics board and your malpractice insurer will not find this reassuring.

Local AI eliminates this entire risk category. When the model runs on your laptop, client data never touches a network. There’s no server to breach. There’s no training pipeline to worry about. The data exists on your encrypted hard drive and nowhere else.

This doesn’t automatically make you fully HIPAA compliant. You still need encryption at rest, access controls, and proper data handling. But it removes the single biggest vulnerability: transmitting PHI to a third party.

What local AI can do for your practice

I want to be specific here because “AI can help with documentation” is vague. Here’s what actually works.

Session note drafting

This is where you’ll save the most time. After a session, you type or dictate bullet points into the AI interface. The model expands them into a properly formatted clinical note in whatever style your practice uses: SOAP, DAP, BIRP, narrative.

A 9B parameter model handles this well. You’re not asking it to understand psychodynamics. You’re asking it to turn “client explored relationship with critical inner voice, connected it to father’s expectations, some emotional release” into clinical language with the right headers and structure.

Treatment plans

Input the diagnosis, presenting problems, and your treatment approach. The model generates a structured plan with measurable goals, interventions, and timelines. You review and adjust. This turns a 30-minute task into a 5-minute review.

Progress summaries for insurance

Feed the model 3-4 months of session notes and ask for a progress summary with clinical language appropriate for insurance review. It pulls out themes, tracks goal progress, and formats it properly. Insurance companies want specific, structured documentation, and this is exactly what models are good at producing.

Clinical letters

Letters of medical necessity, referral letters, coordination of care letters, court-ordered documentation. These follow predictable formats and therapists write dozens of them. The model drafts, you review and sign.

IFS-specific: parts mapping and tracking

This is where I get excited, because I’ve actually done this. IFS therapy works with a structured framework: protectors, exiles, Self-energy, unburdening. The client’s internal system has named parts with specific roles, and those relationships shift across sessions.

A model with the right system prompt can help you maintain a parts map. You describe what came up in session: “The Critic part showed up strongly when we approached the exile holding the abandonment wound from age 7. Some unburdening occurred.” The model organizes this into a structured record, tracks which parts have been contacted, which exiles have been accessed, and where the work stands.

Over months of therapy, this becomes a living document that’s actually useful for tracking the client’s process. Try doing that with a template in your EHR.

Psychoeducation materials

Generate client handouts on grounding techniques, an intro to parts work, cognitive distortion worksheets, mindfulness exercises. Customized to your client’s situation, not generic downloads from a therapy website.

Supervision preparation

Before consultation groups, feed the model your notes on a case and ask for a concise summary with specific questions for the group. Saves the 20 minutes of prep you usually do the morning of.

What local AI cannot and should not do

I want to be equally direct about the limits.

AI cannot diagnose clients. Diagnosis requires clinical judgment, the therapeutic relationship, and training that a language model doesn’t have. Don’t ask it to. Don’t let it suggest diagnoses. Use it for documentation of the diagnosis you’ve already made.

AI doesn’t understand therapeutic alliance, transference, countertransference, or the felt sense of what’s happening in the room. It can’t read the pause after a question. It doesn’t notice when the client’s body tenses. The relational piece is yours. The paperwork piece is where AI helps.

Don’t use it to process real-time session transcripts. Most clients don’t consent to recording, and the ethical issues with AI-analyzed session recordings are unresolved. This tool is for after-session documentation.

And AI is not a chatbot for your clients. Don’t point clients at a local model for therapeutic support between sessions. That’s a liability and clinical ethics problem that no amount of technology fixes.

Recommended setup

The easiest path: Ollama + Open WebUI on your current laptop

If you already have a reasonably modern laptop with 16GB of RAM, the setup costs nothing.

# Install Ollama (Mac/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull the model (6.6GB download)
ollama pull qwen3.5:9b

# Install Open WebUI (gives you a clean chat interface)
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui ghcr.io/open-webui/open-webui:main

Open http://localhost:3000 in your browser. You now have a private AI assistant that looks and feels like ChatGPT but runs entirely on your hardware.

For setup troubleshooting, we have a dedicated guide.

Which hardware and model

Your Situation	Hardware	Model	Cost
Solo practitioner, existing laptop	Any laptop with 16GB RAM	Qwen 3.5 9B	$0
Want better quality	Mac with Apple Silicon (M1+)	Qwen 3.5 9B at Q8	$0 (existing Mac)
Group practice or heavy use	Mac Mini M4 (24GB) or Beelink mini PC	Qwen 3.5 27B	$600-900
Best clinical language quality	24GB+ GPU or M4 Pro Mac	Qwen 3.5 27B at Q6+	$700-2,000

Apple Silicon Macs work well for therapy practice. They’re silent (no fan noise during sessions if the model is running in the background), they have good unified memory for running models, and they’re the hardware many therapists already own.

If you want to keep AI completely separate from your session laptop, a headless mini PC under your desk running Ollama is a clean solution. You access it through your browser on any device in your office. See our VRAM guide for detailed model-to-hardware matching, and our Mac LLM guide for Apple-specific setup.

The IFS-aware model setup

This is the part that took me some experimenting to get right. Create a file called Modelfile-ifs:

FROM qwen3.5:9b
PARAMETER num_ctx 8192

SYSTEM """You are a clinical documentation assistant for a therapist
practicing Internal Family Systems (IFS) therapy. You understand IFS
terminology and framework: parts (protectors, managers, firefighters),
exiles, Self-energy, unburdening, direct access, in-sight, unblending.

When drafting session notes:
- Use proper clinical language appropriate for insurance documentation
- Reference parts by the names the client uses (e.g., "The Critic,"
  "Little One")
- Track Self-leadership indicators
- Note unburdening progress when applicable
- Format in DAP (Data, Assessment, Plan) structure unless told otherwise

When maintaining parts maps:
- Organize by part type (protector/exile/Self)
- Track each part's role, origin, and current relationship to Self
- Note changes across sessions

You do not diagnose. You do not interpret. You document what the
clinician reports. Keep language neutral and clinically appropriate."""

ollama create ifs-notes -f Modelfile-ifs
ollama run ifs-notes

You now have a model that speaks IFS. The same approach works for CBT, DBT, psychodynamic, somatic, or any modality with structured frameworks. Change the system prompt to match your orientation.

Prompt templates for daily use

In Open WebUI, you can save prompt templates. Here’s the one I use for DAP notes:

Write a DAP note based on the following session bullet points.
Use clinical language appropriate for insurance documentation.
Include IFS terminology where relevant.

Session date: [DATE]
Client identifier: [INITIALS OR CODE]
Session number: [#]

Session bullets:
[PASTE YOUR NOTES HERE]

The model turns your shorthand into a structured note in about 10 seconds. You read it, fix anything the model got wrong or oversimplified, and paste it into your EHR. What used to take 15 minutes takes 3.

A complete walkthrough

Let me make this concrete. Here’s an actual workflow.

You finish a session with a client working through anxiety. In IFS terms, a protective part is using worry to prevent an exile’s pain from surfacing. During the session, the client identified the worry as a protector, accessed the exile’s fear of abandonment from age 7, and experienced some unburdening. You assigned Self-energy journaling as homework.

You open Open WebUI, select your “Session Note — IFS” template, and type:

Client identified worry as a protector. Accessed exile’s fear of abandonment from age 7. Some unburdening occurred. Firefighter (binge eating) did not activate today. Assigned Self-energy journaling homework. Client reported feeling lighter at end of session.

The model returns something like:

DATA: Client engaged in parts work focused on a protector part identified as “The Worrier,” whose function is to prevent contact with exiled material. Through direct access, client connected with an exile carrying a fear of abandonment originating from approximately age 7, linked to [early relational disruption]. Partial unburdening was facilitated during the session. The firefighter part associated with binge eating behaviors was monitored but did not activate during the session. Client reported a subjective sense of relief and lightness at session close.
ASSESSMENT: Client demonstrates increasing capacity for Self-leadership, evidenced by ability to approach exile material with curiosity rather than overwhelm. The protector part showed willingness to step back during exile work, suggesting growing trust in the therapeutic process. Unburdening progress is consistent with treatment goals.
PLAN: Continue parts work targeting the abandonment exile in next session. Monitor firefighter activation between sessions. Client to practice Self-energy journaling (15 min daily) to maintain connection with unburdened material. Next session: [DATE].

You review it, adjust the language to match exactly what happened, and copy it into your EHR. Three minutes instead of fifteen. And none of that text ever left your laptop.

Security hardening for your therapy practice

Running AI locally is the foundation, but there are a few more steps to make this genuinely secure.

Full disk encryption. Enable FileVault on Mac or LUKS on Linux. This protects client data if your laptop is stolen. On Mac, it’s a single toggle in System Settings > Privacy & Security.

Separate user account. Create a dedicated user account on your machine for clinical AI work. This isolates it from personal browsing, downloads, and any cloud-synced folders.

Disable cloud sync on your AI data directory. This is the one people miss. If iCloud, Dropbox, or OneDrive is syncing your home folder, your Ollama model data and Open WebUI conversation history could be uploading to the cloud silently. Turn it off for ~/.ollama/ and the Open WebUI data volume.

Local encrypted backups. Back up your clinical AI data to an encrypted external drive. Not to iCloud. Not to Google Drive. A physical drive in your office that you control.

Group practice? If multiple clinicians need to access the same AI setup, run Ollama and Open WebUI in Docker with authentication enabled. Each clinician gets their own login, and conversation histories are separated. See our Open WebUI setup guide for the walkthrough.

Start here, expand later

The therapists I know who are using local AI started with one thing: session notes. They got the setup working, used it for a week, and realized they were saving 5-10 hours per week on documentation alone. Then they added treatment plans. Then insurance letters. Then parts maps.

Don’t try to automate everything on day one. Install Ollama, pull Qwen 3.5 9B, create a system prompt for your modality, and draft notes after your next session. See how the output looks. Adjust the prompt. Build from there.

The IFS community is a natural fit for this. Parts work is structured, the language is specific, and the documentation maps well to what AI models are good at producing. If you’re an IFS practitioner and you’re not using this yet, you’re doing more paperwork than you need to be.

The privacy argument isn’t theoretical. It’s the difference between a tool your ethics board would approve and a HIPAA violation waiting to happen. Local AI gives you the time savings without the risk. For the full privacy deep-dive, see our local AI privacy guide.