before we go live with any AI support setup, I run through the same checklist. Here's what's on it.
not a formal list or anything, just stuff I actually run through now because I've seen what happens when you don't
test it as your worst customer not your best one vague question, typo in it, missing half the context. if it handles that you're probably fine. most people only test the clean version and then wonder why it breaks on real customers
ask it something that's not in the docs not to be mean to the AI, just to see what it does when it doesn't know. does it say it doesn't know or does it just... make something up confidently. very different outcomes
go through every case you've had to escalate manually before if a human has had to step in for it, the AI will hit it eventually. better to find out in testing than from a pissed off customer
make the "talk to a human" option obvious not in a footer somewhere. actually there. especially for anything touching money or cancellations
read the first 20 answers out loud sounds dumb but you catch things this way that you miss reading. if anything sounds slightly off it needs fixing before customers hear it
most of the issues I end up seeing in tickets could've been caught in like 30 mins of this before launch. anyway hope it helps someone