Improved voice input with locale handling, voice calls, and Base64 image validation