How to build the best possible UI for AI products?
The world is changing, along with it technical capabilities, but people’s habits and behavior remain more or less the same. Language is the most natural form of communication for us, which is why ChatGPT and similar products gained hundreds of millions of users within their first months.
However, I don’t believe chat is the limit of our capabilities. Today, I want to discuss interesting design practices for intelligent systems initially created for LLM (Likely referring to large language models).
Scale and Detail
Information is hierarchical, and interfaces are static. Imagine a text with a slider. You read the full article, but when you zoom out the slider, each paragraph is summarized in one sentence. Zoom out more, and each section is just one line. Zoom out to the maximum, and you see a super-brief summary.
Now, apply this idea to other interfaces: working with tables, searching for information, creating presentations, ad campaign analytics. AI serves as a universal summarizer, taking you to the level of detail and abstraction needed for your current task.
Multimodality and Adaptability
Input and output modalities shouldn’t compete. Text isn’t more convenient than voice, or vice versa. Modalities should adapt to the most convenient method of communication at any given time. I like to talk through my GPT ideas during walks but prefer typing at a desk or in a cafe.
Gestures
Gestures are a significant part of the success of smartphones and many other Apple products. Be it touchscreen, VR, or even adjusting your pace — LLMs can interpret such actions and activate the corresponding agent or function.
Emotions
People communicate through emotions as much as semantics. LLMs are excellent (better than most people, according to my tests) at detecting emotions from text. Models that read emotions from facial expressions or voice already exist. Adapting actions and tone to the user’s emotional state will become standard in AI system design.