Gen AI and the Next Leap in Human-Computer Interaction

Written by Vaibs Kumar, SVP Technology, IFS
The pace of innovation in Gen AI is relentless – Copilots are a thing of the past, everyone is now developing agents. The fear that developers won’t be needed by 2025 has died and Gen AI has helped developer efficiency the most. A key reason is programming languages have more defined and stricter rules that human language. If you peak into a large language model, it proves that when you train a behemoth neural network (e.g. over 3 billion parameters) on a massive corpus of text (say 3 billion articles), it can produce human-like language.
This proves that human language has patterns which are inferred by these models, that aren’t codified into rules of language yet. Then you build fail-safes into the model, so it doesn’t produce correct language like “the cat flies to Mars” which doesn’t mean anything, or worse yet, that has terrible meaning.
Approaches like Retrieval Augmented Generation can eliminate the hallucinatory nature of Gen AI by using indexes of accurate information or links (like search engines do). Here, Gen AI is used as the natural language interface to match the question with the underlying information in the index and then provide a contextualized natural language response to the user.
That has me thinking, isn’t that sufficient? Maybe Gen AI doesn’t need to be over-hyped as the sentient technology that will eliminate most human jobs. Because Gen AI is indeed the most effective natural language interface ever. Speech to text and vice-versa has advanced significantly in recent years but never have we been able to have such powerful natural language interactions with computer software.
When we moved from character-based systems to graphical user interfaces in the 1990s, we took a massive experiential leap completely changing our interaction pattern. Touch brought about the advent of tablet computing which eventually led to the iPhone. To me, this is the next major experiential leap. I think that with more computing power in much smaller client devices (aka. Phones, watches), multi-modality between speech, text, image/video and of course, Gen Ai’s ability to accurately understand and respond in natural language, we are paving the way for computer software that we will interact with in a completely different manner in the future.
Not saying a GUI will not be needed, but it has the potential to be radically different. There is a lot more innovation to come including the next generation of software which natively incorporates the ability to interact with it in natural language. Then the hardware, at which there were some recent failed attempts, will adapt and I foresee some very interesting interaction patterns evolve.
Most importantly, humanity needs to adapt to using software through natural language. Believe it or not, we have been conditioned to using a computer in a certain way in the last 50 years and changing that will take some time.