People who grew up playing Zork and other Infocom games know that chat interfaces don't have to be perfect to be magical. Conversational interfaces are awesome, and it's amazing to see them develop.
The trouble is that a lot of the hype is based around chatbots not being an interface, but a universal knower and doer, as a kind of silver bullet for automation in general, as well as a creator of code, text, images, movies, etc.
When the chatbot is an interface, the skill and effort remain with the user. They are using a conversational interface to make something worth making or to find something worth knowing, and that skill and effort is what makes it valuable.
When the idea is you can offload the knowing and doing to the AI, that the chatbot is not an interface but a skilled servant, then it's presumed that the user isn't adding any skill or effort of their own. This is where things go off the rails.
Feasibility comes into question, like can "AI" (LLMs, etc.) really know and do things without human skill and knowledge? Early results don't look good. And economic questions, like if it can be produced with no skill and effort, is it worth anything?
One is a small set of companies effectively prevents competition by hoarding capital, hardware, and network resources, effectively blocking new entrants and stifling innovation. Let's call that the #enshittification path.
The other is that companies focus on conversational interfaces, not on the agency of the chatbot, but on the agency of the people using them, whether they are shopping, creating software, writing, or whatever, focusing on maximizing the agency of the person.
While the #enshittification path depends on centralization and hoarding to create an all-knowing oracle, the #dialogic path depends on diffusion, openness, and human-centred design to create a powerful, expressive lever for human agency.
With the dialogic path, you've effectively captured something that I've been thinking about. I regularly find myself revisiting the capacities of these new AI systems. I find myself always landing on a mixture of disappointed and impressed. The impressed part is outweighed by my disappointment.
But, for example, the chat interface to OCR is pretty useful. I wanted a serial number off of an access point the other day. I snapped a picture and sent it to my desktop.
I looked at it, and I had the "I wonder" thought, and I asked for the serial number AND MAC address. The system faithfully ignored the other information and presented what I asked for. I double-checked, and everything was accurate. That was legitimately cool.
What I don't like is that I don't have a lot of control or insight over these things. Ideally, I would be able to evaluate and plug and play these with a combination of models and deterministic plugins. Starting with a canvas and building my ideal user agent from the ground up. I don't want this to be folded into a web browser because I think having them as discrete tools is useful and adds a layer of cognitive protection.
I have found a few desktop applications that sort of gesture towards this, but I'm not seeing anything that feels good in the way my IDE or web browser feel good.
Admittedly, the knowledge of how these systems operate currently sort of saps any good feeling one might have away.
Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Well, technically the answer is no, not unless supply can be constrained.
When the dust settles, the AI wars will be settled in one of two ways.
Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •Dmytri
in reply to Dmytri • • •jeremiah
in reply to Dmytri • • •With the dialogic path, you've effectively captured something that I've been thinking about. I regularly find myself revisiting the capacities of these new AI systems. I find myself always landing on a mixture of disappointed and impressed. The impressed part is outweighed by my disappointment.
But, for example, the chat interface to OCR is pretty useful. I wanted a serial number off of an access point the other day. I snapped a picture and sent it to my desktop.
I looked at it, and I had the "I wonder" thought, and I asked for the serial number AND MAC address. The system faithfully ignored the other information and presented what I asked for. I double-checked, and everything was accurate. That was legitimately cool.
What I don't like is that I don't have a lot of control or insight over these things. Ideally, I would be able to evaluate and plug and play these with a combination of models and deterministic plugins. Starting with a canvas and building my ideal user agent from the ground up. I don't want this to be folded into a web browser because I think having them as discrete tools is useful and adds a layer of cognitive protection.
I have found a few desktop applications that sort of gesture towards this, but I'm not seeing anything that feels good in the way my IDE or web browser feel good.
Admittedly, the knowledge of how these systems operate currently sort of saps any good feeling one might have away.
Dmytri
in reply to jeremiah • • •