AI Philosophy on Code Plato

Web4.0 Is Coming

Wed, 06 May 2026 00:00:00 +0000

AI isn’t just a tool upgrade — it’s a new computing platform revolution.

Part 1: The Cracks Are Already Showing

I’ve been job hunting recently, and I noticed something interesting: genuine “LLM integration developer” roles are still surprisingly rare. What’s more interesting is that even when companies do post them, most require:

AI Agent experience
LLM project experience
RAG experience
AI Workflow experience

Here’s the problem: LLM development has only exploded in the past few years. How many engineers actually have complete AI development experience? Many engineers only started transitioning into LLM development a few months ago.

If you keep the bar this rigid and can’t hire anyone, those people will get picked up by other companies. In another year or two, you might not be able to hire them at all, even if you want to.

(So if I’m job hunting right now — you could hire me today. Just don’t make me do LeetCode.)

But the really interesting part isn’t the hiring market. It’s that most companies, even now, have no idea how to make money with AI. The people who are actually using LLMs to build things are indie developers, small teams, hackers, and solo founders. They don’t even know if it will be profitable — but they’re running experiments anyway, because “this thing is just too cool.”

That hacker intuition is hard to explain with traditional business logic. Most great tech revolutions didn’t start with a clear business model. They started because a group of people thought something was fascinating.

That’s how the internet started. Personal computers. Smartphones. And now AI.

The real danger is that many large companies are still sitting comfortably in their existing lanes, asking:

Can AI make money?
How do we calculate AI ROI?
Will AI disrupt our current business?

But the question they should actually be asking is:

“Will our company still exist in ten years?”

Because history has already answered this. Kodak didn’t die because its technology was weak. Nokia didn’t die because its engineers weren’t good enough. They died because when a new computing platform arrived, they were still living in the old era.

And right now, the cracks are already showing.

The way I see it, a Niagara Falls is being held back by a thin mud wall — and that wall has started to crack.

Today, 90% of internet companies are already standing at the edge of a cliff. They just haven’t realized it yet. Don’t believe me? Let’s run a social experiment starting now:

Build an AI Skill for Jira
Build an AI Skill for productivity tools
Build AI-native versions of various Web2.0 apps

Watch what happens.

Part 2: The Web4.0 Architecture

“Web3.0” is a term that’s been talked to death. Why? Because it never produced a computing paradigm genuinely capable of restructuring Web2.0.

But AI is different.

I’m calling this wave Web4.0, because AI is starting to deeply penetrate software itself. It’s no longer just a search bar, a chatbot, or an assistant tool — it’s gradually becoming part of the operating logic of software.

I’d even argue this will be the fourth industrial revolution, because for the first time, machines are beginning to participate in producing software themselves.

1. The Software Interface

The software interface of Web4.0 will look very different from today’s — but not completely unfamiliar.

Future software will most likely split into: software on the left, AI on the right.

The left side will still be traditional GUI:

Task lists
Tables
Charts
Dashboards
Status bars

Humans still need to see state, so GUI isn’t going away.

But the right side will become an AI operation layer. Users won’t primarily interact through buttons anymore — they’ll accomplish most tasks through natural language, conversation, and intent.

For example:

“Move this issue to next week and notify the relevant team members.”

AI will:

Update the issue
Change the status
Send notifications
Adjust the timeline

The left-side GUI’s role shifts to: showing the current state of the system. Users can even watch AI operate within the system and step in manually when needed.

Software will shift from:

“Humans operate software”

to:

“AI operates software. Humans supervise AI.”

2. System Architecture

The core shift in Web4.0 is that every frontend will eventually connect to an AI engine.

Whether it’s:

App
Web
Desktop
Skill
Agent

Everything will plug into:

SLM + RAG

Many people assume the future will be dominated by ever-larger models, but I don’t think so. LLMs are too expensive, enterprise-sensitive data can’t leave the building, and no serious company wants its core technology dependent on someone else’s API. A truly mature company will never build its core business permanently on external infrastructure.

So Web4.0 will inevitably move toward:

Each company’s own SLM (Small Language Model) + proprietary RAG.

LLMs will be more like early exploration tools, general reasoning engines, and product validation platforms. Mature products will eventually own their own:

AI Engine
Memory
Knowledge Base
Workflow System

The competitive moat for companies will gradually shift away from:

Frontend pages
CRUD systems
Database design

And toward:

RAG architecture
Workflow orchestration
Enterprise knowledge organization
Agent collaboration systems

3. The Product Lifecycle

The lifecycle of Web4.0 products will also change.

In the early stage, most teams will go straight to:

OpenAI
Claude
Gemini

Combined with:

MCP
RAG
Workflow

To ship fast — because the cost of experimentation is low, and the product can “come alive” from day one.

This is completely different from before. Products used to require massive amounts of custom logic before they were usable. Now AI already ships with enormous general-purpose capability.

But at the mature stage, companies will gradually migrate to:

SLM + proprietary RAG

The reasons are practical:

Reduce costs
Control data
Reduce API dependency
Ensure stability
Establish technical sovereignty

So the typical Web4.0 product evolution path will likely look like:

LLM API
 ↓
RAG
 ↓
Workflow
 ↓
SLM
 ↓
Enterprise AI Engine

4. Customer Support

Customer service may be one of the first industries to be fully restructured.

But this time, it’s real AI support — not the “fake AI that makes everyone want to throw their phone” from before.

Old AI customer service:

Couldn’t follow context
Couldn’t hold a continuous conversation
Couldn’t read emotions
Only matched keywords

So users always ended up demanding a human.

Web4.0 AI support is different. It will genuinely understand:

Context
Conversation history
User sentiment
User behavior

It can even detect:

“This user is getting frustrated.”

And proactively say:

“Let me connect you with a human agent.”

Most companies’ support operations will become fully AI-manageable. The scenarios that still require humans will shrink to:

High-stakes decisions
Emotional de-escalation
Edge case handling

Another industry, restructured.

5. Version Iteration

This is a more radical idea, but I think it’s cool — and the kind of thing that could go viral.

It’s this:

“What goes into the next version is decided by user vote.”

AI will:

Analyze user behavior
Summarize user needs
Auto-generate candidate features
Let users vote

And eventually, AI will auto-implement some of those features too.

The old software development flow:

Product Manager
 ↓
Requirements
 ↓
Engineering

In the Web4.0 era, it may gradually become:

Users
 ↓
AI Analysis
 ↓
AI Implementation
 ↓
User Feedback

Software will enter:

“The era of high-velocity self-evolution.”

Part 3: Web4.0 Is Not an Upgrade — It’s a Replacement

Many companies still think of AI as a plugin, a feature, a chat window, a productivity tool.

But what AI is actually changing is the entire software architecture.

Web4.0 is not “Web2.0 + AI.” It’s a new computing platform — just like:

PCs replaced mainframes
Smartphones replaced parts of the PC
Cloud computing restructured enterprise systems

AI will redefine:

Software
Workflows
Organizational structures
Development models
User interaction
Enterprise architecture

Most companies think they’re just waiting for AI to mature.

But actually:

AI is waiting to replace them.

We may be standing at the single biggest technological inflection point since the invention of the computer. And many companies are already at the edge of the cliff — they just haven’t looked down yet.

LLM-Based AI Agent Architecture: A New Kind of Personal Computer on Your Device

Tue, 05 May 2026 00:00:00 +0000

LLM-Based AI Agent Architecture: A New Kind of Personal Computer on Your Device

For a long time, we’ve thought of AI as a “chatbot.”

But if you step back and look from a systems architecture perspective, you’ll find that a truly mature AI agent looks more like a new kind of personal computer — one that lives on your device.

It has:

A compute core
Memory
A file system
A software system
Input/output devices
Long-term storage

The difference is:

Its core isn’t a traditional CPU. It’s an LLM.

Part 1: The LLM Engine — A “CPU” Without Memory

The LLM itself has no long-term memory.

It’s more like an inference engine:

Receives input
Reads context
Performs reasoning
Produces output
Then “forgets”

It cannot natively remember things that happened in the past.

Therefore:

The LLM itself is more like a CPU than a complete agent.

It only handles computation.

What makes AI “seem like it knows you” is the context provided externally.

Part 2: Context — The AI Agent’s Memory

If the LLM is the CPU,
then Context is the AI’s memory.

And this memory should be split into two layers.

1. Global Context

This layer belongs to the entire agent.

It records:

User preferences
Long-term goals
Habitual behaviors
Persona settings
Persistent rules
Historical knowledge

For example:

“User prefers Markdown”
“User is learning AI Agents”
“User habitually writes in Chinese”

This information shapes agent behavior over time.

2. Session Context

This layer belongs only to the current conversation.

For example:

The current topic under discussion
The current article structure
The most recent rounds of dialogue
Temporary reasoning results

It’s more like temporary memory during program execution.

The Context Window Is Essentially a “Memory Limit”

An LLM’s Context Window isn’t unlimited.

This means:

History can’t accumulate indefinitely
Information gets more expensive as the window fills
Past the limit, content must be compressed

Therefore:

An agent must manage memory like an operating system:

Compress history
Summarize
Clear low-priority information
Transfer long-term data
Dynamically load needed data

Therefore:

The Context Window is essentially the AI’s memory capacity.

Part 3: Markdown Files — The Agent’s Hard Drive

Long-term data shouldn’t stay in the context window.

Otherwise:

Costs keep rising
Inference slows down
The context balloons rapidly

Therefore:

Long-term memory should live in a file system.

And one very natural form is Markdown files.

For example:

Notes
Project materials
Journals
World-building
User profiles
Writing material
Long-term knowledge bases

All of these can be stored as Markdown.

This means:

Traditional Computer	AI Agent
Hard Drive	Markdown File System

Markdown has one enormous advantage:

It can be read by AI and directly by humans alike.

Therefore:

Humans can edit it
AI can process it
Git can version-control it
Files can sync
It persists even without AI

This creates something like:

“A shared knowledge space between humans and AI.”

Part 4: Skills — Software Installed on AI

Future AI agents won’t only have “knowledge.”

They’ll also have “skills.”

For example:

Writing Skill
Programming Skill
Video Editing Skill
Data Analysis Skill
Project Management Skill

These Skills might be composed of:

Prompts
Workflows
Python code
MCP configurations
Tool invocation rules

They are like:

Software installed on the AI.

Therefore:

Traditional Computer	AI Agent
Software / App	Skill

Skills can be:

Installed
Uninstalled
Updated
Shared
Combined

In the future there may even be:

Skill Stores
Skill Marketplaces
Open-source Skill communities

Part 5: Input/Output — More Than Just Text

One of the biggest misconceptions about traditional chatbots is that people think AI only communicates through text.

In reality, future AI agents will have a complete multimodal I/O system.

Input

AI can read:

Text
Voice
Images
Video
Camera feeds
Files
Screen content
Device state

Output

AI can generate:

Text
Voice
Images
Video
Automated actions
Control commands

Therefore:

An AI agent is fundamentally a new interaction layer.

The Complete System: A “Von Neumann-style” AI Computer

When you put the whole architecture together:

Traditional Computer	AI Agent
CPU	LLM Engine
Memory	Context
Hard Drive	Markdown File System
Software	Skill
Input Device	Multimodal Input
Output Device	Multimodal Output

You’ll find:

It increasingly resembles a real computer.

Except:

This computer isn’t built around a GUI.

It’s built around:

“Language comprehension and reasoning.”

The Operating System: A Personal AI OS

In the future, every person’s device may host a persistent AI Agent.

One that:

Understands you
Remembers you
Helps you work
Manages your knowledge
Schedules your Skills
Operates your devices
Grows alongside you over time

At that point:

What we use might no longer just be:

Windows
macOS
Android

But rather:

A new kind of personal AI operating system, with LLM at its core.

And the chat box we use today

may only be the earliest prototype of this new era.

References

Park, Joon Sung et al.
MemGPT: Towards LLMs as Operating Systems
arXiv:2310.08560
https://arxiv.org/abs/2310.08560
Wang, Lei et al.
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem
arXiv:2312.03815
https://arxiv.org/abs/2312.03815