<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agent on Code Plato</title><link>https://CodePlato3721.github.io/tags/agent/</link><description>Recent content in Agent on Code Plato</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Tue, 05 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://CodePlato3721.github.io/tags/agent/index.xml" rel="self" type="application/rss+xml"/><item><title>LLM-Based AI Agent Architecture: A New Kind of Personal Computer on Your Device</title><link>https://CodePlato3721.github.io/post/llm-agent-architecture-new-kind-of-personal-computer/</link><pubDate>Tue, 05 May 2026 00:00:00 +0000</pubDate><guid>https://CodePlato3721.github.io/post/llm-agent-architecture-new-kind-of-personal-computer/</guid><description>&lt;img src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/banner.png" alt="Featured image of post LLM-Based AI Agent Architecture: A New Kind of Personal Computer on Your Device" /&gt;&lt;h1 id="llm-based-ai-agent-architecture-a-new-kind-of-personal-computer-on-your-device"&gt;LLM-Based AI Agent Architecture: A New Kind of Personal Computer on Your Device
&lt;/h1&gt;&lt;p&gt;For a long time, we&amp;rsquo;ve thought of AI as a &amp;ldquo;chatbot.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;But if you step back and look from a systems architecture perspective, you&amp;rsquo;ll find that a truly mature AI agent looks more like a new kind of personal computer — one that lives on your device.&lt;/p&gt;
&lt;p&gt;It has:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A compute core&lt;/li&gt;
&lt;li&gt;Memory&lt;/li&gt;
&lt;li&gt;A file system&lt;/li&gt;
&lt;li&gt;A software system&lt;/li&gt;
&lt;li&gt;Input/output devices&lt;/li&gt;
&lt;li&gt;Long-term storage&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The difference is:&lt;/p&gt;
&lt;p&gt;Its core isn&amp;rsquo;t a traditional CPU. It&amp;rsquo;s an LLM.&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="part-1-the-llm-engine--a-cpu-without-memory"&gt;Part 1: The LLM Engine — A &amp;ldquo;CPU&amp;rdquo; Without Memory
&lt;/h1&gt;&lt;p&gt;The LLM itself has no long-term memory.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s more like an inference engine:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Receives input&lt;/li&gt;
&lt;li&gt;Reads context&lt;/li&gt;
&lt;li&gt;Performs reasoning&lt;/li&gt;
&lt;li&gt;Produces output&lt;/li&gt;
&lt;li&gt;Then &amp;ldquo;forgets&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It cannot natively remember things that happened in the past.&lt;/p&gt;
&lt;p&gt;Therefore:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;The LLM itself is more like a CPU than a complete agent.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;It only handles computation.&lt;/p&gt;
&lt;p&gt;What makes AI &amp;ldquo;seem like it knows you&amp;rdquo; is the context provided externally.&lt;/p&gt;
&lt;p&gt;&lt;img alt="LLM CPU" class="gallery-image" data-flex-basis="276px" data-flex-grow="115" height="325" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/01_llm_cpu.png" width="375"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="part-2-context--the-ai-agents-memory"&gt;Part 2: Context — The AI Agent&amp;rsquo;s Memory
&lt;/h1&gt;&lt;p&gt;If the LLM is the CPU,&lt;br&gt;
then Context is the AI&amp;rsquo;s memory.&lt;/p&gt;
&lt;p&gt;And this memory should be split into two layers.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="1-global-context"&gt;1. Global Context
&lt;/h2&gt;&lt;p&gt;This layer belongs to the entire agent.&lt;/p&gt;
&lt;p&gt;It records:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;User preferences&lt;/li&gt;
&lt;li&gt;Long-term goals&lt;/li&gt;
&lt;li&gt;Habitual behaviors&lt;/li&gt;
&lt;li&gt;Persona settings&lt;/li&gt;
&lt;li&gt;Persistent rules&lt;/li&gt;
&lt;li&gt;Historical knowledge&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;User prefers Markdown&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;User is learning AI Agents&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;User habitually writes in Chinese&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This information shapes agent behavior over time.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="2-session-context"&gt;2. Session Context
&lt;/h2&gt;&lt;p&gt;This layer belongs only to the current conversation.&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The current topic under discussion&lt;/li&gt;
&lt;li&gt;The current article structure&lt;/li&gt;
&lt;li&gt;The most recent rounds of dialogue&lt;/li&gt;
&lt;li&gt;Temporary reasoning results&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&amp;rsquo;s more like temporary memory during program execution.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="the-context-window-is-essentially-a-memory-limit"&gt;The Context Window Is Essentially a &amp;ldquo;Memory Limit&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;An LLM&amp;rsquo;s Context Window isn&amp;rsquo;t unlimited.&lt;/p&gt;
&lt;p&gt;This means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;History can&amp;rsquo;t accumulate indefinitely&lt;/li&gt;
&lt;li&gt;Information gets more expensive as the window fills&lt;/li&gt;
&lt;li&gt;Past the limit, content must be compressed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore:&lt;/p&gt;
&lt;p&gt;An agent must manage memory like an operating system:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compress history&lt;/li&gt;
&lt;li&gt;Summarize&lt;/li&gt;
&lt;li&gt;Clear low-priority information&lt;/li&gt;
&lt;li&gt;Transfer long-term data&lt;/li&gt;
&lt;li&gt;Dynamically load needed data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;The Context Window is essentially the AI&amp;rsquo;s memory capacity.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="Context Memory" class="gallery-image" data-flex-basis="262px" data-flex-grow="109" height="365" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/02_context_memory.png" width="399"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="part-3-markdown-files--the-agents-hard-drive"&gt;Part 3: Markdown Files — The Agent&amp;rsquo;s Hard Drive
&lt;/h1&gt;&lt;p&gt;Long-term data shouldn&amp;rsquo;t stay in the context window.&lt;/p&gt;
&lt;p&gt;Otherwise:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Costs keep rising&lt;/li&gt;
&lt;li&gt;Inference slows down&lt;/li&gt;
&lt;li&gt;The context balloons rapidly&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;Long-term memory should live in a file system.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;And one very natural form is Markdown files.&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Notes&lt;/li&gt;
&lt;li&gt;Project materials&lt;/li&gt;
&lt;li&gt;Journals&lt;/li&gt;
&lt;li&gt;World-building&lt;/li&gt;
&lt;li&gt;User profiles&lt;/li&gt;
&lt;li&gt;Writing material&lt;/li&gt;
&lt;li&gt;Long-term knowledge bases&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of these can be stored as Markdown.&lt;/p&gt;
&lt;p&gt;This means:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Traditional Computer&lt;/th&gt;
 &lt;th&gt;AI Agent&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Hard Drive&lt;/td&gt;
 &lt;td&gt;Markdown File System&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Markdown has one enormous advantage:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;It can be read by AI and directly by humans alike.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;Therefore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Humans can edit it&lt;/li&gt;
&lt;li&gt;AI can process it&lt;/li&gt;
&lt;li&gt;Git can version-control it&lt;/li&gt;
&lt;li&gt;Files can sync&lt;/li&gt;
&lt;li&gt;It persists even without AI&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This creates something like:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;&amp;ldquo;A shared knowledge space between humans and AI.&amp;rdquo;&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="Markdown Storage" class="gallery-image" data-flex-basis="298px" data-flex-grow="124" height="353" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/03_markdown_storage.png" width="439"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="part-4-skills--software-installed-on-ai"&gt;Part 4: Skills — Software Installed on AI
&lt;/h1&gt;&lt;p&gt;Future AI agents won&amp;rsquo;t only have &amp;ldquo;knowledge.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;They&amp;rsquo;ll also have &amp;ldquo;skills.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Writing Skill&lt;/li&gt;
&lt;li&gt;Programming Skill&lt;/li&gt;
&lt;li&gt;Video Editing Skill&lt;/li&gt;
&lt;li&gt;Data Analysis Skill&lt;/li&gt;
&lt;li&gt;Project Management Skill&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These Skills might be composed of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Prompts&lt;/li&gt;
&lt;li&gt;Workflows&lt;/li&gt;
&lt;li&gt;Python code&lt;/li&gt;
&lt;li&gt;MCP configurations&lt;/li&gt;
&lt;li&gt;Tool invocation rules&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They are like:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;Software installed on the AI.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;Therefore:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Traditional Computer&lt;/th&gt;
 &lt;th&gt;AI Agent&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Software / App&lt;/td&gt;
 &lt;td&gt;Skill&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Skills can be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Installed&lt;/li&gt;
&lt;li&gt;Uninstalled&lt;/li&gt;
&lt;li&gt;Updated&lt;/li&gt;
&lt;li&gt;Shared&lt;/li&gt;
&lt;li&gt;Combined&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the future there may even be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Skill Stores&lt;/li&gt;
&lt;li&gt;Skill Marketplaces&lt;/li&gt;
&lt;li&gt;Open-source Skill communities&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img alt="Skill Software" class="gallery-image" data-flex-basis="280px" data-flex-grow="116" height="330" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/04_skill_software.png" width="385"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="part-5-inputoutput--more-than-just-text"&gt;Part 5: Input/Output — More Than Just Text
&lt;/h1&gt;&lt;p&gt;One of the biggest misconceptions about traditional chatbots is that people think AI only communicates through text.&lt;/p&gt;
&lt;p&gt;In reality, future AI agents will have a complete multimodal I/O system.&lt;/p&gt;
&lt;h2 id="input"&gt;Input
&lt;/h2&gt;&lt;p&gt;AI can read:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Text&lt;/li&gt;
&lt;li&gt;Voice&lt;/li&gt;
&lt;li&gt;Images&lt;/li&gt;
&lt;li&gt;Video&lt;/li&gt;
&lt;li&gt;Camera feeds&lt;/li&gt;
&lt;li&gt;Files&lt;/li&gt;
&lt;li&gt;Screen content&lt;/li&gt;
&lt;li&gt;Device state&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="output"&gt;Output
&lt;/h2&gt;&lt;p&gt;AI can generate:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Text&lt;/li&gt;
&lt;li&gt;Voice&lt;/li&gt;
&lt;li&gt;Images&lt;/li&gt;
&lt;li&gt;Video&lt;/li&gt;
&lt;li&gt;Automated actions&lt;/li&gt;
&lt;li&gt;Control commands&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;An AI agent is fundamentally a new interaction layer.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="Multimodal IO" class="gallery-image" data-flex-basis="303px" data-flex-grow="126" height="330" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/05_multimodal_io.png" width="417"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="the-complete-system-a-von-neumann-style-ai-computer"&gt;The Complete System: A &amp;ldquo;Von Neumann-style&amp;rdquo; AI Computer
&lt;/h1&gt;&lt;p&gt;When you put the whole architecture together:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Traditional Computer&lt;/th&gt;
 &lt;th&gt;AI Agent&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;CPU&lt;/td&gt;
 &lt;td&gt;LLM Engine&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Memory&lt;/td&gt;
 &lt;td&gt;Context&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Hard Drive&lt;/td&gt;
 &lt;td&gt;Markdown File System&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Software&lt;/td&gt;
 &lt;td&gt;Skill&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Input Device&lt;/td&gt;
 &lt;td&gt;Multimodal Input&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Output Device&lt;/td&gt;
 &lt;td&gt;Multimodal Output&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;You&amp;rsquo;ll find:&lt;/p&gt;
&lt;p&gt;It increasingly resembles a real computer.&lt;/p&gt;
&lt;p&gt;Except:&lt;/p&gt;
&lt;p&gt;This computer isn&amp;rsquo;t built around a GUI.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s built around:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;&amp;ldquo;Language comprehension and reasoning.&amp;rdquo;&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="AI Computer Architecture" class="gallery-image" data-flex-basis="270px" data-flex-grow="112" height="355" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/06_ai_computer_architecture.png" width="400"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="the-operating-system-a-personal-ai-os"&gt;The Operating System: A Personal AI OS
&lt;/h1&gt;&lt;p&gt;In the future, every person&amp;rsquo;s device may host a persistent AI Agent.&lt;/p&gt;
&lt;p&gt;One that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Understands you&lt;/li&gt;
&lt;li&gt;Remembers you&lt;/li&gt;
&lt;li&gt;Helps you work&lt;/li&gt;
&lt;li&gt;Manages your knowledge&lt;/li&gt;
&lt;li&gt;Schedules your Skills&lt;/li&gt;
&lt;li&gt;Operates your devices&lt;/li&gt;
&lt;li&gt;Grows alongside you over time&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At that point:&lt;/p&gt;
&lt;p&gt;What we use might no longer just be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Windows&lt;/li&gt;
&lt;li&gt;macOS&lt;/li&gt;
&lt;li&gt;Android&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But rather:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;A new kind of personal AI operating system, with LLM at its core.&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;And the chat box we use today&lt;/p&gt;
&lt;p&gt;may only be the earliest prototype of this new era.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Personal AI OS" class="gallery-image" data-flex-basis="265px" data-flex-grow="110" height="366" loading="lazy" sizes="(max-width: 767px) calc(100vw - 30px), (max-width: 1023px) 700px, (max-width: 1279px) 950px, 1232px" src="https://pub-deacd49348914a49b1254b01f351ef0d.r2.dev/2026/05/llm-agent-architecture-a-new-kind-of-personal-computer/en/07_personal_ai_os.png" width="405"&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h1 id="references"&gt;References
&lt;/h1&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Park, Joon Sung et al.&lt;br&gt;
&lt;strong&gt;MemGPT: Towards LLMs as Operating Systems&lt;/strong&gt;&lt;br&gt;
arXiv:2310.08560&lt;br&gt;
&lt;a class="link" href="https://arxiv.org/abs/2310.08560" target="_blank" rel="noopener"
 &gt;https://arxiv.org/abs/2310.08560&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Wang, Lei et al.&lt;br&gt;
&lt;strong&gt;LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem&lt;/strong&gt;&lt;br&gt;
arXiv:2312.03815&lt;br&gt;
&lt;a class="link" href="https://arxiv.org/abs/2312.03815" target="_blank" rel="noopener"
 &gt;https://arxiv.org/abs/2312.03815&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item></channel></rss>