Large Action Models, LAMs
What is a large action model (LAM) ?
Actions are performed from user-given instructions using agents.
Can understand and execute complex task by translating human intentions into action
→ 負責理解使用者和執行動作
They are developed to enhance automation by allowing systems to take autonomous actions based on more generalized instructions, potentially improving efficiency in fields like robotics, enterprise solutions, and logistics.
→ 允許系統根據通用的指令採取自主操作增強自動化,提升領域內效率
Large Action Model Agent:
Actions depend on the current state of the environment and the given conditions or knowledge -> 根據環境與特定條件或知識,適應變化進行交互
- Perception: A LAMs receives input as voice, text, or visuals, accompanied by a task request -> 接收輸入
- Brain: This could be the neuro-symbolic AI of the Large Action Model, which includes capabilities to plan, reason, memorize, and learn or retrieve knowledge -> 計劃、推理、記憶以及學習或檢索知識的能力
- Agent: This is how the large action model takes action, as a user interface or a device. It analyzes the given input task using the brain and then takes action -> 執行操作的方式

Notice!
It doesn't change the fact that decision-makers need to tread carefully, cutting through promising descriptions and focusing on facts
-> 有了 LAMs ,決策者仍需警慎
LLM agent consists of several important elements:
- Prompts
- Memory
- Knowledge
- Planning
- Tools
While still in development, LAMs aim to surpass current AI capabilities, making them particularly significant for enterprise applications.
→ LAMs 屬於還在開發的階段,目標是超越當前 AI 給我們除了文字上的需求
| Feature | Large Language Models (LLMs) | Large Action Models (LAMs) |
|---|---|---|
| What can it do | Language Generation | Task Execution and Completion |
| Input | Textual data | Text, images, instruction, etc. |
| Output | Textual data | Actions, Text |
| Training Data | Large text corporation | Text, code, images, actions |
| Application Areas | Content creation, translation, chatbots | Automation, decision-making, complex interactions |
| Strengths | Language understanding, text generation | Reasoning, planning, decision-making, real-time interaction |
| Weaknesses | Limited reasoning, lack of action capabilities | Still under development, ethical concerns |
Reference