Showing posts with label transformer ASIC. Show all posts
Showing posts with label transformer ASIC. Show all posts

ChatGPT 5: Are we Closer to AGI?

ChatGPT 5: Are we Closer to AGI?

Introduction

The release of ChatGPT 5 marks a watershed moment in the evolution of large language models. With over 700 million weekly users and integration into products like Microsoft Copilot, GPT-5 has been touted as “a significant step” toward artificial general intelligence (AGI) (Milmo, 2025). Yet debates persist on whether its enhancements represent true strides toward a system capable of human-level reasoning across any domain or simply incremental advances on narrow tasks. This post examines the journey from early GPT iterations to GPT-5, considers how AGI is defined, and explores how specialized AI hardware—led by startups such as Etched with its Sohu ASIC—could accelerate or constrain progress toward that elusive goal.


The Evolution of GPT Models

Since the original GPT launch in 2018, OpenAI’s models have grown in scale and capability. GPT-1 demonstrated unsupervised pretraining on a general text corpus, GPT-2 expanded parameters to 1.5 billion, and GPT-3 exploded to 175 billion parameters, showcasing zero-shot and few-shot learning abilities. GPT-3.5 refined chat interactions, and GPT-4 introduced multimodal inputs. GPT-4.o and GPT-4.5 added “chain-of-thought” reasoning, while GPT-5 unifies these lines into a single model that claims to integrate reasoning, “vibe coding,” and agentic functions without requiring manual mode selection (Zeff, 2025).

Defining Artificial General Intelligence

AGI refers to a system that can understand, learn, and apply knowledge across any intellectual task that a human can perform. Key attributes include autonomous continuous learning, broad domain transfer, and goal-driven reasoning. OpenAI’s own definition frames AGI as “a highly autonomous system that outperforms humans at most economically valuable work” (Milmo, 2025). Critics emphasize continuous self-improvement and real-world adaptability—traits still missing from GPT-5, which requires retraining to acquire new skills rather than online learning (Griffiths & Varanasi, 2025).

Capabilities and Limitations of ChatGPT 5

Reasoning and Multimodality
GPT-5 demonstrates improved chain-of-thought reasoning, surpassing GPT-4’s benchmarks in tasks such as mathematics, logic puzzles, and abstraction. It processes text, voice, and images in a unified pipeline, enabling applications like on-the-fly document analysis and voice-guided tutoring (Strickland, 2025).

Vibe Coding
A standout feature, “vibe coding,” allows users to describe desired software in natural language and receive complete, compilable code within seconds. On the SWE-bench coding benchmark, GPT-5 achieved a 74.9% first-attempt success rate, edging out Anthropic’s Claude Opus 4.1 (74.5%) and Google DeepMind’s Gemini 2.5 Pro (59.6%) (Zeff, 2025).

Agentic Tasks
GPT-5 autonomously selects and orchestrates external tools—calendars, email, or APIs—to fulfill complex requests. This “agentic AI” paradigm signals movement beyond static chat, illustrating a new class of assistants capable of executing multi-step workflows (Zeff, 2025).

Limitations
Despite these advances, GPT-5 is not yet AGI. It lacks continuous learning in deployment, requiring offline retraining for new knowledge. Hallucination rates, though reduced to 1.6% on the HealthBench Hard Hallucinations test, still impede reliability in high-stakes domains (Zeff, 2025). Ethical and safety guardrails have improved via “safe completions,” but adversarial jailbreaks remain a concern (Strickland, 2025).

According to Matt O’Brien of AP News (O’Brien, 2025), GPT-5 resets OpenAI’s flagship technology architecture, preparing the ground for future innovations. Yet Sam Altman admitted that key AGI traits, notably online self-learning, are still “many things quite important” away (Milmo, 2025).

Strategic Moves in the AI Hardware Landscape

AI models of GPT-5’s scale demand unprecedented compute power. Traditional GPUs from Nvidia remain dominant, but the market is rapidly diversifying with startups offering specialized accelerators. Graphcore and Cerebras target general-purpose AI workloads, while niche players are betting on transformer-only ASICs. This shift toward specialization reflects the increasing costs of training and inference at scale (Medium, 2024).

Recently, BitsWithBrains (Editorial team, 2024) reported that Etched.ai’s Sohu chip promises 20× faster inference than Nvidia H100 GPUs by hard-wiring transformer matrix multiplications, achieving 90% FLOP utilization versus 30–40% on general-purpose hardware.

Etched and the Sohu ASIC

Genesis and Funding
Founded in 2022, Etched secured \$120 million to develop Sohu, its transformer-specific ASIC (Wassim, 2024). This investment reflects confidence in a hyper-specialized strategy aimed at reducing AI infrastructure costs and energy consumption.

Technical Superiority
Sohu integrates 144 GB of HBM3 memory per chip, enabling large batch sizes without performance degradation—critical for services like ChatGPT and Google Gemini that handle thousands of concurrent requests (Wassim, 2024). An 8× Sohu server is claimed to replace 160 Nvidia H100 GPUs, shrinking hardware footprint and operational overhead.

Strategic Partnerships and Demonstrations
Etched partnered with TSMC to leverage its 4 nm process and dual-sourced HBM3E memory, ensuring production scalability and reliability (Wassim, 2024). The company showcased “Oasis,” a real-time interactive video generator built in collaboration with Decart, demonstrating a use case only economically feasible on Sohu hardware (Lyons, 2024). This three-step strategy—invent, demonstrate feasibility, and launch ASIC—exemplifies how Etched is creating demand for its specialized chip.

Market Potential and Risks
While Sohu’s efficiency is compelling, its transformer-only focus raises concerns about adaptability if AI architectures evolve beyond transformers. Early access programs and developer cloud services aim to onboard customers in sectors like streaming, gaming, and metaverse applications, but the technology remains unproven at hyperscale (Lyons, 2024).

Implications for AGI

Hardware acceleration reduces latency and cost barriers, enabling more frequent experimentation and real-time multimodal inference. If transformer-specialized chips like Sohu deliver on their promises, the accelerated feedback loops could hasten algorithmic breakthroughs. Yet AGI requires more than raw compute—it demands architectures capable of lifelong learning, causal reasoning, and autonomous goal formulation, areas where current hardware alone cannot suffice.

Policy and regulation will also shape the trajectory. Continuous online learning raises new safety and accountability challenges, potentially requiring hardware-level enforcements of policy constraints (Griffiths & Varanasi, 2025).

Challenges and Ethical Considerations

Safety and Hallucinations
Despite reduced hallucination rates, GPT-5 may still propagate misinformation in critical sectors like healthcare and finance. Ongoing hiring of forensic psychiatrists to study mental health impacts highlights the gravity of uncontrolled outputs (Strickland, 2025).

Data Privacy
Agentic functionalities that access personal calendars or emails necessitate robust permission and encryption frameworks. Misconfigurations could expose sensitive data in automated workflows.

Regulatory Scrutiny
OpenAI faces legal challenges tied to its nonprofit origins and nonprofit-to-for-profit conversion, drawing oversight from state attorneys general. Specialized hardware firms may encounter export controls if their chips enable dual-use applications.

Environmental Impact
While Sohu claims energy efficiency gains, the overall environmental footprint of proliferating data centers and embedded AI systems remains substantial. Lifecycle analyses must account for chip manufacturing and e-waste.

Key Takeaways

  • GPT-5 Advances: Improved reasoning, coding (“vibe coding”), and agentic tasks push the model closer to human-level versatility (Zeff, 2025).
  • AGI Gap: True AGI demands continuous, autonomous learning—a feature GPT-5 still lacks (Milmo, 2025).
  • Hardware Specialization: Startups like Etched with Sohu ASICs offer 20× performance for transformer models, but their narrow focus poses adaptability risks (Editorial team, 2024; Wassim, 2024).
  • Strategic Demonstrations: Projects like Oasis illustrate how specialized hardware can create entirely new application markets (Lyons, 2024).
  • Ethical and Regulatory Hurdles: Safety, privacy, and environmental considerations will influence the pace of AGI development (Strickland, 2025; Griffiths & Varanasi, 2025).


References

Related Content


Stay Connected

Follow us on @leolexicon on X

Join our TikTok community: @lexiconlabs

Watch on YouTube: Lexicon Labs


Newsletter

Sign up for the Lexicon Labs Newsletter to receive updates on book releases, promotions, and giveaways.


Catalog of Titles

Our list of titles is updated regularly. View our full Catalog of Titles

AI Chip Wars: How Embedded Intelligence is Revolutionizing Semiconductor Innovation

AI Chip Wars: How Embedded Intelligence is Revolutionizing Semiconductor Innovation

The silent revolution transforming artificial intelligence isn't happening in software labs – it is occurring at the nanometer scale inside semiconductor fabrication plants. As global demand for AI compute explodes, traditional general-purpose chips are hitting physical limits, igniting a technological arms race where the future of AI innovation will be determined by how intelligence gets embedded directly into silicon. This high-stakes battle pits industry titans against daring startups in a contest that will reshape global tech power structures and determine who controls the infrastructure of our intelligent future.

Etched Sohu: the world's first transformer-specific AI chip

Etched's Sohu Chip

But what exactly is embedded AI? Embedded AI literally means that we are building artificial intelligence directly into everyday devices and machines – like putting a tiny brain inside objects. Instead of needing to connect to the internet or a giant computer in the cloud, the device itself can see, hear, understand, and make smart decisions instantly using its own specialized chip. Think of a smart fridge that instantly recognizes spoiled food with its built-in camera, a factory robot that instantly spots defects without stopping, or your phone camera instantly adjusting settings for the perfect photo – all without waiting to "phone home" to a distant server. It turns ordinary objects into responsive, efficient, and private smart helpers.

The Compute Inferno: Fueling the AI Chip Revolution

Transformer models now routinely contain hundreds of billions – even trillions – of parameters, creating unprecedented computational demands:

  • Training frontier-scale models exceeds $1 billion in electricity and hardware costs (Uberti, 2024)
  • Inference costs over a model's lifetime sit an order of magnitude higher than training expenses
  • Energy consumption for large language models has increased 300,000x since 2012

This economic reality mirrors Bitcoin mining's evolution: early miners discovered that specialized ASICs delivered tenfold efficiency gains over flexible GPUs. We're now witnessing the same transformation in AI, where purpose-built silicon eliminates architectural overhead and slashes energy waste.

Architectural Evolution: From General-Purpose to Domain-Specific

Here are the key milestones in the field of embedded AI:

2006: CUDA Revolution

NVIDIA unlocks parallel processing in gaming GPUs, enabling early AI experiments

2016: Google TPU

First dedicated AI accelerator cuts inference latency by 10x for search ranking

2017: Apple Neural Engine

Brings on-device AI to mobile photography with dedicated silicon

Today's hyperscalers demand even sharper specialization: silicon optimized exclusively for transformer architectures – the "T" in ChatGPT – with all unnecessary components stripped away. This has ignited an explosion of domain-specific accelerators challenging NVIDIA's CUDA ecosystem dominance.

2025 Competitive Landscape: Titans vs. Disruptors

Incumbent Powerhouses

Company Flagship Product Key Innovation Strategic Advantage
NVIDIA Blackwell Ultra Micro-tensor scaling, 4-bit FP4 support Doubles model size at constant memory (NVIDIA, 2025)
AMD Instinct MI300X 192GB HBM3, 5TB/s bandwidth Eliminates memory bottlenecks (AMD, 2025)
Intel Gaudi-3 Hybrid architecture Price-performance targeting

Disruptive Startups

Cerebras

Wafer-scale chips printing entire silicon wafers into single processors

Groq

Deterministic LPUs delivering 300 tokens/sec on Llama-2 70B (Groq, 2023)

Chinese Challengers

Huawei's Ascend 910B and Biren's BR100 targeting domestic autonomy despite export controls (Reuters, 2025)

Etched's Sohu: The Ultimate Transformer Machine

San Francisco startup Etched has made the industry's most audacious wager with its transformer-specific Sohu ASIC.

Here's a breakdown of Etched's Sohu chip capabilities in plain terms with real-world analogies:

⚡️ 1. Radical Specialization

Think of it as a master chef who only makes pizza.
Instead of a general-purpose chip (like NVIDIA's) that can run any AI task (chatbots, image recognition, etc.), Sohu is hardwired exclusively for transformer models (the "T" in ChatGPT). It can't run other AI types (like Siri's old voice recognition or Tesla's vision systems). This laser focus is its superpower.

🚀 2. Record Performance

Like replacing 160 horses with 1 rocket.
Sohu generates 500,000 words per second when running a ChatGPT-sized model (Llama-3 70B). To match this, you’d need 160 high-end NVIDIA H100 GPUs ($3M+ worth of hardware) working together. It’s the difference between a bicycle and a fighter jet.

🔋 3. Unprecedented Efficiency

Your phone battery lasting 20 days instead of 1.
For complex AI tasks (like summarizing a 100-page document), Sohu uses 1/20th the electricity of NVIDIA’s top chip. If an NVIDIA server costs $10,000/month in power, Sohu would cost just $500 for the same work.

🔬 4. Advanced Manufacturing

Building circuits 20,000x thinner than a hair.
Sohu is made with TSMC’s 3nm technology – the most precise chipmaking process today. Smaller circuits = more power in less space (like fitting a supercomputer into a laptop).


⚙️ How It Achieves This (Simple Analogy):

Imagine a factory assembly line:

  • Old Way (GPUs): Workers (circuits) read instructions for every task ("Build a car? Okay, let me check the manual..."). Slow and energy-wasting.

  • Sohu’s Way: The factory is pre-built only for cars. Conveyor belts (silicon) are hardwired to bolt tires, install engines, etc. No instructions needed – everything flows instantly with zero wasted motion.

This eliminates:

  • "Scheduler Overhead": No manager shouting instructions.

  • "Thread Divergence": No workers waiting for tasks.

  • "Cache Aliasing": No parts delivered to the wrong station.

Result: Near-perfect efficiency – like a factory where 99% of energy goes directly into building cars.

Real-World Impact

  • For Companies: Cuts AI costs by 95% for chatbots/LLMs.

  • For Users: Enables real-time AI assistants that respond instantly (no "typing..." delay).

  • For the Planet: Slashes data center energy use dramatically.

The tradeoff? Sohu can’t adapt if AI tech moves beyond transformers. It’s a high-risk, high-reward bet on the future.

Strategic Execution & Ecosystem Development

Etched's path to market reveals sophisticated risk mitigation:

Partnerships

Collaboration with Rambus for integrated HBM controller and PHY stack accelerated development (Rambus, 2025)

Developer Strategy

"Developer Cloud" provides pre-silicon emulator access – mirroring NVIDIA's early CUDA playbook (AIM Research, 2024)

Funding & Valuation

$120 million Series A led by Positive Sum with participation from Peter Thiel and Stanley Druckenmiller (Reuters, 2024)

Despite these advantages, analysts place probability of first-customer shipment within 12 months below 10% due to:

  • HBM3 memory supply constraints
  • TSMC 3nm yield challenges
  • Potential U.S. export control changes (Kelly, 2025)

Business Model Innovation: The AI Throughput Economy

Etched's hybrid monetization strategy reflects industry transformation:

Hardware Sales

$50K-$100K

Per-card pricing for on-premise deployment

Throughput Cloud

$0.0001/token

Minute-based billing for hosted inference

This "AI-as-utility" model shields customers from capital expenditure while creating recurring revenue streams. Sohu's deterministic pipeline particularly excels at real-time applications like multilingual voice agents where latency must stay below 200ms – workloads where GPUs struggle with queueing jitter.

Geopolitical Chessboard: The Silicon Curtain

The Five Nation Oligopoly

Advanced semiconductor manufacturing concentrates in just five countries controlling critical choke points:

Country Dominance Area Market Share
Taiwan Advanced Logic (TSMC) 92% of <5nm production="" td="">
Netherlands EUV Lithography (ASML) 100% of EUV systems
South Korea Memory & Foundry (Samsung) 43% of DRAM market

China's Semiconductor Dilemma

Despite massive investments, China faces structural challenges:

  • Spends equivalent of oil imports on semiconductor purchases
  • SMIC's 7nm process (N+2) remains 3-4 generations behind industry leaders
  • Huawei's Ascend 910B allegedly contains TSMC IP despite export controls (Woodruff, 2024)
  • Biren's $207M funding round and planned Hong Kong IPO show desperation for capital (Reuters, 2025)

Reshoring Initiatives

U.S. CHIPS Act

$52B subsidies triggering $450B private investment

Europe's Chips Act

€43B to double EU's global market share

China's Big Fund

$50B+ for semiconductor self-sufficiency

Future Frontiers: Beyond Transformer Dominance

As architectural innovation accelerates, two competing visions emerge:

Vertical Integration Model

Cloud providers building proprietary AI factories:

  • NVIDIA's Blackwell reference platform partners with Cisco/Dell/HPE (NVIDIA Newsroom, 2025)
  • Amazon's Trainium/VasS chips anchor AWS ecosystem
  • Google's TPU v5+ for Google Cloud services

Heterogeneous Ecosystem

Specialists leasing capacity to model developers:

  • Etched targeting lowest $/token for transformers
  • Groq planning 2M LPU shipments by 2026 (Business Insider, 2024)
  • Cerebras' wafer-scale for massive models

Next-Generation Technologies

Neuromorphic Chips

Intel Loihi 2

Chiplet Ecosystems

Modular designs

Photonic Computing

Light-based processing

Quantum Accelerators

Algorithm-specific boost

Conclusion: The Embedded Intelligence Revolution

The AI chip wars represent a fundamental transformation in computing's basic economics. As specialized architectures like Etched's Sohu demonstrate 20x efficiency gains, they force reconsideration of the "one architecture fits all" paradigm that has dominated for decades. This revolution extends beyond technical specifications into global power dynamics, where semiconductor leadership translates directly to economic and military advantage.

The coming years will determine whether transformer-specific ASICs become the new standard or face obsolescence from algorithmic shifts. What remains certain is that embedding intelligence directly into silicon marks a new chapter in computing – one where the boundaries between hardware and intelligence dissolve, creating unprecedented capabilities and complex geopolitical challenges. The nations and companies that master this integration will shape our technological future for decades to come.

Key Takeaways

  • Transformer specialization delivers 10-20x efficiency gains but carries architectural lock-in risks
  • Etched's Sohu represents extreme specialization with 500K tokens/sec performance replacing 160 GPUs
  • Geopolitics dictates semiconductor access with five nations controlling advanced manufacturing
  • China spends equivalent of oil imports on chips but remains 3-4 generations behind in process technology
  • Hybrid business models emerge combining hardware sales with throughput-based cloud services
  • Next-gen architectures are already developing including neuromorphic, photonic, and quantum-assisted chips

References

  1. AMD. (2025). Instinct MI300X accelerators: AI & HPC computing. Retrieved from: https://www.amd.com/en/partner/articles/instinct-mi300x-accelerating-ai-hpc.html
  2. Business Insider. (2024). Groq CEO Jonathan Ross reveals strategy to lead AI chip market. Retrieved from: https://www.businessinsider.com/jonathan-ross-groq-ai-power-list-2024
  3. Kelly, A. (2025). Will the Sohu AI chip ship to customers within a year? Manifold Markets. Retrieved from: https://manifold.markets/ahalekelly/will-the-sohu-ai-chip-ship-to-custo
  4. Morales, J. (2024). Sohu AI chip claimed to run models 20× faster and cheaper than Nvidia H100 GPUs. Tom's Hardware. Retrieved from: https://www.tomshardware.com/tech-industry/artificial-intelligence/sohu-ai-chip-claimed-to-run-models-20x-faster-and-cheaper-than-nvidia-h100-gpus
  5. NVIDIA. (2025). The engine behind AI factories: Blackwell architecture. Retrieved from: https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/
  6. NVIDIA Newsroom. (2025). NVIDIA Blackwell Ultra AI Factory platform. Retrieved from: https://nvidianews.nvidia.com/news/nvidia-blackwell-ultra-ai-factory-platform-paves-way-for-age-of-ai-reasoning
  7. Reuters. (2024). AI startup Etched raises $120 million. Retrieved from: https://www.reuters.com/technology/artificial-intelligence/ai-startup-etched-raises-120-million-develop-specialized-chip-2024-06-25/
  8. Reuters. (2025). China AI chip firm Biren raises new funds. Retrieved from: https://www.reuters.com/world/china/china-ai-chip-firm-biren-raises-new-funds-plans-hong-kong-ipo-say-sources-2025-06-26/
  9. Uberti, G. (2024). Etched is making the biggest bet in AI. Etched Blog. Retrieved from: https://www.etched.com/announcing-etched
  10. Woodruff, M. (2024). Mystery surrounds discovery of TSMC tech inside Huawei AI chips. Wall Street Journal. Retrieved from: https://www.wsj.com/tech/mystery-surrounds-discovery-of-tsmc-tech-inside-huawei-ai-chips-7d922a01
  11. Rambus. (2025). From dorm room beginnings to a pioneer in the AI chip revolution. Retrieved from: https://www.rambus.com/blogs/from-dorm-room-beginnings-to-a-pioneer-in-the-ai-chip-revolution-how-etched-is-collaborating-with-rambus-to-achieve-their-vision/
  12. Deloitte. (2025). Global Semiconductor Industry Outlook. Retrieved from: https://www.deloitte.com/us/en/insights/industry/technology/technology-media-telecom-outlooks/semiconductor-industry-outlook.html
  13. TechInsights. (2025). AI Market Outlook 2025. Retrieved from: https://www.techinsights.com/blog/ai-market-outlook-2025-key-insights-and-trends

Check our posts & links below for details on other exciting titles. Sign up to the Lexicon Labs Newsletter and download a FREE EBOOK about the life and art of the great painter Vincent van Gogh!


Related Content

Catalog 

Our list of titles is updated regularly. View our full Catalog of Titles

Welcome to Lexicon Labs

Welcome to Lexicon Labs

We are dedicated to creating and delivering high-quality content that caters to audiences of all ages. Whether you are here to learn, discov...