OpenAI has announced the release of new AI inference models, 'o3' and 'o4-mini'. OpenAI calls o3 'OpenAI's most advanced inference model ever', and claims that it outperforms previous models in ...
Deploying AI inference at the edge—on smartphones, appliances, industrial devices, and vehicles—promises faster, private, and energy-efficient intelligence. Expedera’s packet-based NPU architecture ...
Testing AI LLM performance can be very complicated and time-consuming. There are also many variables such as quantization, conversion, and variations in input tokens that can reduce a test’s ...
Inception, the company behind the first commercial diffusion large language models (dLLMs), today announced the launch of ...
The AI hardware landscape is evolving at breakneck speed, and memory technology is at the heart of this transformation. NVIDIA’s recent announcement of Rubin CPX, a new class of GPU purpose-built for ...
Yesterday, Microsoft made the software for its Maia 200 chip – its second generation inference processor – available to developers. MS AI chief Scott Guthrie called the Maia 200 “the most efficient ...
Roman Chernin is the CBO and cofounder of AI infrastructure company Nebius. His career spans over 20 years in the tech industry. Every major advance in AI begins with model training, but the ...
GIBO Holdings Ltd. (NASDAQ: GIBO) today announced a significant technological breakthrough in its proprietary AIGC (AI-Generated Content) multimodal engine, marking the transition into a ...
Positron AI, the leader in energy-efficient AI inference hardware, today announced an oversubscribed $230 million Series B financing at a post-money valuation exceeding $1 billion. This press release ...