staff@insideAI News
//
MLCommons has released the latest MLPerf Inference v5.0 benchmark results, highlighting the growing importance of generative AI in the machine learning landscape. The new benchmarks feature tests for large language models (LLMs) like Llama 3.1 405B and Llama 2 70B Interactive, designed to evaluate how well systems perform in real-world applications requiring agentic reasoning and low-latency responses. This shift reflects the industry's increasing focus on deploying generative AI and the need for hardware and software optimized for these demanding workloads.
The v5.0 results reveal significant performance improvements driven by advancements in both hardware and software. The median submitted score for Llama 2 70B has doubled compared to a year ago, and the best score is 3.3 times faster than Inference v4.0. These gains are attributed to innovations like support for lower-precision computation formats such as FP4, which allows for more efficient processing of large models. The MLPerf Inference benchmark suite evaluates machine learning performance in a way that is architecture-neutral, reproducible, and representative of real-world workloads. Recommended read:
References :
Nazy Fouladirad@AI Accelerator Institute
//
References:
hiddenlayer.com
, AI Accelerator Institute
,
As generative AI adoption rapidly increases, securing investments in these technologies has become a paramount concern for organizations. Companies are beginning to understand the critical need to validate and secure the underlying large language models (LLMs) that power their Gen AI products. Failing to address these security vulnerabilities can expose systems to exploitation by malicious actors, emphasizing the importance of proactive security measures.
Microsoft is addressing these concerns through innovations in Microsoft Purview, which offers a comprehensive set of solutions aimed at helping customers seamlessly secure and confidently activate data in the AI era. Complementing these efforts, Fiddler AI is focusing on building trust into AI systems through its AI Observability platform. This platform emphasizes explainability and transparency. They are helping enterprise AI teams deliver responsible AI applications, and also ensure people interacting with AI receive fair, safe, and trustworthy responses. This involves continuous monitoring, robust security measures, and strong governance practices to establish long-term responsible AI strategies across all products. The emergence of agentic AI, which can plan, reason, and take autonomous action to achieve complex goals, further underscores the need for enhanced security measures. Agentic AI systems extend the capabilities of LLMs by adding memory, tool access, and task management, allowing them to operate more like intelligent agents than simple chatbots. Organizations must ensure security and oversight are essential to safe deployment. Gartner research indicates a significant portion of organizations plan to pursue agentic AI initiatives, making it crucial to address potential security risks associated with these systems. Recommended read:
References :
Sean Michael@AI News | VentureBeat
//
References:
AI News | VentureBeat
, www.computerworld.com
,
Gartner, an analyst firm, released a report forecasting that global generative AI spending will reach $644 billion in 2025. This figure represents a 76.4% year-over-year increase from 2024. Despite high failure rates among early generative AI projects, organizations are still expected to invest heavily, with the lion's share of spending going towards services. GenAI services are projected to grow by 162% this year, following a 177% increase last year. According to Gartner Analyst John-David Lovelock, the shift from software to generative AI is becoming a "tidal wave of money."
The surge in spending is primarily driven by vendor investments in the technology. Hyperscalers are making massive capital expenditures on GPU infrastructure, and software vendors are rushing to deploy generative AI tools. Enterprises, however, are pulling back on in-house AI projects and increasingly opting for off-the-shelf solutions. "CIOs are no longer building generative AI tools, they’re being sold technology," Lovelock stated, emphasizing that vendors are offering solutions that meet enterprise needs. Recommended read:
References :
Andrew Liszewski@The Verge
//
Amazon has announced Alexa+, a new, LLM-powered version of its popular voice assistant. This upgraded version will cost $19.99 per month, but will be included at no extra cost for Amazon Prime subscribers. Alexa+ boasts enhanced AI agent capabilities, enabling users to perform tasks like booking Ubers, creating study plans, and sending texts via voice command. These new features are intended to provide a more seamless and natural conversational experience. Early access to Alexa+ will begin in late March 2025 for customers with eligible Echo Show devices in the United States.
Amazon emphasizes that Alexa+ utilizes a "model agnostic" system, drawing on Amazon Bedrock and employing various AI models, including Amazon Nova and those from Anthropic, to optimize performance. This approach allows Alexa+ to choose the best model for each task, leveraging specialized "experts" for orchestrating services. With seamless integration into tens of thousands of devices and services, including news sources like Time, Reuters, and the Associated Press, Alexa+ provides accurate and real-time information. Recommended read:
References :
@techstrong.ai
//
References:
techstrong.ai
, the-decoder.com
IBM has decided to integrate elements of DeepSeek's AI models into its WatsonX platform, driven by a commitment to open-source innovation. This move aims to broaden WatsonX's ability to perform secure reasoning, leveraging "distilled versions" of DeepSeek-R1. This will allow IBM to include what it calls the best open source models from around the world. IBM also stated that big, expensive proprietary systems do not always win.
A Microsoft study conducted with Carnegie Mellon University reveals that GenAI tools are shifting knowledge workers from problem solvers to AI output verifiers. The study, surveying 319 knowledge workers, examined six categories of critical thinking and found that workers are now primarily verifying AI outputs and integrating AI-generated answers. This "cognitive offloading" could weaken independent problem-solving abilities, with researchers suggesting that companies should promote critical thinking skills among employees. Recommended read:
References :
Nanette George@The Dataiku Blog
//
References:
AWS Machine Learning Blog
, learn.aisingapore.org
Dataiku has released its top five features for data scientists in 2024, highlighting its commitment to supporting data practitioners in their work. The features include enhanced integrations with Databricks, seamless cloud deployments, and multimodal AutoML. These enhancements aim to foster collaboration between teams, tools, and technologies, making data science more efficient and effective. Dataiku's focus is on building lasting relationships within the data science ecosystem.
Rich Data Co (RDC) is utilizing generative AI on Amazon Bedrock to transform credit decision-making. Their software-as-a-service solution provides banks and lenders with customer insights and AI-driven capabilities. RDC has developed data science and portfolio assistants that leverage generative AI to assist teams in developing AI models and gaining insights into loan portfolios. The data science assistant boosts team efficiency by answering technical queries, while the portfolio assistant facilitates natural language inquiries about loan portfolios. Recommended read:
References :
@analyticsindiamag.com
//
References:
Techmeme
, Analytics India Magazine
Perplexity AI has launched Perplexity Assistant, an AI-powered agent for Android devices, which is capable of performing multi-app actions such as hailing a ride. This service is currently free for users and available in 15 languages. The company has also introduced Sonar, a new GenAI search API, which they are marketing as the most affordable option in the market.
Perplexity's Sonar API comes in two versions: Sonar and Sonar Pro. Sonar is the basic version, priced at $1 per million tokens for both input and output, while Sonar Pro is aimed at more complex tasks and costs $3 per million input tokens and $15 per million output tokens, offering more in-depth answers. Sonar Pro is reportedly the best performing model on ‘factuality’ in SimpleQA benchmarks. Companies like Zoom and Doximity have integrated the API, with Zoom using it for real-time, private searches within video calls, and Doximity using it to retrieve concise answers to medical questions for physicians. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |