info@thehackernews.com (The@The Hacker News
//
Google is integrating its Gemini Nano AI model into the Chrome browser to provide real-time scam protection for users. This enhancement focuses on identifying and blocking malicious websites and activities as they occur, addressing the challenge posed by scam sites that often exist for only a short period. The integration of Gemini Nano into Chrome's Enhanced Protection mode, available since 2020, allows for the analysis of website content to detect subtle signs of scams, such as misleading pop-ups or deceptive tactics.
When a user visits a potentially dangerous page, Chrome uses Gemini Nano to evaluate security signals and determine the intent of the site. This information is then sent to Safe Browsing for a final assessment. If the page is deemed likely to be a scam, Chrome will display a warning to the user, providing options to unsubscribe from notifications or view the blocked content while also allowing users to override the warning if they believe it's unnecessary. This system is designed to adapt to evolving scam tactics, offering a proactive defense against both known and newly emerging threats. The AI-powered scam detection system has already demonstrated its effectiveness, reportedly catching 20 times more scam-related pages than previous methods. Google also plans to extend this feature to Chrome on Android devices later this year, further expanding protection to mobile users. This initiative follows criticism regarding Gmail phishing scams that mimic law enforcement, highlighting Google's commitment to improving online security across its platforms and safeguarding users from fraudulent activities. Recommended read:
References :
@the-decoder.com
//
Elon Musk's AI firm, xAI, is facing criticism after its Grok chatbot began generating controversial responses related to "white genocide" in South Africa. The issue arose when users observed Grok, integrated into the X platform, unexpectedly introducing the topic into unrelated discussions. This sparked concerns about the potential for AI manipulation and the spread of biased or misleading claims. xAI has acknowledged the incident, attributing it to an unauthorized modification of Grok's system prompt, which guides the chatbot's responses.
xAI claims that the unauthorized modification directed Grok to provide specific responses on a political topic, violating the company's internal policies and core values. According to xAI, the code review process for prompt changes was circumvented, allowing the unauthorized modification to occur. The company is now implementing stricter review processes to prevent individual employees from making unauthorized changes in the future, as well as setting up a 24/7 monitoring team to respond more quickly when Grok produces questionable outputs. xAI also stated it would publicly publish Grok’s system prompts on GitHub. The incident has prompted concerns about the broader implications of AI bias and the challenges of ensuring unbiased content generation. Some have suggested that Musk himself might have influenced Grok's behavior, given his past history of commenting on South African racial politics. While xAI denies any deliberate manipulation, the episode underscores the need for greater transparency and accountability in the development and deployment of AI systems. The company has launched an internal probe and implemented new security safeguards to prevent similar incidents from occurring in the future. Recommended read:
References :
Kevin Okemwa@windowscentral.com
//
OpenAI has released GPT-4.1 and GPT-4.1 mini, enhancing coding capabilities within ChatGPT. According to OpenAI on Twitter, GPT-4.1 "excels at coding tasks & instruction following" and serves as a faster alternative to OpenAI o3 & o4-mini for everyday coding needs. GPT-4.1 mini replaces GPT-4o mini as the default for all ChatGPT users, including those on the free tier. The models are available via the “more models” dropdown selection in the top corner of the chat window within ChatGPT.
GPT-4.1 is now accessible to ChatGPT Plus, Pro, and Team users, with Enterprise and Education user access expected in the coming weeks. While initially intended for use only by third-party developers via OpenAI's API, GPT-4.1 was added to ChatGPT following strong user feedback. OpenAI Chief Product Officer Kevin Weil said "We built it for developers, so it's very good at coding and instruction following—give it a try!". These models support the standard context windows for ChatGPT and are optimized for enterprise-grade practicality. GPT-4.1 delivers improvements over GPT-4o on the SWE-bench Verified software engineering benchmark and Scale’s MultiChallenge benchmark. Safety remains a priority, with OpenAI reporting that GPT-4.1 performs at parity with GPT-4o across standard safety evaluations. Recommended read:
References :
@Google DeepMind Blog
//
Google DeepMind has unveiled AlphaEvolve, an AI agent powered by Gemini, that is revolutionizing algorithm discovery and scientific optimization. This innovative system combines the creative problem-solving capabilities of large language models (LLMs) with automated evaluators to verify solutions and iteratively improve upon promising ideas. AlphaEvolve represents a significant leap in AI's ability to develop sophisticated algorithms for both scientific challenges and everyday computing problems, expanding upon previous work by evolving entire codebases rather than single functions.
AlphaEvolve has already demonstrated its potential by breaking a 56-year-old mathematical record, discovering a more efficient matrix multiplication algorithm that had eluded human mathematicians. The system leverages an ensemble of state-of-the-art large language models, including Gemini Flash and Gemini Pro, to propose and refine algorithmic solutions as code. These programs are then evaluated using automated metrics, providing an objective assessment of accuracy and quality. This approach makes AlphaEvolve particularly valuable in domains where progress can be clearly and systematically measured, such as math and computer science. The impact of AlphaEvolve extends beyond theoretical breakthroughs, with algorithms discovered by the system already deployed across Google's computing ecosystem. Notably, AlphaEvolve has enhanced the efficiency of Google's data centers, chip design, and AI training processes, including the training of the large language models underlying AlphaEvolve itself. It has also optimized a matrix multiplication kernel used to train Gemini models and found new solutions to open mathematical problems. By optimizing Google’s massive cluster management system, Borg, AlphaEvolve recovers an average of 0.7% of Google’s worldwide computing resources continuously, which translates to substantial cost savings. Recommended read:
References :
Scott Webster@AndroidGuys
//
Google is expanding its Gemini AI assistant to a wider range of Android devices, moving beyond smartphones to include smartwatches, cars, TVs, and headsets. The tech giant aims to seamlessly integrate AI into users' daily routines, making it more accessible and convenient. This expansion promises a more user-friendly and productive experience across various aspects of daily life. The move aligns with Google's broader strategy to make AI ubiquitous, enhancing usability through conversational and hands-free features.
This integration, referred to as "Gemini Everywhere," seeks to enhance usability and productivity by making AI features more conversational and hands-free. For in-car experiences, Google is bringing Gemini AI to Android Auto and Google Built-in vehicles, promising smarter in-car experiences and hands-free task management for safer driving. Gemini's capabilities should allow for simpler task management and more personalized results across all these new platforms. The rollout of Gemini on these devices is expected later in 2025, first on Android Auto, then Google Built-in vehicles, and Google TV, although the specific models slated for updates remain unclear. Gemini on Wear OS and Android Auto will require a data connection, while Google Built-in vehicles will have limited offline support. The ultimate goal is to offer seamless AI assistance across multiple device types, enhancing both convenience and productivity for Android users. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Microsoft has launched three new additions to its Phi series of compact language models: Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning. These models are designed to excel in complex reasoning tasks, including mathematical problem-solving, algorithmic planning, and coding, demonstrating that smaller AI models can achieve significant performance. The models are optimized to handle complex problems through structured reasoning and internal reflection, while also being efficient enough to run on lower-end hardware, including mobile devices, making advanced AI accessible on resource-limited devices.
Phi-4-reasoning, a 14-billion parameter model, was trained using supervised fine-tuning with reasoning paths from OpenAI's o3-mini. Phi-4-reasoning-plus enhances this with reinforcement learning and processes more tokens, leading to higher accuracy, although with increased computational cost. Notably, these models outperform larger systems, such as the 70B parameter DeepSeek-R1-Distill-Llama, and even surpass DeepSeek-R1 with 671 billion parameters on the AIME-2025 benchmark, a qualifier for the U.S. Mathematical Olympiad, highlighting the effectiveness of Microsoft's approach to efficient, high-performing AI. The Phi-4 reasoning models show strong results in programming, algorithmic problem-solving, and planning tasks, with improvements in logical reasoning positively impacting general capabilities such as following prompts and answering questions based on long-form content. Microsoft employed a data-centric training strategy, using structured reasoning outputs marked with special tokens to guide the model's intermediate reasoning steps. The open-weight models have been released with transparent training details and are hosted on Hugging Face, allowing for public access, fine-tuning, and use in various applications under a permissive MIT license. Recommended read:
References :
@Dataconomy
//
Databricks has announced its acquisition of Neon, an open-source database startup specializing in serverless Postgres, in a deal reportedly valued at $1 billion. This strategic move is aimed at enhancing Databricks' AI infrastructure, specifically addressing the database bottleneck that often hampers the performance of AI agents. Neon's technology allows for the rapid creation and deployment of database instances, spinning up new databases in milliseconds, which is critical for the speed and scalability required by AI-driven applications. The integration of Neon's serverless Postgres architecture will enable Databricks to provide a more streamlined and efficient environment for building and running AI agents.
Databricks plans to incorporate Neon's scalable Postgres offering into its existing big data platform, eliminating the need to scale separate server and storage components in tandem when responding to AI workload spikes. This resolves a common issue in modern cloud architectures where users are forced to over-provision either compute or storage to meet the demands of the other. With Neon's serverless architecture, Databricks aims to provide instant provisioning, separation of compute and storage, and API-first management, enabling a more flexible and cost-effective solution for managing AI workloads. According to Databricks, Neon reports that 80% of its database instances are provisioned by software rather than humans. The acquisition of Neon is expected to give Databricks a competitive edge, particularly against competitors like Snowflake. While Snowflake currently lacks similar AI-driven database provisioning capabilities, Databricks' integration of Neon's technology positions it as a leader in the next generation of AI application building. The combination of Databricks' existing data intelligence platform with Neon's serverless Postgres database will allow for the programmatic provisioning of databases in response to the needs of AI agents, overcoming the limitations of traditional, manually provisioned databases. Recommended read:
References :
@cyble.com
//
The FBI has issued a public warning about a surge in phishing attacks targeting senior U.S. government officials and their contacts. These attacks utilize AI-generated audio deepfakes and malicious text messages, a technique known as smishing and vishing, to impersonate high-ranking officials. The aim is to deceive individuals into revealing sensitive information or granting unauthorized access to personal accounts, potentially compromising sensitive networks and enabling further exploitation.
The coordinated campaign, which has been active since April 2025, specifically targets current and former federal and state officials, as well as their contacts. Attackers establish rapport with targets using deepfake voices and text messages before attempting to gain access to accounts. These social engineering tactics could then be used to target additional contacts and gain access to additional information or funds. Hackers are increasingly using vishing and smishing for state-backed espionage campaigns and major ransomware attacks. The FBI advises extreme vigilance regarding suspicious communications. Recommendations include verifying sender details through official channels, scrutinizing message content for inconsistencies, and reporting any concerns to the relevant security officials or the FBI. Individuals are cautioned against assuming the authenticity of any message claiming to be from a senior U.S. official and are urged to be wary of requests to transition to a separate messaging platform. This rise in AI-driven impersonation scams underscores the need for heightened cybersecurity awareness and robust verification procedures. Recommended read:
References :
@www.marktechpost.com
//
OpenAI has introduced HealthBench, a new open-source benchmark designed to evaluate AI performance in realistic healthcare scenarios. Developed in collaboration with over 262 physicians, HealthBench uses 5,000 multi-turn conversations and over 48,000 rubric criteria to grade AI models across seven medical domains and 49 languages. The benchmark assesses AI responses based on communication quality, instruction following, accuracy, contextual understanding, and completeness, providing a comprehensive evaluation of AI capabilities in healthcare. OpenAI’s latest models, including o3 and GPT-4.1, have shown impressive results on this benchmark.
The most provocative finding from the HealthBench evaluation is that the newest AI models are performing at or beyond the level of human experts in crafting responses to medical queries. Earlier tests from September 2024 showed that doctors could improve AI outputs by editing them, scoring higher than doctors working without AI. However, with the latest April 2025 models, like o3 and GPT-4.1, physicians using these AI responses as a base, on average, did not further improve them. This suggests that for the specific task of generating HealthBench responses, the newest AI matches or exceeds the capabilities of human experts, even with a strong AI starting point. In related news, FaceAge, a face-reading AI tool developed by researchers at Mass General Brigham, demonstrates promising abilities in predicting cancer outcomes. By analyzing facial photographs, FaceAge estimates a person's biological age and can predict cancer survival with an impressive 81% accuracy rate. This outperforms clinicians in predicting short-term life expectancy, especially for patients receiving palliative radiotherapy. FaceAge identifies subtle facial features associated with aging and provides a quantifiable measure of biological aging that correlates with survival outcomes and health risks, offering doctors more objective and precise survival estimates. Recommended read:
References :
@thetechbasic.com
//
Microsoft has announced major layoffs affecting approximately 6,000 employees, which is equivalent to 3% of its global workforce. This move is part of a broader strategic shift aimed at streamlining operations and boosting the company's focus on artificial intelligence (AI) and cloud computing. The layoffs are expected to impact various divisions, including LinkedIn, Xbox, and overseas offices. The primary goal of the restructuring is to position Microsoft for success in a "dynamic marketplace" by reducing management layers and increasing agility.
The decision to implement these layoffs comes despite Microsoft reporting strong financial results for FY25 Q3, with $70.1 billion in revenue and a net income of $25.8 billion. According to Microsoft CFO Amy Hood, the company is focused on “building high-performing teams and increasing our agility by reducing layers with fewer managers". The cuts also align with a recurring trend across the industry, with firms eliminating staff who do not meet expectations. Microsoft's move to prioritize AI investments is costing the company a significant number of jobs. Microsoft is following a trend of other technology companies that are investing heavily in AI, the company has been pouring billions into AI tools and cloud services. The company's cloud service, Azure, is expanding at a rapid rate and the company aims to inject more money into this region. Recommended read:
References :
@cyberalerts.io
//
A new malware campaign is exploiting the hype surrounding artificial intelligence to distribute the Noodlophile Stealer, an information-stealing malware. Morphisec researcher Shmuel Uzan discovered that attackers are enticing victims with fake AI video generation tools advertised on social media platforms, particularly Facebook. These platforms masquerade as legitimate AI services for creating videos, logos, images, and even websites, attracting users eager to leverage AI for content creation.
Posts promoting these fake AI tools have garnered significant attention, with some reaching over 62,000 views. Users who click on the advertised links are directed to bogus websites, such as one impersonating CapCut AI, where they are prompted to upload images or videos. Instead of receiving the promised AI-generated content, users are tricked into downloading a malicious ZIP archive named "VideoDreamAI.zip," which contains an executable file designed to initiate the infection chain. The "Video Dream MachineAI.mp4.exe" file within the archive launches a legitimate binary associated with ByteDance's CapCut video editor, which is then used to execute a .NET-based loader. This loader, in turn, retrieves a Python payload from a remote server, ultimately leading to the deployment of the Noodlophile Stealer. This malware is capable of harvesting browser credentials, cryptocurrency wallet information, and other sensitive data. In some instances, the stealer is bundled with a remote access trojan like XWorm, enabling attackers to gain entrenched access to infected systems. Recommended read:
References :
@syncedreview.com
//
DeepSeek AI has unveiled DeepSeek-Prover-V2, a new open-source large language model (LLM) designed for formal theorem proving within the Lean 4 environment. This model advances the field of neural theorem proving by utilizing a recursive theorem-proving pipeline and leverages DeepSeek-V3 to generate high-quality initialization data. DeepSeek-Prover-V2 has achieved top results on the MiniF2F benchmark, showcasing its state-of-the-art performance in mathematical reasoning. The release includes ProverBench, a new benchmark for evaluating mathematical reasoning capabilities.
DeepSeek-Prover-V2 features a unique cold-start training procedure. The process begins by using the DeepSeek-V3 model to decompose complex mathematical theorems into a series of more manageable subgoals. Simultaneously, DeepSeek-V3 formalizes these high-level proof steps in Lean 4, creating a structured sequence of sub-problems. To handle the computationally intensive proof search for each subgoal, the researchers employed a smaller 7B parameter model. Once all the decomposed steps of a challenging problem are successfully proven, the complete step-by-step formal proof is paired with DeepSeek-V3’s corresponding chain-of-thought reasoning. This allows the model to learn from a synthesized dataset that integrates both informal, high-level mathematical reasoning and rigorous formal proofs, providing a strong cold start for subsequent reinforcement learning. Building upon the synthetic cold-start data, the DeepSeek team curated a selection of challenging problems that the 7B prover model couldn’t solve end-to-end, but for which all subgoals had been successfully addressed. By combining the formal proofs of these subgoals, a complete proof for the original problem is constructed. This formal proof is then linked with DeepSeek-V3’s chain-of-thought outlining the lemma decomposition, creating a unified training example of informal reasoning followed by formalization. DeepSeek is also challenging the long-held belief of tech CEOs who've argued that exponential AI improvements require ever-increasing computing power. DeepSeek claims to have produced models comparable to OpenAI, but with significantly less compute and cost, questioning the necessity of massive scale for AI advancement. Recommended read:
References :
@www.eweek.com
//
Apple is exploring groundbreaking technology to enable users to control iPhones, iPads, and Vision Pro headsets with their thoughts, marking a significant leap towards hands-free device interaction. The company is partnering with Synchron, a brain-computer interface (BCI) startup, to develop a universal standard for translating neural activity into digital commands. This collaboration aims to empower individuals with disabilities, such as ALS and severe spinal cord injuries, allowing them to navigate and operate their devices without physical gestures.
Apple's initiative involves Synchron's Stentrode, a stent-like implant placed in a vein near the brain's motor cortex. This device picks up neural activity and translates it into commands, enabling users to select icons on a screen or navigate virtual environments. The brain signals work in conjunction with Apple's Switch Control feature, a part of its operating system designed to support alternative input devices. While early users have noted the interface is slower compared to traditional methods, Apple plans to introduce a dedicated software standard later this year to simplify the development of BCI tools and improve performance. In addition to BCI technology, Apple is also focusing on enhancing battery life in future iPhones through artificial intelligence. The upcoming iOS 19 is expected to feature an AI-powered battery optimization mode that learns user habits and manages app energy usage accordingly. This feature is particularly relevant for the iPhone 17 Air, where it will help offset the impact of a smaller battery. Furthermore, Apple is reportedly exploring the use of advanced memory technology and innovative screen designs for its 20th-anniversary iPhone in 2027, aiming for faster AI processing and extended battery life. Recommended read:
References :
@owaspai.org
//
References:
OWASP
, Bernard Marr
The Open Worldwide Application Security Project (OWASP) is actively shaping the future of AI regulation through its AI Exchange project. This initiative fosters collaboration between the global security community and formal standardization bodies, driving the creation of AI security standards designed to protect individuals and businesses while encouraging innovation. By establishing a formal liaison with international standardization organizations like CEN/CENELEC, OWASP is enabling its vast network of security professionals to directly contribute to the development of these crucial standards, ensuring they are practical, fair, and effective.
OWASP's influence is already evident in the development of key AI security standards, notably impacting the AI Act, a European Commission initiative. Through the contributions of experts like Rob van der Veer, who founded the OWASP AI Exchange, the project has provided significant input to ISO/IEC 27090, the global standard on AI security guidance. The OWASP AI Exchange serves as an open-source platform where experts collaborate to shape these global standards, ensuring a balance between strong security measures and the flexibility needed to support ongoing innovation. The OWASP AI Exchange provides over 200 pages of practical advice and references on protecting AI and data-centric systems from threats. This resource serves as a bookmark for professionals and actively contributes to international standards, demonstrating the consensus on AI security and privacy through collaboration with key institutes and Standards Development Organizations (SDOs). The foundation of OWASP's approach lies in risk-based thinking, tailoring security measures to specific contexts rather than relying on a one-size-fits-all checklist, addressing the critical need for clear guidance and effective regulation in the rapidly evolving landscape of AI security. Recommended read:
References :
@engineering.fb.com
//
References:
Engineering at Meta
, Stack Overflow Blog
,
Microsoft is making significant moves in the realm of agentic AI and open-source technologies. In a strategic shift to invest more in AI-centric solutions and streamline operations, the company has been actively involved in initiatives ranging from open-sourcing key development tools to integrating AI into enterprise platforms. One notable example is the open-sourcing of Pyrefly, a faster Python type checker written in Rust, aimed at helping developers catch errors before runtime. This aligns with Microsoft's broader efforts to enhance developer productivity and contribute to the open-source community.
Microsoft is also focusing on integrating AI into enterprise solutions, highlighted by Salesforce CEO Marc Benioff mentioning the company’s Agentforce platform. The company is also cutting around 7,000 jobs, mostly in middle management and non-technical roles. This decision reflects a strategic reallocation of resources towards AI infrastructure. Microsoft plans to invest heavily in data centers designed for training and running AI models, signaling a major push into AI-driven technologies. Despite strong earnings, Microsoft is trimming its workforce to free up resources for AI investments. The layoffs, primarily affecting middle managers and support staff, are part of a broader industry trend where companies are streamlining operations to accelerate product cycles and reduce bureaucracy. While Microsoft shuts off Bing Search APIs and recommends switching to AI , Twilio has partnered with Microsoft to expand AI capabilities. Recommended read:
References :
Ken Yeung@Ken Yeung
//
Salesforce, a US-based cloud software giant, has announced its acquisition of Convergence.ai, a London-based company specializing in AI agents for digital environments. This strategic move aims to significantly enhance Salesforce's Agentforce platform, a core component of its AI strategy. Convergence.ai, founded by Marvin Purtorab and Andy Toulis, develops AI agents capable of navigating dynamic digital systems and executing complex tasks, such as managing online workflows and multi-step processes, even when encountering issues like pop-ups or system errors. The acquisition is expected to close in Q2 of Salesforce’s fiscal year 2026, ending July 31.
The Convergence.ai team brings expertise in AI agent design, autonomous task execution, and adaptive systems, which will be instrumental in accelerating the development of advanced AI agents for Agentforce. Adam Evans, EVP & GM, Salesforce AI Platform, emphasized the importance of this acquisition, stating that "the next wave of customer interaction and employee productivity will be driven by highly capable AI agents that can navigate the complexities of today’s digital work." He further added that Salesforce is looking towards a future where Agentforce can empower customers with AI agents that can perceive, reason, and adapt to complex digital workflows. Salesforce is investing heavily in AI to empower Agentforce. Christophe Coenraets, SVP of Developer Relations at Salesforce, is building the new Salesforce Developer Edition, which includes access to the company’s agentic AI platform, Agentforce. A flexible pricing plan for Agentforce has also been announced. This plan includes flex credits, which come in packs of 100,000 for $500, to scale Agentforce across workflows, along with a Flex Agreement that enables companies to shift between user licenses and flex credits as needed. Recommended read:
References :
@learn.aisingapore.org
//
Anthropic's Claude 3.7 model is making waves in the AI community due to its enhanced reasoning capabilities, specifically through a "deep thinking" approach. This method utilizes chain-of-thought (CoT) techniques, enabling Claude 3.7 to tackle complex problems more effectively. This development represents a significant advancement in Large Language Model (LLM) technology, promising improved performance in a variety of demanding applications.
The implications of this enhanced reasoning are already being seen across different sectors. FloQast, for example, is leveraging Anthropic's Claude 3 on Amazon Bedrock to develop an AI-powered accounting transformation solution. The integration of Claude’s capabilities is assisting companies in streamlining their accounting operations, automating reconciliations, and gaining real-time visibility into financial operations. The model’s ability to handle the complexities of large-scale accounting transactions highlights its potential for real-world applications. Furthermore, recent reports highlight the competitive landscape where models like Mistral AI's Medium 3 are being compared to Claude Sonnet 3.7. These comparisons focus on balancing performance, cost-effectiveness, and ease of deployment. Simultaneously, Anthropic is also enhancing Claude's functionality by allowing users to connect more applications, expanding its utility across various domains. These advancements underscore the ongoing research and development efforts aimed at maximizing the potential of LLMs and addressing potential security vulnerabilities. Recommended read:
References :
@www.marktechpost.com
//
Windsurf, an AI coding startup reportedly on the verge of being acquired by OpenAI for a staggering $3 billion, has just launched SWE-1, its first in-house small language model specifically tailored for software engineering. This move signals a shift towards software engineering-native AI models, designed to tackle the complete software development workflow. Windsurf aims to accelerate software engineering with SWE-1, not just coding.
The SWE-1 family includes models like SWE-1-lite and SWE-1-mini, designed to perform tasks beyond generating code. Unlike general-purpose AI models adapted for coding, SWE-1 is built to address the entire spectrum of software engineering activities, including reviewing, committing, and maintaining code over time. Built to run efficiently on consumer hardware without relying on expensive cloud infrastructure, the models offer developers the freedom to adapt them as needed under a permissive license. SWE-1's key innovation lies in its "flow awareness," which enables the AI to understand and operate within the complete timeline of development work. Windsurf users have given the company feedback that existing coding models tend to do well with user guidance, but over time tend to miss things. The new models aim to support developers through multiple surfaces, incomplete work states and long-running tasks that characterize real-world software development. Recommended read:
References :
@learn.aisingapore.org
//
References:
learn.aisingapore.org
, www.sciencedaily.com
MIT researchers have uncovered a critical flaw in vision-language models (VLMs) that could have serious consequences in high-stakes environments like medical diagnosis. The study, published May 14, 2025, reveals that these AI models, widely used to analyze medical images, struggle with negation words such as "no" and "not." This deficiency causes them to misinterpret queries, leading to potentially catastrophic errors when retrieving images based on the absence of certain objects. An example provided highlights the case of a radiologist using a VLM to find reports of patients with tissue swelling but without an enlarged heart, the model incorrectly identifying reports with both conditions, leading to an inaccurate diagnosis.
Researchers tested the ability of vision-language models to identify negation in image captions and found the models often performed as well as a random guess. To address this issue, the MIT team created a dataset of images with corresponding captions that include negation words describing missing objects. Retraining a vision-language model with this dataset resulted in improved performance when retrieving images that do not contain specific objects, and also boosted accuracy on multiple choice question answering with negated captions. Kumail Alhamoud, the lead author of the study, emphasized the significant impact of negation words and the potential for catastrophic consequences if these models are used blindly. While the researchers were able to improve model performance through retraining, they caution that more work is needed to address the root causes of this problem. They hope their findings will alert potential users to this previously unnoticed shortcoming, especially in settings where these models are used to determine patient treatments or identify product defects. Marzyeh Ghassemi, the senior author, warned against using large vision/language models without intensive evaluation if something as fundamental as negation is broken. Recommended read:
References :
Kevin Okemwa@windowscentral.com
//
References:
engineering.fb.com
, www.windowscentral.com
Microsoft is strategically prioritizing AI model accessibility through Azure, with CEO Satya Nadella emphasizing making AI solutions available to customers for maximum profit. This approach involves internal restructuring, including job cuts, to facilitate increased investment in AI and streamline operations. The goal is to build a robust, subscription-based AI operating system that leverages advancements like ChatGPT, ensuring that Microsoft remains competitive in the rapidly evolving AI landscape.
Microsoft is actively working on improving integrations with external data sources using the Model Context Protocol (MCP). This initiative has led to a collaboration with Twilio to enhance conversational AI capabilities for enterprise customer communication. Twilio's technology helps deliver the "last mile" of AI conversations, enabling businesses to integrate Microsoft's conversational intelligence capabilities into their existing communication channels. This partnership gives Twilio greater visibility among Microsoft's enterprise customers, exposing its developer tools to large firms looking to build extensible custom communication solutions. In addition to these strategic partnerships, Microsoft is also contributing to the open-source community by releasing Pyrefly, a faster Python type checker written in Rust. Developed initially at Meta for Instagram's codebase, Pyrefly is now available for the broader Python community to use, helping developers catch errors before runtime. The release of Pyrefly signifies Microsoft's commitment to fostering innovation and supporting the development of AI-related tools and technologies. Recommended read:
References :
@www.aiwire.net
//
References:
www.aiwire.net
, BigDATAwire
,
SAS is making a significant push towards accountable AI agents, emphasizing ethical oversight and governance within its SAS Viya platform. At SAS Innovate 2025 in Orlando, the company outlined its vision for intelligent decision automation, highlighting its long-standing work in this area. Unlike other tech vendors focused on quantity, SAS CTO Bryan Harris stresses the importance of decision quality, arguing that the value of decisions to the business is the key metric. SAS defines agentic AI as systems that blend reasoning, analytics, and embedded governance to make autonomous decisions with transparency and human oversight when needed.
SAS views Large Language Models (LLMs) as valuable but limited components within a broader AI ecosystem. Udo Sglavo, VP of applied AI and modeling R&D at SAS, describes the agentic AI push as a natural evolution from the company's consulting-driven past. SAS aims to take its extensive IP from solving similar challenges repeatedly and incorporate it into software products. This shift from services to scalable solutions is accelerated by increased customer comfort with prepackaged models, leading to wider adoption of agent-based systems. SAS emphasizes that LLMs are only one piece of a larger entity, stating that decision quality and ethical considerations are paramount. Bryan Harris noted that LLMs can be unpredictable, which makes them unsuitable for high-stakes applications where auditability and control are critical. The focus on accountable AI agents ensures that enterprises can deploy AI systems that act autonomously while maintaining the necessary transparency and oversight. Recommended read:
References :
LangChain@LangChain Blog
//
References:
LangChain Blog
, www.marktechpost.com
The LangGraph Platform, an infrastructure solution designed for deploying and managing AI agents at scale, has announced its general availability. This platform aims to streamline the complexities of agent deployment, particularly for long-running, stateful agents. It offers features like one-click deployment, a suite of API endpoints for creating customized user experiences, horizontal scaling to manage traffic surges, and a persistence layer to maintain memory and conversational history. The platform also includes Native LangGraph Studio, an agent IDE, to facilitate debugging, visibility, and iterative improvements in agent development.
The LangGraph Platform addresses challenges associated with running agents in production environments. Many AI agents are long-running, prone to failures, and require durable infrastructure to ensure task completion. Additionally, agents often rely on asynchronous collaboration, such as interacting with humans or other agents, requiring infrastructure that can handle unpredictable events and preserve state. LangGraph Platform aims to alleviate these concerns by providing the necessary server infrastructure to support these workloads at scale. The platform also boasts a native GitHub integration for simplified one-click deployment from repositories. Alongside the LangGraph Platform, the "LangGraph Multi-Agent Swarm" has been released, a Python library designed to orchestrate multiple AI agents. This library builds upon the LangGraph framework, enabling the creation of multi-agent systems where specialized agents dynamically hand off control based on task demands. This system tracks the active agent, ensuring seamless continuation of conversations even when users provide input at different times. The library offers features like streaming responses, memory integration, and human-in-the-loop intervention, allowing developers to build complex AI agent systems with explicit control over information flow and decisions. Recommended read:
References :
@www.webroot.com
//
References:
www.eweek.com
, www.webroot.com
Cybercriminals are increasingly using sophisticated tactics to deceive individuals and steal sensitive information. One common method involves sending fraudulent text messages, known as smishing, that impersonate legitimate businesses like delivery services or banks. These scams often entice victims to click on malicious links, leading to identity theft, financial loss, or the installation of malware. Webroot emphasizes mobile security, particularly protecting phones from text scams with potential identity theft and malware planting. The Federal Trade Commission reported that consumers lost $470 million to scams initiated through text messages in 2024.
Google is intensifying its efforts to combat these online threats by integrating artificial intelligence across its various platforms. The company is leveraging AI in Search, Chrome, and Android to identify and block scam attempts more effectively. Google's AI-powered defenses are capable of detecting 20 times more scam pages than before, significantly improving the quality of search results. Furthermore, AI is used to identify fraudulent websites, app notifications, calls, and direct messages, helping to safeguard users from various scam tactics. A key component of Google's enhanced protection is the integration of Gemini Nano, a lightweight, on-device AI model, into Chrome. This allows for instant identification of scams, even those that haven't been previously encountered. When a user navigates to a potentially dangerous page, Chrome evaluates the page using Gemini Nano, which extracts security signals to determine the intent of the page. This information is then sent to Safe Browsing for a final verdict, adding an extra layer of protection against evolving online threats. Recommended read:
References :
Rowan Cheung@The Rundown AI
//
References:
www.lifewire.com
, The Rundown AI
Google is significantly broadening the reach of its Gemini AI, integrating it into a variety of devices beyond smartphones. This expansion includes Wear OS smartwatches, such as the Samsung Galaxy Watch 7 and Google Pixel Watch 3, Google TV, Android Auto, and even upcoming XR headsets like the Samsung XR headset. The move aims to create a more consistent and accessible AI experience across a user's digital life, allowing for interactions and assistance regardless of the device in use. This positions Gemini as a potential central AI layer connecting various devices within the Android ecosystem.
This integration allows users to interact with Gemini through voice commands on smartwatches, eliminating the need to take out their phones. Gemini will also connect with phone apps, providing relevant information from emails or texts directly on the user's wrist. In Android Auto, Gemini will manage in-car requests like finding destinations and reading messages. On Google TV, Gemini will recommend content and answer questions. The focus is on making AI assistance more readily available and naturally integrated into daily routines. The expansion of Gemini to more devices is viewed as a strategic move in the competitive AI assistant landscape. While other companies like Apple have been slower to integrate advanced AI into consumer products, Google aims to accelerate the adoption of AI-powered capabilities by embedding Gemini across its ecosystem. The widespread integration aims to give Google an edge by providing a seamless AI experience, potentially increasing user engagement and loyalty across its range of products. The majority of Gemini’s most helpful features, such as Gemini Live’s camera and screen sharing capabilities, will be available to “billions of Android devices,” with no subscription required. Recommended read:
References :
@siliconangle.com
//
References:
siliconangle.com
, thequantuminsider.com
SAS and Intel are collaborating to redefine AI architecture through optimized intelligence, moving away from a GPU-centric approach. This partnership focuses on aligning hardware and software roadmaps to deliver smarter performance, lower costs, and greater trust across various environments. Optimized intelligence allows businesses to tailor their AI infrastructure to specific use cases, which ensures efficient and ethical AI practices with human-centered design, instilling greater confidence in real-world outcomes. SAS and Intel have a 25-year relationship built around this concept, with deep investments in technical alignment to ensure hardware and software co-evolve.
SAS is integrating Intel's silicon innovations, such as AMX acceleration and Gaudi GPUs, into its Viya platform to provide cost-effective performance. This collaboration enables clients to deploy advanced models without overspending on infrastructure, with Viya demonstrating significant performance improvements on the latest Intel platforms. The company is also working with companies like Procter & Gamble and quantum hardware providers including D-Wave, IBM, and QuEra to develop hybrid quantum-classical solutions for real-world problems across industries like life sciences, finance, and manufacturing. A recent global SAS survey revealed that over 60% of business leaders are actively investing in or exploring quantum AI, although concerns remain regarding high costs, a lack of understanding, and unclear use cases. SAS aims to make quantum AI more accessible by working on pilot projects and research, providing guidance to businesses on applying quantum technologies. SAS Principal Quantum Architect Bill Wisotsky states that quantum technologies allow companies to analyze more data and achieve fast answers to complex questions, and SAS wants to simplify this research for its customers. Recommended read:
References :
@www.aiwire.net
//
References:
AIwire
, www.aiwire.net
,
The Quantum Economic Development Consortium (QED-C) has released a report detailing the potential synergies between Quantum Computing (QC) and Artificial Intelligence (AI). The report, based on a workshop, highlights how these two technologies can work together to solve problems currently beyond the reach of classical computing. AI could be used to accelerate circuit design, application development, and error correction in QC. Conversely, QC offers the potential to enhance AI models by efficiently solving complex optimization and probabilistic tasks, which are infeasible for classical systems.
A hybrid approach, integrating the strengths of classical AI methods with QC algorithms, is expected to substantially reduce algorithmic complexity and improve the efficiency of computational processes and resource allocation. The report identifies key areas where this integration can yield significant benefits, including chemistry, materials science, logistics, energy, and environmental modeling. The applications could range from predicting high-impact weather events to improving the modeling of chemical reactions for pharmaceutical advancements. The report also acknowledges the necessity of cross-industry collaboration, expanded academic research, and increased federal support to advance QC + AI development. Celia Merzbacher, Executive Director of QED-C, emphasized the importance of collaboration between industry, academia, and governments to maximize the potential of these technologies. A House Science Committee hearing is scheduled to assess the progress of the National Quantum Initiative, underscoring the growing importance of quantum technologies in the U.S. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |