@zdnet.com
//
Meta CEO Mark Zuckerberg is spearheading a new initiative to develop artificial general intelligence (AGI), recruiting top AI researchers to form an elite team. This push aims to create AI systems capable of performing any intellectual task that a human can, positioning Meta to compete directly with tech giants like Google and OpenAI. Zuckerberg's involvement includes personal recruitment efforts, indicating the high priority Meta is placing on this project. This signals a significant shift for Meta, aiming to lead in the rapidly evolving AI landscape.
Disappointment with the performance of Meta's LLaMA 4 model compared to competitors like OpenAI's GPT-4 and Google's Gemini spurred Zuckerberg's increased focus on AGI. Internally, LLaMA 4 was considered inadequate in real-world user experience, lacking coherence and usability. Furthermore, Meta's metaverse investments have not yielded the anticipated results, leading the company to redirect its focus and resources toward AI, aiming to recapture relevance and mindshare in the tech industry. With tens of billions already invested in infrastructure and foundational models, Meta is now fully committed to achieving AGI. To further bolster its AI ambitions, Meta is investing heavily in AI start-up Scale AI. Meta has invested €12 billion and acquired a 49% stake in Scale AI. The investment has caused Google to end its $200 million partnership with Scale AI. Zuckerberg has also offered large salaries to poach AI talent. This move is part of Meta's broader strategy to build superintelligence and challenge the dominance of other AI leaders. Meta's aggressive pursuit of AI talent and strategic investments highlight its determination to become a frontrunner in the race to build AGI. Recommended read:
References :
@siliconangle.com
//
OpenAI is facing increased scrutiny over its data retention policies following a recent court order related to a high-profile copyright lawsuit filed by The New York Times in 2023. The lawsuit alleges that OpenAI and Microsoft Corp. used millions of the Times' articles without permission to train their AI models, including ChatGPT. The paper further alleges that ChatGPT outputted Times content verbatim without attribution. As a result, OpenAI has been ordered to retain all ChatGPT logs, including deleted conversations, indefinitely to ensure that potentially relevant evidence is not destroyed. This move has sparked debate over user privacy and data security.
OpenAI COO Brad Lightcap announced that while users' deleted ChatGPT prompts and responses are typically erased after 30 days, this practice will cease to comply with the court order. The retention policy will affect users of ChatGPT Free, Plus, and Pro, as well as users of OpenAI's application programming interface (API), but not those using the Enterprise or Edu editions or those with a Zero Data Retention agreement. The company asserts that the retained data will be stored separately in a secure system accessible only by a small, audited OpenAI legal and security team, solely to meet legal obligations. The court order was granted within one day of the NYT's request due to concerns that users might delete chats if using ChatGPT to bypass paywalls. OpenAI CEO Sam Altman has voiced strong opposition to the court order, calling it an "inappropriate request" and stating that OpenAI will appeal the decision. He argues that AI interactions should be treated with similar privacy protections as conversations with a lawyer or doctor, suggesting the need for "AI privilege". The company also expressed concerns about its ability to comply with the European Union's General Data Protection Regulation (GDPR), which grants users the right to be forgotten. Altman pledged to fight any demand that compromises user privacy, which he considers a core principle, promising customers that the company will fight to protect their privacy at every step if the plaintiffs continue to push for access. Recommended read:
References :
Megan Crouse@eWEEK
//
OpenAI's ChatGPT is expanding its reach with new integrations, allowing users to connect directly to tools like Google Drive and Dropbox. This update allows ChatGPT to access and analyze data from these cloud storage services, enabling users to ask questions and receive summaries with cited sources. The platform is positioning itself as a user interface for data, offering one-click access to files, effectively streamlining the search process for information stored across various documents and spreadsheets. In addition to cloud connectors, ChatGPT has also introduced a "Record" feature for Team accounts that can record meetings, generate summaries, and offer action items.
These new features for ChatGPT come with data privacy considerations. While OpenAI states that files accessed through Google Drive or Dropbox connectors are not used for training its models for ChatGPT Team, Enterprise, and Education accounts, concerns remain about the data usage for free users and ChatGPT Plus subscribers. However, OpenAI confirms that audio recorded by the tool is immediately deleted after transcription, and transcripts are subject to workspace retention policies. Moreover, content from Team, Enterprise, and Edu workspaces, including audio recordings and transcripts from ChatGPT record, is excluded from model training by default. Meanwhile, Reddit has filed a lawsuit against Anthropic, alleging the AI company scraped Reddit's data without permission to train its Claude AI models. Reddit accuses Anthropic of accessing its servers over 100,000 times after promising to stop scraping and claims Anthropic intentionally trained on the personal data of Reddit users without requesting their consent. Reddit has licensing deals with OpenAI and Google, but Anthropic doesn't have such a deal. Reddit seeks an injunction to force Anthropic to stop using any Reddit data immediately, and also asking the court to prohibit Anthropic from selling or licensing any product that was built using that data. Despite these controversies, Microsoft CEO Satya Nadella has stated that Microsoft profits from every ChatGPT usage, highlighting the success of their investment in OpenAI. Recommended read:
References :
Ashutosh Singh@The Tech Portal
//
Google has launched AI Edge Gallery, an open-source platform aimed at developers who want to deploy AI models directly on Android devices. This new platform allows for on-device AI execution using tools like LiteRT and MediaPipe, supporting models from Hugging Face. With future support for iOS planned, AI Edge Gallery emphasizes data privacy and low latency by eliminating the need for cloud connectivity, making it ideal for industries that require local processing of sensitive data.
The AI Edge Gallery app, released under the Apache 2.0 license and hosted on GitHub, is currently in an experimental Alpha release. The app integrates Gemma 3 1B, a compact 529MB language model, capable of processing up to 2,585 tokens per second on mobile GPUs, enabling tasks like text generation and image analysis in under a second. By using Google’s AI Edge platform, developers can leverage tools like MediaPipe and TensorFlow Lite to optimize model performance on mobile devices. The company is actively seeking feedback from developers and users. AI Edge Gallery contains categories like ‘AI Chat’ and ‘Ask Image’ to guide users to relevant tools, as well as a ‘Prompt Lab’ for testing and refining prompts. On-device AI processing ensures that complex AI tasks can be performed without transmitting data to external servers, reducing potential security risks and improving response times. While newer devices with high-performance chips can run models smoothly, older phones may experience lag. Google is also planning to launch the app on iOS soon. Recommended read:
References :
Ashutosh Singh@The Tech Portal
//
Google has launched the 'AI Edge Gallery' app for Android, with plans to extend it to iOS soon. This innovative app enables users to run a variety of AI models locally on their devices, eliminating the need for an internet connection. The AI Edge Gallery integrates models from Hugging Face, a popular AI repository, allowing for on-device execution. This approach not only enhances privacy by keeping data on the device but also offers faster processing speeds and offline functionality, which is particularly useful in areas with limited connectivity.
The app uses Google’s AI Edge platform, which includes tools like MediaPipe and TensorFlow Lite, to optimize model performance on mobile devices. A key model utilized is Gemma 31B, a compact language model designed for mobile platforms that can process data rapidly. The AI Edge Gallery features an interface with categories like ‘AI Chat’ and ‘Ask Image’ to help users find the right tools. Additionally, a ‘Prompt Lab’ is available for users to experiment with and refine prompts. Google is emphasizing that the AI Edge Gallery is currently an experimental Alpha release and is encouraging user feedback. The app is open-source under the Apache 2.0 license, allowing for free use, including for commercial purposes. However, the performance of the app may vary based on the device's hardware capabilities. While newer phones with advanced processors can run models smoothly, older devices might experience lag, particularly with larger models. In related news, Google Cloud has introduced advancements to BigLake, its storage engine designed to create open data lakehouses on Google Cloud that are compatible with Apache Iceberg. These enhancements aim to eliminate the need to sacrifice open-format flexibility for high-performance, enterprise-grade storage management. These updates include Open interoperability across analytical and transactional systems: The BigLake metastore provides the foundation for interoperability, allowing you to access all your Cloud Storage and BigQuery storage data across multiple runtimes including BigQuery, AlloyDB (preview), and open-source, Iceberg-compatible engines such as Spark and Flink.New, high-performance Iceberg-native Cloud Storage: We are simplifying lakehouse management with automatic table maintenance (including compaction and garbage collection) and integration with Google Cloud Storage management tools, including auto-class tiering and encryption. Recommended read:
References :
@www.quantamagazine.org
//
References:
finance.yahoo.com
, Quanta Magazine
,
Researchers are exploring innovative methods to enhance the performance of artificial intelligence language models by minimizing their reliance on direct language processing. This approach involves enabling models to operate more within mathematical or "latent" spaces, reducing the need for constant translation between numerical representations and human language. Studies suggest that processing information directly in these spaces can improve efficiency and reasoning capabilities, as language can sometimes constrain and diminish the information retained by the model. By sidestepping the traditional language-bound processes, AI systems may achieve better results by "thinking" independently of linguistic structures.
Meta has announced plans to resume training its AI models using publicly available content from European users. This move aims to improve the capabilities of Meta's AI systems by leveraging a vast dataset of user-generated information. The decision comes after a period of suspension prompted by concerns regarding data privacy, which were raised by activist groups. Meta is emphasizing that the training will utilize public posts and comments shared by adult users within the European Union, as well as user interactions with Meta AI, such as questions and queries, to enhance model accuracy and overall performance. A new method has been developed to efficiently safeguard sensitive data used in AI model training, reducing the traditional tradeoff between privacy and accuracy. This innovative framework maintains an AI model's performance while preventing attackers from extracting confidential information, such as medical images or financial records. By focusing on the stability of algorithms and utilizing a metric called PAC Privacy, researchers have shown that it's possible to privatize almost any algorithm without needing access to its internal workings, potentially making privacy more accessible and less computationally expensive in real-world applications. Recommended read:
References :
@www.thecanadianpressnews.ca
//
Meta is resuming its AI training program using public content shared by adult users in the European Union. This decision follows earlier delays due to regulatory concerns and aims to improve the understanding of European cultures, languages, and history within Meta's AI models. The data utilized will include public posts and comments from platforms like Facebook and Instagram, helping the AI to better reflect the nuances and complexities of European communities. Meta believes this is crucial for developing AI that is not only available to Europeans but is specifically tailored for them.
Meta will begin notifying EU users this week through in-app notifications and email, explaining the types of data they plan to use and how it will enhance AI functionality and the overall user experience. These notifications will include a direct link to an objection form, allowing users to easily opt out of having their data used for AI training purposes. Meta emphasizes that they will honor all objection forms, both those previously received and any new submissions. This approach aims to balance AI development with individual privacy rights under the stringent data privacy rules in the EU. The move comes after Meta had to previously shelve its European AI rollout plans following concerns raised about the privacy implications of its AI tools. Meta also faces ongoing legal challenges related to the use of copyright-protected material in its large language model development. The company maintains that access to EU user data is essential for localizing its AI tools, enabling them to understand everything from dialects and colloquialisms to hyper-local knowledge and unique cultural expressions like humor and sarcasm. Without this data, Meta argues, the region risks being left behind in AI development, particularly as AI models become more advanced and multi-modal. Recommended read:
References :
Ashutosh Singh@The Tech Portal
//
Apple is enhancing its AI capabilities, known as Apple Intelligence, by employing synthetic data and differential privacy to prioritize user privacy. The company aims to improve features like Personal Context and Onscreen Awareness, set to debut in the fall, without collecting or copying personal content from iPhones or Macs. By generating synthetic text and images that mimic user behavior, Apple can gather usage data and refine its AI models while adhering to its strict privacy policies.
Apple's approach involves creating artificial data that closely matches real user input to enhance Apple Intelligence features. This method addresses the limitations of training AI models solely on synthetic data, which may not always accurately reflect actual user interactions. When users opt into Apple's Device Analytics program, the AI models will compare these synthetic messages against a small sample of a user’s content stored locally on the device. The device then identifies which of the synthetic messages most closely matches its user sample, and sends information about the selected match back to Apple, with no actual user data leaving the device. To further protect user privacy, Apple utilizes differential privacy techniques. This involves adding randomized data to broader datasets to prevent individual identification. For example, when analyzing Genmoji prompts, Apple polls participating devices to determine the popularity of specific prompt fragments. Each device responds with a noisy signal, ensuring that only widely-used terms become visible to Apple, and no individual response can be traced back to a user or device. Apple plans to extend these methods to other Apple Intelligence features, including Image Playground, Image Wand, Memories Creation, and Writing Tools. This technique allows Apple to improve its models for longer-form text generation tasks without collecting real user content. Recommended read:
References :
@www.thecanadianpressnews.ca
//
Meta Platforms, the parent company of Facebook and Instagram, has announced it will resume using publicly available content from European users to train its artificial intelligence models. This decision comes after a pause last year following privacy concerns raised by activists. Meta plans to use public posts, comments, and interactions with Meta AI from adult users in the European Union to enhance its generative AI models. The company says this data is crucial for developing AI that understands the nuances of European languages, dialects, colloquialisms, humor, and local knowledge.
Meta emphasizes that it will not use private messages or data from users under 18 for AI training. To address privacy concerns, Meta will notify EU users through in-app and email notifications, providing them with a way to opt out of having their data used. These notifications will include a link to a form allowing users to object to the use of their data, and Meta has committed to honoring all previously and newly submitted objection forms. The company states its AI is designed to cater to diverse perspectives and to acknowledge the distinctive attributes of various European communities. Meta claims its approach aligns with industry practices, noting that companies like Google and OpenAI have already utilized European user data for AI training. Meta defends its actions as necessary to develop AI services that are relevant and beneficial to European users. While Meta highlights that a panel of EU privacy regulators “affirmed” that its original approach met legal obligations. Groups like NOYB had previously complained and urged regulators to intervene, advocating for an opt-in system where users actively consent to the use of their data for AI training. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |