AI Developments: Report #2

Another week where we noticed an exponential development in LLMs and more efforts to moderate answers. Progress has been noted also in multimodal and recommender system development, while more AI startups are receiving massive hype.

What is a liquid neural network, really?

Main Highlight: Liquid neural networks, a concept developed by Ramin Hasani and researchers at MIT, offer a flexible and adaptable approach to machine learning. These networks can scale down to have fewer but richer nodes, making them more efficient and less computationally expensive.

Applications of these systems in robotics show potential for real-world use, including using Raspberry Pi for complex reasoning, reducing the number of neurons for faster solutions, and creating more transparent decision-making processes. These networks require time series data and could help in safety-critical systems by avoiding unnecessary mistakes.

Read Full Article

Brian Heater | Tech Crunch
Publication Date: August 18, 2023

Artificial intelligence for augmentation and productivity

Main Highlight: The MIT Schwarzman College of Computing has awarded seed grants to seven interdisciplinary projects focusing on AI-augmented management. Funded by Andrew W. Houston and Dropbox Inc., these projects aim to leverage artificial intelligence and human-computer interaction to enhance modern workspaces, achieving better management and higher productivity.

The selected projects span a wide range of applications, including memory prosthetics, social scenario simulation, healthcare improvement, and democratizing programming, demonstrating the potential breadth of impact on various sectors of society and the economy.

Read Full Article

Schwarzman College of Computing | MIT
Publication Date: August 18, 2023

Reallusion Elevates Character Animation Workflows With Two-Way Live Sync and OpenUSD Support

Main Highlight: Reallusion has updated its iClone Omniverse Connector, enhancing character animation workflows by offering real-time previews, a bidirectional workflow with Omniverse, and improved support for OpenUSD.

The update facilitates seamless collaboration and expands creative possibilities by including real-time synchronization of projects and enhanced import functionality. These features are designed to make work between iClone and Omniverse quicker and more efficient, and come along with additional bug fixes and improvements.

Read Full Article

DANE JOHNSTON | NVIDIA
Publication Date: August 16, 2023

OpenAI acquires Global Illumination

Main Highlight: OpenAI has acquired the team at Global Illumination, a company known for leveraging AI in creative tools and digital experiences. The team, who previously contributed to major companies like Instagram, Facebook, YouTube, Google, Pixar, and Riot Games, has joined OpenAI to work on core products including ChatGPT.

Read Full Article

OpenAI | Various Authors
Publication Date: August 16, 2023

Using GPT-4 for content moderation

Main Highlight: GPT-4 is being used for content policy development and moderation, significantly speeding up the iteration on policy changes from months to hours and allowing more consistent labeling.

By understanding and interpreting rules and nuances in content policies, GPT-4 is able to adapt instantly to updates, reducing reliance on human moderators.

Read Full Article

OpenAI | Various Authors
Publication Date: August 16, 2023

Research Blogs

Autonomous visual information seeking with large language models

Main Highlight: Google Research has introduced "AVIS: Autonomous Visual Information Seeking with Large Language Models", a novel method that elevates the performance of large language models (LLMs) on visual information seeking tasks.

Utilizing three types of tools - computer vision tools, web search tools, and image search tools - AVIS uses an LLM-powered planner and reasoner to extract and analyze information from various sources.

The method achieved state-of-the-art results on visual information seeking datasets, showing a significant improvement in accuracy and efficiency by incorporating human decision-making data and using a dynamic, structured approach to decision-making.

Read Full Article

Ziniu Hu, Alireza Fathi, | Google
Publication Date: August 18, 2023

Consistent Collaborative Filtering via Tensor Decomposition

Main Highlight: A new model called Sliced Anti-symmetric Decomposition (SAD) has been developed by Apple, for collaborative filtering in recommendation systems, based on implicit feedback.

Unlike traditional methods that estimate latent vectors for users and items, SAD introduces an additional latent vector for each item using a novel three-way tensor view. This innovation extends user-item preferences, suggesting that users may have nonlinear mental models when evaluating items.

The model has demonstrated efficiency in both simulated and real-world datasets, outperforming seven state-of-the-art collaborative filtering models in consistency and accuracy of personalized preferences.

Read Full Article

Shiwen Zhao, Charles Crissman, Guillermo R Sapiro | Apple
Publication Date: August 16, 2023

Which GPU(s) to Get for Deep Learning

Main Highlight: Tim Dettmers provides comprehensive advice and insights on selecting the best GPU for deep learning, emphasizing the importance of GPU RAM, cores, tensor cores, and caches.

The blog post is structured to guide readers through the details of GPU functionality, including comparisons between CPU and GPU, the unique features of NVIDIA's RTX 40 Ampere series, and practical recommendations for various scenarios.

The guide also includes a Q&A section to tackle common misconceptions and address specific questions like cloud vs desktop considerations, cooling strategies, and the comparison between AMD and NVIDIA.

Read Full Article

Tim Dettmers
Updated Publication Date: August 8, 2023

Scientific discovery in the age of artificial intelligence

Main Highlight: The integration of AI into scientific discovery has led to breakthroughs such as self-supervised learning and geometric deep learning, augmenting traditional research methods.

These advancements enable the creation of new designs and help scientists throughout the research process, although challenges remain in data quality and the understanding of AI's capabilities and limitations.

Read Full Paper

Wang, Fu, Du, Huang, Deac, Gao, Liu, Bengio, et.al
Publication Date: August 2, 2023

Research Papers

Scaling Laws for Generative Mixed-Modal Language Models

Main Highlight: The study was conducted to explore the scaling properties of mixed-modal generative language models, examining the interaction between different modalities like text, speech, images, and code.

The research, including over 250 experiments with seven modalities and varying model sizes, has identified new mixed-modal scaling laws that capture both individual modalities and their interactions, predicting competition and synergy.

The findings provide insights into the design and training of mixed-modal models, including guidelines for hyperparameter selection, and will likely advance the development of unified models handling multiple modalities simultaneously.

Read Full Paper

Aghajanyan, Yu, Conneau, Hsu
Publication Date: 10 Jan , 2023

Sigmoid Loss for Language Image Pre-Training

Main Highlight:This new sigmoid loss method performs better, particularly at smaller batch sizes, and allows for further scaling up of the batch size, without requiring additional resources. With this method, the researchers were able to achieve up to 84.5% ImageNet zero-shot accuracy within two days, making it a promising advancement for image-text pre-training.

Read Full Paper

Zhai, Mustafa, Kolesnikov, Beyer
Publication Date: 4 May , 2023

“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Main Highlight:The paper focuses on the misuse of Large Language Models (LLMs) and the emergence of "jailbreak prompts," which are crafted to bypass safeguards and elicit harmful content from LLMs.

Through an extensive study of 6,387 prompts, it identifies characteristics and major attack strategies of jailbreak prompts, such as prompt injection and privilege escalation. The paper also highlights that current LLMs and safeguards are not adequately defending against these threats, with some jailbreak prompts achieving a 0.99 attack success rate on models like ChatGPT and GPT-4, and stresses the need for stronger defense mechanisms.

Read Full Paper

Shen, Chen, Backes, Zhang
Publication Date: 7 Aug , 2023

Platypus: Quick, Cheap, and Powerful Refinement of LLMs

Main Highlight:Platypus, a family of fine-tuned and merged Large Language Models (LLMs), has achieved the strongest performance and tops HuggingFace's Open LLM Leaderboard.

The project introduces Open-Platypus, a curated dataset released to the public, and describes the process of fine-tuning and merging LoRA modules to bring specific domain knowledge to the forefront.

The 13B Platypus model can be trained on a single A100 GPU in 5 hours with 25k questions, offering strong performance with significantly less data and compute resources than other state-of-the-art LLMs, opening opportunities for more improvements in the field.

Read Full Paper

Lee, Hunter, Ruiz
Publication Date: 14 Aug , 2023

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

Main Highlight:TeCH has introduced a new method for reconstructing 3D clothed human figures from a single image, addressing the unsolved challenge of accurately restoring "unseen regions" of the body. The method leverages descriptive text prompts and a personalized Text-to-Image diffusion model to optimize the 3D human's geometry and texture.

The technology shows promise for various applications in augmented and virtual reality, gaming, and more, but also raises concerns regarding potential misuse for deep-fake avatars and intellectual property issues.

Read Full Paper

Huang, Yi, Xiu et.al
Publication Date: 16 Aug , 2023

XSTEST: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

Main Highlight:This paper introduces a test suite called XSTEST to identify exaggerated safety behaviors in large language models (LLMs), such as refusing safe requests due to misinterpretation as unsafe.

Using XSTEST, the authors found that the Llama2 model exhibits substantial exaggerated safety behavior, refusin1g 38% of test prompts and partially refusing another 22%.

The findings suggest that this behavior is a result of lexical overfitting, making models overly sensitive to certain words, and that OpenAI's GPT-4 is better calibrated in comparison.

Read Full Paper

Röttger, Kirk, Vidgen, Attanasio, Bianchi et.al
Publication Date: 2 Aug , 2023

Development

Llama-GPT

Main Highlight:A self-hosted, offline, ChatGPT-like chatbot, powered by Llama 2. 100% private, with no data leaving your device.

Repository

Author: GetUmbrel

Danswer

Main Highlight: Danswer allows you to ask natural language questions against internal documents and get back reliable answers backed by quotes and references from the source material so that you can always trust what you get back. You can connect to a number of common tools such as Slack, GitHub, Confluence, amongst others.

Repository

Author: Danswer-AI

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Main Highlight: This repo introduces the content deformation field CoDeF, a new video representation comprising a canonical content field for static content and a temporal deformation field for transformations along time.

The CoDeF system has been designed to lift image algorithms for video processing, enabling image-to-image translation to be adapted for video-to-video translation and keypoint detection for keypoint tracking without training. This approach provides superior cross-frame consistency in processed videos and can track non-rigid objects like water and smog.

Repository

Author: Ouyang, Wang, Xiao, Bai, Zhang, Zhou et.al

Practical Tutorials and Resources

CS324 - Large Language Models

Stanford Course on LLMs - Notes are open-sourced in the given links

Liang, Hashimoto, Re, Bommasani, Xie | Stanford

Large Language Models: Foundation Models from the Ground Up

Databricks course available on EDX, starting from ground up to the main elements of Generative AI. Complementary for the LLM specialty certification.

Zaharia, Raymond, Eng | Databricks

LangChain: Various Tutorials

Various tutorials which integrate LangChain with specific tools or frameworks, such as Predibase, Zep or Qdrant

LangChain et. al

DeepLearning AI: Large Language Models with Semantic Search

Free tutorial teaching concepts like dense retrieval, which elevates the relevance of retrieved information, leading to improved search results beyond traditional keyword search, and reranking, which injects the intelligence of LLMs into asearch system, making it faster and more effective.

Cohere x DeepLearning

DeepLearning AI: Evaluating and Debugging Generative AI

Free tutorial teaching concepts like monitoring and tracing of LLMs over time in complex interactions and properly debugging by use case.

W&B x DeepLearning

DeepLearning AI: Building Generative AI Applications with Gradio

Free tutorial teaching how to create apps which integrate easily Generative AI (Hugging Face Integrations).

Hugging Face x DeepLearning

AI Developments: Report #2

What is a liquid neural network, really?

Artificial intelligence for augmentation and productivity

Reallusion Elevates Character Animation Workflows With Two-Way Live Sync and OpenUSD Support

OpenAI acquires Global Illumination

Using GPT-4 for content moderation

Research Blogs

Autonomous visual information seeking with large language models

Consistent Collaborative Filtering via Tensor Decomposition

Which GPU(s) to Get for Deep Learning

Scientific discovery in the age of artificial intelligence

Research Papers

Scaling Laws for Generative Mixed-Modal Language Models

Sigmoid Loss for Language Image Pre-Training

“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Platypus: Quick, Cheap, and Powerful Refinement of LLMs

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

XSTEST: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

Development

Llama-GPT

Danswer

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Practical Tutorials and Resources

Tags :

Share :

Related Posts

AI Developments: Report #1

AI Developments: Report #3

AI Developments: Report #4