AI Developments: Report #4

The main highlights for this week have open source initiatives developed by Meta in Code-Llama as well as numerous frameworks assisting LLM development. New surveys have been conducted to keep track of Agents in LLM and Fine-tuning.

AI 'nose' predicts smells from molecular structures

Main Highlight: Scientists have developed a machine-learning tool that can predict the odor of a molecule based solely on its molecular structure. This breakthrough technology could be a game-changer for synthetic chemists in the food and fragrance industries, providing the ability to screen large numbers of molecules for aroma.

Terms to Understand: Machine Learning, Odor Prediction, Molecular Structure

Read Full Article

University of Reading
Publication Date: September 2, 2023

AI tech gives back 'voice' to woman with post-stroke locked-in syndrome

Main Highlight: Researchers from the University of California San Francisco have created a technology that allows people with locked-in syndrome to communicate and express facial emotions via a brain implant and a digital avatar. The system was successfully tested on a woman, enabling her to "speak" and express emotions through the avatar, which used her actual pre-injury voice.

Terms to Understand: Locked-in Syndrome, Brain Implant, Digital Avatar

Read Full Article

Corrie Pelc | Medical News Today
Publication Date: August 30, 2023

Amazon increases fees, ChatGPT comes to the enterprise, and Apple announces a press conference

Main Highlight: This week's major tech developments include Teamshares, a startup that's buying up small businesses without succession plans to expand its fintech product line. Zepto became India's first unicorn of 2023 with a valuation of $1.4 billion and plans to IPO in 2025. Google has introduced BigQuery Studio, a comprehensive data management service, while OpenAI is launching an enterprise version of ChatGPT with enhanced privacy and data analysis capabilities.

Terms to Understand: Teamshares, Zepto, BigQuery Studio, ChatGPT Enterprise

Read Full Article

Kyle Wiggers | Techcrunch
Publication Date: September 3 , 2023

Fast-tracking fusion energy’s arrival with AI and accessibility

Main Highlight: The U.S. Department of Energy has funded a project led by MIT's Plasma Science and Fusion Center to create a unified data platform for fusion research that can be processed by AI tools. The platform aims to make fusion data more accessible and organized, thereby accelerating scientific discovery and encouraging diversity in the fusion and data science workforce.

Terms to Understand: DoE Funding, Fusion Data Platform, AI-powered Tools

Read Full Article

MIT News | Julianna Mullen
Publication Date: September 1, 2023

Research Blogs

Modeling and improving text stability in live captions

Main Highlight: The paper "Modeling and Improving Text Stability in Live Captions" tackles the issue of text instability in live captions, often manifesting as distracting 'flicker'. The researchers formalized this problem by developing a vision-based flicker metric and then introduced a stability algorithm that uses tokenized alignment, semantic merging, and smooth animation. Their user study of 123 participants showed that the stabilization techniques significantly improved the viewer experience.

Terms to Understand: Text Stability, Flicker Metric, Stability Algorithm

Read Full Article

Bahirwani, Xu - Google Research
Updated Publication Date: August 30, 2023

WeatherBench 2: A benchmark for the next generation of data-driven weather models

Main Highlight: Google Research has announced WeatherBench 2 (WB2), an updated benchmark for next-generation data-driven weather models. WB2 aims to speed up the progress of machine learning-based weather forecasts by offering a trusted, reproducible framework for model evaluation. The machine learning models are showing comparable or even better performance than traditional physics-based models, and they can make quick forecasts on inexpensive hardware.

Terms to Understand: WeatherBench 2 (WB2), Machine Learning in Weather Forecasting, Evaluation Framework

Read Full Paper

Rasp, Bromberg, Google Research*
Publication Date: August 31, 2023

Research Papers

Vector Search with OpenAI Embeddings: Lucene Is All You Need

Main Highlight: The paper challenges the prevailing notion that a specialized vector store is essential for leveraging deep neural networks in search applications. Using Lucene's Hierarchical Navigable Small-World (HNSW) indexes, the authors demonstrate that existing search infrastructure can efficiently handle vector search with OpenAI embeddings. This suggests that adding a dedicated vector store to an already complex AI stack may not offer a justified cost-benefit advantage.

Terms to Understand: Lucene, Hierarchical Navigable Small-World (HNSW) indexes, Vector Search

Read Full Paper

Lin, Pradeep, Teofili, Xian
Publication Date: 30 Aug , 2023

Transformers as Support Vector Machines

Main Highlight: The paper establishes a formal link between the optimization geometry of self-attention in transformer models and hard-margin SVM problems. This relationship provides insights into how transformers make decisions when trained with gradient descent. The study reveals that over-parameterization aids in global convergence and the results are validated with experiments across diverse datasets.

Terms to Understand: Self-Attention, Optimization Geometry, Hard-Margin SVM

Read Full Paper

Tarzanagh, Li, Thrampoulidis, Oymak
Publication Date: 31 Aug , 2023

SAM-Med2D: Segment Anything Model

Main Highlight: The paper introduces SAM-Med2D, a specialized adaptation of the Segment Anything Model (SAM) for medical image segmentation. While SAM performed poorly in medical imaging due to a significant domain gap, SAM-Med2D fills this void by incorporating a large-scale medical image dataset of approximately 4.6M images and 19.7M masks. The model is fine-tuned comprehensively and has shown significantly better performance and generalization capabilities across various medical imaging scenarios compared to the original SAM.

Terms to Understand: SAM (Segment Anything Model), SAM-Med2D, Medical Image Segmentation

Read Full Paper

Cheng, Ye et. al
Publication Date: 30 Aug , 2023

PointLLM: Empowering Large Language Models to Understand Point Clouds

Main Highlight: The paper introduces PointLLM, a groundbreaking initiative that enables Large Language Models (LLMs) to understand and process 3D point clouds. This capability expands LLMs beyond text and 2D visual data, showing superior performance over 2D baselines and even outperforming human annotators in object captioning tasks by over 50%. The model employs a two-stage training strategy using a novel dataset of 660K simple and 70K complex point-text instruction pairs.

Terms to Understand: PointLLM, 3D Point Clouds, Large Language Models (LLMs)

Read Full Paper

Xu, Wang, Chen, Pang , Lin
Publication Date: 31 Aug , 2023

Bridging the data gap between children and large language models

Main Highlight: The paper discusses the significant gap in data efficiency between human learners, especially children, and large language models (LLMs). It offers three potential reasons for humans' superior data efficiency: evolutionary advantages, rich sensory experience, and the nature of social, interactive language input. The paper also hints at future research paths to improve LLM efficiency, such as using more grounded, interactive training data.

Terms to Understand: Data efficiency, Large Language Models (LLMs), human learning

Read Full Paper

Frank et.al
Publication Date: 31 Aug , 2023

Development

ChatDev: Create Customized Software with Natural Language Ideas

Main Highlight: ChatDev is a virtual software company made up of intelligent agents serving in various roles like CEO, CTO, Programmer, etc. Its main objective is to offer a highly customizable framework based on large language models for studying collective intelligence. The system is already live with features including customizability for the software development process and both online log and replay modes.

Terms to Understand: ChatDev, Intelligent Agents, Large Language Models

Repository

Author(s): OpenBMB

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

Main Highlight: The IP-Adapter is a lightweight yet effective module designed to enable text-to-image diffusion models to use image prompts. With just 22M parameters, it performs comparably or even better than fine-tuned image prompt models and supports multimodal image generation. The repository is continuously updated, and it offers various demos to showcase its capabilities, including face image prompts and fine-grained features.

Terms to Understand: IP-Adapter, Text-to-Image Diffusion Models, Multimodal Image Generation

Repository

Author: Tencent AI Lab

Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Main Highlight: This is the official implementation of Graph of Thoughts: Solving Elaborate Problems with Large Language Models. This framework gives you the ability to solve complex problems by modeling them as a Graph of Operations (GoO), which is automatically executed with a Large Language Model (LLM) as the engine.

Terms to Understand: Infilling, Zero-Shot Instruction Following, Tokenizers

Repository

Author: SPCL

Practical Tutorials and Resources

Pen and Paper Exercises in Machine Learning

This is a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimisation, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalised models), sampling and Monte-Carlo integration, and variational inference.

Gutmann

Awesome Conformal Prediction

A professionally curated list of awesome Conformal Prediction videos, tutorials, books, papers, PhD and MSc theses, articles and open-source libraries.

Valeman

Stanford XCS224U: Natural Language Understanding I Spring 2023

This professional Stanford Online course draws on theoretical concepts from linguistics, natural language processing, and machine learning. Topics include domain adaptation for supervised sentiment, retrieval augmented in-context learning, advanced behavioral evolution, analysis methods, and NLP methods.

Christopher Potts / Stanford

DeepLearning AI: How Business Thinkers Can Start Building AI Plugins With Semantic Kernel

In this course, you’ll learn how to use and create with Microsoft’s open source orchestrator, Semantic Kernel. Along the way you’ll gain skills in getting the most out of LLMs developing prompts, semantic functions, vector databases and using an LLM for planning.

Microsoft x DeepLearning

AI Developments: Report #4

AI 'nose' predicts smells from molecular structures

AI tech gives back 'voice' to woman with post-stroke locked-in syndrome

Amazon increases fees, ChatGPT comes to the enterprise, and Apple announces a press conference

Fast-tracking fusion energy’s arrival with AI and accessibility

Research Blogs

Modeling and improving text stability in live captions

WeatherBench 2: A benchmark for the next generation of data-driven weather models

Research Papers

Vector Search with OpenAI Embeddings: Lucene Is All You Need

Transformers as Support Vector Machines

SAM-Med2D: Segment Anything Model

PointLLM: Empowering Large Language Models to Understand Point Clouds

Bridging the data gap between children and large language models

Development

ChatDev: Create Customized Software with Natural Language Ideas

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Practical Tutorials and Resources

Tags :

Share :

Related Posts

AI Developments: Report #1

AI Developments: Report #2

AI Developments: Report #3