Weekly AI and NLP News — June 6th 2023

Lack of GPUs is slowing down OpenAI, lawyer cites fake cases invented by ChatGPT, and more

Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!

😎 News From The Web

  • OpenAI's plans according to Sam Altman. OpenAI plans to prioritize creating a cheaper and faster GPT-4, extending context windows, and releasing multimodality in 2024. However, they are currently limited by GPU shortages and are unable to make models millions of times bigger in the near future. The finetuning API will be extended to the latest models.

  • JPMorgan Trained AI to Interpret the Federal Reserve's Intent. JPMorgan Chase used a ChatGPT-based model to accurately predict US financial regulator's statements on interest rate changes. Large language models can offer accurate predictions, leading to big profits for investors adjusting their strategies.

  • Nvidia demo about speaking to AI game characters. Nvidia showcases a powerful AI-powered demo of conversational AI for game characters that enhances realism and engages players, providing game developers a tool to improve their games' storytelling and overall engagement.

  • Lawyer cites fake cases invented by ChatGPT, judge is not amused. A lawyer cited fake cases generated by ChatGPT in their legal filings, highlighting the importance of verifying AI-generated content for accuracy and legitimacy. While generative AI tech like ChatGPT can benefit the legal industry, it is not perfect and may cause ethical and legal issues if not carefully reviewed.

  • Why Nvidia is suddenly one of the most valuable companies in the world. NVIDIA's GPUs have become a crucial component in developing AI, leading its worth to $939.3 billion; with AI applications requiring huge amounts of data, companies are buying thousands of NVIDIA's expensive chips. NVIDIA's dominance in the industry is predicted to persist as startups compete with big tech for access to its costly GPUs.

📚 Guides From The Web

  • Improving mathematical reasoning with process supervision. Researchers found that using process supervision can significantly enhance mathematical problem-solving skills in AI models. By rewarding each correct step of reasoning, the model achieved a new state-of-the-art, surpassing outcome supervision in terms of performance and alignment with human thinking.

  • Barkour: Benchmarking animal-level agility with quadruped robots. Google has created Barkour, a quadruped agility benchmark inspired by dog agility competitions, which evaluates robot performance relative to animals. The benchmark encourages efficient, controllable, and versatile locomotion controllers for quadruped robots and uses a new policy trained with a student-teacher framework to ensure greater robustness, versatility, and dynamism.

  • Combining Text-to-SQL with Semantic Search for Retrieval Augmented Generation. LlamaIndex's SQLAutoVectorQueryEngine is a powerful tool that can query both structured and unstructured data by combining SQL's expressivity with semantic search over a vector database, providing a comprehensive solution for AI professionals.

🔬 Interesting Papers and Repositories

  • Neuralangelo. Neuralangelo is an efficient and flexible framework for high-quality 3D surface reconstruction from RGB video captures. Its optimization techniques enable users to create photorealistic 3D models of both object-centric and large-scale real-world scenes with highly detailed 3D geometry.

  • lyuchenyang/Macaw-LLM: Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration. Macaw-LLM is a multi-modal language modeling framework that integrates images, videos, audio, and text by aligning multi-modal data, encoding with CLIP and Whisper, and feeding outputs to LLaMA for efficient learning. It supports multiple languages and is expandable with the addition of more models.

  • Controllable Text-to-Image Generation with GPT-4. Control-GPT is an AI model that improves text-to-image generation by using Large Language Models to generate TikZ code sketches. These sketches act as a reference for diffusion models, enhancing spatial reasoning and improving image generation's controllability. Control-GPT sets a new standard for spatial arrangement and object positioning, and doubles the accuracy of prior models by empowering users to control object positions and sizes.

  • An Open-Ended Embodied Agent with Large Language Models. Voyager is a new embodied lifelong learning agent that explores and conquers Minecraft without human intervention. It has an automatic curriculum, a skill library, and uses code as the action space for greater compositional actions. Its iterative prompting mechanism learns from mistakes and progresses.

  • A PhD Student’s Perspective on Research in NLP. A new document highlights untapped areas of exploration in NLP beyond LLMs and provides brief descriptions of 14 research areas rich in exploration. This document serves as a valuable guide for identifying exciting areas to explore in the field of NLP for PhD students and researchers.

Thank you for reading! If you want to learn more about NLP, remember to follow NLPlanet. You can find us on LinkedIn, Twitter, Medium, and our Discord server!