Weekly AI and NLP News — May 29th 2023

The new Intel LLM and Meta's project on +1,000 languages

Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!

😎 News From The Web

  • Intel Announces Aurora genAI, Generative AI Model With 1 Trillion Parameters. Intel has announced the Aurora genAI model, with 1 trillion parameters, which will be trained on scientific texts and structured scientific data to target cancer research, systems biology, cosmology, polymer chemistry, materials, and climate science. It will be powered by the Aurora supercomputer and has potential to suggest experiments and accelerate drug design targets.

  • Introducing speech-to-text, text-to-speech, and more for 1,100+ languages. Meta's Massively Multilingual Speech project uses self-supervised learning with wav2vec 2.0 and a unique dataset to enable AI to understand and generate speech in over 1,100 languages, outperforming existing models with reduced character error rates and increased language coverage.

  • Google is starting to let users into its AI search experiment. Google's Search Labs is now open for experimentation, offering AI-generated summaries at the top of search results that are globally compatible in multiple languages, potentially changing the web's business model and impacting SEO.

  • Generative AI Startup Anthropic Secures $450M in Series C Funding to Advance Next-Gen Intelligent Assistants. Anthropic secures $450M in Series C funding led by Spark Capital, with participation from Google, Salesforce, and Zoom, signaling major tech players' investments in the company's AI achievements. The funding will help Anthropic expand its product offerings and develop next-generation AI assistants, promising transformative innovation in businesses and customer experiences.

  • New superbug-killing antibiotic discovered using AI. Researchers have discovered a new antibiotic using AI to identify a compound that kills Acinetobacter baumannii with few signs of resistance, demonstrating the power of AI in identifying potential new antibiotics and hastening the pace of discovery of new treatments.

  • Google’s new generative AI tool can jazz up product images. Google's new Product Studio uses generative AI to help Shopping merchants customize product images and improve their ad campaigns, with integration into the Merchant Center for added convenience.

📚 Guides From The Web

  • How To Finetune GPT Like Large Language Models on a Custom Dataset. Lit-Parrot is a nanoGPT-based tool developed by Lightning AI that offers clean and optimized LLM implementations for fine-tuning on custom data. It includes prompt adaptation, LLM approximation, and LLM cascade for cost reduction. Lightning AI offers step-by-step guidance on installation, dataset preparation, and model fine-tuning.

  • Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA. Introducing QLoRA, a 4-bit finetuning breakthrough for Language Models that enables Guanaco, a chatbot, to achieve an astounding 97% ChatGPT performance on the Vicuna benchmark, all on a single GPU. QLoRA uses a frozen 4-bit base model with adapters on top to achieve memory efficiency through inventive tricks, including the 4-bit NormalFloat, Paged Optimizers, and Double Quantization.

  • Welcome to LLM University. LLM University by Cohere is a new program offering a holistic curriculum on NLP and Large Language Models, suitable for beginners and advanced learners. The program covers the architecture, embeddings, similarity, and attention mechanisms of LLMs, and teaches learners how to apply them to real-world scenarios such as semantic search, text generation, and classification. Plenty of code samples are provided for practical implementation.

  • The Generative AI Revolution in Games. Generative AI tools like Stable Diffusion and Dreambooth are revolutionizing the gaming industry by allowing artists to create high-quality images in hours, democratizing creative tools, and enabling more risk-taking in game development. Artists skilled in using AI tools collaboratively and training the AI to reflect a consistent style will likely be in high demand.

🔬 Interesting Papers and Repositories

  • LIMA: Less Is More for Alignment. LIMA is a language model that can improve alignment without reinforcement learning or human preference modeling, trained on 1,000 curated prompts and responses. It outperformed GPT-4 in 43% of cases without human feedback and highlights the importance of pretraining over large-scale instruction tuning and reinforcement learning.

  • Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training. Sophia, a new optimizer for stochastic optimization, can drastically reduce the cost of pre-training language models. It achieves the same validation pre-training loss with 50% fewer steps than previous optimizers, and can be easily integrated into existing training pipelines.

  • Reasoning with Language Model is Planning with World Model. The Reasoning via Planning (RAP) framework uses Monte Carlo Tree Search to enable large language models to perform effective planning and refine existing reasoning, outperforming other models in tasks like plan generation, math reasoning, and logical inference with a 33% relative improvement in plan gen settings. This framework has potential to improve large language models' reasoning skills in decision-making and motor control.

  • The False Promise of Imitating Proprietary LLMs. Google researchers have found that while using imitation models to improve a weaker language model may initially show improvement, it is limited to tasks heavily supported in training data and cannot close the gap on unsupported tasks. The researchers suggest that improving base capabilities is the best way to improve open-source language models.

  • LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond. A recent paper explores the accuracy of large language models in detecting factual inconsistencies. Although some LLMs performed well, most struggled with detecting inconsistencies in facts, indicating issues with current benchmarks. Researchers created the new SUMM EDITS benchmark which evaluates LLMs' ability to detect factual inconsistencies.

  • BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing. BLIP-Diffusion is a new subject-driven text-to-image generation model that uses a pre-trained multimodal encoder for efficient scaling and fast fine-tuning, allowing for zero-shot generation and up to 20x speedup. It adopts a 2-stage pre-training strategy and excels in zero-shot subject-driven generation with faster fine-tuning compared to DreamBooth. It can be extended to other applications without extra training.

Thank you for reading! If you want to learn more about NLP, remember to follow NLPlanet. You can find us on LinkedIn, Twitter, Medium, and our Discord server!