- NLPlanet Newsletter
- Posts
- Weekly AI and NLP News — April 24th 2023
Weekly AI and NLP News — April 24th 2023
Stack Overflow will charge for their training data, Microsoft's news AI chips, and more
Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!
😎 News From The Web
Stack Overflow Will Charge AI Giants for Training Data. Stack Overflow to charge large-scale AI devs for access to 50M Q&A to boost LLM data quality and result in faster high-quality LLM development.
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens. RedPajama aims to reproduce LLaMA models with a 7B parameter model, using a filtered dataset of 1.2T tokens. Their goal is open-source reproducibility.
Humane previews AI-powered wearable. Humane debuted its AI wearable, which projects info without a nearby cell phone. It raised $230M with funding from Microsoft, OpenAI CEO, and others.
Microsoft Readies AI Chip as Machine Learning Costs Surge. Microsoft creating Athena chip for AI's large-language models, aiming to save money and time. Amazon, Google & Facebook are creating their own AI chips too.
Stability AI Launches the First of its StableLM Suite of Language Models. Stability AI launched StableLM, an open-source language model (3-7B parameters) for text/code that performs well in conversational and coding tasks. Fine-tuned models are available for research.
Introducing Atlassian Intelligence. Atlassian now has Atlassian Intelligence - an AI-powered teammate that makes custom teamwork graphs, understands natural language, and answers queries.
📚 Guides From The Web
Intelligence Superabundance. The article examines how AI can create more demand for tasks requiring intelligence due to increased overall supply. Various industries are discussed.
What is Fuzzy Logic, Robotics & Future of Artificial Intelligence? This article covers the concepts of fuzzy logic and robotics in AI along with future applications in self-driving cars, cybernetics, healthcare, education, and decision-making systems.
🔬 Interesting Papers and Repositories
Minigpt-4. MiniGPT-4 aligns a visual encoder with a large language model to generate detailed image descriptions and create websites.
The Embedding Archives: Millions of Wikipedia Article Embeddings in Many Languages. Cohere's Embedding Archives contain free, multilingual vectors generated from millions of Wikipedia articles, ideal for AI developers building search systems. Available on Hugging Face Datasets.
suno-ai/bark: 🔊 Text-Prompted Generative Audio Model. Bark is a multilingual text-to-audio model that generates realistic speech, music, and nonverbal sounds. Find it on GitHub for research purposes.
Evaluating Verifiability in Generative Search Engines. The paper finds generative search engines often lack full citation support (51.5%). Proposed metrics aim to encourage comprehensive citation use. It emphasizes the need for trustworthy, informative generative search engines.
Learning to Compress Prompts with Gist Tokens. Proposed "gisting" effectively compresses prompts into reusable "gist" tokens for more efficient computation, reducing FLOPs and latency by up to 26x, without compromising output quality.
Thank you for reading! If you want to learn more about NLP, remember to follow NLPlanet. You can find us on LinkedIn, Twitter, Medium, and our Discord server!