Exploring Large Language Models and Optimizing SIMD Everywhere with RISC-V and Rust
Dive into today’s exploration of cutting-edge research as we delve into the world of SIMD optimization for ARM and RISC-V vector extensions, the potential of Large Language Models in reshaping autonomous driving, the biases in fake news detection, the intriguing concept of Large Language Models as superpositions of perspectives, and the advent of Flux Liquid Types for Rust. Join us as we unpack intriguing discussions from Hacker News, ranging from the significant speedup in Google’s XNNPACK library to the challenges and opportunities presented by LLMs in autonomous driving and fake news detection, and the controllability (or lack thereof) in different versions of LLMs. Buckle up for a fascinating journey through the latest Arxiv papers, as we dissect the research and online discourse that’s shaping our technological landscape.
1) SIMD Optimization for ARM and RISC-V Vector Extensions
The migration of ARM NEON Intrinsics codes to RISC-V Vector Extensions using SIMDe resulted in a significant speedup of 1.51x to 5.13x in the Google XNNPACK library.
Google’s Highway and OpenCV’s latest version allow for the creation of versatile SIMD code, enhancing portability and performance in numerical computation and graphics for RISC-V and Arm vector extensions. View on HN
- Highway is a SIMD library that supports both RISC-V and ARM vector extensions.
- OpenCV has “universal intrinsics” that support RISC-V with scalable vector registers.
- Languages like Rust, C++, and Zig have fixed-width SIMD libraries that cannot target different hardware implementations.
- ARM’s SVE and RISC-V’s RVV allow for portable code that can work with different vector widths.
- Most SIMD libraries require compile-time knowledge of the vector size, while Highway supports dynamically-sized SIMD.
- Handwritten assembly can provide the highest level of optimization but may not be practical for complex equations or maintainability.
- There are challenges with code duplication and branching when targeting different SIMD widths.
- The RISC-V ecosystem is rapidly growing and becoming strong.
2) Drive Like a Human Rethinking Autonomous Driving with Large Language Models
The use of a large language model can improve autonomous driving by imitating human driving patterns and adapting through continuous learning.
The concerns and suggestions surrounding the use of Large Language Models (LLMs) in autonomous driving, including hallucinations, limited comparison to Reinforcement Learning (RL), and slow inference speeds, as well as their value in explanations, human interaction, and impact on traffic laws, motion prediction, and language processing. View on HN
- Relying on Large Language Models (LLMs) for autonomous driving raises concerns about hallucinations and slow inference speeds.
- LLMs could potentially be used as a fall-back mechanism in new situations or to predict human and other car behavior.
- The degree to which hallucinations occur in LLMs is difficult to quantify and poses a risk in safety-critical situations.
- There is a need for safety critical AI models to be fully explainable to ensure predictability and control.
- The ideal autonomous driving system should drive like a human, accumulating experience and using common sense to solve problems.
- LLMs can be used as a tool for robotics control and may help uncover new control methods.
- Waymo’s MotionLM and other LLM-like techniques are being explored in safety critical environments like autonomous driving.
- The use of LLMs in self-driving cars requires addressing challenges related to environmental conditions, road obstacles, and real-time decision-making.
3) Bias in Fake News Detection of LLMs
Fake news detectors often misclassify content generated by language model models (LLM) as fake, but detection accuracy can be improved through the use of adversarial training and datasets.
Fake news detectors prioritize plausibility rather than truthfulness, however, accuracy is enhanced through adversarial training with genuine news, while also acknowledging the limitations of human detection and mentioning unreliable websites and conspiracy domains. View on HN
- Fake news detectors are biased against texts generated by large language models.
- Large language models are trained to generate plausible statements, not necessarily truthful ones.
- Adversarial training with LLM-paraphrased genuine news can improve detection accuracy for both human and LLM-generated news.
- There are methodological issues in identifying reliable and unreliable news sources.
- The ability of LLMs to determine the truthfulness of a thing is questioned.
- Humans can train models to detect fake news, even though they themselves may not be perfect at detecting it.
- Automated tests for validity can exceed the normal functioning of human brains.
- Humans can generate and label data with desired properties, regardless of their ability to differentiate the data after it is generated.
4) Large Language Models as Superpositions of Perspectives
Large Language Models (LLMs) are superpositions of perspectives that can adopt different values and traits, with GPT-3.5 and GPT-4 being more controllable, OpenAssistant having some controllability, and StableVicuna and StableLM lacking controllability, while various methods for inducing perspectives are explored.
5) Flux Liquid Types for Rust
Flux is a Rust type system that enhances low-level pointer manipulating programs with faster verification times and fewer annotations.