Radar Trends to Watch: January 2024 – O’Reilly
9 mins read

Radar Trends to Watch: January 2024 – O’Reilly

More large language models. More and more large language models. Will the new year be different? But there’s a difference in this month’s AI news: the focus is on tools that make it easier for users to use models. Whether you simply customize a URL so you can ask questions about an article on arXiv, use LLamafile to run a model on your laptop (make sure you have enough memory!), or use the Notebook language model, to query your own documents, AI is becoming widely accessible – and not just a toy with a web interface.

Artificial intelligence

  • Adding talk2 to the beginning of any arXiv URL (e.g. talk2arxiv.org) loads the paper into an AI chat application so you can talk to it. This is a very clever use of the RAG pattern.
  • Waymo, Google’s autonomous vehicle startup, has reported a total of three minor injuries to people during over 7 million miles of driving. This is clearly not Tesla, not Uber, not Cruise.
  • Google’s DeepMind has used a large language model to solve a previously unsolved problem in mathematics. This is arguably the first time a language model has created information that didn’t exist before.
  • The creator of llamafile has offered a series of one-line bash scripts for laptop-based AI.
  • Microsoft has released a small language model called Phi-2. Phi-2 is a 2.7B parameter model that has been extensively trained on “textbook-quality data.” Without naming names, they claim that the performance is superior to Llama 2.
  • Claude, Anthropic’s large language model, can be used in Google Sheets via a browser extension.
  • The Notebook language model is a RAG implementation for individuals. It’s a Google Notebook (similar to Colab or Jupyter) that allows you to upload documents and then ask questions about those documents.
  • The European Union is close to passing its AI law, the world’s most significant attempt to regulate artificial intelligence.
  • Mistral has released Mixtral 8x7B, an expert mixing model in which the model first determines which of eight sets of 7 billion parameters generates the best response to a prompt. The results compare well with Llama 2. Mistral 7B and Mixtral can be run with Llamafile.
  • Meta has announced Purple Llama, a project around trust and security for large language models. They have published a set of benchmarks for evaluating model security, as well as a classifier for filtering unsafe inputs (prompts) and model outputs.
  • The Switch Kit is an open source software development kit that allows you to easily replace OpenAI with an open source language model.
  • Google has announced that its Gemini multimodal AI model is available to software developers through AI Studio and Vertex AI.
  • Progressive upscaling is a technique that starts with a low-resolution image and increases the resolution using AI. It reduces the computing power required to produce high-resolution images. It was implemented as a plugin for Stable Diffusion called DemoFusion.
  • The internet has enabled mass surveillance, but that still leaves exabytes of data to analyze. According to Bruce Schneier, AI’s ability to analyze this data and draw conclusions from it enables “mass espionage.”
  • A group of over 50 organizations, including Meta, IBM and Hugging Face, formed the AI ​​Alliance to focus on developing open source models.
  • DeepMind has developed an AI system that demonstrates social learning: the ability to learn how to solve a problem by observing an expert.
  • Are neural networks the only way to build artificial intelligence? Hivekit develops tools for a distributed spatial rules engine that can provide the communication layer for hives, swarms and colonies.
  • The proliferation of AI testing tools continues with Gaia, a benchmark suite designed to determine whether AI systems are actually intelligent. The benchmark consists of a series of questions that are easy for humans to answer but difficult for computers.
  • Meta has just released a series of multilingual spoken language models called Seamless. The models are capable of near real-time translation and claim to more closely mimic natural human expression.
  • In an experiment simulating a stock market, an AI stock trading system engaged in “insider trading” after being pressured to earn higher returns and receiving “tips” from company “employees.”
  • What is the best way to run a large language model on your laptop? Simon Willison recommends llamafile, which packages a model along with the weights as a single (large) executable file that works on multiple operating systems.
  • Further work on extracting training data from ChatGPT, this time compared to the production model, shows that these systems may be opaque, but they are not really black boxes.
  • Amazon Q is a new large language model that includes a chatbot and other tools to support office workers. It can be customized by individual companies that subscribe to the service, giving them access to their proprietary data.


  • A new language superset: Pluto is a superset of Lua. Supersetting could be the “new thing” in language design: TypeScript, Mojo, and a few others (including the early versions of C++) come to mind.
  • Virtualization within containers orchestrated by Kubernetes: Can you imagine a Kubernetes cluster running in a Docker container? Is this a good thing or proof that the complexity of a stack can grow indefinitely?
  • Google engineers propose an alternative to microservices: bounded monoliths deployed by an automated runtime environment that determines where and when they are instantiated. As Kelsey Hightower said, the deployment architecture becomes an implementation detail.
  • The OpenBao project is intended to be an open source fork of HashiCorps Vagrant, analogous to the OpenTofu fork of Terraform. There is speculation that IBM will support both projects.
  • Biscuit authorization is a distributed authorization protocol that is relatively small and flexible and designed for use in distributed systems. Each node can only validate a Biscuit token using public information.
  • gokrazy is a minimal Go runtime for the Raspberry Pi and (some) PCs. It minimizes maintenance by eliminating everything that is not necessary to compile and run Go programs.
  • This is clearly what you don’t need: a brainfuck interpreter written in PostScript. (If you really must know, Brainfuck is arguably the world’s most inconvenient programming language, and PostScript is the language your computer sends to a printer.)
  • Baserow is an open source, no-code tool that combines a spreadsheet with a database. It is similar to Airtable.
  • New Programming Language of the Month: Onyx is a new programming language for generating WebAssembly (Wasm) using Wasmer as the underlying runtime.


  • Anil Dash predicts the internet will get weird again – as it should. Power is shifting from the established, heavily funded walled gardens to people who just want to be creative.
  • Meta’s Threads has begun testing integration with ActivityPub, allowing Mastodon servers to access it.
  • The HTML Energy movement seeks to recapture the creativity of the early web by building websites from the ground up using HTML and forgoing powerful web frameworks.
  • The best WebAssembly runtime may not be a runtime at all: just transpile it to C.


  • Researchers have discovered a man-in-the-middle attack against SSH, one of the foundations of cybersecurity.
  • A new version of SSH (SSH3) promises to be faster and more feature-rich. It is based on HTTP/3 and written in Go.
  • Security researchers have revealed two key vulnerabilities in OpenAI’s custom GPTs. Malicious actors can extract system prompts and force the system to reveal uploaded files and other data.
  • Meta has made end-to-end encryption (E2EE) the standard for all Messenger and Facebook messaging users. Their E2EE implementation is based on Signal’s. They have built a new storage and retrieval service for encrypted messages.
  • A chatbot controlled by a jailbroken language model can be used to jailbreak other chatbots. Language models are very good at formulating requests that push other models to go beyond their limits, with success rates of 40 to 60%. AI security will be a central topic this year.

Quantum computing

  • IBM has developed a 1121-qubit quantum processor and a system of three 133-qubit processor chips that significantly improves the accuracy of quantum gates. Working quantum computers will probably require over a million qubits, but that’s a big step forward.
  • A research group has announced that it can carry out calculations on 48 logical (i.e. error-corrected) qubits. Although their work has a number of limitations, it is an important step towards practical quantum computing.
  • Two articles on the topic of post-quantum cryptography explain what it is about.


  • Researchers have developed a non-invasive system that can convert human thoughts into text. Users wear a cap with sensors that generate EEG data. The accuracy is not yet very high, but it is already superior to other thought-to-speech technologies.
  • Artificial neural networks with brains: Researchers connected cultured human brain cells (organoids) to an interface that allowed them to transmit audio data to the organoids. They found that it could recognize vowels.

Virtual and augmented reality

Learn faster. Dig deeper. See further.