While the AI group is still the largest, it’s notable that programming, web, and security are all larger than they have been in recent months. One reason is certainly that we may move AI news to other categories. But I also think AI is harder to impress than it used to be. The AI discussions have been much more about regulation and intellectual property – which makes me wonder if legislation should be a separate category.
Regardless, it is important that OpenAI now enables API users to fine-tune their GPT 4 apps. It is of course an as-a-service. And RISC-V finally seems to be being adopted seriously. Could it compete with Atom and Intel? We will see.
Learn faster. Dig deeper. See further.
AI
- OpenAI has announced ChatGPT Enterprise, a version of ChatGPT aimed at enterprise customers. ChatGPT Enterprise offers enhanced security, the promise of untrained conversations, single sign-on, an admin console, larger 32K context, higher performance, and the removal of usage restrictions.
- Facebook/Meta has released Code LLaMA, a version of their LLaMA 2 model specialized for writing code. It can be used for code generation or completion. Its context window includes 100,000 tokens, making Code LLaMA more accurate for larger programs.
- OpenAI has announced that API users can now optimize GPT-3.5 for their own applications. Fine-tuning for GPT-4 will come later. To ensure security, tuning data is passed through OpenAI’s moderation filter.
- txtai is an open source embedding database. It is a vector database specifically designed to work with natural language problems.
- TextFX is a set of tools that use Google’s PaLM 2 model to play with language. It doesn’t answer questions or write poetry; It allows users to see the possibilities in words, thereby encouraging their own creativity.
- A US judge has ruled that an AI system cannot copyright a work. In this case, the AI itself – and not the human user – should own the copyright. This decision is consistent with the Copyright Office’s guidance: providing prompts to a generative algorithm is not enough to create a copyrighted work.
- Despite ChatGPT’s approximately 50% error rate, a study shows that users prefer ChatGPT’s answers to programming questions over StackOverflow’s answers. ChatGPT’s complete, clear and polite responses seem to be the reason for this preference.
- AI was on the agenda at DefCon, and while the results of a red teaming competition won’t be released for several months, it’s clear that security remains a secondary issue and attacking current AI models is extremely easy is.
- Recognizing emotions is difficult, if not impossible. It’s not clear if there are credible use cases for this. AI systems are particularly bad at this. But companies build products.
- Watermarking has been proposed as a technique to determine whether content is generated by AI, but it is not a panacea. Here are some questions to help assess whether watermarks are useful in a particular situation.
- Zoom and Grammarly have both issued new licensing agreements that allow them to use data collected from users to train AI. Zoom backed down after customer backlash, but that begs the question: Will other apps follow?
- Using large language models for work or play is one thing, but how do you put them into production? 7 Frameworks for Serving LLMs examines some tools for serving language models.
- Simon Willison provides instructions for running LLaMA 2 on a Mac. He also provides slides and a well-edited transcript of his talk on LLMs at North Bay Python.
- PhotoGuard is a tool to protect photos and other images from manipulation by AI systems. It adds data to the image in a way that is undetectable to humans, but results in noticeable distortion when the image changes.
- C2PA is a cryptographic protocol for confirming the origin of electronic documents. It could be used to determine whether documents are generated by AI.
- Google’s DeepMind has developed a vision-voice-action model called RT-2 (Robotic Transformer 2), which combines vision and voice with the ability to control a robot. It learns from both web data (images and text) and robotic data (interactions with physical objects).
programming
- Maccarone is an extension to VSCode that allows you to “delegate” blocks of Python code to AI (GPT-4). The parts of the code under AI control are automatically updated as needed when the surrounding code changes.
- Microsoft adds Python as a scripting language for Excel formulas. Python code runs in an Azure container that contains some commonly used libraries including Matplotlib and Pandas.
- Many companies are building platform engineering teams to make software developers more effective. Here are some ideas for getting started with platform engineering.
- A Google study of Rust’s internal usage supports the claim that Rust makes it easier to create high-quality code. The study also debunks a number of myths about the language. It’s not as hard to learn as most people think (but that’s also a Google study).
- deno_python is a Javascript module that provides integration between Javascript (running on Deno) and Python, allowing Javascript programmers to call key Python libraries and Python functions.
- The Python Steering Council has announced that it will make the Global Interpreter Lock (GIL) optional in a future version of Python. Python’s GIL has long been a barrier to effective multi-threaded computing. The change will be backwards compatible.
network
- Google’s controversial Web Environment Integrity proposal provides web servers with a way to cryptographically authenticate the browser software making a request. WEI could potentially reduce online fraud, but it also poses significant privacy risks.
- Trafilatura is a new tool for web scraping developed based on quantitative research (e.g. compilation of training data for language models). It can extract text and metadata from HTML and generate output in various formats.
- Astro is another open source web framework designed for high performance and easy development.
- Although the “browser wars” are far behind us, it is still difficult for developers to write code that works correctly on all browsers. Baseline is a project from the W3C’s WebDX Community Group that specifies what features web developers can use in the most commonly used browsers.
- How large language models helped redesign a website raises some important questions: When do you stop using ChatGPT and finish the work yourself? When does your own performance begin to atrophy?
- Remember Flash? There’s a museum… And Flash games run in a modern browser using Ruffle, a Flash Player emulator written in WebAssembly.
Security
- Proof-of-Work enters the Tor network. It is used to defend against denial of service attacks. PoW is disabled most of the time, but when traffic appears unusually high, it can be turned on, forcing users to “prove” their humanity (actually, their willingness to do work).
- A look back at this year’s MoveIT attack draws some important conclusions about protecting your assets. Supply chain mapping, third-party risk management, zero trust, and continuous penetration testing are important parts of a security plan.
- Bitwarden has released an open source end-to-end encrypted secrets manager. Secrets Manager enables the secure distribution of API keys, certificates and other sensitive data.
- The US government has announced the AI Cybersecurity Challenge (AIxCC). AIxCC is a two-year competition to develop AI systems that can secure critical software. There are $18.5 million in prizes and the possibility of DARPA funding for up to seven companies.
- OSC&R is the Open Source Supply Chain Attack Reference, a new project that catalogs and describes techniques used to attack software supply chains. It is based on MITER’s ATT&CK framework.
- The Lapsus$ group has emerged as one of the most effective threat actors despite being relatively straightforward. They rely on persistence, clever social engineering, and analyzing vulnerabilities in an organization’s security posture rather than compromising infrastructure.
- The NSA has released a report that provides guidance on how to protect systems from memory security flaws.
- Bruce Schneier has an important perspective on the long-term consequences of the SolarWinds attack. These consequences include the theft of a signing key for an Azure customer account, which in turn was used by attackers to access US government email accounts.
- A new generation of ransomware attacks is targeting IT professionals via fake IT tool advertisements. Although IT professionals are (presumably) more cautious and attentive than other users, they are also high-value targets.
Hardware
- Parmesan cheese makers are experimenting with adding microchips to the cheese rind to authenticate real cheese.
- Adoption of RISC-V, a royalty-free, open source instruction set architecture for microprocessors, is increasing. Could it displace ARM?
- Speculative execution errors have been discovered for current processors from Intel (“Downfall”) and AMD (“Inception”). Patches for Linux have been released.
Operations
Quantum computing
- Peter Shor, inventor of the quantum algorithm for factoring prime numbers (which in turn could be used to break most modern cryptography that is not quantum resistant), has published the lecture notes of the quantum computing course he teaches at MIT.
- Using a Honeywell quantum computer, a material has been found that can improve the efficiency of solar cells. It is likely that the first applications of quantum computing will involve simulation of quantum phenomena rather than pure calculations.
Cryptocurrency
- If you’re interested in WorldCoin’s iris scanning, a cryptographer analyzes his system’s privacy promises. He remains skeptical, but was less unimpressed than expected.
- Paypal has launched a stablecoin that is said to be fully backed by US dollars.