Steganography and LLMs: Hiding in Plain Sight?
Comments on **Bruce Schneier's** blog discuss methods of steganography, including using white text on a white background and phonological changes to words to evade detection by **Large Language Models (LLMs)**. The discussion touches on the limitations and potential bypasses of these techniques, as well as related tools for TEMPEST mitigation and text watermarking.
The comments section of a recent **Bruce Schneier** blog post has sparked a discussion around steganography and its effectiveness against modern **LLMs**. Several commenters explored different techniques for concealing information within text. Here's a breakdown of the key points:
### Simple Hiding Techniques
One commenter suggested a straightforward approach: using white text on a white background to hide information from the human eye while remaining machine-readable. Another mentioned using black font on a black background, drawing a parallel to censorship tactics.
### Evading LLMs with Phonological Changes
**Derek Jones** described his attempts to obscure meaning from **LLMs** by introducing phonological changes to words. He presented an example sentence with deliberate misspellings, such as "phashyon es cycklyq." While he anticipated that word tokenization would hinder **LLM** decoding, he found that even smaller models could handle these alterations with relative ease.
### The Layering Problem
**Clive Robinson** emphasized that the choice of language layer for steganography is crucial. Higher layers (longer token lengths) yield more coherent stego-text but can result in noticeable context jumps. He also noted that the underlying concepts aren't new and have been discussed previously.
### TEMPEST and Soft Tempest Fonts
The discussion expanded to include TEMPEST (Transient Electromagnetic Emanation Standard) and techniques for mitigating information leakage through electromagnetic radiation. Commenters mentioned "Zero Emission Pad," an older Windows program designed to counter TEMPEST vulnerabilities via font smoothing. **Clive Robinson** cited the work of **Markus G. Kuhn** at Cambridge Computer Labs on Soft Tempest Fonts and linked to a FAQ on the topic. He cautioned that modern SDR (Software Defined Radio) technology has significantly advanced TEMPEST capabilities, potentially diminishing the effectiveness of older font-based approaches.
### Tools and Resources
Several tools and resources were mentioned, including:
* `snowdrop`: A tool available in Debian for watermarking plaintext English.
* `Tempest for Eliza`: A free/open-source program demonstrating monitor insecurity by broadcasting to local AM/FM radio.
* `TempestSDR`: A program for advanced users utilizing SDR technology.
### LLMs: Overgrown Autocomplete?
One commenter, **MrC**, humorously observed that the referenced paper seemingly transitions from a clever steganography method to a realization that **LLMs** are essentially advanced autocomplete systems lacking true intelligence or intent.
In conclusion, while various steganographic techniques exist, their effectiveness against modern **LLMs** and advanced surveillance technologies remains a subject of ongoing debate and development. The comments highlight the constant cat-and-mouse game between those seeking to conceal information and those seeking to extract it.