pk.org: CS 419/Lecture Notes

Steganography & Watermarking

Hiding data

Paul Krzyzanowski – December 2025

Introduction

In the 17th century, Sir John Trevanion, an English politician during the time of the English Civil War, faced execution with little hope of escape. On the brink of death, a letter from his loyal servant arrived. To the guards, it seemed like an ordinary, rambling message offering sympathy and encouragement. But to Sir John, the letter carried a hidden meaning, one that only he could decipher.

Worthie Sir John, --Hope, that is ye beste comfort of ye afflicted, cannot much, I fear me, help you now. That I would saye to you, is this only: if ever I may be able to requite that I do owe you, stand not upon asking me. 'Tis not much that I can do: but what I can do, bee ye verie sure I wille. I knowe that, if dethe comes, if ordinary men fear it, it frights not you, accounting it for a high honour, to have such a rewarde of your loyalty. Pray yet that you may be spared this soe bitter, cup. I fear not that you will grudge any sufferings; only if bie submission you can turn them away, 'tis the part of a wise man. Tell me, an if you can, to do for you anythinge that you wolde have done. The general goes back on Wednesday. Restinge your servant to command. R.T.

By applying a secret rule that both men had agreed on earlier, he took the third character after each punctuation mark and assembled a hidden instruction: “Panel at east end of chapel slides.” When he was granted an hour in the chapel for prayer, he followed the instructions, found the hidden panel, and escaped through a concealed tunnel, evading his execution1.

Whether or not this story is historically accurate, it is a good illustration of a concealment cipher (or null cipher): a message hidden in plain sight using a pattern known only to the sender and receiver. The letter is an example of steganography, which is the art of hiding the existence of a message rather than scrambling its content.

Unlike cryptography, which scrambles a message to obscure its content, steganography tries to hide the very existence of the message, often within ordinary texts, images, or objects. The word comes from the Greek steganos (covered) and graphia (writing)--literally "covered writing." Sir John's escape illustrates the power and creativity of this technique.

Steganography vs. Watermarking

Steganography and watermarking are closely related concepts, and the terms are often used interchangeably. Both techniques embed information into content. However, they have different goals.

The goal of steganography is to allow primarily one-to-one communication while hiding the existence of a message. An intruder, someone who does not know what to look for, cannot even detect the message in the data. Steganography can be thought of as invisible watermarking.

Three characteristics define steganography:

  1. The cover object appears untouched, meaning that the modifications should be invisible and statistically difficult to detect.

  2. The hidden data is a payload: you embed arbitrary messages to transmit secretly.

  3. The adversary model is one of detection. If an analyst suspects the content contains hidden data, the mission fails.

The primary goal of watermarking is to create an indelible imprint on a message, such that an intruder cannot remove or replace it. It is often used to assert ownership, authenticity, or encode DRM rules. The message may be, but does not have to be, invisible. Watermarking is primarily used for one-to-many communication.

Three characteristics define watermarking:

  1. The watermark does not need to be hidden. Some might be invisible, but secrecy is not required.

  2. The watermark is tied to the object, not a message payload. It could include ownership, license information, or tracking data.

  3. The adversary model is one of removal. The watermark should survive cropping, compression, resampling, screenshotting, and other transformations.

A variant of watermarking is fingerprinting, which embeds uniquely identifying information into each copy of distributed content, allowing the source of a leaked copy to be traced back to a specific recipient.

Classic Techniques in Steganography

Steganography has been used throughout history to transmit secret information without arousing suspicion. Some classic techniques include:

Concealment ciphers (null ciphers) hide secret messages within seemingly ordinary text using predefined patterns or rules. For example, the hidden message might be revealed by taking the first letter of every word, the second letter of each word, or every nth letter in the text. A famous example comes from World War I:

Apparently neutral's protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on byproducts, ejecting suets and vegetable oils.

Reading the second letter of each word in this message reveals "Pershing sails from NY June I":

By World War II, null ciphers were no longer used by spies but by regular people trying to beat the censor. They are now sometimes used for prison or gang communication via innocuous letters or classified ads.

Invisible ink has been used for centuries to conceal messages that only become visible under certain conditions, such as exposure to heat, ultraviolet light, or specific chemicals. During the American Revolution, both George Washington's spies and British agents used invisible ink to transmit sensitive military plans. Lemon juice, for instance, was a common choice and would become visible when exposed to heat, allowing messages to remain hidden from prying eyes until intentionally revealed.

Microdots involve shrinking messages to the size of a dot, often smaller than a period, which can be embedded within images, letters, or documents. These were widely employed during World War II by spies and intelligence agencies. German agents used microdots to conceal detailed blueprints, maps, or secret instructions on documents that appeared harmless. Magnification tools were required to reveal the information, making it an effective tool for espionage.

Hidden text in artwork or objects has been used by artists and craftspeople throughout history to communicate secret meanings or instructions. During the Reformation, Protestant dissenters in Catholic-controlled regions concealed religious or political messages in artwork. Similarly, spies in World War II encoded escape routes and instructions into maps disguised as paintings or decorative items.

Writing messages on one's head and covering them with hair is an unusual but effective technique dating back to ancient Greece. The historian Herodotus recounts how a message was written on a servant's shaved head, which was then allowed to grow hair to conceal it. The servant traveled to deliver the message, which was revealed by shaving their head again. This method ensured that the message remained completely hidden during transit.

Carefully-clipped newspaper articles allowed spies and informants to communicate messages by clipping and arranging words or phrases from newspapers. The resulting collage conveyed the secret message while appearing as an innocent collection of clippings. This technique was particularly common in the 20th century when newspapers were widely available and their contents provided plausible deniability.

Knitting patterns have been used to encode messages where the arrangement of stitches--such as purls and knits--corresponds to letters or words in a secret code. This technique was used by spies during both World Wars.

In World War I, Belgian intelligence recruited elderly women living near railway stations to monitor enemy train movements. As they knitted, they would purl a stitch when they saw an artillery train or drop a stitch (leaving a hole) when a troop railcar passed. This practice continued in World War II, where the Belgian Resistance used similar techniques. Phyllis Latour Doyle, a British secret agent, parachuted into Nazi-occupied Normandy in 1944 and coded over 100 messages into her knitting, which she passed to British intelligence. The use of knitting in espionage became so concerning that during World War II, both the United States and the UK banned the posting of written knitting patterns, fearing they might contain coded messages.

Charles Dickens' novel A Tale of Two Cities, published in 1859, features the fictional character Madame Defarge encoding the names of those condemned to die into her knitting during the French Revolution. While entirely fictional, this literary depiction predates the actual wartime use of knitting codes and may have helped popularize the concept.

Signatures like "XOXO" can contain hidden meanings. Simple patterns commonly used to signify hugs and kisses could be used as a steganographic code if the sender and recipient had a predefined understanding of what the pattern represented. Variations in spacing, capitalization, or repetition could further encode specific instructions or messages.

Word or letter substitution hides messages by subtly altering text, such as changing capitalization, spacing, or fonts. A historical example comes from prisoners in the 16th century who marked specific letters in books to encode messages. Similarly, acrostic poetry, where the first letters of each line spell out a hidden message, was often used during the Renaissance to convey covert information.

Wax-covered tablets were used in ancient Greece. Herodotus described messages carved into wood underneath a wax writing surface. The visible wax layer could contain innocent text while the hidden message was carved beneath.

Pin-prick codes involved micro-perforations over selected letters in typewritten text, books, or newspapers. This technique allowed messages to be hidden in ordinary printed material.

The purpose of all of these methods was the same: ensuring that secret messages could evade detection by blending into everyday objects and activities.

Chaffing and Winnowing

Chaffing and winnowing is a modern steganographic technique introduced by Ron Rivest in 1998. It takes its name from the process of separating wheat (the valuable grain) from chaff (the worthless husks).

This technique doesn't encrypt the message but instead hides it by pairing the real message (the "wheat") with irrelevant data (the "chaff") and transmitting both together. Each message is accompanied with a MAC or digital signature using a key known only to trusted parties. Intruders can see all the messages but can't separate the meaningful message from the noise. This method doesn't rely on hiding the data but rather on making it indistinguishable from meaningless information. It has practical applications in digital communication where authentication without encryption may be needed.

Image Steganography

Messages can be embedded into images. This is arguably the most common way of using steganography.

There are three common ways of hiding a message in an image:

Metadata fields

Image formats usually contain more than just pixel data. They carry metadata such as camera model, exposure settings, GPS coordinates, and user comments. JPEG, TIFF, and many raw formats support EXIF (Exchangeable Image File Format) metadata; PNG has text chunks.

This shouldn't be considered steganography, however, since these fields are well-known and hidden only in the sense that they are not part of the image. However, they can be an effective way to transport data covertly and can be used to bypass content filtering firewalls that may consider images to be harmless.

Least significant bit encoding (LSB steganography)

The simplest form of image steganography modifies the least significant bits of pixel values.

An image is a grid of pixels; each pixel is a collection of RGB (red, green, and blue) colors, each represented by 8 bits. Changing the highest-order bit has a dramatic effect on the color. Changing the lowest-order bit changes the value by 1 out of 256, which is usually imperceptible.

To hide a message, we can:

  1. Take the bits of the message.

  2. Walk through the pixels in some order, possibly determined by a secret key.

  3. For each relevant color channel, replace the least significant bit with a bit of the message.

The receiver, knowing the key and the embedding pattern, reads the same pixels and reconstructs the bitstream. To everyone else, the image looks almost identical to the original.

For example, a 40-megapixel camera image contains 120 million RGB color values, providing 15 megabytes of capacity for hidden data.

Frequency-domain embedding

Some compressed image formats, like JPEG, do not store raw pixel intensities. Instead, they transform each $8 \times 8$ block of pixels using the Discrete Cosine Transform (DCT) and store a matrix of frequency coefficients. Low-frequency coefficients describe smooth changes; high-frequency coefficients capture rapid changes such as edges and texture.

Human vision is more sensitive to low-frequency changes than high-frequency ones. For example, you are likely to notice distortions in a clear blue sky (low frequency) than in a leafy tree (high frequency). JPEG compression takes advantage of this by quantizing high-frequency coefficients more aggressively.

Steganographic schemes can also exploit this. A simple frequency-domain method is:

  1. Apply the DCT to each block.

  2. Select certain mid- or high-frequency coefficients according to a key.

  3. Modify those coefficients slightly to encode the hidden bits.

  4. Apply the inverse DCT to reconstruct the image.

The recipient performs the same transform and reads the chosen coefficients. If the scheme is designed well, the visual quality remains almost unchanged, while the embedded data can still be extracted after moderate recompression or scaling.

The choice of cover medium is crucial. Media with high noise (photos, music) works better because small changes are less noticeable. A photograph of a forest will hide modifications better than a simple graphic with solid colors.

Audio Steganography

Audio signals can also host hidden data. Digital audio is a sequence of samples, typically 16-bit or 24-bit integers. As with images, least significant bits can be modified with minimal audible impact.

More sophisticated schemes make use of the same psychoacoustic analysis that audio compression algorithms use (lie MP3 or AAC): place the bits in areas where human listeners simply won't notice the distortion. Techniques like echo hiding, phase coding, and spread spectrum can embed data within audio signals without significantly altering the audio's perceptible qualities.

Echo hiding adds low-amplitude echoes into the audio. These are too subtle for humans to perceive, but a decoder can detect them. The goal is robustness: the watermark should survive compression and other transformations.

Spread spectrum watermarks have been used by Universal Music Group on Spotify audio, though this technique is largely dying out now. Amazon has used inaudible signaling to prevent Echo devices from activating during its commercials.

Modern speech watermarking schemes often use neural networks. A recent example is Meta’s AudioSeal system, which trains one network to embed a watermark and another to detect it. The training objective is to minimize perceptible distortion between the original and watermarked audio while maximizing reliable detection, even after common audio editing operations such as cropping, filtering, noise addition, or recompression. Earlier systems such as WavMark hide a small binary payload in each short window of audio and scan across the clip to find it; AudioSeal’s localized detector is designed to find watermarks faster and more robustly.

While AudioSeal produces the best results of any audio watermarking technology to date, it is still subject to adversarial attacks. Specifically, the more information about the algorithm is disclosed to attackers, the easier it is to mount an attack that will obscure the watermark. The authors propose keeping the training parameters secret. AudioSeal is freely available on GitHub.

Video Steganography

Video files combine all the ingredients: sequences of images, audio tracks, and motion information. This means they offer high capacity for embedded data.

Techniques for video steganography include:

The challenge is robustness. Real-world video often gets transcoded, downscaled, or re-encoded for streaming. Practical video steganography must survive those transformations and still leave enough signal for the receiver to decode the hidden bits.

Visible watermarking, such as network logos at the bottom-right of a screen, is common but is not steganography since it is intentionally visible.

Text Steganography

The earliest examples of steganography were text-based. In the digital world, we still see variations on that theme. The key idea is that a piece of text can carry more information than what appears in the words alone. Formatting, punctuation, and word choice can all be used to encode bits.

We can roughly group text steganography into three styles: layout-based, syntactic, semantic, markup-based.

Layout-based text steganography

Layout-based schemes rely on the fact that humans are usually insensitive to minor formatting variations.

Some simple patterns include the following:

A famous real-world example of layout-based watermarking is attributed to former British Prime Minister Margaret Thatcher. In the 1980s, she was reportedly frustrated by cabinet documents leaking to the press. Each minister received a copy of sensitive documents that looked identical to the naked eye, but the word processors had been configured so that the pattern of spaces between words encoded an identifier for the recipient. When a document leaked, investigators examined the inter-word spacing and traced the leak back to the corresponding copy. The document layout served as a covert fingerprint.

These techniques appear both in printed documents and in digital documents such as PDFs, where exact coordinates and spacing can be controlled. They are easy to implement and can be very hard to notice visually, but they are fragile under reformatting. If the text is reflowed, copied into a different document, or passed through a formatter that normalizes whitespace and spacing, the hidden information may be destroyed.

Syntactic text steganography

Syntactic schemes use visible punctuation and structure, but in a way that is subtle enough that the text still reads naturally.

A few common tricks include:

Well-designed syntactic schemes try to keep the statistical profile of the text close to that of normal writing, to avoid simple detection based on punctuation or spelling statistics.

Semantic text steganography

Semantic steganography goes a step further and changes word choice while preserving meaning. Here the encoding alphabet consists of sets of synonyms or near-synonyms.

For example:

To send a longer message, the text must contain enough places where such substitutions make sense. The more aggressively the writer uses synonyms for encoding, the more likely the prose will sound unnatural or repetitive.

Modern research systems use language models to perform this type of steganography. A model can choose among multiple sentences that all look plausible but collectively encode a bitstream. These systems are a form of linguistic steganography. In practice, capacity is limited and high-quality encoding is computationally expensive. The resulting text is also vulnerable to editing: if a copy editor rewrites sentences, the embedded message is changed or destroyed.

Markup-based steganography

Once text is wrapped in markup such as HTML, XML, or PDF page descriptions, additional opportunities appear. The key observation is that many different textual representations correspond to the same rendered appearance.

Some patterns exploit the flexibility of the markup syntax:

Other patterns exploit non-printing or invisible content:

These markup-based channels take advantage of the gap between what a human sees on the screen and what is actually stored in the file. They can be surprisingly robust: copying and pasting from a browser may carry along zero-width characters or hidden spans; forwarding a PDF usually preserves hidden objects and pages. On the other hand, tools that sanitize or normalize markup, or that convert to plain text, can strip away many of these signals.

Use of hidden text to confuse search engines and spam filters

The same tricks also show up outside of classical steganography. Spammers and search-engine manipulators embed keywords using zero-width characters, white text on a white background, or overlapping objects to confuse email filters and inflate the apparent relevance of a page without changing what a human reader sees. Modern spam filters and search engines explicitly look for these patterns, which turns a once-stealthy channel into a useful detection signal.

Text steganography is easy to explain and implement, and it aligns nicely with the historical techniques that relied on letters and punctuation. In practice, however, it is used less often than media-based steganography because images, audio, and video offer vastly more capacity and tend to survive everyday editing operations better.

Network Steganography

We can also hide information in how data is sent over the network. The overt traffic might be an ordinary web browsing session, video call, or DNS lookup. Additional data can be embedded into fields that most observers ignore.

Examples include:

These tricks are attractive for malware command-and-control because the covert channel is blended into traffic that security products expect to see. Detecting them often requires more than simple pattern matching.

Steganography for Malware Delivery and Exfiltration

Steganography has become a useful mechanism for attackers to deliver malware because malicious data can be hidden in "innocent" content, such as an image, and neither detected nor blocked by content-inspecting firewalls or intrusion detection systems. Similarly, attackers can use steganography to exfiltrate data from an organization by uploading images, audio, or other non-suspicious data.

The first documented example of malware using steganography was in 2011 with the Duqu malware, which encrypted and embedded data into a JPEG file. In 2014, the Lurk malware would have a dropper extract the malware from an image in the resource section of the dropper URL, accessing downloader URLs embedded in the least significant bits of an image's pixels.

In April 2024, a report about the SteganoAmor campaign came out. The hacking group TA558 has been using a sophisticated method of delivering malware through the use of steganography, specifically targeting the hospitality and tourism sectors predominantly in Latin America. This method has been implicated in over 320 cyber attacks across various sectors and regions.

The high-level structure of the attack illustrates how steganography fits into a multi-stage infection chain.

  1. Initial delivery
    Victims received phishing emails with malicious Microsoft Office documents. The emails were sent through compromised mail servers, which made them look more legitimate to spam filters.

  2. Exploit and loader
    Opening the document triggered an exploit against CVE-2017-11882, a long-patched vulnerability in the Equation Editor component, but one that still poses a threat to systems running outdated software versions. The exploit downloaded and executed a Visual Basic script from a legitimate third-party service.

  3. Steganographic payload
    The script fetched what appeared to be a normal JPEG image from a remote server. The image contained a base64-encoded payload hidden using steganographic techniques.

  4. Extraction and final stage
    A PowerShell script extracted the hidden data from the image and used it to download and install the final malware payload, which established persistence and connected to the attacker’s infrastructure.

Security products that examined only the file type or superficial properties of downloads would see an image, not an executable. The malicious code was not delivered as a straightforward binary; it was concealed inside a media file, which is often treated as harmless.

Other examples include:

Steganography has now become common even in low-end commodity malware, not just nation-state operations.

Watermarking and content authenticity

Steganography and watermarking are related but have different goals.

Watermarks are used to assert ownership, track distribution, or encode usage rules. A visible broadcaster logo in the corner of a video is a simple watermark. More sophisticated schemes hide watermarks in the frequency domain of images or audio.

A typical non-robust watermark might store a serial number in the least significant bits of image pixels. A more robust scheme spreads the watermark across many coefficients. Removing it without noticeably damaging the image becomes difficult.

Watermarking plays a central role in content authenticity, especially in the context of deepfakes and generative AI. The goal is to help users answer questions such as:

Classic Techniques in Watermarking

The original watermarks date back to 1282, when papermakers would alter the thickness of paper while it was still wet by imprinting a pattern in the paper mold. This identified the paper maker or trade guild responsible for producing the paper. The dry paper could be rolled again to create an even thickness but varying density. Later, watermarks were used in banknotes to enable the detection of authentic currency. The first use in currency was in the 1661 issue of the Stockholms Banco.

The EURion constellation (also known as Omron rings) is a pattern of five small circles repeated throughout modern banknotes. Software in scanners and image editing programs recognizes this pattern to prevent counterfeiting. The pattern is used in currency from Armenia, Australia, Canada, China, the EU, India, Japan, Mexico, Switzerland, Thailand, the UK, the US, Zimbabwe, and many other countries.

UV watermarking is used in passports, currency, and hand stamps for amusement park or club re-entry. These marks are invisible under normal light but become visible under ultraviolet illumination.

Fragile vs. Robust Watermarks

Fragile watermarks are designed to break if the content is modified. This makes them useful for authentication and tamper detection. Examples include currency, passports, and entry tickets. These will no longer be valid if tampered, so users should not want to remove them.

Robust watermarks are designed to survive transformation. This makes them useful for tracking content authorship and ownership. Examples include photos, videos, audio, and documents. Users may try to remove these, so the watermark must persist through cropping, compression, format conversion, and other modifications.

Printer watermarking and hidden identifiers

Color laser printers and some copiers add their own invisible markings. Many devices embed a pattern of tiny yellow dots on every page they print. Under normal lighting, the dots are extremely hard to see. Under magnification or blue light, the pattern becomes visible.

The dot patterns encode information such as:

This system is usually described as a machine identification code or printer watermark. The goal is to create a robust identifier that ties a printed page to a specific device and time. The mechanism is a form of steganography in the physical world: the information is hidden in the layout of barely visible dots, not in any explicit text or barcode, and most users are unaware that it exists.

A widely publicized example occurred in 2017 with Reality Leigh Winner, a U.S. intelligence contractor who leaked a classified NSA report to a news outlet. The document was printed, carried out of a secure facility, and later published as a scanned PDF. Investigators examined the scan and were able to see the faint pattern of yellow dots under magnification. From those dots they recovered the printer’s serial number and timestamp, which helped them identify which printer had been used and which user had printed the document inside the agency.

Content Authenticity

Content authenticity refers to methods that help verify whether digital media (images, video, audio) originated from a trusted source and remains unaltered. This field combines provenance metadata with watermarking, and sometimes steganographic techniques, to allow users to detect manipulation, including deepfakes.

Watermarking Light for Videos

In 2025, a Cornell research team led by assistant professor Abe Davis and graduate student Peter Michael introduced a novel method they term noise-coded illumination. Instead of modifying video pixels after capture, special displays or lights modulate their intensity in subtle patterns that encode information about the scene. Cameras that record under this lighting naturally capture these modulations.

The idea is that the physical scene and the watermark are linked. A generator that fabricates a completely synthetic video will not have the matching lighting pattern. Even if an attacker tries to forge both the content and the code pattern, the effort is higher than just faking pixels.

For more information, see the Cornell news article.

Watermarking for deepfake detection

Instead of detecting deepfakes, a number of systems have been created to watermark genuine content before it is shared. As an example, the following two methods capture this:

FaceSigns is a semi-fragile neural watermarking scheme. It embeds a 128-bit secret into real images. The watermark survives benign image operations like compression, scaling, or color adjustment. If a deepfake model manipulates the face, the watermark is likely to be damaged or destroyed. It achieves near-perfect detection.

FaceGuard enables users to watermark their own genuine photos before posting them. Later, if someone produces a deepfake based on those photos, the watermark either disappears or fails verification.

These methods do not try to label deepfakes directly. Instead, they label trusted originals and treat anything that cannot prove its authenticity as suspicious. That is a subtle but important shift in strategy.

Provenance metadata and C2PA

Beyond watermarking, content authenticity relies on standards for provenance metadata: structured, signed data that records how a file was created and modified.

The Content Authenticity Initiative (CAI), launched by Adobe in 2019 (with the New York Times and Twitter), develops open-source tools and promotes content provenance workflows.

The Coalition for Content Provenance and Authenticity (C2PA), formed in 2021 (a joint effort combining CAI and Microsoft/BBC's Project Origin), defines an open, royalty-free standard for Content Credentials.

The idea is to attach cryptographically protected metadata to media files. This metadata might include:

Hashes and digital signatures bind the metadata to the pixels or samples. Each tool that edits the file can append a new signed statement. The result is a chain of assertions that describes the life of the content from capture to publication.

Viewing tools can then tell you, for example:

This JPEG was captured by camera X at time T, then edited in application Y with the following operations.

If the hashes no longer match the pixels, or if signatures are broken, the viewer can warn that the content may have been altered outside of recognized tools.

C2PA's goal is to fight deepfakes and misinformation by giving people technical means to verify media provenance. Major companies, including Adobe, Microsoft, Google, OpenAI, Sony, Truepic, and Digimarc, are implementing Content Credentials.

OpenAI adds C2PA metadata to DALL-E 3 outputs, allowing external tools to verify image provenance, even though OpenAI acknowledges metadata can be stripped. YouTube uses C2PA-based Content Credentials to label videos captured with unaltered real-camera footage. TikTok, as the first major social media platform adopting C2PA credentials, attaches "nutritional-label" content credentials to AI-generated content. Digimarc integrates digital watermarking with C2PA standards to secure imagery against deepfake or misinformation misuse; useful for evaluating authenticity even if metadata is stripped.

An increasing number of cameras also support C2PA Content Credentials.

There are limitations:

Still, Content Credentials and watermarking form a useful toolkit for providing some degree of confidence to the authenticity of content.

Google SynthID

Several large providers are starting to watermark AI-generated content by default. One example is Google’s SynthID family of watermarks, developed by DeepMind. For images and audio, SynthID embeds a robust watermark directly into the pixel values or waveform so that it is difficult to remove without noticeably degrading the content.

For text, SynthID works differently. SynthID-Text does not modify the characters after the fact; instead, it alters the word-selection process of the language model to introduce a statistical signature into the generated text. The system randomly assigns scores to candidate words that the model might generate and biases the model to choose higher-scoring words slightly more often. A detector that knows these scores can then examine a piece of text and compute how over-represented the high-scoring words are. If the skew is larger than what we would expect from normal language, the detector can infer that the text was likely generated by a model using SynthID-Text.

SynthID is an example of watermarking applied at the model level: instead of asking users to mark their own content, the generator itself imprints content as it is created, so that trusted detectors can later identify AI-generated material even after common transformations such as copying, minor editing, or format conversion.

This approach is acknowledged to be far from foolproof. Users can make significant edits or ask another chatbot to summarize the text to remove the statistical signature.

Statistical Steganalysis

Can we identify if content contains steganography? Various detection techniques exist.

Histogram analysis examines the distribution of pixel values. In natural images, histograms usually show smooth, continuous curves. Steganography, especially LSB embedding, can create unnatural patterns in the histogram.

Chi-square analysis tests whether LSB bits follow expected patterns. LSB bits are normally not perfectly random. When secret data is embedded into LSBs (especially if it's random-looking encrypted data), it tends to randomize the LSB distribution. The chi-square test measures how far the observed distribution is from the expected one.

Machine learning approaches train models on clean images to distinguish them from images containing steganography.

There are inherent payload capacity trade-offs: the more data you hide, the greater the risk of introducing detectable artifacts.

Advances in Secure Steganography

In 2023, researchers at the University of Oxford in collaboration with Carnegie Mellon University applied techniques of minimum entropy coupling to make it virtually impossible to detect steganography--similar in concept to Shannon's Perfect Secrecy. By using minimum entropy coupling, they can join two distributions of data together such that their mutual information is maximized, but the individual distributions are preserved. This means there's no statistical clue that hidden data is present. Neural networks have been used to encode images using these techniques.


Anonymous Communication

Steganography and watermarking deal with the content of communication: hiding it, tagging it, or authenticating it. A different but related problem is how to hide who is talking to whom.

Authentication, the inverse of anonymity, is a major theme in computer security: certificates, digital signatures, passwords, and authenticated TLS sessions all verify identity. But anonymity and secrecy are also important. Sometimes we care about our privacy.

Anonymous communication is often considered bad, associated with spammers, scammers, illegal goods trafficking, money laundering, and other criminal activity. However, there are legitimate uses for anonymous communication:

The Limits of Privacy Tools

Many services retain information about you, including accounts, configuration settings, identity, purchase data, cloud storage (files, email, photos), and browsing history. Sites may know your interests through tracking cookies, which is important for data mining and targeted advertising.

Browsers offer "private" browsing modes: Apple Private Browsing, Mozilla Private Browsing, Google Chrome Incognito Mode, and Microsoft InPrivate browsing. These modes:

However, private browsing is not truly private:

Even encrypted sessions have limitations.

Eavesdroppers can't see the plaintext, but they can see where traffic is coming from and where it's going. ISPs and companies know your IP address and can track you.

If you use a commercial VPN service (such as ExpressVPN or Proton), the ISP and recipient won't know your IP address, but the VPN provider will. This is essentially what third-party VPN services do: they act as a single relay that encrypts traffic between you and the relay.

Mobile devices also present tracking challenges. They scan for Bluetooth devices and Wi-Fi access points, broadcasting their MAC address. ISPs and hotspots can track users by their MAC addresses. Apple's fix provides a different MAC address each time the device connects to a new network.

The surface web, deep web, and dark web

The surface web consists of web content that can be indexed by mainstream search engines using web crawlers.

The deep web contains content that search engines cannot find, including unindexed content from dynamically-generated pages, query results from libraries, and government and corporate databases.

The dark web is part of the deep web that has been intentionally hidden. It is not accessible through standard browsers; you need special software, such as the Tor browser. Servers do not register names with DNS and sometimes use a .onion pseudo-top-level domain.

Examples of some .onion domains include:

group domain
CIA http://ciadotgov4sjwlzihbbgxnqg3xiyrg7so2r2o3lt5wz5ypk4sxyjstad.onion/
Facebook https://www.facebookwkhpilnemxj7asaniu7vnjjbiltxjqhye3mhbshg7kx5tfyd.onion/
ProPublica http://p53lf57qovyuvwsc6xnrppyply3vtqm7l6pcobkmyqsiofyeznfu5uqd.onion/

What's with their length and all those random characters?

Dark web services generate a public/private key pair, and the public key is hashed (base32 plus checksum plus version) to create the domain name.

Note that the CIA .onion domain starts with ciadotgov and Facebook starts facebook. These are referred to as vanity addresses. Services create them by repeatedly generating public/private keypairs, hashing them, and seeing if the result starts with the characters they're hoping to get. It's a similar process to Bitcoin's proof of work: keep trying. Generating a "vanity" address with recognizable characters requires approximately 3.5 billion attempts for 7 characters.

The dark web hosts both legitimate and illicit services. Examples of legitimate services include anonymous access to news (ProPublica, NY Times, BBC News all have .onion addresses), DuckDuckGo search, SecureDrop for anonymous leaking, and even the CIA. Illicit services include markets for drugs, stolen identities, and hacking tools.

Tor: The Onion Router

Tor provides anonymous browsing through a collection of relays around the world run by non-profits, universities, and individuals. Currently there are around 8,000 active relays serving approximately 2 to 2.5 million directly-connecting users.

History

Onion routing was developed in 1995 at the U.S. Naval Research Laboratory to protect U.S. intelligence communications. The goal was to develop a way of communicating over the Internet without revealing who is talking to whom, even if someone is monitoring the network.

The Navy patented onion routing in 1998 and released the code under a free license. The project continued with MIT graduates and Tor was deployed in 2002. The Tor Project was founded in 2006 as a non-profit organization with support from the Electronic Frontier Foundation (EFF).

Anonymity Goals

Tor aims to provide two forms of anonymity:

  1. Unobservability is the inability of an observer to link participants to actions: being able to use a resource or service without others being able to observe that it is being used.

  2. Unlinkability is the inability to associate multiple actions as being related: a user may make multiple uses of resources or services without others being able to link these uses together.

From mix networks to onion routing

David Chaum’s mix networks were an early proposal for anonymous email:

Mix networks provide strong anonymity but introduce significant latency, since nodes must wait to accumulate enough messages to form a useful batch. They are acceptable for email but not for interactive web browsing.

Onion routing takes some ideas from mix networks but removes batching in favor of low-latency forwarding. The key ideas are:

Tor is the most widely deployed onion-routing system today.

Why Multiple Relays?

A single relay is vulnerable to easy correlation attacks. If Eve, the eavesdropper, watches the entry and exit of a relay, she can observe that Alice sends something to the relay and something comes out headed for store.com. Even with a shared relay serving multiple parties and encrypted connections, correlation attacks are still possible.

Using multiple relays makes attacks more difficult because an attacker is unlikely to have access to all relays at different ISPs worldwide. Tor uses three layers of relays by default. This makes it more difficult to know where to look. Correlation by message time and size is still possible but difficult since the relays are scattered across ISPs and across the world.

The Tor Consensus Document

Every relay creates a public-private key pair. The Tor Consensus Document is signed and updated hourly, describing the entire Tor network: all valid relays, their public keys, their IP addresses, ports, bandwidth, and which authorities agree on each relay's status. Trusted directory authorities vote about the network. A user bootstraps by downloading this directory information. The relay's public key is also used to create its .onion domain name.

Circuits

Alice selects a list of relays through which her message will flow. This path is called a circuit. No node knows if the previous node is the originator or just another relay. Only the final node (the exit node) knows it is the last node. Tor relays use their own DNS resolvers, not the standard Internet DNS.

To set up a circuit, Alice first connects to Relay1 and sets up a TLS link. She does a one-way authenticated key exchange with Relay1 to agree on a symmetric key S1, and they agree on a random circuit ID number (for example, 123).

To extend the circuit to Relay2, Alice sends a message to Relay1 encrypted with S1, instructing Relay1 to extend to Relay2. Relay1 establishes a TLS link to Relay2. Alice's initial handshake with Relay2 is encrypted with Relay2's public key. Relay2 picks a random circuit ID for this data stream. Alice then does a key exchange with Relay2 to agree on symmetric key S2. All traffic to Relay2 flows through Relay1 and is encrypted with S1.

To extend to Relay3, the same process continues. Alice's messages to Relay2 are encrypted with S2 and then with S1, creating layers of encryption like an onion.

Sending Messages

When Alice sends a message to a destination, each relay strips off a layer of encryption. The message structure looks like an onion: the innermost layer contains the directive to the exit node to open a TCP connection to the destination and send messages. The response follows the same path in reverse, with each relay adding its layer of encryption.

Importantly, Tor is not a VPN. Neither IP nor TCP packets are transmitted in the message. It does not encapsulate IP packets but rather sends data streams back and forth. This is because it would be too easy to identify the type of system by looking at TCP formats and responses. End-to-end TLS between source and destination can still be used on top of Tor.

Limitations of Anonymity

Tor provides meaningful anonymity but is not perfect.

Correlation attacks are possible if an attacker can observe both the entry and exit of traffic. By correlating timing and size of data at the first relay with outputs of the last relay, an attacker may be able to determine who is talking to whom. You can make correlation attacks difficult by padding or fragmenting messages to be the same size and queuing up multiple messages to shuffle and transmit at once, but this adds latency and traffic overhead.

Compromised exit nodes present another risk. The exit node decrypts the final layer and contacts the service. If the content itself is not encrypted with TLS, the exit node can see it.

A lot of users are needed to ensure anonymity. Relays should be hosted by third parties to gather input from multiple groups. For example, a relay within fbi.gov would reveal that all input comes from fbi.gov, which isn't desirable for anonymity.

Threats to Tor

Tor faces several ongoing threats.

Censorship is one challenge. Russia blocked most Tor nodes in December 2021, leaving approximately 300,000 Russian users (about 15 percent of Tor users) scrambling for alternatives. Tor managers responded by creating mirror sites and calling for volunteers to create Tor bridges: private nodes using a transport system known as obfs4 that disguises traffic so it doesn't appear related to Tor.

Sybil attacks involve an adversary creating multiple fake identities in a network to gain influence and disrupt the algorithm. Security researchers have identified a single anonymous entity operating hundreds of malicious Tor relays, which reached as many as 10 percent of all nodes at their peak. If one person controls both the first hop and the third hop of a circuit, it becomes easy to infer the information that is supposed to be obfuscated by the middle node.

IP spoofing attacks have targeted Tor. In October 2024, attackers spoofed Tor-related IP addresses to trigger automated abuse reports, resulting in some relays being blacklisted by major hosting providers and subsequently taken offline.

I2P and Garlic Routing

I2P (Invisible Internet Project) is an alternative to Tor that uses garlic routing instead of onion routing.

Tor uses onion routing where each message from the source is encrypted with one layer for each relay. Garlic routing combines multiple messages at a relay--all messages, each with its own delivery instructions going to one relay, are bundled together. This makes traffic analysis more difficult.

Tor circuits are bidirectional, with responses taking the same path as requests. I2P "tunnels" are unidirectional--one tunnel for outbound and one for inbound traffic, both built by the client. The sender gets acknowledgement of successful message delivery.

I2P focuses on anonymous hosting of services and uses a distributed hash table (DHT) for locating information on servers and routing. Services on top of I2P include:

Tor currently has far more users, which provides more anonymity, and focuses on anonymous access to services. I2P has a smaller user base but provides strong anonymity for hosting hidden services.

Mesh Networks for Internet-Free Communication

What do you do if the government monitors the Internet or the Internet is not available? This was the problem the 2019 Hong Kong pro-democracy protesters faced.

The solution was to use a peer-to-peer mesh network that does not use the Internet. The Bridgefy app discovers neighbors who are running routing software via Bluetooth. Messages hop from phone to phone until they reach their target, supporting both private and broadcast messages. Downloads for Bridgefy increased almost 4,000% over 60 days between July and September 2019.

This approach was originally designed to enable people to communicate at sporting events and concerts. It is also useful in areas hit by storms where Internet infrastructure is down.

During the 2022 Russian invasion of Ukraine, the top apps being downloaded in Ukraine were Signal (the private messaging app), Bridgefy, Maps.me (an offline mapping app), and several "walkie-talkie" apps that enable free communication without sign-ups or personal information. Ukrainians were preparing for either the loss of the Internet in their country or the closure of the free Internet behind a new digital iron curtain.


  1. See this article in Cryptiana for a discussion. The validity of this story is in doubt. While it's been presented by many authors, there don't seem to be any primary sources for it, and some details are questionable.