PIR-DNSSEC proposal

⚓ General    📅 2025-01-11    👤 CPerezz    👁️ 161      

CPerezz

Warning

This post was published 82 days ago. The information described in this article may have changed.

Hey everyone. I wrote a proposal on PIR-DNSSEC. Took me some time. It is not a final version, but would appreciate comments, feedback etc from anyone!

I’d like comments on specific stuff addressed on HackMD directly: https://hackmd.io/@CPerezz/ryQoCKmLi

And for general comments or feedback feel free to use the forum replies.

A proposal on PIR-DNSSEC

aaa

::success The following document is my attempt to explain as much as possible all my ideas and thoughts on the concept of PIR-DNSSEC.

The TLDR, PIR-DNSSEC is the endgoal for the DNS protocol. We currently have DNS-over-TLS and DNSSEC. This grants authentication and spoofing & cache poisoning protection.

But DNS lacks privacy. Specially since major DNS operators and resolvers like Google, have unprecedented power to mass surveil the world through every single DNS lookup that arrives to them.

This protocol, is a solution proposal to that problem.
::

> If you already know how DNS and DNSSEC work or simply don’t care, you can directly skip to PIR-DNSSEC section

What is DNS?

The Domain Name System (DNS) is the global distributed database responsible for translating human-readable domain names (such as ethereum.org) into numerical IP addresses (such as 3.124.100.143) that computers use to communicate with each other. It is often compared to a phone book—though, unlike a traditional phone book, DNS is hierarchical, decentralized, and extensively cached.

Key Characteristics of DNS

  • Hierarchical structure:

    • At the root of the DNS hierarchy there are root servers (denoted by “.”).
    • Below the root are Top-Level Domains (TLDs), such as .org, .org, .net, country-code TLDs, and others.
    • Below the TLDs are second-level domains (e.g., ethereum.org), followed by possible subdomains (e.g., sub.ethereum.org), and so forth.
  • Distributed database:

    • The DNS database is spread across millions of DNS servers on the internet, each authoritative for a particular domain or zone.
  • Caching:

    • DNS records can be cached by resolvers (DNS servers or applications performing DNS resolution) for a configurable period of time (the Time To Live, or TTL). This reduces latency and load on authoritative servers.
  • Record types:

    • Common record types include A/AAAA (address records), NS (name server), MX (mail exchange), CNAME (canonical name or alias), TXT (text data), etc.

How does DNS work?

:::danger Add 1 or 2 diagrams of DNS protocol and messaging diagram + the series of videos on DNS
::
  • A user attempts to visit a domain, say www.ethereum.org. The computer checks its local DNS cache or sends a query to a configured DNS resolver (which could be the user’s ISP’s (hopefully not) DNS server or a public DNS server like 1.1.1.1).

  • If the resolver does not have the domain’s record cached, it will send a query up the chain, starting from the DNS root servers (.), then to the TLD server (.org), then to the authoritative name servers for ethereum.org.

  • The authoritative name server for the domain returns the IP address (or other relevant record) for www.ethereum.org to the resolver.

  • The resolver then returns the answer to the user’s application, which can now open a direct connection to the specified IP address.

  • The resolver and potentially the local machine will cache this answer for the record’s TTL. Future lookups for www.ethereum.org during the TTL will be answered immediately from cache. Hence, saving up all the time it takes to send a request-response through the wire.


What are the issues with plain DNS?

  • Lack of authentication: The DNS response is not cryptographically signed or verified in a standard DNS implementation. This leaves DNS traffic vulnerable to spoofing (e.g., DNS cache poisoning, man-in-the-middle attacks).

  • Lack of privacy: Queries are sent unencrypted (in the traditional DNS protocol over port 53), so anyone on the path can see which domains are being requested and respond with altered data if they have the capability.

  • SLack of security: The DNS response can suffer man-in-the-middle attacks. Meaning an attacker can intercept the DNS communications sent by the user and craft a fake response with a poisoned record that tricks the user into visiting a website that isn’t the one he expects.

  • No built-in data integrity: The DNS protocol was originally designed in a more trusted and smaller network environment (The OG Internet/ARPANET). Therefore, it did not include built-in data integrity checks aside from basic checksums, which are insufficient for preventing targeted forgery.


What is DNSSEC?

DNS Security Extensions (DNSSEC) is a set of extensions that adds authentication and data integrity to the DNS protocol using public-key cryptography. Its main goal is to protect clients from forged or manipulated DNS data, such as in man-in-the-middle or cache poisoning attacks.

How does DNSSEC work?

:::danger Add 1 or 2 diagrams of DNS protocol and messaging diagram + the series of videos on DNS
::
  • Chain of trust Each DNS zone (root, TLD, domain, etc.) has a key pair, consisting of a private key used to sign the zone’s resource records (RRs) and a public key stored in the zone’s DNSKEY record. A parent zone (imagine .org in ethereum.org) holds a Delegation Signer (DS) record containing a cryptographic hash of the child zone’s public key (in this case, www.ethereum.org). The root zone (the top one) has its public key widely trusted and distributed in what is known as a trust anchor.

  • Signed resource records DNSSEC introduces new record types (e.g., RRSIG, DNSKEY, DS, NSEC/NSEC3) to provide signatures and proof of non-existence.

  • Verification When a resolver (that supports DNSSEC) queries for a domain, it requests not only the DNS records but also the associated signatures. (Meaning, it doesn’t just do a DNS request. But also a DNSSEC one).

    • The resolver uses the public key from the parent zone’s DS record to verify the signature on the child zone’s DNSKEY.
    • Then, using the child zone’s DNSKEY, it verifies the signature on the resource record (RRSIG).
    • This chain of trust continues up to the root zone’s trusted key.

How is it backwards-compatible?

:::danger add some diagram
::
  • Opt-in nature DNSSEC is optional; zones can choose to sign their data or not. Resolvers can be configured in “best effort” mode, where they look for DNSSEC records but gracefully handle domains that are not yet signed.

  • Fallback behavior If a resolver does not support DNSSEC, it will ignore DNSSEC records and proceed with plain DNS.

  • Partial deployment TLDs and many second-level domains have gradually adopted DNSSEC, but it is not enforced for all domains. This partial deployment model means non-DNSSEC zones still work as usual.


PIR-DNSSEC

What is PIR-DNSSEC?

PIR-DNSSEC is a proposed system that combines DNS Security Extensions (DNSSEC) with Private Information Retrieval (PIR) to provide both authentication and privacy for DNS queries. In plain DNSSEC, clients can verify that a DNS response has not been tampered with, but the actual queries are still visible to the DNS resolver and potentially to on-path observers. By integrating PIR, a user can retrieve DNS records without revealing which specific record they need—even to the DNS server—while still benefiting from the authenticity guarantees of DNSSEC.

PIR Ensures Query Privacy

PIR techniques allow a client to retrieve data from a database without revealing which item is being accessed. Normally, when you perform a DNS query, the server knows exactly which domain you’re asking about. With PIR, the request is structured so the server processes an encrypted or obfuscated query and returns the correct data—but it cannot tell which data in the database was selected.

::info Even if the server logs all queries, the design of PIR means each query looks statistically similar to any other query. The server cannot link a specific client request to a specific domain. This is a significant departure from standard DNS or DNS-over-TLS, where the DNS resolver itself always knows your exact lookup. And it’s also one of the main focuses of this work. To grant full privacy even to the extend of major DNS controllers.

::

::info PIR systems can reduce or eliminate side-channel leaks too, which might reveal which record is being fetched (e.g., by query size or request timing). This is specially crucial if we’re trying to hide sensitive or private domain lookups from powerful adversaries, including the operator of the DNS server itself.

::

Combined Benefits

End-to-End Integrity + Confidentiality By combining DNSSEC with PIR, users get integrity and privacy (nobody can see which specific domain is being queried). This offers a more complete security solution than using DNSSEC or standard encrypted DNS (like DNS-over-TLS or DNS-over-HTTPS) alone.

Defense Against Multiple Threat Vectors With standard DNSSEC, on-path attackers cannot forge DNS replies, but they can still see the queries and potentially block them or perform traffic analysis. PIR-DNSSEC makes eavesdropping or direct observation of domain queries impractical, thereby thwarting passive surveillance and metadata analysis.

Applicability in High-Privacy Contexts In scenarios where privacy is paramount—such as corporate environments, whistleblower platforms, or sensitive government or military networks—the combined approach ensures that both data authenticity and user anonymity are maintained at the DNS resolution layer.

Why PIR-DNSSEC?

A complete private internet

The internet’s foundational protocols were designed primarily for connectivity and openness rather than privacy. DNS, in particular, was established in an era when simply resolving names to IP addresses was the main goal; little attention was paid to preventing operators or on-path observers from monitoring or altering the queries themselves. Today, major DNS operators—especially root operators and large public resolvers—have unprecedented visibility into people’s online activities, because they effectively see every domain lookup. This information can reveal not just browsing patterns but also insights into personal habits, corporate interests, and national behaviors.

By introducing robust privacy measures into DNS (e.g., via PIR-DNSSEC), we move closer to an internet where anonymity and privacy could become default. If DNS queries become truly opaque to service providers and network adversaries, one key element of online tracking and surveillance is eliminated.

A more robust Tor

This could reduce or, in some narrower cases, potentially eliminate the need for certain anonymity tools like Tor—at least where DNS lookups are concerned—since Tor itself currently handles domain resolution in a more circuit-based way. Ensuring DNS privacy and authenticity is a critical step toward a more comprehensive approach to internet anonymity.

This could specially help with exit-nodes in the Tor network. Which are the ones in charge of performing DNS Lookups such that even this can be made with complete privacy, reducing significanly the attack surface on identification of users.

Current doubts/limitations

Performance Overhead

PIR techniques—whether computational or information-theoretic—are well-known for introducing additional computational and bandwidth overhead. Likewise, DNSSEC relies on cryptographic operations (e.g., signature verification, key retrieval) that add their own overhead. When combining these two, several performance and privacy trade-offs emerge:

  1. Layering PIR on Top of DNSSEC

    • Separate or Combined Retrieval
      DNSSEC requires fetching not just the DNS record but also its associated public keys, DS records, and RRSIGs in order to validate the response. If these items are fetched via standard DNSSEC queries, the domain or TLD being queried may be partially revealed (e.g., requesting the .org key implies we’re validating something under .org).
      • To maintain privacy at every stage, PIR can be applied both to the retrieval of the DNS record and to the retrieval of the corresponding public keys.
        ::danger However, doing so effectively doubles (or more) the number of PIR operations, each carrying significant computational cost.
        ::
      • In some designs, you might use separate PIR servers or databases for the records and the keys, or you might unify them into a single larger database from which both the records and their keys are retrieved privately.
        ::danger This increases the complexity on operating the DNS and a bigger operational cost and bandwidth cost.
        ::
    • Multiple Queries for Chain of Trust
      DNSSEC validation often involves multiple lookups along the chain of trust (root, TLD, domain).
      ::danger If each of these lookups is converted into a PIR query, the number of PIR operations grows with the depth of the chain. Each step potentially multiplies the computational and bandwidth overhead.
      ::

2. Computational Complexity

PIR Protocol Costs
  • Single-Server Computational PIR often requires heavy cryptographic computations (e.g., homomorphic encryptions or blind RSA operations), which can be expensive at large scale.
  • Multi-Server Information-Theoretic PIR can reduce per-query computational overhead but requires multiple non-colluding servers. Setting up and maintaining these servers—each with a copy of the DNS and DNSSEC data—adds both infrastructure and operational costs.
    ::danger > Also, IMVHO, it seems unrealistic to think about non-collusion chances when there’s so many few actors that are up for hosting the DNS data and running so much computation required by these schemes just in an act of kidnness.
    ::
DNSSEC Signature Verification

For every DNS record retrieved, validating the RRSIG (signature) using the DNSKEY is a known overhead. In a PIR context, you might also need to perform these verifications in a way that does not leak which key is being used for which record (or, alternatively, you reveal it only on the client side after retrieving the keys via PIR).

  • If the client needs to verify all possible DNSKEYs or signatures to hide which domain it actually wants, that increases the client’s computation significantly.
  • Conversely, if a server tries to help with partial validation, it risks leaking partial information about the domain in question unless the server’s computation is also done via private queries or in a trusted execution environment.
:::warning We should probably be able to increase comm-cost with the client and allow it to be the verifier for these intermediate steps if possible. Otherwise, this is one of the more critical parts of the construction. As it is not just about this being slow. But just not possible in general.
::

3. Bandwidth and Latency

  • With PIR, clients often download large portions of the database (in an information-theoretic approach) or perform elaborate cryptographic exchanges (in a computational approach). This can lead to substantially more data being transferred compared to plain DNSSEC lookups.
  • If we unify the DNSSEC public keys (root, TLD, zone keys, etc.) into a single large dataset, a PIR query for that dataset may be significantly bigger than a plain DNS or DNSSEC query.

4. Mitigations and Design Considerations

Caching
DNS caching reduces the number of lookups needed. Similarly, one might cache not only DNS records but also signed keys that have been previously retrieved via PIR. However, caching must be carefully designed so that it does not inadvertently leak which records or keys the client has accessed.
::warning This is one of the things I need to dive into more. I’m not sure we can actually cache requests from different individuals and just serve the cache re-using them. Of course odds are there will be no way. But worth discussing with some experts on the topic.
::
Combined Queries

One approach is to design the PIR scheme so that clients retrieve multiple keys (or entire sets of keys) in a single query, hiding the specific key needed. This could amortize overhead across multiple queries but increases data download size.

Partial vs. Full PIR
Some hybrid designs might use PIR only for the actual DNS record query but retrieve DNSSEC public keys in a less private manner, accepting a partial privacy leak (e.g., TLD or zone-level knowledge). This reduces overhead but does not provide full privacy.
::success Even assuming this is possible to build and operate, this is one of the most appealing solutions (although not the one I’d prefer to see live ofc..). But we need a way to balance the extreme costs this protocol may add with some tradeoff users can take to lower privacy at the advantage of significantly lower the costs.
:: Within this scenario, one can imagine that TLD keys and big zone domain keys and possible even some extra info could just be requested openly. And only at internal-zone levels we execute the PIR-version of the protocl.
Multi-Server Deployments

Running multiple authoritative servers that each store a copy of the DNS zone or DNSSEC data can increase reliability and share the computational load of handling PIR queries. However, ensuring that none of these servers collude (if relying on information-theoretic PIR) is an additional challenge.

5. Balancing Privacy and Practical Performance

  • Trade-Off Decisions:
    Ultimately, implementers must decide how important it is to conceal every piece of DNSSEC validation data. If full privacy is a must, the system must incorporate PIR for the retrieval of records, keys, and signature data, leading to a much higher overhead.
  • Research and Optimization:
    • Significant ongoing research aims to optimize PIR protocols (e.g., using more efficient homomorphic operations or compressing the data structure). These optimizations could bring PIR overhead down to a more manageable level for DNS-based use cases.
    • Approaches to reduce the frequency of key retrieval—e.g., caching DNSKEY sets for entire TLDs—might offset repeated PIR queries, but the caching layer must be carefully designed to maintain privacy.

6. Complexity of deployment

DNSSEC deployment already requires key management, signing infrastructure, and resolver configuration. Adding PIR on top might complicate the deployment and require specialized servers and code.

7. Scalability concerns

A naive PIR approach might require large amounts of data to be transmitted, or specialized data structures that must be carefully designed for DNS.
For large zones or high query volumes, a carefully optimized approach is needed.

8. Partial or incremental adoption

Widespread adoption would require many DNS servers to support PIR. In an incremental approach, only specific resolvers or authoritative servers might offer it, limiting overall benefit until critical mass is reached.

9. Backward compatibility

PIR-DNSSEC should ideally still serve clients that only support DNSSEC without PIR, or plain DNS. Striking the right balance of backward compatibility is non-trivial.




Contact me

If you’re interested on this or want to further discuss, share ideas etc.. you can contact me by email to the following address: pir-dnssec_contact@proton.me

Thanks! Feedback is welcole!

🏷️ dns, pir, dnssec, project-proposal

andy    2025-01-14 👍 👎

You sold me the idea with this TLDR:

“The TLDR, PIR-DNSSEC is the endgoal for the DNS protocol. We currently have DNS-over-TLS and DNSSEC. This grants authentication and spoofing & cache poisoning protection.

But DNS lacks privacy. Specially since major DNS operators and resolvers like Google, have unprecedented power to mass surveil the world through every single DNS lookup that arrives to them.“

Couple general comments:

  • Who’ll adopt this? Servers?
  • What percentage of DNSSEC is adopted over just DNS? What’s the reason/can we extrapolate that to PIR-DNSSEC?
  • What’s the order of magnitude of overhead? Any sort of intuition for a path of reducing the overhead? (outside the list of Mitigations and Design Considerations)

Again, super cool idea! :)

1

andy    2025-01-14 👍 👎

We need better markdown support for those ::Warning:: and ::danger:: labels

2

chance    2025-01-19 👍 1 👎

Adding PIR to anything is strictly good, it’s a powerful way of retrieving data. The problem isn’t figuring out how to use PIR, it’s developing a PIR scheme that actually works.

Multiple dev teams have requested PIR for retrieving leaves from relatively small merkle trees (order of 2**20 maybe). This has been an open problem for years, it’s one of the only things preventing Semaphore from scaling. Right now we have to always load the entire anonymity set, but with PIR we can retrieve leaves to construct a merkle path without downloading the whole tree. Last I looked the SOTA was insufficient for even this case: a small append only set of field elements.

The DNS has on the order of 2**32 public records that change frequently (not append only) where queries need to be resolved sub-second with minimal bandwidth. Is there an existing PIR scheme that could handle this? (and if so can we get merkle tree path retrievals using it 🙏?)

3

dohoon8    2025-01-20 👍 1 👎

I was thinking using PIR to retrieve merkle path of large merkle tree from the server, but it requires logN queries, making it probably inefficient and also fragile when the merkle tree is updated. I found some interesting recent ORAM paper(https://eprint.iacr.org/2023/274.pdf) to address this problem. (ORAM is PIR + functionality to write to the database)

4