Foundations of Probabilistic Proofs

This course provides a rigorous introduction to probabilistic proofs, which have had a vast impact on theoretical computer science and have been deployed in real-world secure systems. Today probabilistic proofs are invaluable tools for delegation of computation, efficient zero knowledge, hardness of approximation, and more. In this course we study in detail several types of probabilistic proofs: Interactive Proofs (IPs), Probabilistically Checkable Proofs (PCPs), and Interactive Oracle Proofs (IOPs). The course materials consist of lecture slides and worksheets, provided below.

Context

Proofs are at the foundations of mathematics, and verifying the correctness of a mathematical proof is a fundamental computational task. The P versus NP problem, which deals precisely with the complexity of proof verification, is one of the most important open problems in all of mathematics and computer science.

Starting in the 1980s, the complexity-theoretic study of proof verification led to new notions of mathematical proofs, such as Interactive Proofs, Multi-Prover Interactive Proofs, Probabilistically Checkable Proofs, Interactive Oracle Proofs, and more. These new notions have served as tools to address many research questions in theoretical computer science and beyond, cementing probabilistic proofs as one of the most important conceptual and technical contributions to computer science as a whole. The 2012 Turing Award was given to Shafi Goldwasser and Silvio Micali for, among other things, pioneering the development of probabilistic proofs. Several Gödel Prizes were awarded to papers contributing to probabilistic proofs (in 1993, in 2001, and in 2019).

In this course we favor a treatment that highlights two astonishing capabilities of probabilistic proofs.

Delegation of computation: verifying the correctness of a computation exponentially faster than the time to execute the computation itself.
Zero knowledge: attesting the correctness of a computation without revealing information about private inputs to said computation.

These capabilities directly lead to cryptographic proofs such as zkSNARKs (zero-knowledge succinct non-interactive arguments of knowledge), which in turn have been widely deployed across various applications.

Content

The course comprises four units.

• Unit 1: Interactive Proofs. We study IPs (Interactive Proofs), a model where a prover exchanges messages with a verifier across multiple rounds. We study the main capababilities of IPs: capturing languages (believed to be) beyond NP via the sumcheck protocol and Shamir's protocol (leading to the IP=PSPACE theorem); delegation of bounded-depth circuit computations via the GKR protocol; and zero knowledge IPs.

• Unit 2: Probabilistically Checkable Proofs. We study PCPs (Probabilistically Checkable Proofs), a model where a prover outputs a proof string and the verifier queries a few locations of this proof string. We study several PCP constructions. A PCP with exponential proof length and constant query complexity, based on the Blum–Luby–Rubinfeld linearity test. A PCP with polynomial proof length and polylogarithmic query complexity, based on the Rubinfeld–Sudan low-degree test. Finally, we study how to achieve succinct verification for PCPs and delegate any nondeterministic computation in the PCP model, a seminal result in the theory of PCPs.

• Unit 3: Interactive Oracle Proofs. We study IOPs (Interactive Oracle Proofs), a model that simultaneously combines features of IPs and PCPs. We study how to construct IOPs with linear proof length, for circuit-like computations and for machine-like computations. A key component is the FRI protocol, an IOP of proximity for Reed–Solomon codes. Along the way we learn powerful techniques such as the univariate sumcheck protocol and memory checking.

• Unit 4: Additional Topics. We study probabilistic proofs that are holographic, a feature that enables succinct verification for circuit-like computations in an offline-online model. We study proof composition, and its application to establish the PCP Theorem. We compare the power of public-coin versus private-coin proof systems. We study limitations of IPs, PCPs, and IOPs. We study parallel repetition in different proof models. We study the connection between PCPs and hardness of approximation.

Prerequisites

This course assumes basic knowledge of algorithms and computational complexity. Specifically, it is useful to be familiar with the following topics: discrete probability; some abstract algebra (finite fields, polynomials); asymptotic notation and analysis of algorithms (deterministic and probabilistic); models of computation (Turing machines and circuits); and basic complexity classes (such as P, NP, BPP). The bibliographic notes at the end of Lecture 00 provide pointers to background resources to learn about these topics.

Lectures

Lectures are available on GitHub under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0). Each lecture is a deck of slides, in PDF and Keynote formats; the Keynote slides have animations while the PDF slides do not. Worksheets for most lectures are provided in this this PDF.

Below is the list of lectures, each with a summary and linked slides.
(You can expand all summaries or collapse all summaries.)

Lecture 00: Introduction [pdf] [key]

A brief overview of the structure and content of the course. We present theoretical and practical motivations for the material in the course.

Unit 1: Interactive Proofs

Lecture 01: Intro to IPs [pdf] [key]

We introduce interactive proofs (IPs), which extend the notion of a mathematical proof by allowing the verifier checking the statement to use randomness and exchange messages with a prover. We present an IP for the language of graph non-isomorphism (GNI), which is not known to be in BPP. We show that every language having an IP is contained in PSPACE. Finally, we discuss strategies for error reduction for an IP.

Lecture 02: Sumcheck Protocol [pdf] [key]

We show that IPs capture all languages in P^{#P}, a rich complexity class that contains NP and coNP (and is believed to be much larger than that). We introduce the concept of arithmetization, which translates logical satisfiability to algebraic constraints. We present the sumcheck protocol, a powerful tool that allows to very efficiently check the sum of a low-degree polynomial over a product domain. Along the way, we discuss the Polynomial Identity Lemma, a basic property of low-degree polynomials.

Lecture 03: IP for PSPACE [pdf] [key]

We show that every language in PSPACE has an IP. We discuss how to arithmetize the PSPACE-complete language TQBF (true fully quantified boolean formulas), and then describe Shamir's protocol (an IP for the resulting algebraic problem). We show that TQBF is PSPACE-complete.

Lecture 04: Doubly-Efficient IPs [pdf] [key]

So far we have studied IPs without considering the complexity of the honest prover. In this lecture we introduce doubly-efficient IPs, which are IPs in which the honest prover runs in polynomial time. The motivation is delegation of computation: in a doubly-efficient IP for a language the verifier should be "cheaper" than the decider of the language (without the prover's help). We describe the (bare-bones) GKR protocol, a doubly-efficient IP for bounded-depth circuit computations.

Lecture 05: Zero-Knowledge IPs [pdf] [key]

We introduce zero knowledge IPs, which enable proving that a statement is true without revealing any information (beyond the fact that the statement is true). We discuss an IP for graph isomorphism (GI), and show that it is zero knowledge against honest verifiers and also malicious verifiers. We discuss limitations of zero-knowledge IPs, leading to a surprising connection between the zero-knowledge IP for GI and the IP for GNI discussed in Lecture 1. Finally, we discuss the GMW protocol for graph 3-colorability, which leads to *computational* zero knowledge IPs for every language in NP.

Unit 2: Probabilistically Checkable Proofs

Lecture 06: Intro to PCPs [pdf] [key]

We introduce probabilistically checkable proofs (PCPs), a model of probabilistic proof wherein the prover sends a (possibly large) proof string which the verifier is able to query at arbitrary positions. We show that PCPs capture all languages in IP (and thus in PSPACE) and tease additional capabilities of PCPs such as delegation of general nondeterministic computations. Finally, we describe Killian’s protocol, a fundamental approach to achieve succinct arguments by combining a PCP and a collision-resistant hash function (a simple cryptographic primitive).

Lecture 07: Linearity Testing [pdf] [key]

We introduce proximity testing: the problem of distinguishing, via few queries to a function, whether the function belongs to a certain function class or is far from (all functions in) that class. Property testing plays a key role in achieving small query complexity for PCPs. We discuss simple examples of proximity tests, and then introduce and analyze the Blum-Ruby-Rubinfeld test for linear functions. Moreover, we discuss local correction for linear functions, which applies to functions that are close to linear. Finally, we sketch an improved analysis of the BLR test that leverages Fourier analysis.

Lecture 08: Exponential-Length PCP [pdf] [key]

We describe a PCP for NP with exponential proof length and constant query complexity. This is achieved via an elegant and powerful approach: a compiler that, leveraging linearity testing and local correction for linear functions (both from Lecture 7), transforms any linear PCP into a corresponding (standard) PCP. Linear PCPs are a variant of PCPs where the verifier makes algebraic (in this case, linear) queries to the PCP string. We describe how to construct linear PCPs for NP with polynomial (and linear) proof length and constant query complexity.

Lecture 09: Low-Degree Testing [pdf] [key]

We study proximity testing for (evaluations of) low-degree polynomials. First we describe a simple proximity test for univariate low-degree polynomials, which is essentially optimal. Then we explain how the simple test extends to the multivariate case, but yields large query complexity. This issue is overcome by the Rubinfeld-Sudan test, which achieves a query complexity that is independent of the number of variables. We analyze the test and discuss low-degree testing in general.

Lecture 10: Polynomial-Length PCP [pdf] [key]

We describe a PCP for NP with polynomial proof length and polylogarithmic query complexity. We follow an approach that is analogous to the one we used in Lecture 8: we combine a low-degree test (which we studied in Lecture 9) and a low-degree PCP (which we construct in this lecture). Along the way we use the sumcheck protocol viewed as a PCP (by unrolling the IP interaction).

Lecture 11: PCPs with Sublinear Verification [pdf] [key]

We prove that PCP=NEXP by describing a PCP for OSAT, a succinct variant of the SAT problem that is NEXP-complete. We show how to arithmetize an OSAT instance, which leads to the problem of zero-on-subcube testing for polynomials. We show a low-degree PCP for this problem, which we combine with a low-degree test. We thus see a first example of sublinear verification for PCPs: a PCP verifier whose time complexity is asymptotically less (indeed, exponentially less) than the time complexity of the (nondeterministic) decider of the language.

Lecture 12: PCP for NTIME [pdf] [key]

We show that PCPs enable delegation of general nondeterministic computations: we prove that every language in NTIME[T] has a PCP with prover time poly(T) and verifier time poly(n,logT). Constructing the PCP involves an efficient "scale down" of the PCP for NEXP, which demands a number of delicate changes, including a different NTIME-complete problem (IOSAT) and a different arithmetization via low-degree multivariate polynomials.

Unit 3: Interactive Oracle Proofs

Lecture 13: Intro to IOPs [pdf] [key]

We introduce interactive oracle proofs (IOPs), a model of probabilistic proof that simultaneously generalizes the IP model and PCP model. We show that IOPs capture the same languages as PCPs (i.e. NEXP). We revisit a PCP for NP and show that interaction can significantly reduce the proof length. Finally, we discuss how IOPs lead to succinct arguments: we describe the (interactive) BCS protocol, an analogue to Kilian’s protocol that achieves succinct arguments by combining an IOP and a collision-resistant hash function.

Lecture 14: Linear-Length IOP for Circuits [pdf] [key]

We explore how IOPs can further reduce proof length compared to PCPs. We describe an IOP for NP with linear proof length and logarithmic query complexity. Specifically, we target a popular NP-complete problem known as rank-1 constraint satisfiability (R1CS). We encode information using univariate polynomials, and accordingly discuss a univariate analogue of the sumcheck protocol. In turn, this motivates the study of IOPs of proximity for the Reed–Solomon code, which we study in the next lecture.

Lecture 15: FRI Protocol [pdf] [key]

We study the problem of testing proximity to the Reed–Solomon code in the IOP model. We motivate and discuss the FRI protocol, an IOP for proximity for Reed–Solomon codes with linear proof length and logarithmic query complexity. We also present simple lower bounds on the soundness of the FRI protocol.

Lecture 16: Analysis of FRI [pdf] [key]

This lecture concludes the analysis of FRI, by upper bounding the soundness error of the protocol.

Lecture 17: Linear-Length IOP for Machines [pdf] [key]

We describe an IOP for (nondeterministic) machine computations with linear proof length and polylogarithmic verifier time. This provides a highly efficient approach to delegate computations in the IOP model. First, we construct an IOP for automata computations. This IOP uses a zero-on-subset subprotocol as a building block, and encodes automata computations via the language of rational constraints. Then, to handle general machine computations, we show how to ensure memory consistency via permutation checks.

Unit 4: Additional Topics

Lecture 18: Holographic Proofs [pdf] [key]

We consider the problem of achieving sublinear verification for general computation. We introduce holographic IOPs, which involve a preprocessing step (run by a trusted indexer) for the non-input dependent part of the computation to be proven. We discuss definitions of holographic IPs, PCPs, and IOPs, and revisit existing protocols to show how they can introduce holography. Finally, we show that holographic proof systems, after a compilation akin to Micali and BCS, lead to preprocessing succinct arguments.

Lecture 19: Proof Composition & PCP Theorem [pdf] [key]

We start proving the PCP Theorem, a seminal result stating that every language in NP has a PCP with polynomial proof length and constant query complexity. The high-level strategy involves proof composition, a powerful paradigm that constructs a PCP from an outer PCP and an inner PCP. Roughly, the resulting PCP inherits the proof length of the outer PCP and the query complexity of the inner PCP. We introduce notions under which proof composition works: robustness and proofs of proximity. Then we prove the PCP Theorem assuming the existence of two ingredients: (1) a robust outer PCP of proximity (PCPP) with polynomial proof length and logarithmic query complexity; and (2) an inner PCPP with exponential proof length and constant query complexity. We also discuss proof composition for IOPs.

Lecture 20: Robust PCPs [pdf] [key]

We show how to transform a PCP into a robust PCP, in order to construct the first building block for the PCP Theorem. A robust PCP is one satisfying a stronger notion of soundness: with high probability, a local view is far from any local view that would make the PCP verifier accept. The transformation involves two steps: query bundling and robustification. Query bundling reduces query complexity to constant, at the expense of alphabet size. Robustification achieves constant robustness, at the expense of query complexity.

Lecture 21: PCPs of Proximity [pdf] [key]

We study PCPs of Proximity (PCPPs), which are PCPs wherein the verifier is given oracle access to a purported witness, and the soundness condition states that the verifier rejects (with high probability) if the given witness is far (in Hamming distance) from the set of valid witnesses. Then we construct PCPPs by modifying two PCPs that we previously constructed, by using a local decoder. This lecture concludes the proof of the PCP Theorem.

Lecture 22: Public vs. Private Coins & Perfect Completeness [pdf] [key]

The IP for graph non-isomorphism in Lecture 1 is an example of a private-coin IP. We show that private coins do not significantly help the IP verifier: any private-coin IP can be transformed into a public-coin IP by adding a single additional round. This reduction introduces a completeness error, which then we show how to remove.

Lecture 23: Limitations of IPs [pdf] [key]

We discuss the limitations of IPs with small communication complexity. First, we show that any IP can be decided by a deterministic machine in time exponential in the two-way communication complexity. Next, we show that any public-coin IP can be decided by a probabilistic machine in time exponential in prover-to-verifier communication. Finally, we discuss extensions of this second result to private-coin IPs.

Lecture 24: Limitations of PCPs and IOPs [pdf] [key]

We study limitations of PCPs and IOPs. We show that one-query PCPs for SAT (with small alphabet) would contradict RETH, and that two-query PCPs over a binary alphabet are trivial. We then consider limitations on soundness for PCPs, introduce the sliding scale conjecture, and motivate its statement. Finally, we investigate analogous questions for IOPs.

Lecture 25: Parallel Repetition [pdf] [key]

We discuss parallel repetition, a method to reduce the soundness error of a proof system. Perhaps surprisingly, the behavior of parallel repetition is strikingly drifferent across different models of probabilistic proof. We see this by considering parallel repetition for IPs, MIPs, and PCPs.

Lecture 26: Hardness of Approximation [pdf] [key]

In prior lectures we discussed cryptographic applications of PCPs and IOPs, as tools towards delegation of computation. In this lecture we discuss applications to hardness of approximation. PCPs can be used to show that NP-hard problems are not only hard to solve, but also hard to approximate within certain factors.

Citation

You can use the snippet below to cite this online course.

@misc{Chiesa2024-fopp-course,
    author = {Chiesa, Alessandro},
    title = {Foundations of Probabilistic Proofs (online course)},
    url = {https://fopp-course.org},
    year = {2024},
}

Acknowledgements

This course was born, in a far more modest form, at UC Berkeley in Fall 2016 and Spring 2017. In subsequent years, I reorganized, refined, and extended this course over several iterations, first at UC Berkeley and then at EPFL (as well as for two summer schools). This progress would not have been possible without the many students who valiantly took the course and helped me focus on improving the rough edges. Moreover, I am especially grateful for the valuable contributions of co-instructors and teaching assistants who helped in prior iterations of this course. Many thanks go to Gal Arnon, Marcel Dall'Agnol, Giacomo Fenzi, Ziyi Guan, Tom Gur, Inbal Livni Navon, Igor Shinkar, Nick Spooner, and Eylon Yogev!