Why program obfuscation matters

Program obfuscation (a.k.a iO) is somewhat less known than other cryptographic primitives mostly because there exists no practical & secure construction. But it still remains very active area of research with evergrowing list of new ideas, security models, and back and forth between schemes and attacks.

What interests me into program obfuscation is its practical relevance in the world today. Just too many applications, some of them described later, that are impactful & relevant wouldn’t be possible without it. Another motivation is to develop applications with minimal trust assumptions that touch people across the globe. If this weren’t the case, then there already exist solutions like threshold FHE, MPC, TEEs ¹.

What follows is a ramble that teases a few applications of program obfuscation and in doing so highlights its features, limitations, and hopefully motivates its practical relevance.

Large scale MPC

3 friends Alice, Bob, and Charlie are out for dinner at a fancy restaurant with the condition that the one with the highest bank balance will cover for the rest. But they’ve a problem: how do they find who has the highest bank balance? Clearly showing the total balance in respective bank accounts is risky.

Using program obfuscation the solution is simple. Describe program \(\text{P}\) with a hardcoded unknown secret \(\text{S}\). Public key \(\text{Pk}_S\) for \(\text{S}\) is publicly known such that arbitrary messages encrypted with \(\text{Pk}_{\text{S}}\) can be decrypted with \(\text{S}\). \(\text{P}\) takes 3 tuples and some auxiliary information as inputs. Each tuple contains encrypted bank balance of one of the friend, encrypted using \(\text{Pk}_{\text{S}}\), along with a signature from the bank attesting balance’s validity. The auxiliary information is set to the public key of the bank.

\(\text{P}\) decrypts balance in each tuple and checks balance’s validity by verifying corresponding signature using bank’s public key. It then finds the maximum of the three balances and outputs the corresponding index.

It’s valid to point that the same can be acheived with existing MPC schemes. However, in contrast to MPC schemes², program obfuscation requires minimal interaction and communication: clients send their encrypted inputs once ³ and require no further interaction (i.e. it’s free of liveness requirements). This allows for large-scale MPC.

But the key benefit of program obfuscation over any existing cryptographic primitive is that obfuscated programs hide arbitrary secrets. Thus, it becomes possible to instantiate a global encrypted state composed of private states of multiple people. Then write “obfuscated” programs that take a shard of the global state as an input, mutate it as per instructions, and put the updated shard back into the global state.

Global encrypted trust network

Anonymous sybil resistant identity

For instance consider an encrypted global trust network ⁴. To construct an encrypted trust network, the first piece is an anonymous sybil resistant identity system. But establishing an identity system that is both sybil resistant and anonymous is a hard problem. This is because sybil resistance requires humaness check which is hard to do while the other person remains anonymous.

One good example of sybil resistant but not anonymous identity system is any government identity (i.e passport, aadhar, etc.). On the other hand, identity systems that use social groups for humaness check are only somewhat sybil resistant and anonymous. This is because your friends can attest to your unique identifier allowing you to claim that you’re a human. But nothing prevents them from attesting to your multiple different unique identifiers.

Considering sybil resistant an anonymous identities are hard, let’s bootstrap anonymous identity system from one that is sybil resistant. Without loss of generality I’ll use passports, for it allows me to gloss over the details of actually issuing the identities. But the techniques apply to any other identity system.

Passport, at-least the newer ones, come with a tamper proof microchip that stores a data blob containing holder’s personal identifiable information (PII) along with a signature from passport issuing authority attesting the validity of the PII. It then becomes easy for others to check passport’s authenticity by verifying signature against claimed issuing authority’s public key. Moreover, the signature helps bridge passport to an anonymous identity ⁵ using program obfuscation.

Define obfuscated program \(\text{P}_{\text{ID}}\) into which 3 secrets: identity salt \(\text{SALT}_{\text{ID}}\), authentication salt \(\text{SALT}_{\text{MAC}}\), and key derivation salt \(\text{SALT}_{\text{KD}}\) are hard-coded. \(\text{P}_{\text{ID}}\) takes as inputs passport holder’s PII and the corresponding signature from the issuing authority. \(\text{P}_{\text{ID}}\) verifies the signature using issuing authority’s public key (issuing authority’s public key is known publicly). If verification fails, it returns. Otherwise, \(\text{P}_{\text{ID}}\) produces anonymous identifier as \(\text{Id} = \text{Sha256}(\text{holder's PII} | \text{SALT}_{\text{ID}})\). It then attests to the identifier by producing a message authentication code (MAC) \(\text{MAC}_{\text{Id}} = \text{Sha256}(\text{Id} | \text{SALT}_{\text{MAC}})\). \(\text{P}_{\text{ID}}\) then generates shared secret \(\text{SS}_{\text{Id}} = \text{Sha256}(\text{Id} | \text{SALT}_{\text{KD}})\). It then outputs \(\text{Id}\), corresponding MAC \(\text{MAC}_{\text{Id}}\), and the shared secret \(\text{SS}_{\text{Id}}\).

Actual MAC constructions use keyed hash functions, but above, with the intention to simplify, I’ve replaced the key with a salt. Since \(\text{SALT}_{\text{MAC}}\) is kept secret, it works as the key in keyed MACs. But there’s no reason why an actual keyed hash function based MAC cannot be used.

It’s clear that anonymous identifier \(\text{Id}\) is deterministic from passport holder’s PII. More so, due to \(\text{SALT}_{\text{ID}}\), anonymous identifier \(\text{Id}\) cannot be traced back to its PII. But there’s a problem: verifying \(\text{MAC}_{\text{Id}}\) for identity \(\text{Id}\) requires \(\text{SALT}_{\text{MAC}}\), which must always remain secret except that it can be hardcoded inside an obfuscated program. And that’s what we do: obfuscate another “verifier” program \(\text{P}_{\text{V}}\) into which the authentication salt, \(\text{SALT}_{\text{MAC}}\), is hard-coded. \(\text{P}_{\text{V}}\) takes as input the identity tuple \((\text{Id}, \text{MAC}_{\text{Id}})\) and outputs true if \(\text{MAC}_{\text{Id}}\) is valid for \(\text{Id}\) by checking \(\text{MAC}_{\text{Id}} = \text{Sha256}(\text{Id} | \text{SALT}_{\text{MAC}})\). Otherwise outputs false.

The shared secret \(\text{SS}_{\text{Id}}\) is used by identity holder to communicate secretly, using an authenticated encryption scheme (AEAD), with obfuscated programs into which the key derivation salt, \(\text{SALT}_{\text{KD}}\), is hard-coded.

program P_ID {
    // hardcoded secrets
    secret salt_id;
    secret salt_mac;
    secret salt_kd;

    // public key of passport issuing authority
    public pk

    // pii: passport holder's personal identifiable information
    // sig: signature on `pii` by passport issuing authority
    function run(encrypted pii, encrypted sig) {
        assert(verify(sig, pk));
        let id = sha256(pii | salt_id);
        let mac_id = sha256(id | salt_mac);
        let ss_id = sha256(id | salt_kd);
        return (id, mac_id, ss_id)
    }
}

Pseudocode for program P_ID. run function executes the program P_ID. Credit CC.

Encrypted trust network

The resulting identity system defines identity as tuple \((\text{Id}, \text{MAC}_{\text{Id}})\) and the corresponding shared secret \(\text{SS}_{\text{Id}}\) is kept private by the holder. Now assume both you and your friend are part of the said identity system. To hand out a trust score to your friend, create the trust score object \(\text{Ts} = \text{{Id: Friend's Id, Score: Trust score}}\) and encrypt it under defined AEAD scheme using your shared secret \(\text{SS}_{\text{Id}}\), \(\text{EncTs = Enc(Ts, SS) }\) (encryption routine of any AEAD scheme requires nonce which is sampled randomly here). Then share the following trust score tuple \((\text{EncTs}, \text{Id}, \text{MAC}_{\text{Id}})\) with others.

The decryption fails if it is attempted for an encrypted trust score \(\text{EncTs}\) from user with identity \(\text{Id}\) using an incorrect shared secret \(\text{SS}_{\text{Id'}}\) and vice versa (i.e. decryption fails when attempted on incorrectly formed ciphertext using user’s correct shared secret \(\text{SS}_{\text{Id}}\)). Thus, a successful decryption implies authenticity of the ciphertext and explains why an AEAD scheme is used. But it is only the identity holder and obfuscated programs that can derive \(\text{SS}_{\text{Id}}\) and, thus, to verify the trust score another program \(\text{P}_{\text{VN}}\) is obfuscated. The secrets \(\text{SALT}_{\text{MAC}}\) and \(\text{SALT}_{\text{KD}}\) are hard-coded into \(\text{P}_{\text{VN}}\). \(\text{P}_{\text{VN}}\) takes as input a trust score tuple \((\text{EncTs}, \text{Id}, \text{MAC}_{\text{Id}})\) and outputs 1 if both \(\text{MAC}_{\text{Id}}\) and \(\text{EncTs}\) are valid (i.e. \(\text{MAC}_{\text{Id}} = \text{Sha256}(\text{Id}|\text{SALT}_{\text{MAC}})\) and decryption of \(\text{EncTs}\) using \(\text{SS}_{\text{Id}} = \text{Sha256}(\text{Id}|\text{SALT}_{\text{MAC}})\) succeeds).

To find the trust score for a stranger with identity \(\text{B}\), first gather all trust score tuples from your trusted social network as the collection \(\text{TrustScoreTuples}\). Then ask \(\text{B}\) to give authorization to aggregate trust scores they have received in collection \(\text{TrustScoreTuples}\). To provide authorization, \(\text{B}\) uses their shared secret \(\text{SS}_{\text{B}}\) to produce MAC \(\text{MAC}_{\text{IN}} = \text{Sha256}(\text{TrustScoreTuples}|\text{SS}_{\text{B}})\) ⁶. Then feed \(\text{TrustScoreTuples}\), \(\text{B}\)’s identity tuple \((\text{B}, \text{MAC}_{\text{B}})\), and \(\text{MAC}_{\text{IN}}\), as inputs, to the obfuscated program \(\text{P}_{\text{TN}}\).

Two secret salts \(\text{SALT}_{\text{KD}}\), \(\text{SALT}_{\text{MAC}}\) are hard-coded into \(\text{P}_{\text{TN}}\). \(\text{P}_{\text{TN}}\) first checks \(\text{MAC}_{\text{B}} = \text{Sha256}(\text{B} | \text{SALT}_{\text{MAC}})\). This verifies that \(\text{B}\) is a valid identity issued by program \(\text{P}_{\text{ID}}\). \(\text{P}_{\text{TN}}\) then derives the shared secret \(\text{SS}_{\text{B}}\) as \(\text{SS}_{\text{B}} = \text{Sha256}(\text{B} | \text{SALT}_{\text{KD}})\) and verifies that \(\text{B}\)’s data dependent authorization \(\text{MAC}_{\text{IN}}\) is valid for collection \(\text{TrustScoreTuples}\) by checking \(\text{MAC}_{\text{IN}} = \text{Sha256}(\text{TrustScoreTuples} | \text{SS}_{\text{B}})\).

\(\text{P}_{\text{TN}}\) then loops over each trust score tuple. For the \(\text{i}^{th}\) tuple \((\text{A}^\text{i}, \text{MAC}_{\text{A}^\text{i}}, \text{EncTs}_\text{i})\), \(\text{P}_{\text{TN}}\) first validates that identity \(\text{A}^\text{i}\) is valid by checking \(\text{MAC}_{\text{A}^\text{i}} = \text{Sha256}(\text{A}^\text{i} | \text{SALT}_{\text{MAC}})\). If not, \(\text{P}_{\text{TN}}\) returns. It then derives shared secret with \(\text{A}^\text{i}\) as \(\text{SS}_{\text{A}^\text{i}} = \text{Sha256}(\text{A}^\text{i}, \text{SALT}_{\text{KD}})\) and decrypts \(\text{EncTs}_\text{i}\) using \(\text{SS}_{\text{A}^\text{i}}\), \(\text{Ts} = \text{Dec}(\text{EncTs}_\text{i}, \text{SS}_{\text{A}^\text{i}})\). If decryption fails, it returns. \(\text{P}_{\text{TN}}\) then adds \(\text{Ts.Score}\) to the total trust score if \(\text{Ts.Id = B}\). Once, \(\text{P}_{\text{TN}}\) reaches end of the loop it returns the total trust score.

program P_TN {
    secret salt_mac
    secret salt_kd

    function check_id_mac(id, mac) {
        assert(mac == sha256(id | mac))
    }

    function shared_secret(id) {
        return sha256(id | salt_kd)
    }

    function run(id: {b, mac_b}, mac_in, trust_scores: [(enc_ts, a, mac_a)]) {
        // validate b's identity
        check_id_mac(b, mac_b)

        // validate b's authorization on `test_scores`
        assert(mac_in == sha256(trust_scores|shared_secret(b))

        let total_score = 0
        for (enc_ts, a, mac_a) in trust_scores {
            check_id_mac(a, mac_a)

            // decrypt encrypted trust score `enc_ts`
            ts = Dec(enc_ts, shared_secret(a))

            // add score to `total_score` if score is for `b`
            if ts.id == b {
                total_score += ts.score
            }
        }
        return total_score

    }
}

The construction only uses lightweight cryptography (i.e. hash functions and MACs) and avoids heavyweight cryptography (i.e. public key encryption, digital signatures). It’s not the case that the same cannot be done using public key encryption and digital signatures, infact it’s quite straightforward. Instead, the main motivation is efficiency. It’s true for any modern cryptography primitive (e.g. zero-knowledge proofs, FHE), and very likely will remain true for a practical obfuscation scheme that, using the primitive as the backend for lightweight crypto is drastically more efficient than using it as the backend for heavyweight crypto.

Smart contracts

Let’s start with a basic question: how are balances stored? If balances are stored as encrypted list of tuples (address, amount), then for amount transfers the entire balance list will have to be fed to the obfuscated program. The program then checks transaction’s validity and updates the list in time at-least proportional to the size of the list. This is because the program cannot find the tuples to update without linear scan over the list.

This highlights a limitation of obfuscated programs: execution time increases linearly with size of the memory ⁷. Hence, one should optimize to reduce memory footprint of the program.

The alternative model for storing balances, which indeed reduces memory footprint, is notes and nullifiers. Now the global state consists of set of encrypted notes and nullifiers, both stored as merkle trees. Four secrets: key derivation salt \(\text{SALT}_{\text{KD}}\), nonce derivation salt \(\text{SALT}_{\text{NONCE}}\), nullifier salt \(\text{SALT}_{\text{NULL}}\), and salt \(\text{SALT}_{\text{MAC}}\) are hard-coded into the obfuscated transfer program \(\text{P}_{\text{T}}\). \(\text{P}_{\text{T}}\) takes as inputs: merkle root \(\text{R}\) of the encrypted notes tree, encrypted input notes, merkle proof that encrypted note exists in the tree with root \(\text{R}\) for all encrypted input notes, spender \(\text{A}\)’s address \(\text{A}_{\text{ADD}}\), receiver \(\text{B}\)’s address \(\text{B}_{\text{ADD}}\), the transfer amount, and \(\text{MAC}_{\text{IN}}\).

Verification of merkle proofs validates that the encrypted input notes exist in the state tree with root \(\text{R}\). This prevents malicious executor from feeding valid input notes, probably minted on the fly, that do not exist in the global set of encrypted notes and still cause \(\text{P}_{\text{T}}\) to produce valid outputs. Define \(\text{SS}_{\text{A}}\) as a shared symmetric key between \(\text{A}\) and the obfuscated program \(\text{P}_{\text{T}}\) derived as \(\text{SS}_{\text{A}} = \text{Sha256}(\text{A}_{\text{ADD}} | \text{SALT}_\text{KD})\). \(\text{A}\) authorizes the transaction with MAC \(\text{MAC}_{\text{IN}}\) produced on \(\text{InputTuple} = (\text{R}, \text{[encrypted input notes]}, \text{B}_{\text{ADD}}, \text{amount})\) using the shared secret \(\text{SS}_{\text{A}}\), \(\text{MAC}_{\text{IN}} = \text{Sha256}(\text{InputTuple} | \text{SS}_{\text{A}})\). \(\text{A}\)’s input notes are encrypted under an AEAD scheme using secret \(\text{SS}_{\text{A}}\). There are two reasons to use an AEAD scheme: (1) correct decryption using \(\text{SS}_{\text{A}}\) validates that \(\text{A}\) owns the input note and the note itself does not require an address field (2) any incorrectly formed ciphertext is rejected, preventing \(\text{P}_{\text{T}}\) from producing unpredictable outputs.

\(\text{P}_{\text{T}}\) verifies merkle proof of each input note. It then uses \(\text{SS}_{\text{A}}\) to decrypt the input notes. If either merkle verification or decryption of any note fails, \(\text{P}_{\text{T}}\) returns. It then checks \(\text{MAC}_{\text{IN}}\) on InputTuple (i.e. \(\text{MAC}_{\text{IN}} = \text{Sha256}(\text{InputTuple} | \text{SS}_{\text{A}})\)) and returns if \(\text{MAC}_{\text{IN}}\) is invalid. \(\text{P}_{\text{T}}\) then checks that summation of amounts in input notes >= amount. If not, it returns. \(\text{P}_{\text{T}}\) then mints a new note with transfer amount for \(\text{B}\) and encrypts it using shared symmetric key \(\text{SS}_\text{B} = \text{Sha256}(\text{B}_\text{ADD} | \text{SALT}_\text{KD})\). \(\text{P}_{\text{T}}\) also mints change back note for \(\text{A}\) and encrypts it using \(\text{SS}_\text{A}\). Any AEAD scheme requires nonce for encryption and the nonce is derived as \(\text{Nonce} = \text{Sha256}(\text{InputTuple} | \text{SALT}_\text{NONCE})\). The same \(\text{Nonce}\) is used for both encryptions. \(\text{P}_{\text{T}}\) then produces nullifier for each input note as \(\text{Sha256}(\text{Note} | \text{SALT}_{\text{NULL}})\). \(\text{P}_{\text{T}}\) then sets the output tuple \(\text{OutputTuple} = (\text{R}, \text{[new encrypted notes]}, \text{[nullifiers]})\) and attests to it by producing \(\text{MAC}_\text{OUT} = \text{Sha256}(\text{OutputTuple} | \text{SALT}_\text{MAC})\). \(\text{P}_{\text{T}}\) then outputs \(\text{OutputTuple}\) and the corresponding \(\text{MAC}_\text{OUT}\).

\(\text{OutputTuple}\) represents state transition from state root \(\text{R}\) and is only valid if \(\text{MAC}_\text{OUT}\) is valid and none of the nullfiers already exist in the global nullifiers set. To validate \(\text{MAC}_\text{OUT}\), another obfuscated “verifier” program \(\text{P}_\text{V}\), into which \(\text{SALT}_\text{MAC}\) is hard-coded, is used. It takes \(\text{OutputTuple}\) and corresponding \(\text{MAC}_\text{OUT}\) as inputs and outputs 1 if \(\text{MAC}_\text{OUT}\) equals \(\text{Sha256}(\text{OutputTuple} | \text{SALT}_\text{MAC})\), otherwise outputs 0. It’s impossible for anyone to produce valid MACs for arbitrary tuples because \(\text{SALT}_\text{MAC}\) is only known to programs \(\text{P}_{\text{T}}\) and \(\text{P}_{\text{V}}\).

For \(\text{A}\) to attest to \(\text{InputTuple}\), it needs to learn \(\text{SS}_\text{A}\). But \(\text{A}\) cannot learn key derivation salt \(\text{SALT}_\text{KD}\). This is solved with another obfuscated program \(\text{P}_\text{KD}\) into which two secrets: address derivation salt \(\text{SALT}_\text{ADD}\) and key derivation salt \(\text{SALT}_\text{KD}\) are hard-coded. \(\text{P}_\text{KD}\) takes \(\text{A}\)’s secret \(\text{S}_\text{A}\) as input and outputs \(\text{A}\)’s address \(\text{A}_\text{ADD} = \text{Sha256}(\text{S}_\text{A} | \text{SALT}_\text{ADD})\) and \(\text{A}\)’s shared secret \(\text{SS}_\text{A} = \text{Sha256}(\text{A}_\text{ADD} | \text{SALT}_\text{KD})\).

Setting value for nonce is tricky. First, there exist no means to generate randomness in the obfuscated program \(\text{P}_{\text{T}}\), which implies that the executor must provide randomness for nonce. But if nonce is controlled directly by the executor, then a malicious executor can cause \(\text{P}_{\text{T}}\) to use the same nonce for encryptions across 2 runs with different inputs. Thus obtain two different note ciphertexts encrypting two different messages under the same shared secret and nonce, which is in-secure (it’s well known that nonce re-use is really bad). What malicious executor cannot do is cause \(\text{P}_{\text{T}}\) to output two different outputs across 2 runs on same inputs. Hence, it seems safe to tie nonce with inputs. The reason nonce is not set as hash of the inputs but instead is set as hash of inputs concatenated with a secret salt (i.e. \(\text{Nonce} = \text{Sha256}(\text{InputTuple} | \text{SALT}_\text{NONCE})\)) is to make the nonce unpredictable by the executor.

program P_T {
    secret salt_kd
    secret salt_nonce
    secret salt_null
    secret salt_mac

    fn assert_path_at_root(root, path, note){
        // asserts that `path` to `note` is valid from the `root`
    }

    function shared_secret(add) {
        return sha256(add | salt_kd)
    }

    // root R: root of encrypted notes tree
    // in_notes: input notes of A
    // merkle_paths: merkle path for root `R` for each note in `in_notes`
    // a_add: A's address
    // b_add: B's address
    // amount: transfer amount
    // mac_in: Input MAC
    function run(root R, in_notes: [Enc({amount})], merkle_paths, a_add, b_add, amount, mac_in) {
        // verify merkle path in root R for each input note
        for (note, path) in zip(in_notes, merkle_paths) {
            assert_path_at_root(R, path, note)
        }

        // verify MAC `mac_in` over input tuple
        let ss_a = shared_secret(a_add)
        assert(mac_in == sha256((R, in_notes, b_add, amount)|ss_a))


        // total amount in input notes
        let total_in_amount = 0
        for note in in_notes {
            let plain_note = Dec(note, ss_a)
            total_in_amount += plain_note.amount
        }

        assert(total_in_amount >= amount)

        // mint new note for B
        let plain_note_b = {amount: amount}

        // mint change back note for A
        let plain_note_a = {amount: total_in_amount - amount}

        // derive nonce for encryption
        let nonce = sha256([R, in_notes, a_add, b_add, amount]|salt_nonce)

        // Encrypt new notes
        let note_b = Enc(plain_note_b, shared_secret(b_add), nonce)
        let note_a = Enc(plain_note_a, ss_a, nonce)

        // nullify input notes
        let nullifiers = []
        for note in in_notes {
            nullifiers.push(sha256(note|salt_null))
        }

        // produce MAC on the output tuple
        let output_tuple = (
            R, [note_b, note_a], nullifiers
        )
        let mac_out = sha256(output_tuple|salt_mac)

        return (output_tuple, mac_out)

    }
}

Encrypted smart contract

Construction for an actual smart contract with encrypted state is similar to the transfer program \(\text{P}_\text{T}\) with slight modifications. Take encrypted automated market maker, obfuscated program \(\text{P}_\text{AMM}\), that facilitates swaps for a single pair ETH-USD as an example. Like \(\text{P}_\text{T}\), four secret salts: \(\text{SALT}_\text{KD}\), \(\text{SALT}_\text{NONCE}\), \(\text{SALT}_\text{NULL}\), \(\text{SALT}_\text{MAC}\) are hard-coded into \(\text{P}_\text{AMM}\). Another secret \(\text{S}_\text{STATE}\) (s_state) using which program’s state object \(\text{State}_\text{AMM} = \text{{ETH: X, USD: Y}}\) is encrypted (under defined AEAD scheme) is also hard-coded.

program P_AMM {
    secret salt_kd
    secret salt_nonce
    secret salt_null
    secret salt_mac
    secret s_state <-- secret using which program's local state is encrypted

    fn assert_path_at_root(root, path, note){
        // asserts that `path` to `note` is valid from the `root`
    }

    function shared_secret(add) {
        return sha256(add | salt_kd)
    }

    // root R: root of encrypted notes tree
    // enc_state: local encrypted state object Enc({ETH: X amount, USD: Y amount}) using secret s_state
    // in_notes: input notes of A. Input note has two fields: (1) `amount` (2) currency `curr`
    // merkle_paths: merkle path for root `R` for each note in `in_notes`
    // a_add: A's address
    // swap_dir: Swap direction. 0 if ETH -> USD. 1 otherwise
    // swap_amount: Swap amount
    // mac_in: Input MAC
    function run(root R, enc_state, in_notes: [Enc({amount, curr})], merkle_paths, a_add, swap_dir, swap_amount, mac_in) {
        // verify merkle path in root R for each input note
        for (note, path) in zip(in_notes, merkle_paths) {
            assert_path_at_root(R, path, note)
        }

        // verify MAC `mac_in` over input tuple
        let ss_a = shared_secret(a_add)
        assert(mac_in == sha256((R, in_notes, swap_dir, swap_amount)|ss_a))

        // currency? EHT=0, USD=1
        let curr = 0
        if swap_dir == 1 {
            curr = 1
        }

        // input `curr` amount
        let in_amount = 0
        let ss_a = shared_secret(a_add)
        for note in in_notes {
            let plain_note = Dec(note, ss_a)
            if plain_note.curr = curr {
                in_amount += plain_note.amount
            }
        }

        // Decrypt local state object using s_state
        let state = Dec(enc_state, s_state)

        // trade
        let out_amount = 0
        if curr == 0 {
            // trade direction: ETH -> USD

            let price = state.usd / state.eth
            out_amount = price * in_amount
            state.usd -= out_amount
            state.eth += in_amount
        } else {
            // trade direction: USD -> ETH

            let price = state.eth / state.usd
            out_amount = price * in_amount
            state.eth -= out_amount
            state.usd += in_amount
        }

        // mint new note for A
        let plain_note = {amount: out_amount, curr: curr}

        // derive nonce for encryption
        let nonce = sha256([R, in_notes, a_add, swap_dir, swap_amount]|salt_nonce)

        // encrypt new note and the update state object
        let note_a = Enc(plain_note, ss_a, nonce)
        let enc_updated_state = Enc(state, s_state, nonce)

        // nullify input notes
        let nullifiers = []
        for note in in_notes {
            nullifiers.push(sha256(note|salt_null))
        }

        // produce MAC on the output tuple
        let output_tuple = (
            R, enc_state, enc_updated_state, note_a, nullifiers
        )
        let mac_out = sha256(output_tuple|salt_mac)

        return (output_tuple, mac_out)

    }
}

Pseudocode for \(\text{P}_{\text{AMM}}\). There are two new additions to \(\text{P}_{\text{AMM}}\) over \(\text{P}_{\text{T}}\) to support local state. The first is the encrypted local state object, enc_state, which is taken as input. The second is secret s_state, using which local state object is encrypted and stored externally.

Apart from minor differences, due to functionality differences between \(\text{P}_{\text{T}}\) and \(\text{P}_{\text{AMM}}\), the first major difference is that \(\text{P}_{\text{AMM}}\) stores its local state outside encrypted using the hardcoded secret s_state. It then takes the local encrypted state, enc_state, as an input. The second difference is that the \(\text{OutputTuple}\) represents two different state transitions. The first, as in output tuple of \(\text{P}_{\text{T}}\), adds new notes to the encrypted notes state tree and nullifers to the nullifier free. The second updates \(\text{P}_{\text{AMM}}\)’s state from enc_state to enc_updated_state. The second transition is only valid if enc_state is \(\text{P}_{\text{AMM}}\)’s latest know state.

Open problems

Composable smart contracts

It’s natural that there will exist multiple encrypted smart contracts, each with different functionalities and independent encrypted states. If so, how does smart contract A accesses encrypted state of smart contract B? I haven’t been able to find a construction that is secure and is free of undesirable properties. Consider one attempt: enforce all smart contracts to encrypt their states using the same hard-coded hidden secret s_state. Then for A to access B’s state, it’s matter of feeding B’s state as an input to A. However, smart contract A can host arbitrary code and may simply output B’s state in plain which isn’t secure. This can be fixed with a public list of “acceptable” smart contracts. But then who determines which smart contracts are acceptable?

Probably there can exist a fixed set of, relatively straight forward, constraints that a smart contract must follow. For example, smart contract cannot output anything other than encrypted states, encrypted notes, and nullifiers. Or smart contract A must not mutate smart contract B’s state. However, what should be on the list remains an open question.

Offloading execution

All this while we’ve assumed that the user executes obfuscated programs. This explains why the programs accept private inputs in plain (e.g. transfer amount, swap instructions in case of smart contracts and PII in case of anonymous identity). But what if executing these obfuscated programs is expensive? How can the user offload execution to a beefy server while keeping necessary inputs private.

There exists a simple transformation from a program P that accepts inputs in plain to program P’ that accepts the same inputs encrypted - use user’s shared secret to encrypt the inputs. For instance, if P takes as input InputTuple, then P’ takes as input encrypted InputTuple, EncInputTuple. To execute, user A encrypts InputTuple as EncInputTuple under an AEAD scheme using \(\text{SS}_\text{A}\) and a random nonce. A then sends EncInputTuple to P’. Upon receiving EncInputTuple, P’ first derives shared secret \(\text{SS}_\text{A} = \text{Sha256}(\text{A}_\text{ADD}|\text{SALT}_\text{KD})\). It then decrypts EncInputTuple to obtain InputTuple, after which P’ executes like P. If any of the outputs are private to A, P’ encrypts them using shared secret \(\text{SS}_\text{A}\) (nonce is derived as \(\text{Sha256}(\text{EncInputTuple}|\text{SALT}_\text{NONCE})\)).

But there’s still one missing piece - how does user A derive shared secret \(\text{SS}_\text{A}\)? If A is not allowed to execute the obfuscated key derivation program locally, then shared secret key derivation does not seem possible only using lightweight cryptography (public key cryptography seems necessary⁸)

Hardcoding secrets

In all constructions above I’ve assumed that unknown secrets are hard-coded into obfuscated programs. However, I haven’t exactly described how this works.

As a warning, my answer will still be hand wavy because I don’t think program obfuscation schemes are at a stage where a concrete method can be investigated. So the answer remains simple: given that a single person can obfuscate program P with hardcoded secret S, then the obfuscation procedure can be transformed into a MPC protocol that achieves the same functionality but where randomness for the procedure is contributed by multiple non-colluding parties.

Obviously a very interactive MPC protocol that forces each party to stay online throughout a very long procedure isn’t statisfactory. This is because, in practice, not many people participate in protocols that require much work. And much work, even after there exists a practical program obfuscation scheme, needs to be done to optimize the MPC protocol such that it can be used for applications that require minimal trust assumptions (e.g. encrypted social networks, encrypted smart contracts, etc.).

To be clear, primitives like FHE/MPC may play some role in practical program obfuscation. TEEs have the highest trust assumption (i.e. you trust the vendor) but are the cheapest, both in terms of runtime & cost. (threshold)FHE/MPC are mostly interchangeable primitives that have the same t-out-of-N trust assumption. More so, there are suggestions to store secret shards of threshold FHE inside secure enclaves to hopefully get best of both worlds. However, none of them provide complete cryptographic guaratees as an ideal program obfuscation scheme would.↩︎
In secret sharing based MPC schemes each user needs to stay online throughout the computation phase. In multi-party FHE, each user needs to come online for 3 times: to upload private inputs to the server, retrieve output ciphertext and produce decryption share, and, finally, to retrieve other decryption shares to decrypt the output ciphertext.↩︎
The size of encrypted inputs depend on PKE scheme and can be as low as size of ciphertext of any symmetric encryption scheme↩︎
A trust network is an oracle to which you can ask: how much can I trust person B? It’s important for “encrypted” to precede “global” to prevent the network from being used for nefarious purposes.↩︎
The bridged anonymous identity is anonymous for everyone except the issuing authority. This is because the issuing authority holds the PII and arbitrary power to generate valid signatures, allowing them to bridge themselves and learn the anonymous identifier.↩︎
It’s necessary for B to authorize aggregating trust scores on given TrustScoreTuples. Otherwise, one can query B’s trust scores for arbitrary collection of trust score tuples and gather a good sense of B’s social network.↩︎
It’s not completely true that the cost increases linearly with the size of the memory. Primitives like ORAM, DE-PIR can be used to get asymptotically better RAM access runtime. However, ORAM in context of program obfuscation has not been explored. Whereas DE-PIR, at the moment, is impractical.↩︎
If public key encryption, a heavyweight primitive, is allowed then shared secret key derivation is straightforward. A encrypts its secret \(\text{S}_\text{A}\) using obfuscated key derivation program \(\text{P}_\text{KD}\)’s public key \(\text{Pk}_{\text{P}_\text{KD}}\). A then sends its public key \(\text{Pk}_{\text{A}}\) and \(\text{Enc}(\text{S}_\text{A}, \text{Pk}_{\text{P}_\text{KD}})\) to \(\text{P}_\text{KD}\). \(\text{P}_\text{KD}\) decrypts \(\text{Enc}(\text{S}_\text{A}, \text{Pk}_{\text{P}_\text{KD}})\) (\(\text{P}_\text{KD}\) has secret key corresponding to \(\text{Pk}_{\text{P}_\text{KD}}\) hard-coded). It then produces \(\text{SS}_\text{A} = \text{Sha256}(\text{S}_\text{A} | \text{SALT}_\text{KD})\) and outputs \(\text{Enc}(\text{SS}_\text{A}, \text{Pk}_\text{A})\).↩︎