Side-Channel Analysis and Fault Injection: CPA & DFA PV204 Security Technologies Łukasz Chmielewski CRoCS, Masaryk University, chmiel@fi.muni.cz Part of the content is reused with permission of Lejla Batina, Radboud University Nijmegen, The Netherlands. April 22nd, 2024 SCA Concepts April 22nd, 2024 1 / 53 Outline 1 Introduction & Reminder 2 EM & other side channels 3 CPA 4 Countermeasures 5 Differential Fault Analysis (DFA) 6 DFA of RSA 7 Conclusions SCA Concepts April 22nd, 2024 2 / 53 Introduction & Reminder Plan for Today 1 2 Concentrate on passive Side-Channel Analysis The main goal is to make Correlation Power Analysis understandable and be aware that there exist countermeasures 3 I will also introduced Differential Fault Analysis + Example 4 The lecture next week will be mostly about Fault Injection and Differential Fault Analysis, but I will also mention more advanced attacks like Template Attacks and higher-order attacks SCA Concepts April 22nd, 2024 3 / 53 Introduction & Reminder Known challenge: embedded crypto devices SCA Concepts April 22nd, 2024 4 / 53 Introduction & Reminder Relevance November 13, 2019 May 28, 2020 SCA Titan: January 7, 2021 October 3, 2019 December 12, 2019 SCA Concepts April 22nd, 2024 5 / 53 Introduction & Reminder Timing side-channel: PIN verification Software for PIN code verification Input: 4-digit PIN code Output: PIN verified or rejected Process CheckPIN (pin[4]) int pin_ok=0; if (pin[0]==5) if (pin[1]==9) if (pin[2]==0) if (pin[3]==2) pin_ok=1; end end end end return pin_ok; EndProcess What are the execution times of the process for PIN inputs [0,1,2,3], [5,3,0,2], [5,9,0,0] The execution time increases as we get closer to [5,9,0,2] SCA Concepts April 22nd, 2024 6 / 53 Introduction & Reminder Power side channel: CMOS leakage Complementary Metal-Oxide-Semiconductor (CMOS) is the most popular technology and the CMOS circuits exhibit several types of leakage The most relevant for side-channel attacks is the charge and discharge of the CMOS load capacitance a.k.a dynamic power consumption Power analysis attack exploits the fact that the dynamic power consumption depends on the data and instructions being processed Dynamic power consumption (Pdyn) is produced by CMOS transitions from state 0 to 1 and from state 1 to 0 Pdyn = CV2 DDP0→1f, where C the transistor capacitance, VDD the power supply voltage, f the frequency and P0→1 the probability of a 0 → 1 transition SCA Concepts April 22nd, 2024 7 / 53 Introduction & Reminder Power side channel: Modeling the leakage Starting point: we use the number of transitions to model the leakage The Hamming distance model counts the number of 0 → 1 and 1 → 0 transitions Example 1: A register R is storing the result of an AES round and initial value v0 gets overwritten with v1 The power consumption because of the register transition v0 → v1 is related to the number of bit flips that occurred Thus it can be modeled as HammingDistance(v0, v1) = HammingWeight(v0 ⊕ v1) Common leakage model for hardware implementations (FPGA, ASIC) SCA Concepts April 22nd, 2024 8 / 53 Introduction & Reminder Power side channel: Modeling the leakage Example 2: In a microcontroller, a register A contains value v0 and an assembly instruction moves the content of register A to B mov rB, rA This instruction transfers v0 from A to B via the CPU, using the bus Typically the bus is precharged at all bits being zeros or one (busInitialValue) The power consumption of the instruction can be modeled as HammingDistance(busInitialValue,v0) = HammingWeight(v0 ⊕ 0) = HW(v0) Common leakage model for software implementations (AVR/ARM) SCA Concepts April 22nd, 2024 9 / 53 Introduction & Reminder Measurement setup schematics Usually power measurements requires physical proximity to the device and customized measurement equipment (resistor, oscilloscope) SCA Concepts April 22nd, 2024 10 / 53 Introduction & Reminder Actual setups DPA setup with ARM CortexM4 FA setup Current Probe Target XY-Table EM-FI Transient Probe VC Glitcher Picoscope Tempest FPGA board for SCA SCA Concepts April 22nd, 2024 11 / 53 Introduction & Reminder Analysis capabilities Simple Power Analysis (SPA): one or a few measurements - visual inspection or some simple signal processing Differential attacks (DPA): multiple measurements - use of statistics, signal processing, etc. Higher order attacks: Univariate vs multivariate Profiled attacks: Template and Deep Learning attacks Combining two or more side-channels Combining side-channel attack with theoretical cryptanalysis SCA Concepts April 22nd, 2024 12 / 53 Introduction & Reminder Simple Power Analysis (SPA) Based on one or a few measurements Mostly discovery of data-(in)dependent but instruction-dependent properties e.g. Symmetric: Number of rounds (resp. key length) Memory accesses (usually higher power consumption) Asymmetric: The key (if badly implemented, e.g. RSA / ECC) Key length Implementation details: for example RSA w/wo CRT SCA Concepts April 22nd, 2024 13 / 53 Introduction & Reminder Reminder: DPA, main concepts The DPA assumption: the attack is possible assuming the existence of a sensitive variable (for which exhaustive key search is possible) that depends on something we know (msg) and something we want to learn (key). Main steps 1 Choose your sensitive variable 2 Collect measurements, known plaintext/ciphertext, sub-key guesses 3 Predict (hypothetical) intermediate values 4 Decide on the leakage model 5 Recover the key by statistical means, using partition or comparison method as the side-channel distinguisher SCA Concepts April 22nd, 2024 14 / 53 Introduction & Reminder DPA with Distance of Means (DoM) as distinguisher Classical 1-bit DPA on AES using DoM LSB = 0 Collect measurements Compute Mean0 Obtain n measurements: e.g. 1000 plaintext xi , power trace pi(t) S-box 8 bits of plaintext 8 bits of key 8 output bits AES impl. LSB LSB(SBox(xi ⊗ k j )Focus: k j ∈ 0,…,255{ }For each key guess: LSB(SBox(xi ⊗ k j )Calculate: LSB = 1 Collect measurements Compute Mean1 1000 measurements * time window t * 256 key guesses Mean0 – Mean1 Maximum difference = best key guess! [Kocher et al.] SCA Concepts April 22nd, 2024 15 / 53 Introduction & Reminder Power analysis notes and literature Very powerful attacks that require contact with the target Countermeasures on different layers required i.e. algorithm, implementation, transistor P. Kocher, J. Jaffe, B. Jun. “Differential Power Analysis”, CRYPTO 1999. T. Eisenbarth et al. “On the Power of Power Analysis in the Real World: A Complete Break of the KeeLoqCode Hopping Scheme”, CRYPTO 2008. Mangard et al. Power Analysis Attacks, Springer, 2006. T. Kasper et al. “All You Can Eat or Breaking a Real-World Contactless Payment System”, Financial Cryptography 2010. J. Balasch et al. “Power Analysis of Atmel CryptoMemory - Recovering Keys from Secure EEPROMs”, CT-RSA 2012. N. Samwel et al. “Breaking Ed25519 in WolfSSL.”, CT-RSA 2018. SCA Concepts April 22nd, 2024 16 / 53 EM & other side channels Electromagnetic side channel SCA Concepts April 22nd, 2024 17 / 53 EM & other side channels EM side channel: Probing Observing a power signal in more complex systems can be messy Complicated SoCs with multiple peripherals Countermeasures trying to flatten the power consumption signal Use an electromagnetic probe instead A probe is used to access the power consumption with less board modifications Smaller probes can focus on interesting locations and ignore interference from unrelated el. components SCA Concepts April 22nd, 2024 18 / 53 EM & other side channels EM side channel: Decapsulation and Microprobing To improve spatial resolution of analysis use a micrometer-sized antenna To exploit more leakage decapsulate the chip using chemicals SCA Concepts April 22nd, 2024 19 / 53 EM & other side channels EM side channel: Decapsulation and Microprobing Left: close inspection of decapsulated ARM processor using a microscope Right: EM emission heat-map of the same chip SCA Concepts April 22nd, 2024 20 / 53 EM & other side channels EM side channel: notes and literature EM enables side-channel attacks both in high proximity scenarios and distance scenarios Main side channel for SoCs, FPGAs, contactless cards due to their complexity and communication methods TEMPEST-like attacks are also targeting private data and authentication methods such as code etc. Gandolfi et al.: Electromagnetic Analysis: Concrete Results, CHES 1999. Andrikos et al.: Location, location, location: Revisiting modeling and exploitation for location-based side channel leakages, ASIACRYPT2019 SCA Concepts April 22nd, 2024 21 / 53 EM & other side channels Exotic side channels Exotic side channels SCA Concepts April 22nd, 2024 22 / 53 EM & other side channels Optical Emission Accessing the chip SRAM cells emits photons that can be detected by a high-resolution camera Visual inspection can reveal the memory location accessed The memory location maps to a specific value (e.g. in the AES LUT), i.e. it maps directly to Sbox(in ⊕ key) Since the input in is known, knowledge of the memory location reveals the key Schlösser et al.: Simple Photonic Emission Analysis of AES, CHES 2012. SCA Concepts April 22nd, 2024 23 / 53 EM & other side channels Out-of-order and speculative execution Meltdown and Spectre Attacks Kernel addresses were access unintentionally due to out-of-order execution Taking all possible branches may also cause issues Foreshadow as a variant of Meltdown on SGX Recent ones: RIDL, ZombieLoad, ... SCA Concepts April 22nd, 2024 24 / 53 CPA Correlation Power Analysis (CPA) SCA Concepts April 22nd, 2024 25 / 53 CPA CPA: the main principle Brier et al.: Correlation Power Analysis with a Leakage Model, CHES 2004 Chapter 6 in the blue book SCA Concepts April 22nd, 2024 26 / 53 CPA CPA Step 1: Choose intermediate value and decide on the leakage model + Sbox ... update in y k Assume a software AES implementation e.g. for AVR microcontrollers (8-bit arch.) Step 1: Choose an intermediate value v of the AES cipher to attack The value v must be a function of the input and the key, i.e. v = f(in, k) A common choice for v is the Sbox output, i.e. v = y = Sbox(in ⊕ k) Throughout the attack the key k must remain constant Throughout the attack the input in is random SCA Concepts April 22nd, 2024 27 / 53 CPA CPA Step 2: Measure the power consumption In Step 2 we record power consumption traces for multiple random inputs Generate randomly n 8-bit inputs. Typically n is large (thousands to millions!) Store the inputs in vector in = [in1in2in3 . . . inn] SCA Concepts April 22nd, 2024 28 / 53 CPA CPA Step 2: Measure the power consumption For every generated 8-bit input we measure the power consumption of the AES implementation over time For every input inj , j = 1, . . . , n we capture a digitized trace over time We denote the trace related to input inj as tj = [t1 j t2 j . . . tm j ] . It contains m points in time (a.k.a. samples) SCA Concepts April 22nd, 2024 29 / 53 CPA CPA Step 2: Measure the power consumption Capturing 6 power traces with m time points (samples) results in the following measurement matrix Note that the power traces originate from the device, i.e. they are related to the secret key stored inside the device We will refer to the unknown key that is stored in the device as kdev SCA Concepts April 22nd, 2024 30 / 53 CPA CPA Step 3: Predict (hypothetical) intermediate values In our device y = Sbox(in ⊕ kdev ), but kdev is unknown! For a given input in we can compute the value y for all possible keys k ∈ {0, 1, . . . , 255} ForAll in ∈ in ForAll k ∈ {0, 1, . . . , 255} Compute y(in, k) = Sbox(in ⊕ k) One of the columns of this value-prediction matrix is the correct one! Divide and Conquer strategy SCA Concepts April 22nd, 2024 31 / 53 CPA CPA Step 4: Leakage model We map the hypothetical intermediate values to hypothetical power consumption values, producing the power-prediction matrix A common choice is Hamming weight but keep in mind that other models may be applicable SCA Concepts April 22nd, 2024 32 / 53 CPA CPA Step 5: Comparison Compare the hypothetical power consumption values with the real measurements using Pearson correlation ForAll columns of measurement matrix ForAll columns of power prediction matrix Compute the correlation between columns The highest correlation value reveals the key Pearson correlation coefficient ρk (L, HW(Vg)) for leakage L, Hamming weight of the intermediate value computed HW(V) and key guess Kg : ρk (L, HW(Vg)) = cov[L, HW(Vg)] Var[L] · Var[HW(Vg)] (1) SCA Concepts April 22nd, 2024 33 / 53 CPA CPA Step 5: Comparison SCA Concepts April 22nd, 2024 34 / 53 Countermeasures Countermeasures Masking and hiding SCA Concepts April 22nd, 2024 35 / 53 Countermeasures The idea Purpose: break the link between the actual data and power consumption Masking: power consumption remains dependent on the data on which computation is performed but not the actual data A random mask concealing every intermediate value Can be on different levels (arithmetic → gate level) Hiding: power consumption is independent of the intermediate values and of the operations Special logic styles like WDDL, MDPL etc. Randomizing in time domain Lowering SNR ratio SCA Concepts April 22nd, 2024 36 / 53 Countermeasures Software countermeasures Time randomization: Operations are randomly shifted in time Use of NOP operations Add random delays Use of dummy variables and instructions (sequence scrambling) Register renaming and nondeterministic processor Idea is to exploit ILP within an instruction stream Processor selects an instruction and a memory access randomly Permuted execution rearranged instructions e.g. S-boxes Masking techniques SCA Concepts April 22nd, 2024 37 / 53 Countermeasures Hardware countermeasures Noise generation: Hardware noise generator from e.g. RNG Total power is increased Desynchronization: Introducing some fake clock cycles during the computation or using a weak jitter Power signal filtering: ex.: RLC filter (R-resistor, C-capacitor, L-inductor) smoothing the pow. cons. signal by removing high frequency components Using active comp. (transistors) in order to keep pow. cons. relatively constant Novel circuit designs e.g. special logic styles SCA Concepts April 22nd, 2024 38 / 53 Countermeasures Masking Principle: randomizing intermediate values with a secret sharing scheme so DPA fails Boolean masking: a dth-order secure Boolean masking scheme splits a sensitive value x into d + 1 shares (x0, x1, ..., xd ), as follows: x = x0 ⊕ x1 ⊕ · · · ⊕ xd The number of traces required for a successful attack grows exponentially w.r.t. the security order d. Probing-secure scheme. We refer to a scheme that uses certain families of shares as t−probing-secure iff any set of at most t intermediate variables is independent from the sensitive values. SCA Concepts April 22nd, 2024 39 / 53 Countermeasures Masking with 2 shares X = X1 ⊕ X2 The leakage L(X) = HW(X1, X2) depends on two variables. It does not reveal any information on the value of X when a DPA is performed SCA Concepts April 22nd, 2024 40 / 53 Differential Fault Analysis (DFA) DFA Bellcore attack in 1995 Differential faults on RSA-CRT signatures Requires 1 correct and 1 wrong signature Attack on DES in 1997 by Biham and Shamir Special attacks on AES, ECC etc. SCA Concepts April 22nd, 2024 41 / 53 Differential Fault Analysis (DFA) DFA on symmetric-key ciphers Basic DFA scenario: adversary obtains a pair of ciphertexts that are derived by encrypting the same plaintext (one is correct value and the other is faulty) two encryptions are identical up to the point where the fault occurred → two ciphertexts can be regarded as the outputs of a reduced-round iterated block cipher where the inputs are unknown but show a small (and possibly known) differential DFA on DES the original attack of Biham and Shamir exploits computational errors occurring in the final rounds of the cipher assumes that one bit of the right half of the DES internal state is flipped at a random position SCA Concepts April 22nd, 2024 42 / 53 Differential Fault Analysis (DFA) Countermeasures Generic approaches Correctness check: encrypt twice Random delays: limits the precision Masking: secret sharing complicates probing wires of the device Hardware countermeasures: light detectors, supply voltage, frequency detectors active shields redundancy: duplication of hardware blocks dual rail implementations SCA Concepts April 22nd, 2024 43 / 53 Differential Fault Analysis (DFA) Symmetric-key countermeasures Introducing redundancy is harder than for PKC Multiple execution is expensive Using the inverse Loop invariant: 2nd variable counting in the opposite way prevents tampering the counter of a loop add a signature that is updated in every run of the loop (checksum) To ensure the integrity of the stored data, Cyclic Redundancy Check (CRC) can be added SCA Concepts April 22nd, 2024 44 / 53 DFA of RSA RSA cryptosystem Key generation: e and the length of n are given Generate “large” prime numbers p, q such that gcd(p − 1, e) = 1, gcd(q − 1, e) = 1; Compute n = p · q and ϕ(n) = (p − 1)(q − 1); Compute d satisfying ed ≡ 1 (mod ϕ(n)) Public key: pk = (n, e); private key: sk = (n, d). Public-key op. c ← Enc(pk, m): c := me (mod n). Secret-key op. m ← Dec(sk, c): m := cd (mod n). There are further limitations to the choice of p, q. e is typically fixed in advance (e.g., 65537 = 216 + 1). CRT Find dp = d (mod p − 1), computed as dp = e−1 (mod p − 1) Find dq = d (mod q − 1) Compute U = p−1 (mod q) cp = c (mod p), cq = c (mod q) SCA Concepts April 22nd, 2024 45 / 53 DFA of RSA RSA with CRT Optimisation of computing a signature giving about 4-fold speedup: n = p · q Signature: s = md mod n Precomputed values dp := d mod (p − 1) dq := d mod (q − 1) iq := q−1 mod p sp := mdp mod p sq := mdq mod q Garner’s method (1965) to recombine sp and sq: s = sq + q · (iq(sp − sq) mod p) SCA Concepts April 22nd, 2024 46 / 53 DFA of RSA Injecting fault in branch sp Assume that an adversary can inject a fault in the computation of sp, resulting in ˆsp. Moreover, an invalid signature ˆs. The adversary can make a correct and an incorrect signatures: s = sq + q · (iq(sp − sq) mod p) ˆs = sq + q · (iq(ˆsp − sq) mod p) s − ˆs = q · (iq(sp − sq) mod p) − (iq(ˆsp − sq) mod p) . Then the adversary can recover q as follows: q = gcd(n, s − ˆs) SCA Concepts April 22nd, 2024 47 / 53 DFA of RSA Injecting fault in branch sq Assume that an adversary can inject a fault in the computation of sq, resulting in ˆsq. The adversary can make a correct and an incorrect signatures: s = sq + q · (iq(sp − sq) mod p) ˆs = ˆsq + q · (iq(sp − ˆsq) mod p) Subtracting the 2 signatures: s − ˆs ≡ (sq − ˆsq) + q · (iq(sp − sq) mod p) −q · (iq(sp − ˆsq) mod p) ≡ (sq − ˆsq) + (q · iq mod p)(sp − sp mod p) − (q · iq mod p) ≡1 (mod p) (sq − ˆsq mod p) ≡ 0 (mod p). =⇒ p = gcd(n, s − ˆs) SCA Concepts April 22nd, 2024 48 / 53 DFA of RSA Countermeasure for RSA Compute a signature twice and compare the two results Verify the signature with the public exponent e, but in some applications (Java card), one does not have access to the public exponent e during signature generation Shamir: random r is first chosen and then modular exp. is computed based on r · n Find s∗ = md (mod r · n) Find Z = md (mod r) If s∗ = Z (mod r) Output s = s∗ (mod n) SCA Concepts April 22nd, 2024 49 / 53 Conclusions Conclusions The goal of this lecture is to go more into detail of Side-Channel and Fault Injection Analyses by presenting in detail the two most powerful techniques against cryptographic implementations. What are the assumptions of CPA? What are the assumptions of DFA on RSA? Provide more background on Side-Channel Attacks Thank you for the attendance :-) SCA Concepts April 22nd, 2024 50 / 53 Conclusions Questions ? SCA Concepts April 22nd, 2024 51 / 53 Extra Masking Principle: randomizing intermediate values with a secret sharing scheme so DPA fails Boolean masking: a dth-order secure Boolean masking scheme splits a sensitive value x into d + 1 shares (x0, x1, ..., xd ), as follows: x = x0 ⊕ x1 ⊕ · · · ⊕ xd The number of traces required for a successful attack grows exponentially w.r.t. the security order d. Probing-secure scheme. We refer to a scheme that uses certain families of shares as t−probing-secure iff any set of at most t intermediate variables is independent from the sensitive values. SCA Concepts April 22nd, 2024 52 / 53 Extra Masking with 2 shares X = X1 ⊕ X2 The leakage L(X) = HW(X1, X2) depends on two variables. It does not reveal any information on the value of X when a DPA is performed SCA Concepts April 22nd, 2024 53 / 53