PV181 Laboratory of security and applied cryptography Seminar 12: Crypto-libraries protected against hardware attacks Łukasz Chmielewski chmiel@fi.muni.cz | PV1811 Outline • Recall + goal of this seminar – Digital signatures – RSA vs. ECC • Side Channel + Fault Injection speed run • Secured X25519 library: sca25519 – Optionally: Demo • Assignment this week: – Securing RSA execution 2 | PV181 encryption decryption message Alice Public key of Bob Bob Adapted Source: Network and Internetwork Security (Stallings) Private key of Bob Encrypted message Decrypted original message | PV1813 Recall: Asymmetric cryptosystem Recall: Digital signature scheme 4 | PV181 Signature algorithm Verification algorithmmessage signed message Alice Public key of Alice Bob Source: Network and Internetwork Security (Stallings) verified message Private key of Alice Is there a difference? Recall: RSA vs. ECC 5 | PV181 • exponentiation ≈ scalar multiplication • multiplication ≈ points addition • squaring ≈ point doubling Why is hardware security important? 6 | PV181 Identity Theft • Premium Content Theft Impersonation Card / Money Theft Phone / Money Theft 7 | PV181 Side- Channel Analysis 8 | PV181 Cookies Example 9 | PV181 Passive vs Active Side Channels 10 | PV181 Passive: analyze device behavior Active: change device behavior Recent Practical Attacks 11 | PV181 Side Channels • Time • Power • Electro Magnetic Emanations • Light • Sound • Temperature • … 12 | PV181 What can be attacked & why? • Type of device? • What kind of primitive? • How much control do you have? • What can you access? • What would be the attacker’s goal? • What is your goal? • Where is the money? • … 13 | PV181 Practical Setup Spectrum 14 | PV181 Some Other Practical Setups 15 | PV181 Actual (overcomplicated?) setup 16 | PV181 Simple Power Analysis (SPA) on RSA 18 | PV181 1996. A = 1 for ( i = n-1; i≥0; i− −) A = A2 mod N if (di = =1) A = A*c mod N end if end for Return A = cd mod N ModExp(c){ } M M M MS S S … S M … S S S S S S S S S 1 0 1 0 0 0 1 0 0 1 0 Probe “By carefully measuring the amount of time required to perform private key operations, attackers may be able to find […] RSA keys.” Differential (Correlation) Power Analysis 19 | PV181 1999 1999 2004 random inputs … ntraces Guess ෡𝒅 bits target state User: • HW of a register • HD between current and previous register state • ID model (value of a register) 𝑓𝑖 = Selection Function(random inputs, ෡𝒅, target state) DPA = Difference of Means 𝑓𝑖 = ቊ 0 𝑖𝑓 𝐻𝑊 ≤ 16 1 𝑖𝑓 𝐻𝑊 > 16 𝑓𝑖 = 𝐻𝑊(𝑟𝑒𝑔_𝑠𝑡𝑎𝑡𝑒) CPA = Pearson correlation ModExp(𝒅) • Problems with the above approaches: • can we attack the key directly? • we often do not get many traces with the same secret • can we use an unprotected device of the same model? • (Possible) Solution: • We profile, i.e. template the unprotected device • We use the profile to break the protected device • Procedure: 1. Choose a model that describes the power consumption 2. Profile the unprotected device to create the template (Template Building) 3. Use the template to break the protected device (Template Matching) • The same steps are always performed but the model can be different. • So often we will not learn the secret but the hamming weight of the secret. • Neural Networks can be used instead of Template Attacks • Attacking Single Trace or Multiple Traces? Both are possible. 20 Controlled unprotected device Attacked protected device Profiled Attacks Goals of Fault Injection 21 | PV181 • The goal is to change a critical value or to change the flow of a program. • Faults can be injected in several ways: – Power glitches can disturb the power supply to the processor, resulting in wrong values read from memory. – Optical glitches with laser can force any elementary circuit to switch, enabling the attacker to achieve a very specific change of data values or behavior. – Clock manipulation by introducing a few very short clock cycles which may lead to the device misinterpreting a value read from memory. – Cutting the power to the processor while performing important computations, hoping to either prevent the system from taking measures against a detected attack or get the system into a vulnerable state when the power is back. • Differential Fault Analysis (DFA) Fault Injection Example: the “unlooper” device 22 | PV181 Fault Injection Example: the “unlooper” device Warm-up question (1): where to glitch? 23 | PV181 RSA-CRT: Differential Fault Analysis • Optimization of computing a signature giving about 3 or 4-fold speed-up • Precompute the following values: – Find dp = d (mod p-1), computed as dp = e-1 (mod p-1) – Find dq = d (mod q-1) – Compute iq = q-1 (mod p) • Computations using mp = m (mod p) and mq = m (mod q) • Signature or encryption (forgetting about hashing): – sp = 𝑚 𝑑 𝑝 (mod p) – sq = 𝑚 𝑑 𝑞 (mod q) – Garner’s method (1965) to recombine sp and sq: • s = sq + q · (iq(sp − sq) (mod p)) • Due to a limited time, we need to skip the math details on how to recover p and q, but it is possible with one fault! – If you are interested, ask me after the seminar; it is a so-called Bellcore attack, see for example: https://eprint.iacr.org/2012/553.pdf 24 | PV181 How to protect against FI? 25 | PV181 • You have to check that the operations was correctly executed, for example: – Duplication of operations; – For signature generation you can verify the result – Some SCA countermeasures will work even for FI • But not all Warm-up Question (2): Software for PIN code verification 26 | PV181 • What is the problem here? • What are the execution times of the process for PIN inputs? • [0,1,2,3], [5,3,0,2], [5,9,0,0] • The execution time increases as we get closer to • [5,9,0,2] Warm-up Task – parity check for DES key 27 | PV181 Warm-up Task – parity check for DES key cont’d 28 | PV181 Tell me what is the key ☺ 29 | PV181 Warm-up Task – parity check for DES key cont’d Question 1: faster and more secure modexp - Montgomery ladder 30 | PV181 x0=x; x1=x2 for j=k-2 to 0 { if dj=0 x1=x0*x1; x0=x0 2 else x0=x0*x1; x1=x1 2 x1=x1 mod N x0=x0 mod N } return x0 Both branches with the same number and type of operations (unlike square and multiply on previous slide) Is it constant-time & secure? Why? Question 2: even more secure modexp 31 | PV181 x0=x; x1=x2 for j=k-2 to 0 { b=dj x(1-b)=x0*x1; xb=xb 2 x1=x1 mod N x0=x0 mod N } return x0 Memory access often is not constant time! Especially in the presence of caches. Is it constant-time & secure? Why? Question 3: even more secure modexp 32 | PV181 x0=x; x1=x2 for j=k-2 to 0 { b=dj x(1-b)=x0*x1; xb=xb 2 x1=x1 mod N x0=x0 mod N } return x0 Memory access often is not constant time! Especially in the presence of caches. Is it constant-time & secure? Why? Question 4: even more more secure modexp 33 | PV181 x0=x; x1=x2; sw = 0 for j=k-2 to 0 { b=dj cswap(x0,x1,b⊕sw) sw = sw⊕di x1=x0*x1; x0=x0 2 x1=x1 mod N x0=x0 mod N } return x0 Constant-time? Depends on the cswap… but it can be ☺ Other-side channels? Depends  Is it constant-time & secure? Why? Question 5: Arithmetic Cswap – constant-time? 34 | PV181 Question 5: Arithmetic Cswap – secure against other side-channels? 35 | PV181 sample 𝒕𝒊 ⋮ Scalar multiplication trace 255 iterations ⋯ sample index samplevalue(V) Apply clustering (e.g. k-means), Template Attack, Deep Learning to the set of 255 samples: bits 0 bits 1 255 Samples255 Montgomery iterations Message and exponent blinding 36 | PV181 𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵 1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆 𝒎𝒐𝒅 𝑵 2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏) 3. 𝒄 𝒓 = 𝒎 𝒓 𝒅 𝒓 𝒎𝒐𝒅 𝒏 4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏 message blinding message “unblinding” exponent blinding blinded exponentiation The sequence of operations (S, M) is related to the exponent bits. However: • If d is random: the sequence of exponent bits changes for every RSA execution • If m is random: Intermediate data is random (masked) → hardly predicted! DPA is based on the prediction of intermediate data. Thesis: Any side-channel attack requiring multiple traces are repelled by message and exponent blinding countermeasures. For ECC there are corresponding countermeasures: coordinate blinding, scalar blinding, blinded scalar multiplications, and no unblinding ☺ Message and exponent blinding for CRT? 37 | PV181 𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵 1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆 𝒎𝒐𝒅 𝑵 2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏) 3. 𝒄 𝒓 = 𝒎 𝒓 𝒅 𝒓 𝒎𝒐𝒅 𝒏 4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏 message blinding message “unblinding” exponent blinding blinded exponentiation • Message blinding is the same! • Exponent blinding needs to be done twice: sp = 𝑚 𝑑 𝑝 (mod p) = 𝑚 𝑑 𝑝+r*(p-1) (mod p) sq = 𝑚 𝑑 𝑞 (mod q) = 𝑚 𝑑 𝑞+r*(q-1) (mod q) • That does not stop FI attacks! SCA&FI-protected Elliptic Curve library • A protected library for ECDH – key exchange & session key establishment – It will be published in TCHES2023 volume 1 and • presented at Ches 2023 in Prague • Code library available from GitHub • Useful links: – https://eprint.iacr.org/2021/1003 – https://github.com/sca-secure-library-sca25519/sca25519 • Taking care of ECDSA: – https://eprint.iacr.org/2022/1254 – I will add it to the repository later on. 38 | PV181 What to do first • Download (or clone) the code from: – https://github.com/sca-secure-library-sca25519/sca25519 • If you do not know C then it will be tricky but in this case try to be intuitive. • Task 1: have a look at the STM32F407-unprotected: – Please find the starting point. – Please find the scalar multiplication function. • And the scalar multiplication loop. – What the code is doing? 39 Task 1: Unprotected Crypto Library 40 Task 1: Unprotected Crypto Library cont’d 41 Protected Crypto Library – other implementations Ephemeral & Static increase complexity 42 Task 2: Ephemeral Crypto Library • Have a look at the STM32F407-ephermeral (and STM32F407-static): – Find scalar multiplication functions and the scalar multiplication loops • Try to find one side-channel countermeasure and one fault injection countermeasure. Have also a look at the list of implemented countermeasures in: – https://tches.iacr.org/index.php/TCHES/issue/view/312 • Can you explain the countermeasures? • If you have time, then try to find one or two more countermeasures Remark: do not worry – this is a hard exercise. 43 Task 2: Ephemeral Crypto Library - FI 44 Find the same countermeasure in the static implementation. Task 2: Ephemeral Crypto Library - SCA 45 Task 2: Ephemeral Crypto Library – SCA cont’d 46 Task 3: Static Crypto Library – SCA • Find scalar splitting (similar to blinding): 1. Generate 64-bit r and computer r-1 2. Compute P’ = [r-1*k]*P 3. Compute [r]*P’ = [k]P • Does it work? • Find this countermeasure in the static SCA code: Steps 2 and 3. 47 Exercise: Protected Crypto Library 3 Step 2 Step 3 48 Efficiency Demo (Optionally) 49 Demo Instructions • Open in a browser: https://github.com/sca-secure- library-sca25519/sca25519 • And follow the instructions from there – There are some issues related to the libopencm3 library • You need a Discover board and an FTDI cable • git clone https://github.com/sca-secure-library- sca25519/sca25519.git 50 Assignment 9 – Countermeasures • This is a programming assignment. Please upload your scripts/code and the required analysis via the course webpage. • The deadline for submission is Dec. 13, 2023, 8:00. – -3 points for each started 24h after the deadline. • Your code should be contained in one .py file. Please name the submission file as _hw9.zip. Put there both the python code, the analysis document, and all data produced during analysis (as long as the size is reasonable). • The code must contain comments so that it is reasonably easy to understand how to run the script for evaluating each answer. 51 | PV181 Assignment 9 - Tasks 1. Task 1: protect the CRT implementation with exponent blinding in the function TCR_protected! First, test and then modify the code (the result should be the same). In a separate report (max 2 pages), write why the countermeasure works (does not affect the correctness of the result). Then, perform a useful analysis of the efficiency cost of the countermeasure (repeat the experiment a number of times and report a percent increase). [3.5 points] 2. Task 2: protect the CRT implementation with message blinding! Note that this will require knowledge of e. In the document, write why the countermeasure works. Then, perform a useful analysis of the cost of the countermeasure. [3.0 points] 3. Task 3: protect the CRT implementation against fault injection! Any countermeasure is OK. In the document, write why the countermeasure works. Then, perform a useful analysis of the cost of the countermeasure. [2.5 points] 4. Task 4: combine all the countermeasures and measure the time of all additional countermeasures and how well they work. Write that in the report. [1 points] 5. Bonus (3 points): (a) Instead of exponent blinding, implement exponent splitting. How does it compare to blinding? [1 point] (b) Implement another extra countermeasure (any, it can be either SCA or FI). What is its cost? [1 point] (c) Implement yet another extra countermeasure (any, either SCA or FI). What is its cost? [1 point] Remark: we are securing Python code and, for the sake of this exercise, assume that the code is directly executed by the processor (and not interpreted etc.) Consultation: Monday at 13:00 in A406. Good luck!!! 52 | PV181