PV181 Laboratory of security and applied cryptography Seminar 9: Crypto-libraries protected against hardware attacks | PV1811 Łukasz Chmielewski Email: chmiel@fi.muni.cz Consultations: A406, 9.00-11.00 on Fridays Outline • Recall + goal of this seminar – Digital signatures – RSA and a bit about ECC • Side Channel + Fault Injection speed run • Secured X25519 library: sca25519 – Optionally (but unlikely): Demo • Assignment this week: – Securing RSA execution 2 | PV181 encryption decryption message Alice Public key of Bob Bob Adapted Source: Network and Internetwork Security (Stallings) Private key of Bob Encrypted message Decrypted original message | PV1813 Recall: Asymmetric cryptosystem Recall: Digital signature scheme 4 | PV181 Signature algorithm Verification algorithmmessage signed message Alice Public key of Alice Bob Source: Network and Internetwork Security (Stallings) verified message Private key of Alice Is there a difference? RSA (recall) 5 RSA: reminder 1. Secret primes 𝑝, 𝑞: 𝑛 = 𝑝 ∙ 𝑞 2. Public exponent 𝑒: gcd 𝑒, (𝑝 − 1) = gcd 𝑒, (𝑞 − 1) = 1 3. Private exponent 𝑑: 𝑑 ∙ 𝑒 ≡ 1 𝑚𝑜𝑑 𝜑 𝑛 Encryption (public 𝑛, 𝑒): 𝐸 𝑚 = 𝑚 𝑒 𝑚𝑜𝑑 𝑛 = 𝑐 Decryption (private 𝑛, 𝑑): 𝐷 𝑐 = 𝑐 𝑑 𝑚𝑜𝑑 𝑛 = 𝑚 6 | PV181 RSA-CRT + demo • Optimization of computing a signature giving about 3 or 4-fold speed-up • Precompute the following values: – Find dp = d (mod p-1), computed as dp = e-1 (mod p-1) – Find dq = d (mod q-1) – Compute iq = q-1 (mod p) • Computations using mp = m (mod p) and mq = m (mod q) • Signature or encryption (forgetting about hashing): – sp = 𝑚 𝑑 𝑝 (mod p) – sq = 𝑚 𝑑 𝑞 (mod q) – Garner’s method (1965) to recombine sp and sq: • s = sq + q · (iq(sp − sq) (mod p)) • Computations using mp = m (mod p) and mq = m (mod q) • Open RSA.py and run it. Analyze it, what are your conclusions? – What is the speed improvement? 7 | PV181 ECC (recall) 8 Recall: RSA vs. ECC 9 | PV181 • exponentiation ≈ scalar multiplication • multiplication ≈ points addition • squaring ≈ point doubling • The next few slides be ECC recall Elliptic curve example • Example • y2 = x3 - 3 x2 + 5 over ℚ, and ∞ • How would it look over a finite field, • for example: Fp? for p = 7919 10 | PV181 Can you see a pattern? Elliptic curve implementations • Group operation over the curve: addition and doubling 11 | PV181 Elliptic curve implementations’ details • Above operations on the finite field: • … 12 | PV181 ECC keys • Generating key pair – Select a random integer d from [1,n − 1] – Compute P = [d]G = d*G; – Hard to get d from P and G! • Private key: d • Public key: P, – also: G, and curve details are also public • For 256-bit curve – the private key d will be approx. 256-bit long – the public key P is a point on the curve – will be approx 512-bit long, unless compressed 14 | PV181 SCA & FI 15 Why is hardware security important? 16 | PV181 Identity Theft • Premium Content Theft Impersonation Card / Money Theft Phone / Money Theft Cookies Example 17 | PV181 https://www.simplethread.com/great-scott-timing-attack-demo/ Passive vs Active Side Channels 18 | PV181 Passive: analyze device behavior Active: change device behavior Recent Practical Attacks 19 | PV181 What can be attacked & why? • Type of device? • What kind of primitive? • How much control do you have? • What can you access? • What would be the attacker’s goal? • What is your goal? • Where is the money? • … 21 | PV181 Some Practical Setups 22 | PV181 Simple Power Analysis (SPA) on RSA 23 | PV181 1996. A = 1 for ( i = n-1; i≥0; i− −) A = A2 mod N if (di = =1) A = A*c mod N end if end for Return A = cd mod N ModExp(c){ } M M M MS S S … S M … S S S S S S S S S 1 0 1 0 0 0 1 0 0 1 0 Probe “By carefully measuring the amount of time required to perform private key operations, attackers may be able to find […] RSA keys.” Differential (Correlation) Power Analysis 24 | PV181 1999 1999 2004 random inputs … ntraces Guess ෡𝒅 bits target state User: • HW of a register • HD between current and previous register state • ID model (value of a register) 𝑓𝑖 = Selection Function(random inputs, ෡𝒅, target state) DPA = Difference of Means 𝑓𝑖 = ቊ 0 𝑖𝑓 𝐻𝑊 ≤ 16 1 𝑖𝑓 𝐻𝑊 > 16 𝑓𝑖 = 𝐻𝑊(𝑟𝑒𝑔_𝑠𝑡𝑎𝑡𝑒) CPA = Pearson correlation ModExp(𝒅) Goals of Fault Injection 26 | PV181 • The goal is to change a critical value or to change the flow of a program. • Faults can be injected in several ways: – Power glitches – Optical glitches with laser – Clock manipulation by introducing a few very short clock cycles – Cutting the power to the processor while performing important computations • Differential Fault Analysis (DFA) RSA-CRT: Differential Fault Analysis https://eprint.iacr.org/2012/553.pdf • Optimization of computing a signature giving about 3 or 4-fold speed-up • Precompute the following values: – Find dp = d (mod p-1), computed as dp = e-1 (mod p-1) – Find dq = d (mod q-1) – Compute iq = q-1 (mod p) • Computations using mp = m (mod p) and mq = m (mod q) • Signature or encryption (forgetting about hashing): – sp = 𝑚 𝑑 𝑝 (mod p) – sq = 𝑚 𝑑 𝑞 (mod q) – Garner’s method (1965) to recombine sp and sq: • s = sq + q · (iq(sp − sq) (mod p)) • Due to a limited time, we need to skip the math details on how to recover p and q, but it is possible with one fault! – If you are interested, ask me after the seminar; it is a so-called Bellcore attack, see for example: https://eprint.iacr.org/2012/553.pdf 29 | PV181 How to protect against FI? 30 | PV181 • You have to check that the operations was correctly executed, for example: – Duplication of operations; – For signature generation you can verify the result – Some SCA countermeasures will work even for FI • But not all Warm-up Question (1-2): Software for PIN code verification 31 | PV181 • What is the problem here? • What are the execution times of the process for PIN inputs? • [0,1,2,3], [5,3,0,2], [5,9,0,0] • The execution time increases as we get closer to • [5,9,0,2] • How would you perform a fault injection attack here? Warm-up Task – parity check for DES key 32 | PV181 Warm-up Task – parity check for DES key cont’d 33 | PV181 Tell me what is the key ☺ 34 | PV181 Warm-up Task – parity check for DES key cont’d Question 1: faster and more secure modexp - Montgomery ladder 35 | PV181 x0=x; x1=x2 for j=k-2 to 0 { if dj=0 x1=x0*x1; x0=x0 2 else x0=x0*x1; x1=x1 2 x1=x1 mod N x0=x0 mod N } return x0 Both branches with the same number and type of operations (unlike square and multiply on previous slide) Is it constant-time & secure? Why? Question 2: even more secure modexp 36 | PV181 x0=x; x1=x2 for j=k-2 to 0 { b=dj x(1-b)=x0*x1; xb=xb 2 x1=x1 mod N x0=x0 mod N } return x0 Memory access often is not constant time! Especially in the presence of caches. Is it constant-time & secure? Why? Question 3: even more secure modexp 37 | PV181 x0=x; x1=x2 for j=k-2 to 0 { b=dj x(1-b)=x0*x1; xb=xb 2 x1=x1 mod N x0=x0 mod N } return x0 Memory access often is not constant time! Especially in the presence of caches. Is it constant-time & secure? Why? Question 4: even more more secure modexp 38 | PV181 x0=x; x1=x2; sw = 0 for j=k-2 to 0 { b=dj cswap(x0,x1,b⊕sw) sw = sw⊕di x1=x0*x1; x0=x0 2 x1=x1 mod N x0=x0 mod N } return x0 Constant-time? Depends on the cswap… but it can be ☺ Other-side channels? Depends  Is it constant-time & secure? Why? Message and exponent blinding 41 | PV181 𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵 1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆 𝒎𝒐𝒅 𝑵 2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏) 3. 𝒄 𝒓 = 𝒎 𝒓 𝒅 𝒓 𝒎𝒐𝒅 𝒏 4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏 message blinding message “unblinding” exponent blinding blinded exponentiation The sequence of operations (S, M) is related to the exponent bits. However: • If d is random: the sequence of exponent bits changes for every RSA execution • If m is random: Intermediate data is random (masked) → hardly predicted! Message and exponent blinding 42 | PV181 𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵 1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆 𝒎𝒐𝒅 𝑵 2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏) 3. 𝒄 𝒓 = 𝒎 𝒓 𝒅 𝒓 𝒎𝒐𝒅 𝒏 4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏 message blinding message “unblinding” exponent blinding blinded exponentiation • Message blinding is the same! • Exponent blinding needs to be done twice: sp = 𝑚 𝑑 𝑝 (mod p) = 𝑚 𝑑 𝑝+r*(p-1) (mod p) sq = 𝑚 𝑑 𝑞 (mod q) = 𝑚 𝑑 𝑞+r*(q-1) (mod q) • That does not stop FI attacks! Why do coordinate and scalar blinding protect ECC against SCA? 43 | PV181 𝑴 = 𝒔 𝑷 = 𝒔 𝑿, 𝒀 = [𝒔](𝒙, 𝒚, 𝟏) 1.𝑴 = [𝒔](𝒙. 𝒛, 𝒚. 𝒛, 𝒛) 2. 𝒔 𝒓 = 𝒔 + 𝒓. |𝑬| 3. 𝑴 𝒓 = [𝒔 𝒓](𝒙. 𝒛, 𝒚. 𝒛, 𝒛) 4. coordinate blinding no unblinding scalar blinding blinded scalar mult. The same situation as for RSA. Point blinding is also possible but not presented above. Note: there are of course differences in some detailed countermeasures. CODE INSPECTION PROTECTED CRYPTO LIBRARY 44 SCA&FI-protected Elliptic Curve library • A protected library for ECDH – key exchange & session key establishment – It will be published in TCHES2023 volume 1 and • presented at Ches 2023 in Prague • Code library available from GitHub • Useful links: – https://eprint.iacr.org/2021/1003 – https://github.com/sca-secure-library-sca25519/sca25519 • Taking care of ECDSA: – https://eprint.iacr.org/2022/1254 – I will add it to the repository later on. 45 | PV181 What to do first • Download (or clone) the code from: – https://github.com/sca-secure-library-sca25519/sca25519 • If you do not know C then it will be tricky but in this case try to be intuitive. • Task 1: have a look at the STM32F407-unprotected: – Please find the starting point. – Please find the scalar multiplication function. • And the scalar multiplication loop. – What the code is doing? 46 Task 1: Unprotected Crypto Library 47 Task 1: Unprotected Crypto Library cont’d 48 Protected Crypto Library – other implementations Ephemeral & Static increase complexity 49 Task 2: Ephemeral Crypto Library • Have a look at the STM32F407-ephermeral (and STM32F407-static): – Find scalar multiplication functions and the scalar multiplication loops • Try to find one side-channel countermeasure and one fault injection countermeasure. Have also a look at the list of implemented countermeasures in: – https://tches.iacr.org/index.php/TCHES/issue/view/312 • Can you explain the countermeasures? • If you have time, then try to find one or two more countermeasures Remark: do not worry – this is a hard exercise. 50 Task 2: Ephemeral Crypto Library - FI 51 Find the same countermeasure in the static implementation. Task 2: Ephemeral Crypto Library - SCA 52 Task 2: Ephemeral Crypto Library – SCA cont’d 53 Task 3: Static Crypto Library – SCA • Find scalar splitting (similar to blinding): 1. Generate 64-bit r and computer r-1 2. Compute P’ = [r-1*k]*P 3. Compute [r]*P’ = [k]P • Does it work? • Find this countermeasure in the static SCA code: Steps 2 and 3. 54 Exercise: Protected Crypto Library 3 Step 2 Step 3 55 Efficiency Demo (Optionally) 56 Demo Instructions • Open in a browser: https://github.com/sca-secure- library-sca25519/sca25519 • And follow the instructions from there – There are some issues related to the libopencm3 library • You need a Discover board and an FTDI cable • git clone https://github.com/sca-secure-library- sca25519/sca25519.git 57 CONCLUSIONS & QUESTIONS 58 Assignment 7 – Countermeasures • This is a programming assignment. Please upload your scripts/code and the required analysis via the course webpage. • The deadline for submission is Nov. 28, 2024, 8:00. – -3 points for each started 24h after the deadline. • Your code should be contained in one .py file. Please name the submission file as _hw7.zip. Put there both the python code, the analysis document, and all data produced during analysis (as long as the size is reasonable). • The code must contain comments so that it is reasonably easy to understand how to run the script for evaluating each answer. 59 | PV181 Assignment 7 - Tasks 1. Have a look at the RSA_homework.py file. There are some comments for you there too. Protect the CRT implementation with exponent blinding in the function TCR_protected! First, test and then modify the code (the result should be the same). In a separate report (max 2 pages), write why the countermeasure works (does not affect the correctness of the result). Then, perform a useful analysis of the efficiency cost of the countermeasure (repeat the experiment a number of times and report a percent increase). [2.0 points] 2. Protect the CRT implementation with message blinding! Note that this will require knowledge of the public exponent e. In the document, write why the countermeasure works. Then, perform a useful analysis of the cost of the countermeasure. [3.0 points] 3. Protect the CRT implementation against fault injection! Any countermeasure is OK. In the document, write why the countermeasure works. Then, perform a useful analysis of the cost of the countermeasure. [1.5] 4. Combine all the countermeasures and measure the time of all additional countermeasures and how well they work. Write that in the report. [1.5 points] 5. Instead of exponent blinding, implement exponent splitting. How does it compare to blinding efficiencywise? Order the countermeasures with respect to their efficiency. [2 point] 6. Bonus: - Implement another extra countermeasure (any, it can be either SCA or FI). What is its cost? [1 point] Remark: we are securing Python code and, for the sake of this exercise, assume that the code is directly executed by the processor (and not interpreted etc.) Consultation: Friday at 9:00 in A406. Good luck!!! 60 | PV181