PV181 Laboratory of security
and applied cryptography
Seminar 12:
Crypto-libraries protected against hardware attacks
Łukasz Chmielewski
chmiel@fi.muni.cz
| PV1811
Outline
• Recall + goal of this seminar
– Digital signatures
– RSA vs. ECC
• Side Channel + Fault Injection speed run
• Secured X25519 library: sca25519
– Optionally: Demo
• Assignment this week:
– Securing RSA execution
2 | PV181
encryption decryption
message
Alice
Public key of Bob
Bob
Adapted Source: Network and
Internetwork Security (Stallings)
Private key of Bob
Encrypted
message
Decrypted
original
message
| PV1813
Recall: Asymmetric cryptosystem
Recall: Digital signature scheme
4 | PV181
Signature
algorithm
Verification
algorithmmessage signed
message
Alice
Public key of Alice
Bob
Source: Network and
Internetwork Security (Stallings)
verified
message
Private key of Alice
Is there a difference?
Recall: RSA vs. ECC
5 | PV181
• exponentiation ≈ scalar
multiplication
• multiplication ≈ points addition
• squaring ≈ point doubling
Why is hardware security important?
6 | PV181
Identity Theft
• Premium
Content Theft
Impersonation
Card / Money Theft
Phone / Money Theft
7 | PV181
Side-
Channel
Analysis
8 | PV181
Cookies Example
9 | PV181
Passive vs Active Side Channels
10 | PV181
Passive: analyze device behavior Active: change device behavior
Recent Practical Attacks
11 | PV181
Side Channels
• Time
• Power
• Electro Magnetic Emanations
• Light
• Sound
• Temperature
• …
12 | PV181
What can be attacked & why?
• Type of device?
• What kind of primitive?
• How much control do you have?
• What can you access?
• What would be the attacker’s goal?
• What is your goal?
• Where is the money?
• …
13 | PV181
Practical Setup Spectrum
14 | PV181
Some Other Practical Setups
15 | PV181
Actual (overcomplicated?) setup
16 | PV181
Simple Power Analysis (SPA) on RSA
18 | PV181
1996.
A = 1
for ( i = n-1; i≥0; i− −)
A = A2 mod N
if (di = =1)
A = A*c mod N
end if
end for
Return A = cd mod N
ModExp(c){
}
M M M MS S S …
S
M
… S S S S S S S S S
1 0 1 0 0 0 1 0 0 1 0
Probe
“By carefully measuring the amount of time required to perform
private key operations, attackers may be able to find […] RSA
keys.”
Differential (Correlation) Power Analysis
19 | PV181
1999
1999
2004
random
inputs
…
ntraces
Guess ෡𝒅 bits target state
User:
• HW of a register
• HD between current and previous register state
• ID model (value of a register)
𝑓𝑖 = Selection Function(random inputs, ෡𝒅, target state)
DPA = Difference of Means
𝑓𝑖 = ቊ
0 𝑖𝑓 𝐻𝑊 ≤ 16
1 𝑖𝑓 𝐻𝑊 > 16
𝑓𝑖 = 𝐻𝑊(𝑟𝑒𝑔_𝑠𝑡𝑎𝑡𝑒)
CPA = Pearson correlation
ModExp(𝒅)
• Problems with the above approaches:
• can we attack the key directly?
• we often do not get many traces with the
same secret
• can we use an unprotected device of the
same model?
• (Possible) Solution:
• We profile, i.e. template the unprotected device
• We use the profile to break the protected device
• Procedure:
1. Choose a model that describes the power consumption
2. Profile the unprotected device to create the template (Template Building)
3. Use the template to break the protected device (Template Matching)
• The same steps are always performed but the model can be different.
• So often we will not learn the secret but the hamming weight of the secret.
• Neural Networks can be used instead of Template Attacks
• Attacking Single Trace or Multiple Traces? Both are possible.
20
Controlled
unprotected device
Attacked
protected device
Profiled Attacks
Goals of Fault Injection
21 | PV181
• The goal is to change a critical value or to change the flow
of a program.
• Faults can be injected in several ways:
– Power glitches can disturb the power supply to the processor, resulting in
wrong values read from memory.
– Optical glitches with laser can force any elementary circuit to switch,
enabling the attacker to achieve a very specific change of data values or
behavior.
– Clock manipulation by introducing a few very short clock cycles which may
lead to the device misinterpreting a value read from memory.
– Cutting the power to the processor while performing important computations,
hoping to either prevent the system from taking measures against a detected
attack or get the system into a vulnerable state when the power is back.
• Differential Fault Analysis (DFA)
Fault Injection Example:
the “unlooper” device
22 | PV181
Fault Injection Example: the “unlooper” device
Warm-up question (1): where to glitch?
23 | PV181
RSA-CRT: Differential Fault Analysis
• Optimization of computing a signature giving about 3 or 4-fold speed-up
• Precompute the following values:
– Find dp = d (mod p-1), computed as dp = e-1 (mod p-1)
– Find dq = d (mod q-1)
– Compute iq = q-1 (mod p)
• Computations using mp = m (mod p) and mq = m (mod q)
• Signature or encryption (forgetting about hashing):
– sp = 𝑚 𝑑 𝑝 (mod p)
– sq = 𝑚 𝑑 𝑞 (mod q)
– Garner’s method (1965) to recombine sp and sq:
• s = sq + q · (iq(sp − sq) (mod p))
• Due to a limited time, we need to skip the math details on how to recover
p and q, but it is possible with one fault!
– If you are interested, ask me after the seminar; it is a so-called Bellcore attack, see for
example: https://eprint.iacr.org/2012/553.pdf
24 | PV181
How to protect against FI?
25 | PV181
• You have to check that the operations was correctly
executed, for example:
– Duplication of operations;
– For signature generation you can verify the result
– Some SCA countermeasures will work even for FI
• But not all
Warm-up Question (2):
Software for PIN code verification
26 | PV181
• What is the problem here?
• What are the execution times of
the process for PIN inputs?
• [0,1,2,3], [5,3,0,2], [5,9,0,0]
• The execution time increases as
we get closer to
• [5,9,0,2]
Warm-up Task – parity check for DES key
27 | PV181
Warm-up Task – parity check for DES key
cont’d
28 | PV181
Tell me what is the key ☺
29 | PV181
Warm-up Task – parity check for DES key
cont’d
Question 1:
faster and more secure modexp - Montgomery ladder
30 | PV181
x0=x; x1=x2
for j=k-2 to 0 {
if dj=0
x1=x0*x1; x0=x0
2
else
x0=x0*x1; x1=x1
2
x1=x1 mod N
x0=x0 mod N
}
return x0
Both branches with the same
number and type of operations
(unlike square and multiply on
previous slide)
Is it constant-time & secure? Why?
Question 2:
even more secure modexp
31 | PV181
x0=x; x1=x2
for j=k-2 to 0 {
b=dj
x(1-b)=x0*x1; xb=xb
2
x1=x1 mod N
x0=x0 mod N
}
return x0
Memory access often is not
constant time!
Especially in the presence of
caches.
Is it constant-time & secure? Why?
Question 3:
even more secure modexp
32 | PV181
x0=x; x1=x2
for j=k-2 to 0 {
b=dj
x(1-b)=x0*x1; xb=xb
2
x1=x1 mod N
x0=x0 mod N
}
return x0
Memory access often is not
constant time!
Especially in the presence of
caches.
Is it constant-time & secure? Why?
Question 4:
even more more secure modexp
33 | PV181
x0=x; x1=x2; sw = 0
for j=k-2 to 0 {
b=dj
cswap(x0,x1,b⊕sw)
sw = sw⊕di
x1=x0*x1; x0=x0
2
x1=x1 mod N
x0=x0 mod N
}
return x0
Constant-time? Depends on the
cswap… but it can be ☺
Other-side channels? Depends 
Is it constant-time & secure? Why?
Question 5:
Arithmetic Cswap – constant-time?
34 | PV181
Question 5:
Arithmetic Cswap – secure against other side-channels?
35 | PV181
sample 𝒕𝒊
⋮
Scalar multiplication trace
255
iterations
⋯
sample index
samplevalue(V)
Apply clustering (e.g. k-means), Template Attack, Deep Learning
to the set of 255 samples:
bits 0 bits 1
255 Samples255 Montgomery iterations
Message and exponent blinding
36 | PV181
𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵
1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆
𝒎𝒐𝒅 𝑵
2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏)
3. 𝒄 𝒓 = 𝒎 𝒓
𝒅 𝒓 𝒎𝒐𝒅 𝒏
4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏
message blinding
message “unblinding”
exponent blinding
blinded exponentiation
The sequence of operations (S, M) is related to the exponent bits.
However:
• If d is random: the sequence of exponent bits changes for every RSA execution
• If m is random: Intermediate data is random (masked) → hardly predicted!
DPA is based on the prediction of intermediate data.
Thesis: Any side-channel attack requiring multiple traces are repelled by message and exponent blinding
countermeasures.
For ECC there are corresponding countermeasures: coordinate blinding, scalar blinding, blinded scalar
multiplications, and no unblinding ☺
Message and exponent blinding for CRT?
37 | PV181
𝒄 = 𝒎 𝒅 𝒎𝒐𝒅 𝑵
1. 𝒎 𝒓 = 𝒎. 𝒓−𝒆
𝒎𝒐𝒅 𝑵
2. 𝒅 𝒓 = 𝒅 + 𝒓 ∗ 𝝋(𝒏)
3. 𝒄 𝒓 = 𝒎 𝒓
𝒅 𝒓 𝒎𝒐𝒅 𝒏
4. 𝒄 = 𝒄 𝒓 ∗ 𝒓 𝒎𝒐𝒅 𝒏
message blinding
message “unblinding”
exponent blinding
blinded exponentiation
• Message blinding is the same!
• Exponent blinding needs to be done twice:
sp = 𝑚 𝑑 𝑝 (mod p) = 𝑚 𝑑 𝑝+r*(p-1) (mod p)
sq = 𝑚 𝑑 𝑞 (mod q) = 𝑚 𝑑 𝑞+r*(q-1) (mod q)
• That does not stop FI attacks!
SCA&FI-protected Elliptic Curve library
• A protected library for ECDH
– key exchange & session key establishment
– It will be published in TCHES2023 volume 1 and
• presented at Ches 2023 in Prague
• Code library available from GitHub
• Useful links:
– https://eprint.iacr.org/2021/1003
– https://github.com/sca-secure-library-sca25519/sca25519
• Taking care of ECDSA:
– https://eprint.iacr.org/2022/1254
– I will add it to the repository later on.
38 | PV181
What to do first
• Download (or clone) the code from:
– https://github.com/sca-secure-library-sca25519/sca25519
• If you do not know C then it will be tricky but in this
case try to be intuitive.
• Task 1: have a look at the STM32F407-unprotected:
– Please find the starting point.
– Please find the scalar multiplication function.
• And the scalar multiplication loop.
– What the code is doing?
39
Task 1: Unprotected Crypto Library
40
Task 1: Unprotected Crypto Library cont’d
41
Protected Crypto Library – other
implementations
Ephemeral & Static increase complexity
42
Task 2: Ephemeral Crypto Library
• Have a look at the STM32F407-ephermeral (and
STM32F407-static):
– Find scalar multiplication functions and the scalar multiplication loops
• Try to find one side-channel countermeasure and one fault
injection countermeasure. Have also a look at the list of
implemented countermeasures in:
– https://tches.iacr.org/index.php/TCHES/issue/view/312
• Can you explain the countermeasures?
• If you have time, then try to find one or two more
countermeasures
Remark: do not worry – this is a hard exercise.
43
Task 2: Ephemeral Crypto Library - FI
44
Find the same countermeasure
in the static implementation.
Task 2: Ephemeral Crypto Library - SCA
45
Task 2: Ephemeral Crypto Library – SCA
cont’d
46
Task 3: Static Crypto Library – SCA
• Find scalar splitting (similar to blinding):
1. Generate 64-bit r and computer r-1
2. Compute P’ = [r-1*k]*P
3. Compute [r]*P’ = [k]P
• Does it work?
• Find this countermeasure in the static SCA code:
Steps 2 and 3.
47
Exercise: Protected Crypto Library 3
Step 2 Step 3
48
Efficiency Demo (Optionally)
49
Demo Instructions
• Open in a browser: https://github.com/sca-secure-
library-sca25519/sca25519
• And follow the instructions from there
– There are some issues related to the libopencm3 library
• You need a Discover board and an FTDI cable
• git clone https://github.com/sca-secure-library-
sca25519/sca25519.git
50
Assignment 9 – Countermeasures
• This is a programming assignment. Please upload your
scripts/code and the required analysis via the course webpage.
• The deadline for submission is Dec. 13, 2023, 8:00.
– -3 points for each started 24h after the deadline.
• Your code should be contained in one .py file. Please name the
submission file as <uco_number>_hw9.zip. Put there both the
python code, the analysis document, and all data produced
during analysis (as long as the size is reasonable).
• The code must contain comments so that it is reasonably easy
to understand how to run the script for evaluating each answer.
51 | PV181
Assignment 9 - Tasks
1. Task 1: protect the CRT implementation with exponent blinding in the function TCR_protected! First, test
and then modify the code (the result should be the same). In a separate report (max 2 pages), write why
the countermeasure works (does not affect the correctness of the result). Then, perform a useful analysis
of the efficiency cost of the countermeasure (repeat the experiment a number of times and report a
percent increase). [3.5 points]
2. Task 2: protect the CRT implementation with message blinding! Note that this will require knowledge of e.
In the document, write why the countermeasure works. Then, perform a useful analysis of the cost of the
countermeasure. [3.0 points]
3. Task 3: protect the CRT implementation against fault injection! Any countermeasure is OK. In the
document, write why the countermeasure works. Then, perform a useful analysis of the cost of the
countermeasure. [2.5 points]
4. Task 4: combine all the countermeasures and measure the time of all additional countermeasures and
how well they work. Write that in the report. [1 points]
5. Bonus (3 points):
(a) Instead of exponent blinding, implement exponent splitting. How does it compare to blinding? [1 point]
(b) Implement another extra countermeasure (any, it can be either SCA or FI). What is its cost? [1 point]
(c) Implement yet another extra countermeasure (any, either SCA or FI). What is its cost? [1 point]
Remark: we are securing Python code and, for the sake of this exercise, assume that the code is directly
executed by the processor (and not interpreted etc.)
Consultation: Monday at 13:00 in A406.
Good luck!!!
52 | PV181