Write Your Own RSA Encryption Tool — Part 2 RSA

Tech & Math
5 min readMar 13, 2022

RSA, the most famous name in cryptography, what is it? In this series, I will walk you through the all you need to know about RSA — from the mathematical theory behind it to the final implementation in Python.

In this article, you will learn:

1.How RSA works.

2.Mathematics behind RSA

3.Implementation of RSA

Before that, you should have a working knowledge of modular arithmetics. If not, please consult the part 1 to learn or refresh on modular operations.

Rather than presenting dry mathematical theorems to you, let’s go through the process of RSA encryption/decryption process. Following convention, we have Alice and Bob, who are trying to communicate to each other.

RSA Ins and outs

Alice first puts some information to everyone: she will puts a tuple, (N, e), to the internet. Notice that everyone knows this information. Cool, so what is N? The formation of N is as follows: Alice randomly choose 2 really big prime numbers, p, and q, then N = p*q. e is called public key, and we have the following requirements: e must be coprime to least common multiplier of p-1 and q-1. (We say two numbers are coprime, if their gcd, greatest common divisor, is 1).

Before we go further, you might wonder, why do we have such a strange requirement on e? The reason is Extended Euclidean algorithm:

Extended Euclidean algorithm: This is an algorithm that allows us to find an integer m and n such that for any two integer a and b, we have: a*m+b*n = gcd(a,b), where gcd means greatest common divisor.

In this case, since e is coprime to lcm(p-1,q-1), then we can find integers m and n such that 1 = e * m + lcm(p-1,q-1) * n. For now, we ignore all proofs, just keep this fact in mind.

Then, after find appropriate pairs (N,e), Alice will have to generate a number called d, or private key. This time, Alice will keep this d to herself, not sharing to anyone. Here, as you might guess, have another peculiar requirement on d: d is the multiplicative inverse of e in modular system of lcm(p-1,q-1). In math:

d * e = (1 mod lcm(p-1,q-1))

Questions now arise: 1.Is there always a d that satisfies the requirement? 2.Can Alice find it quickly? The answer is Yes to both, thanks to Extended Euclidean algorithm.

Since we now know m and n such that 1 = e * m + lcm(p-1,q-1) * n. Then, we can just let d = m. Then, d * e = m * e = 1-lcm(p-1,q-1)*n. Now, we do modular arithmetics, we know d * e = (1 mod lcm(p-1,q-1)). Also, Extended Euclidean Algorithms have time complexity O(log(max(e, lcm(p-1,q-1)))), which is considered efficient for computers.

Now Alice has finished its part, and it is turn for Bob. Now, Bob wants to send a message msg to Alice. His steps are simple:

  1. He finds out (N, e) sent by Alice.
  2. He makes msg into numbers, calculates O = msg^e (mod N), and sends O to Alice

His steps are done. Simple, huh? Now Alice receives Bob’s message, and she needs to decrypt it. She only needs to do one operation: calculates O^d (mod N) and the result will be msg.

The working mathematics behind it is that msg = O^d (mod N) = (msg^e)^d (mod N), or for any number msg, msg^(e*d) = (msg mod N).

Here is where we learned Euler theorem: if gcd(a,N) = 1, then a^φ(n) = (1 mod N). Here φ(N) is Euler’s totient function: the number of integers in {1, 2, . . ., n-1} which are coprime to N. Interestingly, Euler’s totient function φ(N) = lcm(p-1,q-1).

As usual, we skipped all the proofs, but we can see how msg^(e*d) = (msg mod N). Since d * e = (1 mod lcm(p-1,q-1)), then, d * e =1 + h * lcm(p-1,q-1), = 1 + h * φ(N) where h is just an integer.

Then, (msg^e)^d = msg ^ (e*d) = msg ^ (1+φ(N)) = msg * msg ^ φ(N). Using Euler theorem, we know msg ^ φ(N) = (1 mod N), then msg * msg ^ φ(N) = (1 mod N). Therefore, we proved (msg^e)^d = (msg mod N).

Why RSA is Secure

Util now, we finish encryption and decryption process, but we never discuss why it is hard for someone else to decrypt the message. We translate this into mathematics language. Given msg^e (mod N), how do we find the value of msg? You might notice that, the reason we can recover the value of msg is because we have this magical private key d. The reason we have this magical d is because we use Extended Euclidean Algorithm, and the key of Extended Euclidean Algorithm is that we know value of lcm(p-1,q-1). In other words, this is all due to the fact we know value of p and q. This is to say, given N, if we can factor p and q easily. RSA process will become meaningless.

Why prime factorization is so hard? The reason behind this is too complicated to discuss here. But, keep in mind that for large number, prime factorization has no known efficient algorithms.

Implementation

To implement RSA, we have to identify some key process. In order to simplify things, we ignore the prime number generation (p, q) and selection of public key e. However, in practice, they are extremely important.

We get p, q and e.

Now, the implementation is straightforward. The only part left is to generate private key d. To achieve this, we only need to implement Extended Euclidean Algorithms:

Implementation to get private key d

After that, it is time to test and see amazing RSA algorithm works! (There is only one caveat: the msg should be less than the N)

Some random tests

The full code will be put here, and feel free to grab them for your need:

RSA.py:

import math
# Step 1: We generate two large prime numbers. Here, we take p,q as given:
p = 337
q = 557
N = p*q
print(N)

# Calculate the Euler Totient number of N
Phi_N = math.lcm(p-1, q-1)
print(Phi_N)

# In this step, we generate some public key
e = 17

#Extended Euclidean Algorithm
def extended_euclidean_algo(a,b):
if(a%b==0):
return(b,0,1)
else:
gcd,m,n = extended_euclidean_algo(b,a%b)
m = m — ((a//b) * n)
return(gcd,n,m)

def find_multiplicative_inverse(e,Phi_N):
gcd,m,_= extended_euclidean_algo(e,Phi_N)
if(gcd!=1):
#e number have to be coprime to Phi_N to have mul inverse.
return None
else:
return m%Phi_N

d = find_multiplicative_inverse(e, Phi_N)
print(d)
print((d*e) % Phi_N)

def encrypt(msg):
return (msg**e)%N

def decrypt(encrypted_msg):
return (encrypted_msg**d)%N

--

--