How does SHA-256 algorithm work?
In cryptography, there are mainly three kinds of algorithms, namely, symmetric-key cryptography, asymmetric key cryptography, and hash functions. SHA-256 which stands for <b>Secure Hash Algorithm-256 bits</b> is an example of the <b>hash function</b>. It takes in an input of any size and gives out an output of size 256-bits. Here I will try to explain what Hash Functions are and how does SHA-256 algorithm works.
Hash Functions are a type of algorithms in cryptography. Hash algorithms can take in an input of any size and give out an output of a fixed size (like in the case of SHA-256, the output is of size 256-bits). The algorithm should be designed in such a way that it is easily computable by a computer.
For security purposes, we need to use such kinds of hash functions in the real world that is <b>collision-free</b>. This means that whatever be the input, the output should be unique. Though this is not possible for any known algorithm. Taking the example of the SHA-256 bit algorithm, the number of possible outputs for it can be 2<sup>256</sup>. But the number of possible inputs can be infinite. So by pigeon hole principle, <b>it can't give a unique output every time a unique input has been entered</b>. <br/>
<img src="https://bl6pap004files.storage.live.com/y4mo-EG5mnINZGJbp057eaJgFFlwm9ljJgHH54WsDed5s9rapwRTn7RpNg6CxunURjbX1eWOaWcsHySxYGZyMov51FRY7docS7JgRYcFkGDaf6M3GO2zQY5xhYnYoR2nAOQQ_1okkxw6yDs0UXTGpaM0lcpjcj9APjWtqg4AaTPiNmZO1LvZrWhYZV7FPDk03yb?width=256&height=176&cropmode=none" width="256" height="176" />
But until we don't find a collision for a hashing algorithm, it is considered collision-free. There are some examples of the hashing algorithms that have been proved to have collisions like the MD5 algorithm. Another important feature a cryptographic hashing algorithm should have is that if we have the output, <b>it should be almost impossible to deduce the actual input</b>. These features make cryptographic hashing algorithms safe to use.
<h2><b><center>Working Of SHA-256 Algorithm</h2></b></center>
Even though MD5 is the most used hashing algorithm right now, SHA-256 is generally recommended to be used in software systems. When we pass input through the SHA-256 algorithm, it first calculates the size of the input. If the size of the input is not a multiple of 512, it <b>adds some padding to the input</b>. The padding in SHA-256 consists of 1 followed by zeroes so that the size of the resulting input is multiple of 512. <br/>
<img src="https://bl6pap004files.storage.live.com/y4mXWUO7XQIEz9Vss2AKXb6hIKimP3fT-V_tbw6Itsa50ZYJ0CJoLJ0OCXcubPWsPlBKdgYpuWH2cPYPaSQamv0Tw-tqLtPRuBWlJwJY7xnEMgBhgYuSRS6VWIPFHnJsE9T-ESD7s5R0i2tMmRhPIMxYmUFYEwS_Quj7YbaYf26XIhw2H1b3vOlcVVnRlh6KQfa?width=256&height=48&cropmode=none" width="256" height="48" />
After adding the padding, the algorithm divides the input into blocks of size 512-bits. After dividing the input into blocks of 512-bits, the algorithm finds an initial value called <b>Initialization Vector (IV)</b> which is of size 256-bits. The IV value for any input is one of the below 8 already decided values (all values in hex):
<center><i><b>H0 = 6a09e667
H<sub>1</sub> = bb67ae85
H<sub>2</sub> = 3c6ef372
H<sub>3</sub> = a54ff53a
H<sub>4</sub> = 510e527f
H<sub>5</sub> = 9b05688c
H<sub>6</sub> = 1f83d9ab
H<sub>7</sub> = 5be0cd19</center></i></b>
Hence whenever we enter the same input, the IV value is always the same. We pass the IV value and the first block of 512-bits through a compression function. The output of the compression function of size 256-bits. We continue with the same step by passing the output of the previous step and the next block of 512-bits through the compression function until all the blocks have been processed. <br/>
<img src="https://bl6pap004files.storage.live.com/y4meCRlrKgnvDopRrF2RXv1JzOEnDpmZhDrSnDcoThmoBke1L3X5Uwy_z-A6POVX5nafqLvlfypVflTlXpV9et70oGCLKUEcGIh3eWCR-WZnwhnGGUrykXdPUBRQYL2wO4DeJZ2iifW4wh69bDL2yR8vdtSxeKE-G1Le_0BnAzL-5Z6qR_BlTjz2OBIxIFO94ja?width=256&height=118&cropmode=none" width="256" height="118" />
The final output we get from the compression function is considered the output of the SHA-256 bits algorithm. The compression algorithm used in SHA-256 is <b>Merkle-Damgard Construction</b>. It is called a compression function as it takes in an input of size 768-bits (256 +,512) and returns an output of 256-bits. The compression function has to be collision-free, only then SHA-256 algorithm can be collision-free.
- Ojas Srivastava, 11:37 PM, 17 Dec, 2021