How does SHA-256 algorithm work?




In cryptography, there are mainly three kinds of algorithms, namely, symmetric-key cryptography, asymmetric key cryptography, and hash functions. SHA-256 which stands for <b>Secure Hash Algorithm-256 bits</b> is an example of the <b>hash function</b>. It takes in an input of any size and gives out an output of size 256-bits. Here I will try to explain what Hash Functions are and how does SHA-256 algorithm works. <h2><b><center>Hash Functions</h2></b></center> Hash Functions are a type of algorithms in cryptography. Hash algorithms can take in an input of any size and give out an output of a fixed size (like in the case of SHA-256, the output is of size 256-bits). The algorithm should be designed in such a way that it is easily computable by a computer. For security purposes, we need to use such kinds of hash functions in the real world that is <b>collision-free</b>. This means that whatever be the input, the output should be unique. Though this is not possible for any known algorithm. Taking the example of the SHA-256 bit algorithm, the number of possible outputs for it can be 2<sup>256</sup>. But the number of possible inputs can be infinite. So by pigeon hole principle, <b>it can't give a unique output every time a unique input has been entered</b>. <br/> <img src="" width="256" height="176" /> But until we don't find a collision for a hashing algorithm, it is considered collision-free. There are some examples of the hashing algorithms that have been proved to have collisions like the MD5 algorithm. Another important feature a cryptographic hashing algorithm should have is that if we have the output, <b>it should be almost impossible to deduce the actual input</b>. These features make cryptographic hashing algorithms safe to use. <h2><b><center>Working Of SHA-256 Algorithm</h2></b></center> Even though MD5 is the most used hashing algorithm right now, SHA-256 is generally recommended to be used in software systems. When we pass input through the SHA-256 algorithm, it first calculates the size of the input. If the size of the input is not a multiple of 512, it <b>adds some padding to the input</b>. The padding in SHA-256 consists of 1 followed by zeroes so that the size of the resulting input is multiple of 512. <br/> <img src="" width="256" height="48" /> After adding the padding, the algorithm divides the input into blocks of size 512-bits. After dividing the input into blocks of 512-bits, the algorithm finds an initial value called <b>Initialization Vector (IV)</b> which is of size 256-bits. The IV value for any input is one of the below 8 already decided values (all values in hex): <center><i><b>H0 = 6a09e667 H<sub>1</sub> = bb67ae85 H<sub>2</sub> = 3c6ef372 H<sub>3</sub> = a54ff53a H<sub>4</sub> = 510e527f H<sub>5</sub> = 9b05688c H<sub>6</sub> = 1f83d9ab H<sub>7</sub> = 5be0cd19</center></i></b> Hence whenever we enter the same input, the IV value is always the same. We pass the IV value and the first block of 512-bits through a compression function. The output of the compression function of size 256-bits. We continue with the same step by passing the output of the previous step and the next block of 512-bits through the compression function until all the blocks have been processed. <br/> <img src="" width="256" height="118" /> The final output we get from the compression function is considered the output of the SHA-256 bits algorithm. The compression algorithm used in SHA-256 is <b>Merkle-Damgard Construction</b>. It is called a compression function as it takes in an input of size 768-bits (256 +,512) and returns an output of 256-bits. The compression function has to be collision-free, only then SHA-256 algorithm can be collision-free.

- Ojas Srivastava, 11:37 PM, 17 Dec, 2021