### FPGA Super Hash Processor Design

General Overview and Theory

The goal of this project is to implement a FPGA core that can compute the SHA1, SHA256, and MD5 hash result of any given message. This design is implemented and tested for an Intel Aria II FPGA using Quartus Prime and Modalism.

What is a hash and why do we need them?

Any function that takes arbitrary data of any size as an input and outputs new data of a fixed size can be considered a hash function and its results are considered hashes. Hashes can be used to verify the integrity of data, efficient data look ups, cryptographic algorithms, and the handling of large data sets.

Overview of the Hashing Algorithms

The SHA1, SHA256, and MD5 hash algorithms are three of the most popular hash functions in use today. Although the SHA1 and MD5 algorithms are older and can exploited these days, they are still used in many legacy and non-security critical systems. The SHA256 algorithm is a more modern and secure and thus it is used in many systems that require security such as the Bitcoin blockchain.

Although the computation of these three algorithms are different, they all require the message to be preprocessed to create a input using the guidelines below:

1. The processed input must be an integer multiple of 512 bits (512, 1024, 2048, etc.)

2. The last 64 bits of proceed input must be equal to the size of the message (64-bit binary representation)

3. After the message there is a mandatory padding bit of value 1 plus padding bits of all zeros to make the input a multiple of 512 bits. If the message + padding bit of 1 + size is already a multiple of 512 bits, no zeros are needed.

Examples:

1. If the given message is 204 bits long, the processed input will look as follows:

Input[0:203] = message

Input[204] = 1

Input [205:447] = 0

Input[448:511] = size of message in a 64-bit binary representation

2. If the given message is 564 bits long, the processed input will look as follows:

Input[0:563] = message

Input[564] = 1

Input[565:958] = 0

Input[959:1023] = size of message in a 64-bit binary representation

After the message is processed, it is split into individual 512-bit blocks. Then each block sequentially goes through rounds of computation which gives our result, also known as the digest.

 Block Sizes [bits] Digest Size [bits] Computation Rounds Per Block MD5 512 128 64 SHA1 512 160 80 SHA256 512 256 64

Basic Core and Testbench Layout

The Super Hash processor module takes the following inputs and outputs:

INPUTS:

Opcode: The desired hash operation (MD5, SHA256, SHA 1)

Message Address: Location of the message in memory

Size: Size of the message

Output Address: Location where the digest must be written in memory

Clock: System clock for module

Reset: When set low, module is reset

OUTPUTS:

Memory Clock: Clock for memory (synced with system clock)

Memory Write Enable: Enable pin to make memory writeable

Done: Output pin which is set high when module has computed digest

Implementation:

This project is implemented using a state machine design with the following eight states.

STATE 1: SETUP

This is the initial starting state and the state the machine resets to if the reset pin is set low. In this state we initialize our hash constants (h1, h2,…) and fill the first processed block with zeros before it is filled with its actual values.  It’s important to do this now because each 512 bit block is processed sequentially and individually. The block is represented by 16 32bit registers ( w[0], w[1], …. w[15]) which total 512 bits. By always initially filling the 16 registers with zeros, we can just re write the registers we need and save the hassle of figuring out how many zeros are needed for padding purposes. Once this is done we switch over the IIDLE state.

STATE 2: IDLE

Here we determine set the current_block variable to the number 512 bit blocks we will need given the size of the message. We also set the registers variable to the total number of 32 bit words our message uses, set our counter variable to zero, set our limit variable to the number of registers we will need to read for this current block (max of 16), and set the mem_addr output pin to the address of the first word in the message. Once this is all done we move to the reading state.

Here we simply keep reading words from memory, convert them to big Endian, and store them into our w[] registers. Each time the counter and mem_addr are incremented. Once the counter is greater than the limit, we reset the counter, update the registers variable with the remaining number of registers, and move to the PAD state.

Here we go through a set of conditional statements which determine if and where padding is required. Since we initially fill the block with 0s, all we are concerned about is the “1 bit” of padding and the addition of the “size pad” to the last 64 bits of the final block. Finally we update the limit variable for the next block (we do this now and not in the previous state because it depends on the registers variable and that was set the previous cycle) and set our hash function variables (a,b,c,d…) before moving to the HASH state.

STATE 5 HASH:

This is the state where all the number crunching for the hash function is done. Once the computation is done, hashing variables are updates, and we move to the POST_HASH sate.

STATE 6 POST_HASH:

In this state we update the current_block variable, set our counter variable to zero, and set our block to all zeros again. We then determine the next stage of action. If there are more words to be read, we point the mem_addr output pin to the correct place and switch to the READ state. If there are no more words to be read but we require a block of padding we move to the PAD state. If all the blocks have been hashed, we move to the OUTPUT state.

STATE 7 OUTPUT:

Once the hash results have all been computed we begin outputting the result  by setting the mem_addr pin the output address and writing our results. When finished we move to the DONE state.

STATE 8 DONE:

The done pin is set high and we reset back to the SETUP state.

Results

 MD5 ONLY SHA1 ONLY SHA256 ONLY ALL THREE FMAX [Mhz] 98.3 147.36 114.56 95.74 ALUTs 2211 2307 2406 3962 LOGIC REGISTERS 954 1044 1208 1236 AREA 3165 3351 3614 5198 CYCLES (505 byte message) 754 899 758 2411 DELAY [microseconds] 7.6703967 6.10070575 6.616620112 25.18279 AREA*DELAY [miliseconds * area] 24.276806 20.443465 23.91246508 130.9001

The results above were achieved using the "performance high effort" compiler option on a Windows 10 64-Bit machine running Quartus Prime Lite Edition v18. Overall the system achieved a good Fmax and the area was quite low making this a compact design. However there is still much more room for improvement. The largest gains can be realized by breaking down the computation into smaller functions and implementing pipelining to eliminate superficial data depencides. The second version of this core will focus on those two things.

System Verilog Code:

module super_hash_processor(input logic clk, reset_n, start,
input logic [1:0] opcode,
output logic done, mem_clk, mem_we,
output logic [31:0] mem_write_data,

enum logic [3:0] {SETUP=4'b0000, IDLE=4'b0001, READ=4'b0010, PAD=4'b0011, HASH=4'b0100, POST_HASH=4'b0101, OUTPUT=4'b0110, DONE=4'b1110} state;
logic [31:0] counter;
//logic [31:0] temp_address; //uncomment if doing more than one hash
logic [31:0] current_block;
logic [31:0] limit;
logic [31:0] registers;
logic [31:0] w[0:15];
logic [31:0] a, b, c, d, e, f, fsha1, g, h, k, temp;
logic [31:0] h0, h1, h2, h3, h4, h5, h6, h7;
logic [31:0] a1, b1, c1, d1, e1;

assign mem_clk = clk;

// right rotation
function logic [31:0] rightrotate(input logic [31:0] x,
input logic [7:0] r);
begin
rightrotate = (x >> r) | (x << (32-r));
end
endfunction

// SHA256 K constants
parameter int sha256_k[0:63] = '{
32'h428a2f98, 32'h71374491, 32'hb5c0fbcf, 32'he9b5dba5, 32'h3956c25b, 32'h59f111f1, 32'h923f82a4, 32'hab1c5ed5,
32'hd807aa98, 32'h12835b01, 32'h243185be, 32'h550c7dc3, 32'h72be5d74, 32'h80deb1fe, 32'h9bdc06a7, 32'hc19bf174,
32'he49b69c1, 32'hefbe4786, 32'h0fc19dc6, 32'h240ca1cc, 32'h2de92c6f, 32'h4a7484aa, 32'h5cb0a9dc, 32'h76f988da,
32'h983e5152, 32'ha831c66d, 32'hb00327c8, 32'hbf597fc7, 32'hc6e00bf3, 32'hd5a79147, 32'h06ca6351, 32'h14292967,
32'h27b70a85, 32'h2e1b2138, 32'h4d2c6dfc, 32'h53380d13, 32'h650a7354, 32'h766a0abb, 32'h81c2c92e, 32'h92722c85,
32'ha2bfe8a1, 32'ha81a664b, 32'hc24b8b70, 32'hc76c51a3, 32'hd192e819, 32'hd6990624, 32'hf40e3585, 32'h106aa070,
32'h19a4c116, 32'h1e376c08, 32'h2748774c, 32'h34b0bcb5, 32'h391c0cb3, 32'h4ed8aa4a, 32'h5b9cca4f, 32'h682e6ff3,
32'h748f82ee, 32'h78a5636f, 32'h84c87814, 32'h8cc70208, 32'h90befffa, 32'ha4506ceb, 32'hbef9a3f7, 32'hc67178f2
};

// SHA256 hash round
function logic [255:0] sha256_op(input logic [31:0] a, b, c, d, e, f, g, h, w,
input logic [7:0] t);
logic [31:0] S1, S0, ch, maj, t1, t2; // internal signals
begin
S1 = rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e, 25);
ch = (e & f) ^ ((~e) & g);
t1 = h + S1 + ch + sha256_k[t] + w;
S0 = rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22);
maj = (a & b) ^ (a & c) ^ (b & c);
t2 = S0 + maj;

sha256_op = {t1 + t2, a, b, c, d + t1, e, f, g};
end
endfunction

// MD5 S constants
parameter byte S[0:63] = '{
8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22,
8'd5, 8'd9,  8'd14, 8'd20, 8'd5, 8'd9,  8'd14, 8'd20, 8'd5, 8'd9,  8'd14, 8'd20, 8'd5, 8'd9,  8'd14, 8'd20,
8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23,
8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21
};

// MD5 K constants
parameter int md5_k[0:63] = '{
32'hd76aa478, 32'he8c7b756, 32'h242070db, 32'hc1bdceee,
32'hf57c0faf, 32'h4787c62a, 32'ha8304613, 32'hfd469501,
32'h698098d8, 32'h8b44f7af, 32'hffff5bb1, 32'h895cd7be,
32'h6b901122, 32'hfd987193, 32'ha679438e, 32'h49b40821,
32'hf61e2562, 32'hc040b340, 32'h265e5a51, 32'he9b6c7aa,
32'hd62f105d, 32'h02441453, 32'hd8a1e681, 32'he7d3fbc8,
32'h21e1cde6, 32'hc33707d6, 32'hf4d50d87, 32'h455a14ed,
32'ha9e3e905, 32'hfcefa3f8, 32'h676f02d9, 32'h8d2a4c8a,
32'hfffa3942, 32'h8771f681, 32'h6d9d6122, 32'hfde5380c,
32'ha4beea44, 32'h4bdecfa9, 32'hf6bb4b60, 32'hbebfbc70,
32'h289b7ec6, 32'heaa127fa, 32'hd4ef3085, 32'h04881d05,
32'hd9d4d039, 32'he6db99e5, 32'h1fa27cf8, 32'hc4ac5665,
32'hf4292244, 32'h432aff97, 32'hab9423a7, 32'hfc93a039,
32'h655b59c3, 32'h8f0ccc92, 32'hffeff47d, 32'h85845dd1,
32'h6fa87e4f, 32'hfe2ce6e0, 32'ha3014314, 32'h4e0811a1,
};

// MD5 g
function logic[3:0] md5_g(input logic [7:0] t);
begin
if (t <= 15)
md5_g = t;
else if (t <= 31)
md5_g = (5*t + 1) % 16;
else if (t <= 47)
md5_g = (3*t + 5) % 16;
else
md5_g = (7*t) % 16;
end
endfunction

// MD5 f
function logic[31:0] md5_f(input logic [7:0] t);
begin
if (t <= 15)
md5_f = (b & c) | ((~b) & d);
else if (t <= 31)
md5_f = (d & b) | ((~d) & c);
else if (t <= 47)
md5_f = b ^ c ^ d;
else
md5_f = c ^ (b | (~d));
end
endfunction

// MD5 hash round
function logic[127:0] md5_op(input logic [31:0] a, b, c, d, w,
input logic [7:0] t);
logic [31:0] t1, t2; // internal signals
begin
//debug
/*
$display("w == %x\n",w);$display("--------------------------\n");
*/
t1 = a + md5_f(t) + md5_k[t] + w;
t2 = b + ((t1 << S[t])|(t1 >> (32-S[t])));
md5_op = {d, t2, b, c};
end
endfunction

//sha1 hash
function logic [159:0] hash_op(input logic [31:0] a, b, c, d, e, w, input logic [31:0] t);
if(t<=19)begin
k = 32'h5A827999;
fsha1 = (b & c) | ( (~b) & d);
end else

if(t <=39)begin
k = 32'h6ED9EBA1;
fsha1 = b ^ c ^ d;
end else

if(t<=59)begin
k = 32'h8F1BBCDC;
fsha1 = (b & c) | (b & d) | (c & d);
end else

begin
k = 32'hCA62C1D6;
fsha1 = b ^ c ^ d;
end
//debug
/*
$display("a == %x\n",a);$display("b == %x\n",b);
$display("c == %x\n",c);$display("d == %x\n",d);
$display("e == %x\n",e);$display("f == %x\n",f);
$display("w == %x\n",w); */ temp = {a[26:0],a[31:27]} + fsha1 + e + k + w; //debug /*$display("temp == %x\n",temp);
$display("-------------------------------"); */ e = d; d = c; c = {b[1:0],b[31:2]}; b = a; a = temp; hash_op = {a, b, c, d, e}; endfunction // convert from little-endian to big-endian function logic [31:0] changeEndian(input logic [31:0] value); changeEndian = {value[7:0], value[15:8], value[23:16], value[31:24]}; endfunction //appending pading "1" function logic [31:0] magic(input logic [31:0] value); begin if(size%4 == 1)begin magic = ((value & 32'hFF000000) | 32'h00800000); end if(size%4 == 2) begin magic = ((value & 32'hFFFF0000) | 32'h00008000); end if(size%4 == 3) begin magic = ((value & 32'hFFFFFF00) | 32'h00000080); end end endfunction //determine number of blocks function logic [31:0] determine_num_blocks(input logic [31:0] size); determine_num_blocks = ((((size)+8)/64)+1); endfunction always_ff @(posedge clk, negedge reset_n) begin if (!reset_n) begin state <= SETUP; end else casex (state) SETUP: begin w[0]<=0; //set block to zeros to avoid padding 0s w[1]<=0; w[2]<=0; w[3]<=0; w[4]<=0; w[5]<=0; w[6]<=0; w[7]<=0; w[8]<=0; w[9]<=0; w[10]<=0; w[11]<=0; w[12]<=0; w[13]<=0; w[14]<=0; w[15]<=0; if(opcode == 2'b01 || opcode == 2'b00) begin // set constants depending on operation h0 <= 32'h67452301; h1 <= 32'hEFCDAB89; h2 <= 32'h98BADCFE; h3 <= 32'h10325476; h4 <= 32'hC3D2E1F0; end if(opcode == 2'b10) begin h0 <= 32'h6a09e667; h1 <= 32'hbb67ae85; h2 <= 32'h3c6ef372; h3 <= 32'ha54ff53a; h4 <= 32'h510e527f; h5 <= 32'h9b05688c; h6 <= 32'h1f83d9ab; h7 <= 32'h5be0cd19; end //temp_addr <= output_addr; //uncomment if doing more than one hash state <= IDLE; done <= 0; end IDLE: // start if(start) begin // READ first word current_block<=determine_num_blocks(size); registers <= size/4 + size[0]; if(size >= 64) begin //if message is more than 512 bits limit <= 16; end if (size < 64) begin //if message is less than 512 bits limit <= size/4 + size[0]; end counter <= 0; state <= READ; mem_addr <= message_addr - 1; //account for reading dealys end READ: begin mem_addr <= mem_addr + 1; w[counter-2] <= changeEndian(mem_read_data); //-2 to account for delays counter <= counter + 1; state <= READ; if (counter>limit)begin counter <= 0; registers <= registers - limit; state <= PAD; end end PAD: begin if(current_block == 1)begin w[15] <= (size << 3); //append size if(size % 4 == 0)begin w[0] <= 32'h80000000; //full block of padding end if(limit < 16) begin if(size % 4 > 0)begin //if unfinished word need to magic w[limit-1] <= magic(w[limit-1]); end if(size % 4 == 0)begin //if finished word need to make w[limit] = 1 w[limit] <= 32'h80000000; end end end if(current_block == 2 && size % 4 > 0 && registers < 1)begin //unfinished last word w[limit-1] <= magic(w[limit-1]); end if(registers<16)begin //set limit for next block limit <= registers; end else begin limit <= 16; end a <= h0; //prep hash constants b <= h1; c <= h2; d <= h3; e <= h4; f <= h5; g <= h6; h <= h7; state <= HASH; end HASH: begin if(opcode == 2'b00)begin //md5 if(counter < 64)begin if(counter < 16)begin {a, b, c, d} <= md5_op(a, b, c, d, w[counter], counter); end end if(counter > 15)begin {a, b, c, d} <= md5_op(a, b, c, d, w[md5_g(counter)], counter); end counter <= counter + 1; if(counter==64)begin h0 <= h0 + a; h1 <= h1 + b; h2 <= h2 + c; h3 <= h3 + d; state <= POST_HASH; end end if(opcode == 2'b01)begin //sha1 if(counter < 80)begin if(counter < 16)begin //debug //$display("t == %d\n",t);
{a,b,c,d,e} <= hash_op(a, b, c, d, e, w[counter], counter);
counter <= counter + 1;
end
if( counter >= 16)begin
//debug
// $display("t == %d\n",t); {a,b,c,d,e} <= hash_op(a, b, c, d, e, ((w[counter-a1]^w[counter-b1]^w[counter-c1]^w[counter-d1])<<1 | (w[counter-a1]^w[counter-b1]^w[counter-c1]^w[counter-d1])>>31), counter); w[counter-e1]<= ((w[counter-a1]^w[counter-b1]^w[counter-c1]^w[counter-d1])<<1 | (w[counter-a1]^w[counter-b1]^w[counter-c1]^w[counter-d1])>>31); counter <= counter + 1; if((((counter+1)-3)%16)==0)begin a1 <= a1 + 16; end if((((counter+1)-8)%16)==0)begin b1 <= b1 + 16; end if((((counter+1)-14)%16)==0)begin c1 <= c1 + 16; end if(((counter+1)%16)==0)begin d1 <= d1 + 16; e1 <= e1 + 16; end end end if (counter == 80)begin h0 <= h0 + a; h1 <= h1 + b; h2 <= h2 + c; h3 <= h3 + d; h4 <= h4 + e; a1 <= 3; b1 <= 8; c1 <= 14; d1 <= 16; e1 <= 16; state <= POST_HASH; end end if(opcode == 2'b10) begin //sha256 if(counter < 64)begin if(counter < 16)begin {a, b, c, d, e, f, g, h} <= sha256_op(a, b, c, d, e, f, g, h, w[counter], counter); end if(counter> 14 )begin w[15] <= w[0] + (rightrotate(w[1], 7) ^ rightrotate(w[1], 18) ^ (w[1] >> 3)) + w[9] + (rightrotate(w[14], 17) ^ rightrotate(w[14], 19) ^ (w[14] >> 10)); for (int i=0; i<15; i++) w[i] <= w[i+1]; end if(counter > 15)begin {a, b, c, d, e, f, g, h} <= sha256_op(a, b, c, d, e, f, g, h, w[15], counter); end counter <= counter + 1; end if (counter == 64)begin h0 <= h0 + a; h1 <= h1 + b; h2 <= h2 + c; h3 <= h3 + d; h4 <= h4 + e; h5 <= h5 + f; h6 <= h6 + g; h7 <= h7 + h; state <= POST_HASH; end end end POST_HASH:begin w[0] <= 0; //set block to zeros again w[1] <= 0; w[2] <= 0; w[3] <= 0; w[4] <= 0; w[5] <= 0; w[6] <= 0; w[7] <= 0; w[8] <= 0; w[9] <= 0; w[10] <= 0; w[11] <= 0; w[12] <= 0; w[13] <= 0; w[14] <= 0; w[15] <= 0; counter <= 0; current_block <= current_block - 1; //decrement block counter if(limit > 0)begin state <= READ; mem_addr <= mem_addr - 2; end if(limit < 1 )begin state <= PAD; end if(current_block == 1)begin state <= OUTPUT; //mem_addr <= temp_addr //uncomment if doing more than one hash end end OUTPUT:begin mem_we <= 1; mem_addr <= output_addr; //comment out if doing more than one hash if(counter == 0)begin mem_write_data <= h0; end if(counter == 1)begin mem_write_data <= h1; end if(counter == 2)begin mem_write_data <= h2; end if(counter == 3)begin mem_write_data<=h3; if (opcode == 2'b00) begin state <= DONE; end end if(counter==4)begin mem_write_data <= h4; if (opcode == 2'b01)begin state <= DONE; end end if(counter == 5)begin mem_write_data <= h5; end if(counter == 6)begin mem_write_data <= h6; end if(counter == 7)begin mem_write_data <= h7; if (opcode == 2'b10)begin state <= DONE; end end mem_addr <= output_addr + counter; counter <= counter + 1; //temp_address <= output_addr + counter; //uncomment if doing more than one hash end DONE: begin /* //debug$display("-------- FINAL RESULT --------");
$display("h0== %x\n",h0);$display("h1 == %x\n",h1);
$display("h2 == %x\n",h2);$display("h3 == %x\n",h3);
$display("h4 == %x\n",h4); */ done <= 1; state <= SETUP; end endcase end endmodule module tb_super_hash_processor(); logic clk, reset_n, start; logic [ 1:0] opcode; logic [ 31:0] message_addr, size, output_addr; logic done, mem_clk, mem_we; logic [ 15:0] mem_addr; logic [ 31:0] mem_write_data; logic [ 31:0] mem_read_data; logic [127:0] md5_hash; // results here logic [159:0] sha1_hash; // results here logic [255:0] sha256_hash; // results here logic [ 31:0] dpsram[0:16383]; // each row has 32 bits logic [ 31:0] dpsram_tb[0:16383]; // for result testing, testbench only logic [ 31:0] message_seed; // modify message_seed below int message_size = 505; // in bytes int pad_length; int t, m; int outloop; int cycles; int total_cycles; int rounds; logic correct; logic [127:0] md5_digest; logic [159:0] sha1_digest; logic [255:0] sha256_digest; logic [ 31:0] h0; logic [ 31:0] h1; logic [ 31:0] h2; logic [ 31:0] h3; logic [ 31:0] h4; logic [ 31:0] h5; logic [ 31:0] h6; logic [ 31:0] h7; logic [ 31:0] a, b, c, d, e, f, g, h; logic [ 31:0] s1, s0; logic [ 31:0] w[0:79]; // instantiate your design super_hash_processor super_hash_processor_inst (clk, reset_n, start, opcode, message_addr, size, output_addr, done, mem_clk, mem_we, mem_addr, mem_write_data, mem_read_data); parameter string hnames[0:2] = {"MD5", "SHA1", "SHA256"}; // --------------------------------------------------------------------------------------- // MD5 S constants parameter byte S[0:63] = '{ 8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22, 8'd7, 8'd12, 8'd17, 8'd22, 8'd5, 8'd9, 8'd14, 8'd20, 8'd5, 8'd9, 8'd14, 8'd20, 8'd5, 8'd9, 8'd14, 8'd20, 8'd5, 8'd9, 8'd14, 8'd20, 8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23, 8'd4, 8'd11, 8'd16, 8'd23, 8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21, 8'd6, 8'd10, 8'd15, 8'd21 }; // MD5 K constants parameter int md5_k[0:63] = '{ 32'hd76aa478, 32'he8c7b756, 32'h242070db, 32'hc1bdceee, 32'hf57c0faf, 32'h4787c62a, 32'ha8304613, 32'hfd469501, 32'h698098d8, 32'h8b44f7af, 32'hffff5bb1, 32'h895cd7be, 32'h6b901122, 32'hfd987193, 32'ha679438e, 32'h49b40821, 32'hf61e2562, 32'hc040b340, 32'h265e5a51, 32'he9b6c7aa, 32'hd62f105d, 32'h02441453, 32'hd8a1e681, 32'he7d3fbc8, 32'h21e1cde6, 32'hc33707d6, 32'hf4d50d87, 32'h455a14ed, 32'ha9e3e905, 32'hfcefa3f8, 32'h676f02d9, 32'h8d2a4c8a, 32'hfffa3942, 32'h8771f681, 32'h6d9d6122, 32'hfde5380c, 32'ha4beea44, 32'h4bdecfa9, 32'hf6bb4b60, 32'hbebfbc70, 32'h289b7ec6, 32'heaa127fa, 32'hd4ef3085, 32'h04881d05, 32'hd9d4d039, 32'he6db99e5, 32'h1fa27cf8, 32'hc4ac5665, 32'hf4292244, 32'h432aff97, 32'hab9423a7, 32'hfc93a039, 32'h655b59c3, 32'h8f0ccc92, 32'hffeff47d, 32'h85845dd1, 32'h6fa87e4f, 32'hfe2ce6e0, 32'ha3014314, 32'h4e0811a1, 32'hf7537e82, 32'hbd3af235, 32'h2ad7d2bb, 32'heb86d391 }; // MD5 g function logic[3:0] md5_g(input logic [7:0] t); begin if (t <= 15) md5_g = t; else if (t <= 31) md5_g = (5*t + 1) % 16; else if (t <= 47) md5_g = (3*t + 5) % 16; else md5_g = (7*t) % 16; end endfunction // MD5 f function logic[31:0] md5_f(input logic [7:0] t); begin if (t <= 15) md5_f = (b & c) | ((~b) & d); else if (t <= 31) md5_f = (d & b) | ((~d) & c); else if (t <= 47) md5_f = b ^ c ^ d; else md5_f = c ^ (b | (~d)); end endfunction // MD5 hash round function logic[127:0] md5_op(input logic [31:0] a, b, c, d, w, input logic [7:0] t); logic [31:0] t1, t2; // internal signals begin t1 = a + md5_f(t) + md5_k[t] + w; t2 = b + ((t1 << S[t])|(t1 >> (32-S[t]))); md5_op = {d, t2, b, c}; end endfunction // --------------------------------------------------------------------------------------- // SHA1 f function logic [31:0] sha1_f(input logic [7:0] t); begin if (t <= 19) sha1_f = (b & c) | ((~b) & d); else if (t <= 39) sha1_f = b ^ c ^ d; else if (t <= 59) sha1_f = (b & c) | (b & d) | (c & d); else sha1_f = b ^ c ^ d; end endfunction // SHA1 k function logic [31:0] sha1_k(input logic [7:0] t); begin if (t <= 19) sha1_k = 32'h5a827999; else if (t <= 39) sha1_k = 32'h6ed9eba1; else if (t <= 59) sha1_k = 32'h8f1bbcdc; else sha1_k = 32'hca62c1d6; end endfunction // SHA1 hash round function logic [159:0] sha1_op(input logic [31:0] a, b, c, d, e, w, input logic [7:0] t); logic [31:0] temp, tc; // internal signals begin temp = ((a << 5)|(a >> 27)) + sha1_f(t) + e + sha1_k(t) + w; tc = ((b << 30)|(b >> 2)); sha1_op = {temp, a, tc, c, d}; end endfunction // --------------------------------------------------------------------------------------- // SHA256 K constants parameter int sha256_k[0:63] = '{ 32'h428a2f98, 32'h71374491, 32'hb5c0fbcf, 32'he9b5dba5, 32'h3956c25b, 32'h59f111f1, 32'h923f82a4, 32'hab1c5ed5, 32'hd807aa98, 32'h12835b01, 32'h243185be, 32'h550c7dc3, 32'h72be5d74, 32'h80deb1fe, 32'h9bdc06a7, 32'hc19bf174, 32'he49b69c1, 32'hefbe4786, 32'h0fc19dc6, 32'h240ca1cc, 32'h2de92c6f, 32'h4a7484aa, 32'h5cb0a9dc, 32'h76f988da, 32'h983e5152, 32'ha831c66d, 32'hb00327c8, 32'hbf597fc7, 32'hc6e00bf3, 32'hd5a79147, 32'h06ca6351, 32'h14292967, 32'h27b70a85, 32'h2e1b2138, 32'h4d2c6dfc, 32'h53380d13, 32'h650a7354, 32'h766a0abb, 32'h81c2c92e, 32'h92722c85, 32'ha2bfe8a1, 32'ha81a664b, 32'hc24b8b70, 32'hc76c51a3, 32'hd192e819, 32'hd6990624, 32'hf40e3585, 32'h106aa070, 32'h19a4c116, 32'h1e376c08, 32'h2748774c, 32'h34b0bcb5, 32'h391c0cb3, 32'h4ed8aa4a, 32'h5b9cca4f, 32'h682e6ff3, 32'h748f82ee, 32'h78a5636f, 32'h84c87814, 32'h8cc70208, 32'h90befffa, 32'ha4506ceb, 32'hbef9a3f7, 32'hc67178f2 }; // SHA256 hash round function logic [255:0] sha256_op(input logic [31:0] a, b, c, d, e, f, g, h, w, input logic [7:0] t); logic [31:0] S1, S0, ch, maj, t1, t2; // internal signals begin S1 = rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e, 25); ch = (e & f) ^ ((~e) & g); t1 = h + S1 + ch + sha256_k[t] + w; S0 = rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22); maj = (a & b) ^ (a & c) ^ (b & c); t2 = S0 + maj; sha256_op = {t1 + t2, a, b, c, d + t1, e, f, g}; end endfunction // --------------------------------------------------------------------------------------- // left rotation function logic [31:0] leftrotate(input logic [31:0] x); begin leftrotate = (x << 1) | (x >> 31); end endfunction // right rotation function logic [31:0] rightrotate(input logic [31:0] x, input logic [7:0] r); begin rightrotate = (x >> r) | (x << (32-r)); end endfunction // convert from little-endian to big-endian function logic [31:0] changeEndian(input logic [31:0] value); changeEndian = {value[7:0], value[15:8], value[23:16], value[31:24]}; endfunction // --------------------------------------------------------------------------------------- // clock generator always begin #10; clk = 1'b1; #10 clk = 1'b0; end // main testbench initial begin total_cycles = 0; for (opcode = 0; opcode < 3; opcode = opcode + 1) begin // RESET HASH CO-PROCESSOR @(posedge clk) reset_n = 0; for (m = 0; m < 2; m = m + 1) @(posedge clk); reset_n = 1; for (m = 0; m < 2; m = m + 1) @(posedge clk); // SET MESSAGE LOCATION size = message_size; case (opcode) 2'b00: begin // md5 message_addr = 32'd1000; message_seed = 32'h01234567; end 2'b01: begin // sha1 message_addr = 32'd2000; message_seed = 32'h34567012; end default: begin // sha256 message_addr = 32'd3000; message_seed = 32'h45670123; end endcase output_addr = message_addr + ((message_size-1))/4 + 1; // CREATE AND DISPLAY MESSAGETEXT$display("-----------\n");
$display("Messagetext\n");$display("-----------\n");

dpsram_tb[0] = changeEndian(message_seed); // change Endian // for testbench only

$display("%x\n", dpsram[message_addr]); for (m = 1; m < (message_size-1)/4+1; m = m + 1) begin // data generation dpsram[message_addr+m] = (dpsram[message_addr+m-1]<<1)|(dpsram[message_addr+m-1]>>31); dpsram_tb[m] = changeEndian(dpsram[message_addr+m]); // change Endian$display("%x\n", dpsram[message_addr+m]);
end

// START PROCESSOR

start = 1'b1;
for (m = 0; m < 2; m = m + 1) @(posedge clk);
start = 1'b0;

// calculate total number of bytes after padding (before appending total length)
if ((message_size + 1) % 64 <= 56 && (message_size + 1) % 64 > 0)
else

case (message_size % 4) // pad bit 1
0: dpsram_tb[message_size/4] = 32'h80000000;
1: dpsram_tb[message_size/4] = dpsram_tb[message_size/4] & 32'h FF000000 | 32'h 00800000;
2: dpsram_tb[message_size/4] = dpsram_tb[message_size/4] & 32'h FFFF0000 | 32'h 00008000;
3: dpsram_tb[message_size/4] = dpsram_tb[message_size/4] & 32'h FFFFFF00 | 32'h 00000080;
endcase

for (m = message_size/4+1; m < pad_length/4; m = m + 1) begin
dpsram_tb[m] = 32'h00000000;
end

dpsram_tb[pad_length/4] = message_size >> 29; // append length of message in bits (before pre-processing)

outloop = pad_length/64; // break message into 512-bit chunks (64 bytes)

// COMPUTE NUMBER OF ROUNDS

if (opcode == 2'b01) // sha1
rounds = 80;
else // md5 or sha256
rounds = 64;

// SET INITIAL HASH

case (opcode)
2'b00: begin // md5
h0 = 32'h67452301;
h1 = 32'hEFCDAB89;
h3 = 32'h10325476;
h4 = 32'h00000000;
h5 = 32'h00000000;
h6 = 32'h00000000;
h7 = 32'h00000000;
end
2'b01: begin // sha1
h0 = 32'h67452301;
h1 = 32'hEFCDAB89;
h3 = 32'h10325476;
h4 = 32'hC3D2E1F0;
h5 = 32'h00000000;
h6 = 32'h00000000;
h7 = 32'h00000000;
end
default: begin // sha256
h0 = 32'h6a09e667;
h1 = 32'hbb67ae85;
h2 = 32'h3c6ef372;
h3 = 32'ha54ff53a;
h4 = 32'h510e527f;
h5 = 32'h9b05688c;
h6 = 32'h1f83d9ab;
h7 = 32'h5be0cd19;
end
endcase

// COMPUTE SHA256 HASH

for (m = 0; m < outloop; m = m + 1) begin
// W ARRAY EXPANSION

for (t = 0; t < rounds; t = t + 1) begin
if (t < 16) begin
w[t] = dpsram_tb[t+m*16];
end else begin
case (opcode)
2'b00: begin // md5
w[t] = w[md5_g(t)];
end
2'b01: begin // sha1
w[t] = leftrotate(w[t-3] ^ w[t-8] ^ w[t-14] ^ w[t-16]);
end
default: begin // sha256
s0 = rightrotate(w[t-15], 7) ^ rightrotate(w[t-15], 18) ^ (w[t-15] >> 3);
s1 = rightrotate(w[t-2], 17) ^ rightrotate(w[t-2], 19) ^ (w[t-2] >> 10);
w[t] = w[t-16] + s0 + w[t-7] + s1;
end
endcase
end
end

// INITIAL HASH AT ROUND K

a = h0;
b = h1;
c = h2;
d = h3;
e = h4;
f = h5;
g = h6;
h = h7;

// HASH ROUNDS

for (t = 0; t < rounds; t = t + 1) begin
case (opcode)
2'b00:
{a, b, c, d} = md5_op(a, b, c, d, w[t], t);
2'b01:
{a, b, c, d, e} = sha1_op(a, b, c, d, e, w[t], t);
default:
{a, b, c, d, e, f, g, h} = sha256_op(a, b, c, d, e, f, g, h, w[t], t);
endcase
end

// FINAL HASH

h0 = h0 + a;
h1 = h1 + b;
h2 = h2 + c;
h3 = h3 + d;
h4 = h4 + e;
h5 = h5 + f;
h6 = h6 + g;
h7 = h7 + h;
end

md5_digest = {h0, h1, h2, h3};
sha1_digest = {h0, h1, h2, h3, h4};
sha256_digest = {h0, h1, h2, h3, h4, h5, h6, h7};

// WAIT UNTIL ENTIRE FRAME IS HASHED, THEN DISPLAY HASH RESULT

wait (done == 1);

// DISPLAY HASH RESULT

$display("-----------------------\n");$display("correct %s hash result is:\n", hnames[opcode]);
$display("-----------------------\n"); case (opcode) 2'b00: // md5$display("%x\n", md5_digest);
2'b01: // sha1
$display("%x\n", sha1_digest); default: // sha256$display("%x\n", sha256_digest);
endcase

md5_hash = {
};

sha1_hash = {
};

sha256_hash = {
};

$display("-----------------------\n");$display("Your %s result is:        \n", hnames[opcode]);
$display("-----------------------\n"); case (opcode) 2'b00: // md5$display("%x\n", md5_hash);
2'b01: // sha1
$display("%x\n", sha1_hash); default: // sha256$display("%x\n", sha256_hash);
endcase

$display("***************************\n"); correct = 1'b0; case (opcode) 2'b00: // md5 correct = (md5_digest == md5_hash); 2'b01: // sha1 correct = (sha1_digest == sha1_hash); default: // sha256 correct = (sha256_digest == sha256_hash); endcase if (correct) begin$display("Congratulations! You have the correct %s hash result!\n", hnames[opcode]);
$display("Total number of %s cycles: %d\n\n", hnames[opcode], cycles); end else begin$display("Error! The %s hash result is wrong!\n", hnames[opcode]);
end

total_cycles = total_cycles + cycles;

$display("***************************\n"); end$display("FINAL TOTAL NUMBER OF CYCLES: %d\n\n", total_cycles);
$display("***************************\n");$stop;
end

// memory model
always @(posedge mem_clk)
begin
if (mem_we) // write
end

// track # of cycles
always @(posedge clk)
begin
if (!reset_n)
cycles = 0;
else
cycles = cycles + 1;
end

endmodule