长期待咕。
哈希函数
性质
哈希(散列)函数应该满足:
- 单向性,即消息不可能通过摘要被还原
- 无冲性,即不同的消息应有不同的摘要值(显然是一种理想状况)
其应该满足 3 个基本属性:
- 抗碰撞攻击:找到 $M_1\ne M_2\ s.t.\ hash(M_1)=hash(M_2)$
- 抗原像攻击:给定 $hash(M_1)$,找到 $M_2\ne M_1\ s.t.\ hash(M_2)=hash(M_1)$
- 抗第二原像攻击:给定 $M_1,hash(M_1)$,找到 $M_2\ne M_1\ s.t.\ hash(M_2)=hash(M_1)$
MD5
MD5 创建一个 128-bit 消息摘要。
Description
referred to Cryptographic Hashing Functions - MD5, Farhad Ahmed Sagar、https://github.com/pod32g/MD5/blob/master/md5.c、jhsn 三年前的 implementation 等。
据资料所言,一共 5 步(前 2 步用以追加与填充消息,第 3, 4 步用了一些辅助函数)
English Ver.
-
Padding Bits: $len(padded_message) + 64 \equiv 0\ (mod\ 512)$:
- append bits:
'1' + '0...0'
untillen() % 512 == 448 (512 - 64)
- append bits:
-
Append Length:
- add a 64-bit representation of the original message to make
len() % 512 == 0
- divide it into 512-bit blocks, then divide each of them into 16 words (means 32-bit each), we denote them as $M[0,…,N-1]$, where N % 16 == 0.
- add a 64-bit representation of the original message to make
-
Initialize MD Buffer:
-
four word buffer (32-bit each) word A 01 23 45 67 word B 89 ab cd ef word C fe dc ba 98 word D 76 54 32 10
-
-
Process Message in 16-word Blocks:
-
use four Auxiliary Functions:
-
inputs: three 32-bit word
-
use each of these 4 functions in 4 rounds (each round has 16 operations)
-
output: a 32-bit word
-
-
use a table consisting 64-elements:
-
$K[0,…,63]$:
hex(floor(abs(sin(i + 1)) * 2^32))
-
-
ALGORITHM: See my another intro below (Chinese Ver.).
-
-
buffer AA, BB, CC, DD contains MD5 digest of our input message.
Chinese Ver.
其实就两步:填充、计算。
-
填充
将
initial_message
用 bits100..00
追加填充,直到 $len(message)+64\equiv 0\ (mod\ 512)$;然后追加填充 64-bitlen(initial_message) % 2**64
,以使 $len(message)\equiv 0\ (mod\ 512)$;最后按 512-bit 将其划分成块,每一块又分为 16 个 32-bit 小块。 -
计算(过程中一定保证
Zmod(2^32)
)首先介绍几个需要用到的辅助函数:
- 对于每一个 512-bit block,进行 64 轮循环,每轮循环的结构一致,可大致理解为
a0 = am + left_rotate(an + F(a[]) + Message[g] + K[i], s[j])
;而选取的参数、函数需要根据轮数的不同去选择。 - 在每一个 512-bit 运算后,将对应参数
A, B, C, D
累加。最后将其依次拼接起来,即为 MD5 digest。
- 以上
round 3 g = (3 * i + 5) % 16
更正。
- 对于每一个 512-bit block,进行 64 轮循环,每轮循环的结构一致,可大致理解为
Implementation
import struct
import hashlib
from math import sin, floor
from Crypto.Util.number import bytes_to_long
# params
a0, b0, c0, d0 = 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476
K = [int(floor(2**32 * abs(sin(i + 1)))) for i in range(64)]
s = [
7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22,
5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20,
4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23,
6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21
]
def rev_hex(hex_str, length):
hex_str = hex_str[2:].rjust(length, '0')
output = '0x'
for i in range(len(hex_str), -1, -2):
output += hex_str[i:i + 2]
return output
def bytes_to_int(a):
b = 0
for i in a:
b = 256 * b + i
return b
def left_rotate(x, s):
return ((x << s) | (x >> 32 - s)) % 2**32
def md5(m):
temp = rev_hex(hex(len(m) * 8 % 2**64), 16)[2:]
length = b''.join([chr(int(temp[i:i + 2], 16)).encode('latin1') for i in range(0, 16, 2)])
m = m + b'\x80'
if len(m) % 64 != 56:
m = m + b'\x00' * ((56 - len(m)) % 64)
m = m + length
m = [m[i:i + 64] for i in range(0, len(m), 64)]
AA, BB, CC, DD = a0, b0, c0, d0
for block in m:
A, B, C, D = AA, BB, CC, DD
M = [int(rev_hex(hex(bytes_to_int(block[i:i + 4])), 8)[2:], 16) for i in range(0, 64, 4)]
for i in range(64):
if i < 16:
F = (B & C) | ((~B) & D)
g = i
elif i < 32:
F = (D & B) | ((~D) & C)
g = (5 * i + 1) % 16
elif i < 48:
F = B ^ C ^ D
g = (3 * i + 5) % 16
else:
F = C ^ (B | (~D))
g = (7 * i) % 16
D, C, B, A = C, B, (B + left_rotate((A + F + K[i] + M[g]) % 2**32, s[i])) % 2**32, D
AA = (AA + A) % 2**32
BB = (BB + B) % 2**32
CC = (CC + C) % 2**32
DD = (DD + D) % 2**32
digest = rev_hex(hex(AA), 8)[2:] + rev_hex(hex(BB), 8)[2:] + rev_hex(hex(CC), 8)[2:] + rev_hex(hex(DD), 8)[2:]
return digest
def md5_test():
message = input().encode()
digest = md5(message)
print(digest)
assert hashlib.md5(message).hexdigest() == digest
md5_test()
几个问题:
- 整个过程均为小端序
- str 转 bytes 时,最后发现用 latin1 可以直接转 ascii>128 的字符,整了好久用极其迂回的方式整出来了。
- 先 + ‘1’ 再去判断字节是否满足条件,全程用 bytes 操作就行。