用Python给文件名加入hash防止冲撞

ShadowC

| 本文阅读量: -

此博客内图片均存于七牛云,为了方便操作,所有图片放置在同一个文件夹下,考虑到可能存在的文件重名问题,进行处理。

2023-11-05 Update:目前博客内图片暂由 nginx 代理并提供访问服务。

2023-11-07 Update:目前博客内的图片由自建的 minio 服务提供访问。

想文件名防冲撞,第一个想到的就是加入hash值。在密码学中,hash算法指的是将不定长的输入数据处理后得到定长的输出数据,用于保证文件的完整性,防止篡改或传输错误。MD5算法作为hash算法的一种,已经被破解(一定时间内找到相同MD5码的另一个文件),但是仅仅作为文件完整性验证仍然足够,同时计算时间也足够短。因此考虑在图片文件名称的最后加入MD5码,作为防止重名的验证。以下是完整代码:

"""
File name: hfn.py
Description: Rename the files in target directory by add hash.

Author: Cheng Shu
Date: 2021-06-20 15:19:46
LastEditTime: 2021-06-20 16:01:13
LastEditors: Cheng Shu
@Copyright © 2020 Cheng Shu
License: MIT License
"""

import os
import sys
import hashlib

def add_hash_to_name(file):
    '''return modified name.'''
    with open(file, 'rb') as f:
        md5 = hashlib.md5()
        
        # in case file is too big, actually it is useless for images on blog.
        # compute hash value
        for byte_block in iter(lambda: f.read(4096), b""):
            md5.update(byte_block)
        # print(md5.hexdigest())
        
        # find last dot and determine where to add hash
        new_name = ''
        last_dot = file.find('.')
        if last_dot == -1:
            new_name = file + '-'+ md5.hexdigest()
        elif md5.hexdigest() in file:
            new_name = file
        else:
            new_name = file[:last_dot] + f"-{md5.hexdigest()}" + file[last_dot:]
        return new_name
            

def main():

    # validate the args
    assert len(sys.argv) == 2, "Please Check args"
    assert os.path.isdir(sys.argv[1]), "Path is not validate!"

    path = sys.argv[1]
    os.chdir(path)
	
    # load file list, and add hash to every file
    # if hash is not needed, print any character before enter.
    for file in os.listdir('.'):
        if not os.path.isfile(file):
            continue
        if file[0] == '.':
            continue
        print(f"ready to file: [{file}], go on with Enter, omit with any char:", end='')
        s = input()
        if (len(s) != 0):
            continue
        
        new_name = add_hash_to_name(file)
        os.rename(file, new_name)
        print(f"{file} -> {new_name}")
 

if (__name__ == "__main__"):
    main()

调用方式:

$ python3 hfn.py [folder]

例如:

$ python3 .\hfn.py ./test
ready to file: [a.py], go on with Enter, omit with any char:
a.py -> a-a8f04b565e11f11cf5665250dfa1891f.py
ready to file: [b], go on with Enter, omit with any char:
b -> b-2d02e669731cbade6a64b58d602cf2a4