标准库之 random 模块

2024-09-26 22:39 由【1758872】的博客发表于 #后端开发

一、介绍random模块

1.1、random模块简介

random模块是Python标准库中用于生成伪随机数的模块，伪随机数是通过算法生成的数列，在一定范围内表现出随机性，虽然这些数列在一定程度上是可预测的，但对于大多数应用来说已经足够。

二、random模块的基本功能

2.1、整数用函数

2.1.1、random.randrange()

random.randrange(start, stop[, step])

返回从 range(start, stop, step) 随机选择的一个元素。

支持设置步长，可以生成按固定间隔的随机数，这对于需要特定模式或间隔的随机数生成非常有用。注意：不包括右边界

# 返回 0 <= N < 100 范围内的随机整数
print(random.randrange(100))
# 返回 100 <= N < 200 范围内的随机整数
print(random.randrange(100, 200))
# 返回 100 <= N < 200 范围内的随机整数，步长为5
print(random.randrange(100, 200, 5))

2.1.2、random.randint()

random.randint(a, b)

返回随机整数 N 满足 a <= N <= b。相当于 randrange(a, b+1)。

不支持设置步长，只能生成指定范围内的任意整数。注意：包括左边界和右边界

# 返回 100 <= N <= 200 范围内的随机整数
print(random.randint(100, 200))

2.1.3、random.getrandbits()

random.getrandbits(k)

返回随机整数 N 满足 0 <= N <= 2^k-1，例如，getrandbits(16)将生成一个0到65535之间的随机整数。这个函数特别适用于需要生成大范围随机数的情况。

# 返回 0 <= N <= 65535（2的16次方再减去1） 范围内的随机整数
print(random.getrandbits(16))

2.2、序列用函数

2.2.1、random.choice()

random.choice(seq)

从非空序列 seq 返回一个随机元素。如果 seq 为空，则引发 IndexError。

a = ['alice', 'bob', 'helen', 'jack', 'sue']
b = ('alice', 'bob', 'helen', 'jack', 'sue')
c = {'alice', 'bob', 'helen', 'jack', 'sue'}
d = {'alice': 16, 'bob': 18, 'helen': 19, 'jack': 20, 'sue': 22}
e = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
# 从列表中随机选取一个元素
print(random.choice(a))
# 从元组中随机选取一个元素
print(random.choice(b))
# 从字符串中随机选取一个元素
print(random.choice(e))
# ———————————————————————————————————————————
# 不支持从集合中随机选取，会报错
# print(random.choice(c))
# 不支持从字典中随机选取，会报错，但可以使用 random.choice(list(d.items())) 的方法来实现
# print(random.choice(d))

2.2.2、random.choices()

random.choices(population, weights=None, *, cum_weights=None, k=1)

population：必需参数，指定要进行选择的序列（可以是列表、元组、字符串等）。

weights：可选参数，指定每个元素的权重（概率）。如果不指定，则默认每个元素的权重相等。

cum_weights：可选参数，指定累计权重。如果指定了cum_weights，则必需省略weights参数。

k：可选参数，指定要选择的元素个数。默认为1，即只选择一个元素。

从序列中有放回地随机抽取k个元素，并返回对应的数组，weights参数代表每个元素被选到的相对权重，而cum_weights代表的是累积权重，它们的长度需和待抽取序列population长度一致。若无设置权重，则每个元素被抽到的权重是一样的。

注意：权重不能为负值，否则会报错。

import random
fruits = ['apple', 'banana', 'orange', 'grape', 'watermelon']
weights = [0.1, 0.2, 0.3, 0.2, 0.2]
cum_weights = [0.1, 0.4, 0.7, 0.9, 1.0]
# 从列表中随机选择一个元素
chosen_fruit = random.choices(fruits)
# 选择多个元素
chosen_fruits = random.choices(fruits, k=3)
# 按照概率从列表中随机选择一个元素
chosen_fruit = random.choices(fruits, weights=weights)
# 利用cum_weights参数选择元素
chosen_fruit = random.choices(fruits, cum_weights=cum_weights)

选择多个元素并计算选择的次数

import random
fruits = ['apple', 'banana', 'orange', 'grape', 'watermelon']
chosen_fruits = random.choices(fruits, k=1000)
fruit_counts = {}
for fruit in chosen_fruits:
    if fruit in fruit_counts:
        fruit_counts[fruit] += 1
    else:
        fruit_counts[fruit] = 1
print(fruit_counts)

2.2.3、random.shuffle()

random.shuffle(x)

就地将序列 x 随机打乱位置。

import random
fruits = ['apple', 'banana', 'orange', 'grape', 'watermelon']
random.shuffle(fruits)  # 注意：改变的是原列表
print(fruits)  # ['orange', 'watermelon', 'grape', 'banana', 'apple']

2.2.4、random.sample()

random.sample(population, k, *, counts=None)

从序列中无放回地随机抽取k个元素。用于无重复的随机抽样。

import random
fruits = ['apple', 'banana', 'orange', 'grape', 'watermelon']
# 返回一个新的随机打乱序列, 注意：原列表是没有改变的
print(random.sample(fruits,len(fruits)))  # ['orange', 'watermelon', 'apple', 'banana', 'grape']
print(fruits)  # ['apple', 'banana', 'orange', 'grape', 'watermelon']

重复的元素可以一个个地直接列出，或使用可选的仅限关键字形参 counts 来指定。例如，sample(['red', 'blue'], counts=[4, 2], k=5) 等价于 sample(['red', 'red', 'red', 'red', 'blue', 'blue'], k=5)。

要从一系列整数中选择样本，请使用 range() 对象作为参数。对于从大量人群中采样，这种方法特别快速且节省空间：sample(range(10000000), k=60) 。

2.3、离散分布

2.3.1、random.binomialvariate()

random.binomialvariate(n=1, p=0.5)

二项式分布。返回 n 次独立试验在每次试验的成功率为 p 时的成功次数

2.4、实值分布

2.4.1、random.random()

random.random()

返回 0.0 <= X < 1.0 范围内的随机浮点数

2.4.2、random.uniform(a, b)

random.uniform(a, b)

返回一个随机浮点数 N ，a <= N <= b。

import random
print(random.uniform(60, 100))

2.4.2.1、保留小数点后n位的方法

2.4.2.1.1、使用round()函数

num = 3.14159
# 这种方式是进行的四舍五入
rounded_num = round(num, 2)  # 输出: 3.14

2.4.2.1.2、使用format()函数

num = 3.14159
# 这种方式是进行的四舍五入，注意：得到的是字符串形式的浮点数
formatted_num = "{:.2f}".format(num)  # 输出: '3.14'

2.4.2.1.3、使用decimal模块进行精确控制

from decimal import Decimal, ROUND_HALF_UP
 
num = Decimal('3.14159')
context = Decimal(1) / Decimal(10) ** 2
# 这种方式是进行的四舍五入
rounded_num = num.quantize(context, rounding=ROUND_HALF_UP)  # 输出: Decimal('3.14')

2.4.2.1.4、使用f-string（Python 3.6+）

num = 3.14159
formatted_num = f"{num:.2f}"  # 输出: '3.14'

三、与random相关的模块

3.1、secrets模块

secrets 模块用于生成高度加密的随机数，适于管理密码、账户验证、安全凭据及机密数据。

3.1.1、secrets.choice()

secrets.choice(seq)

返回一个从非空序列中随机选取的元素。

import secrets
# 假设我们有一个元素列表
fruits = ['apple', 'banana', 'orange', 'grape', 'watermelon']
# 使用 secrets.choice 来安全地选择列表中的一个随机元素
secure_random_element = secrets.choice(fruits)
print(secure_random_element)

3.1.2、secrets.randbelow()

secrets.randbelow(exclusive_upper_bound)

返回 [0, exclusive_upper_bound) 范围内的随机整数。

import secrets
# 生成一个安全的随机整数，范围在 0 <= N < 10之间
secure_int = secrets.randbelow(10)
print(f"安全随机整数：{secure_int}")

3.1.3、secrets.randbits()

secrets.randbits(k)

返回一个具有k个随机位的非负整数。如 secrets.randbits(3)，表示取值为 0 <= N < 2^3，也就是 [0, 8) 之间的随机整数。

3.1.4、secrets.token_bytes()

secrets.token_bytes([nbytes=None])

如果指定了 nbytes， secrets.token_bytes(nbytes) 会返回一个包含 nbytes 个随机字节的 bytes 对象。如果未提供 nbytes 参数，它会返回一个合适用于安全令牌的默认长度的随机字节序列（通常为 32 个字节）。

import secrets
# 生成默认长度（通常为32个字节）的随机字节序列
random_token = secrets.token_bytes()
print(len(random_token))   # 输出：32
print(random_token) # 输出随机字节序列：b'<\xa1\xc2\xb9k\xf4\x9e$\xd9\x03\xd5H\xb8\x1e\xe2d\xe0\x82x\xe9)\x11\xe9\x04l\xda\x95\xe3\xf0C\x88\xba'
# 生成指定长度的随机字节序列
specified_length_token = secrets.token_bytes(16)
print(specified_length_token) # 输出随机字节序列：b'\xa2C\x98H\xb2\xca5\n\xb69h{\xfa\xb3n\xc5'

3.1.5、secrets.token_hex()

secrets.token_hex([nbytes=None])

如果提供了 nbytes 参数，则会生成长度为 nbytes*2 的十六进制字符串（每个字节转换为两个十六进制字符）。如果未提供 nbytes 参数，则会生成一个适用于安全令牌的默认长度的十六进制字符串（通常是 32 个字符）。

import secrets
# 生成默认长度（通常为32个字符）的随机十六进制字符串
random_hex_token = secrets.token_hex()
print(random_hex_token)  # 53063bd5d38f7f032e146d769567254748dcbc34de37f716043a21d4b9ef0575
# 生成指定长度的随机十六进制字符串
specified_length_hex_token = secrets.token_hex(16)
print(specified_length_hex_token)  # bcce543dee60747639895e2fcffcfaf8

3.1.6、secrets.token_urlsafe()

secrets.token_urlsafe([nbytes=None])

返回安全的 URL 随机文本字符串，包含 nbytes 个随机字节。文本用 Base64 编码，平均来说，每个字节对应 1.3 个结果字符。未提供 nbytes 或为 None 时，它会生成一个适合于安全令牌的默认长度随机 URL 安全字符串。这个函数生成的返回值是一个只包含 URL 安全字符（字母、数字、下划线和短横线）的字符串。

import secrets
# 生成默认长度的随机 URL 安全字符串
random_urlsafe_token = secrets.token_urlsafe()
print(len(random_urlsafe_token))   # 输出：43
print(random_urlsafe_token)        # 输出：t6GrND8P32pL763R36VQIaB70jf8r_uruvSa0wgQYrY
# 生成指定长度的随机 URL 安全字符串
specified_length_urlsafe_token = secrets.token_urlsafe(16)
print(specified_length_urlsafe_token)  # 输出：dUA2mPJbbWtB6rLj8hoC1g

3.1.7、secrets.compare_digest()

secrets.compare_digest(a, b)

用于比较两个字符串 a 和 b，并且在字符串匹配时具有防止时间侧信道攻击的特性。在密码学和安全相关的场景中，比较两个敏感字符串时，使用 secrets.compare_digest 要优于简单的 == 操作符。该函数返回一个布尔值，如果 a 和 b 匹配，返回 True，否则返回 False。这种比较在比较时间上更加均匀，不易受到时间侧信道攻击的影响。在比较敏感数据时，尤其是在密码验证或令牌比对时使用这个函数，可以提高系统的安全性。

生成系统密码示例

import string
import random
import secrets
 
def generate_password(min_length, max_repeat, max_class_repeat, *char_credits):
    characters = [string.digits, string.ascii_uppercase, '!@#$%^&*(){}<>,?`~+-=[]', string.ascii_lowercase]
    all_characters = ''.join(characters)
 
    while True:
        # 生成每个字符类别的字符
        password = [secrets.choice(char) for char, count in zip(characters, char_credits) for _ in range(count)]
        # 添加字符以满足最小密码长度要求
        remaining_len = min_length - sum(char_credits)
        password += [secrets.choice(all_characters) for _ in range(remaining_len)]
        random.shuffle(password)
        if is_valid_password(min_length, password, characters, max_repeat, max_class_repeat):
            return ''.join(password)
 
 
def is_valid_password(min_length, password, characters, max_repeat, max_class_repeat):
    # 判断是否以=开头
    if ''.join(password).startswith("="):
        return False
    
    # 检查字符和类别是否连续重复超过了规定次数
    repeat_count = max(password.count(char) for char in set(password))
    if repeat_count > min_length // 4:
        return False
 
    for char_class in characters:
        char_class_positions = [idx for idx, char in enumerate(password) if char in char_class]
        for idx in char_class_positions:
            if idx < len(password) - max_class_repeat:
                if all(password[idx + i] in char_class for i in range(0, max_class_repeat + 1)):
                    return False
    # 判定同一个字符的索引
    char_repeat = [char for char in set(password) if password.count(char) > max_repeat]
    char_repeat_positions = {}
    for idx, char in enumerate(password):
        if char in char_repeat:
            if char not in char_repeat_positions:
                char_repeat_positions[char] = [idx]
            else:
                char_repeat_positions[char].append(idx)
 
    # 判定相同字符是否连续
    for _, index in char_repeat_positions.items():
        for idx in index:
            if idx < len(password) - max_repeat:
                if all(password[idx + i] == password[idx] for i in range(0, max_repeat + 1)):
                    return False
    return True
 
 
if __name__ == '__main__':
    # 设置密码生成的参数
    min_length = 16
    char_credits = (3, 3, 3, 3)  # 数字、大写字母、特殊字符、小写字母
    # 同一个字符不能连续重复的数量
    max_repeat = 2
    # 同一类字符不能连续重复的数量
    max_class_repeat = 3
 
    # 生成和打印密码
    for _ in range(100):
        generated_password = generate_password(min_length, max_repeat, max_class_repeat, *char_credits)
        print(generated_password)

生成密码重置URL示例

import secrets
import string
 
# 生成安全令牌，用于密码恢复应用程序
security_token = ''.join(secrets.choice(string.ascii_letters + string.digits) for _ in range(16))
# 构建临时密保URL
temp_url = f"https://example.com/password-recovery?token={security_token}"
print("生成的临时密保URL:", temp_url)  # 输出：生成的临时密保URL: https://example.com/password-recovery?token=90HOEUVFroaJIO5D