RPC error: [query], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.RESOURCE_EXHAUSTED details = "grpc: trying to send message larger than max (69641975 vs. 67108864)" debug_error_string = "UNKNOWN:Error received from peer ipv4:127.0.0.1:19531 {created_time:"2023-05-19T15:28:04.787406541+08:00", grpc_status:8, grpc_message:"grpc: trying to send message larger than max (69641975 vs. 67108864)"}" >)>, <Time:{'RPC start': '2023-05-19 15:28:02.361254', 'RPC error': '2023-05-19 15:28:04.788007'}>
今天做embedding 召回时遇到一个问题,看了下可能是文件流太大了,于是就像看看 计算一下变量大小、先做一个过滤,用来避免这种请。
paperclub分享如下:
代码如下:
def cal_size(inputs):
"""
文件大小单位换算
:text: 文件字节
:return: 返回字节大小对应单位的数值
"""
def formatter(text):
units = ["B", "KB", "MB", "GB", "TB", "PB"]
size = 1024
for i in range(len(units)):
if (text / size) < 1:
return "%.2f %s" % (text, units[i]) # 返回值保留小数点后两位
text = text / size
if isinstance(inputs, np.ndarray):
inputs_bytes = inputs.nbytes
elif isinstance(inputs, list):
inputs = np.array(inputs)
inputs_bytes = inputs.nbytes
elif isinstance(inputs, str):
inputs = bytes(inputs.encode('utf-8'))
inputs_bytes = inputs.__sizeof__()
elif isinstance(inputs, bytes):
inputs_bytes = inputs.__sizeof__()
else:
inputs = bytes(str(inputs).encode('utf-8'))
inputs_bytes = inputs.__sizeof__()
print("大小: ", inputs_bytes, " ===>> ", formatter(inputs_bytes))
convert_size(res)
----------------------------------------------------------------
大小: 127952 ===>> 124.95 KB
好家伙,果然不小呀