HDFS均衡操作快速参考
Posted on Wed 01 May 2024 in 技术 • Tagged with HDFS, 均衡, 大数据, 快速参考
快速判断是否需要均衡
# 计算当前均衡度(标准差)
hdfs dfsadmin -report | python3 -c "
import sys, re
used_percents = []
for line in sys.stdin:
if 'DFS Used%:' in line:
percent = float(re.search(r'(\d+\.?\d*)%', line).group(1))
used_percents.append(percent)
if used_percents:
avg = sum(used_percents) / len(used_percents)
variance = sum((x - avg) ** 2 for x in used_percents) / len(used_percents)
std_dev = variance ** 0.5
print(f'标准差: {std_dev:.2f}%')
if std_dev > 15:
print('⚠️ 需要立即均衡')
elif std_dev > 10:
print('⚠️ 建议进行均衡')
else:
print('✅ 集群已均衡')
"
常用均衡命令
基本均衡
# 标准均衡(推荐)
nohup hdfs balancer -threshold 10 -policy datanode > /tmp/balancer.log 2>&1 &
# 严格均衡
nohup hdfs balancer -threshold 5 -policy datanode > /tmp/balancer.log 2>&1 &
# 宽松均衡
nohup hdfs balancer -threshold …Continue reading