HDFS均衡操作快速参考

Posted on Wed 01 May 2024 in 技术 • Tagged with HDFS, 均衡, 大数据, 快速参考

快速判断是否需要均衡

# 计算当前均衡度(标准差)
hdfs dfsadmin -report | python3 -c "
import sys, re
used_percents = []
for line in sys.stdin:
    if 'DFS Used%:' in line:
        percent = float(re.search(r'(\d+\.?\d*)%', line).group(1))
        used_percents.append(percent)
if used_percents:
    avg = sum(used_percents) / len(used_percents)
    variance = sum((x - avg) ** 2 for x in used_percents) / len(used_percents)
    std_dev = variance ** 0.5
    print(f'标准差: {std_dev:.2f}%')
    if std_dev > 15:
        print('⚠️  需要立即均衡')
    elif std_dev > 10:
        print('⚠️  建议进行均衡')
    else:
        print('✅ 集群已均衡')
"

常用均衡命令

基本均衡

# 标准均衡(推荐)
nohup hdfs balancer -threshold 10 -policy datanode > /tmp/balancer.log 2>&1 &

# 严格均衡
nohup hdfs balancer -threshold 5 -policy datanode > /tmp/balancer.log 2>&1 &

# 宽松均衡
nohup hdfs balancer -threshold …

Continue reading