# hdfs diskbalancer -help plan
usage: hdfs diskbalancer -plan <hostname> [options]
Creates a plan that describes how much data should be moved between disks.
--bandwidth <arg> Maximum disk bandwidth (MB/s) in integer
to be consumed by diskBalancer. e.g. 10
--maxerror <arg> Describes how many errors can be
tolerated while copying between a pair
of disks.
--out <arg> Local path of file to write output to,
if not specified defaults will be used.
--plan <arg> Hostname, IP address or UUID of datanode
for which a plan is created.
--thresholdPercentage <arg> Percentage of data skew that is
tolerated before disk balancer starts
working. For example, if total data on a
2 disk node is 100 GB then disk balancer
calculates the expected value on each
disk, which is 50 GB. If the tolerance
is 10% then data on a single disk needs
to be more than 60 GB (50 GB + 10%
tolerance value) for Disk balancer to
balance the disks.
--v Print out the summary of the plan on
* Computes Volume Data Density. Adding a new volume changes
* the volumeDataDensity for all volumes. So we throw away
* our priority queue and recompute everything.
* we discard failed volumes from this computation.
* totalCapacity = totalCapacity of this volumeSet
* totalUsed = totalDfsUsed for this volumeSet
* idealUsed = totalUsed / totalCapacity
* dfsUsedRatio = dfsUsedOnAVolume / Capacity On that Volume
* volumeDataDensity = idealUsed - dfsUsedRatio
public void computeVolumeDataDensity() {
long totalCapacity = 0;
long totalUsed = 0;
// when we plan to re-distribute data we need to make
// sure that we skip failed volumes.
for (DiskBalancerVolume volume : volumes) {
if (!volume.isFailed() && !volume.isSkip()) {
if (volume.computeEffectiveCapacity() < 0) {
totalCapacity += volume.computeEffectiveCapacity();
totalUsed += volume.getUsed();
if (totalCapacity != 0) {
this.idealUsed = truncateDecimals(totalUsed /
(double) totalCapacity);
for (DiskBalancerVolume volume : volumes) {
if (!volume.isFailed() && !volume.isSkip()) {
double dfsUsedRatio =
truncateDecimals(volume.getUsed() /
(double) volume.computeEffectiveCapacity());
volume.setVolumeDataDensity(this.idealUsed - dfsUsedRatio);
* Computes whether we need to do any balancing on this volume Set at all.
* It checks if any disks are out of threshold value
* @param thresholdPercentage - threshold - in percentage
* @return true if balancing is needed false otherwise.
public boolean isBalancingNeeded(double thresholdPercentage) {
double threshold = thresholdPercentage / 100.0d;
if(volumes == null || volumes.size() <= 1) {
// there is nothing we can do with a single volume.
// so no planning needed.
return false;
for (DiskBalancerVolume vol : volumes) {
boolean notSkip = !vol.isFailed() && !vol.isTransient() && !vol.isSkip();
Double absDensity =
if ((absDensity > threshold) && notSkip) {
return true;
return false;
computeVolumeDataDensity:查看一个盘的数据密度,计算方法为 当前盘的空间占用比例(dfsUsedRatio)- 所有盘的空间占用比例(idealUsed)