简单分析Java的HashMap.entrySet()的实现

2022-10-27 10:56:13

关于Java的HashMap.entrySet()，文档是这样描述的：这个方法返回一个Set，这个Set是HashMap的视图，对Map的操作会在Set上反映出来，反过来也是。原文是

Returns a Set view of the mappings contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa.

本文通过源码简单分析这一功能的实现。

首先要简单介绍一下HashMap的内部存储。我们知道，Map是用来存储key-value类型数据的，一个<k, v>对在Map的接口定义中被定义为Entry，HashMap内部实现了Entry接口。HashMap内部维护一个Entry数组。

transient Entry[] table;

当put一个新元素的时候，根据key的hash值计算出对应的数组下标。数组的每个元素是一个链表的头指针，用来存储具有相同下标的Entry。

Entry[] table

    +---+

    | 0 | -> entry_0_0 -> entry_0_1 -> null

    +---+

    | 1 | -> null

    +---+

    |   |

     ...

    |n-1| -> entry_n-1_0 -> null

    +---+

entrySet()方法返回的是一个特殊的Set，定义为HashMap的内部私有类

private final class EntrySet extends AbstractSet<Map.Entry<K,V>>

主要看一下这个Set的iterator()方法。这个方法很简单，返回一个EntryIterator类型的实例。EntryIterator类型是泛型HashIterator<T>的一个子类，这个类的内容很简单，唯一的代码是在next()函数中调用了HashIterator的nextEntry()方法。所以，重点就变成了分析nextEntry()方法。上述过程见下面的图示

HashMap

    |- table <------------------------------------\

    \+ entrySet()                                 |iterates

        |              HashMap.HashIterator<T>    |

        |returns                ^       \- nextEntry()

        V                       -                 ^

HashMap.EntrySet                |                 |

    \- iterator()               |extends          |

            |                   |                 |

            |  instantiats      |                 |calls

            \----------> HashMap.EntryIterator    |

                                        \- next() /

HashIterator通过遍历table数组，实现对HashMap的遍历。内部维护几个变量：index记录当前在table数组中的下标，current用来记录当前在table[index]这个链表中的位置，next指向current的下一个元素。nextEntry()的完整代码如下：

final Entry<K,V> nextEntry() {

    if (modCount != expectedModCount)

        throw new ConcurrentModificationException();

    Entry<K,V> e = next;

    if (e == null)

        throw new NoSuchElementException();

    if ((next = e.next) == null) {

        Entry[] t = table;

        while (index < t.length && (next = t[index++]) == null)

            ;

    }

    current = e;

    return e;

}

第一个if用来判断在多线程的情况下是否出现并发错误，这里暂时不讨论。如果next不是null，那么返回并更新next。更新方法是第三个if的内容：如果当前链表还没有结束，则简单的把next向后移一个；否则在table中查找下一个非空的slot。

总结一下，HashMap的entrySet()方法返回一个特殊的Set，这个Set使用EntryIterator遍历，而这个Iterator则直接操作于HashMap的内部存储结构table上。通过这种方式实现了“视图”的功能。整个过程不需要任何辅助存储空间。

p.s. 从这一点也可以看出为什么entrySet()是遍历HashMap最高效的方法，原因很简单，因为这种方式和HashMap内部的存储方式是一致的。

码农公寓

相关文章