hashmap

基本信息

　　HashMap可以接受null键值和值，而Hashtable则不能；HashMap是非synchronized;HashMap很快；以及HashMap储存的是键值对。

注意点

　　１．当两个对象的hashcode相同会发生什么？
　　因为hashcode相同，所以它们的bucket位置相同，‘碰撞’会发生。因为HashMap使用链表存储对象，这个Entry(包含有键值对的Map.Entry对象)会存储在链表中。
　　２．如果两个键的hashcode相同，你如何获取值对象？
　　找到bucket位置之后，会调用keys.equals()方法去找到链表中正确的节点，最终找到要找的值对象。
　　３．如果HashMap的大小超过了负载因子(load factor)定义的容量，怎么办？
　　默认的负载因子大小为0.75，也就是说，当一个map填满了75%的bucket时候，和其它集合类(如ArrayList等)一样，将会创建原来HashMap大小的两倍的bucket数组，来重新调整map的大小，并将原来的对象放入新的bucket数组中。这个过程叫作rehashing，因为它调用hash方法找到新的bucket位置。
　　４．为什么String, Interger这样的wrapper类适合作为键？
　　Ｓtring, Interger这样的wrapper类作为HashMap的键是再适合不过了，而且String最为常用。因为String是不可变的，也是final的，而且已经重写了equals()和hashCode()方法了。其他的wrapper类也有这个特点。不可变性是必要的，因为为了要计算hashCode()，就要防止键值改变，如果键值在放入时和获取时返回不同的hashcode的话，那么就不能从HashMap中找到你想要的对象。不可变性还有其他的优点如线程安全。如果你可以仅仅通过将某个field声明成final就能保证hashCode是不变的，那么请这么做吧。因为获取对象的时候要用到equals()和hashCode()方法，那么键对象正确的重写这两个方法是非常重要的。如果两个不相等的对象返回不同的hashcode的话，那么碰撞的几率就会小些，这样就能提高HashMap的性能。

简单实现一个hashmap

package base_struct.hashmap;

/**
     * 基于链表数组的实现
     * @author dh
     * @param <V>
     * @param <K>
     *
*/
public class DhHashMap<K, V> {
private class Entry<K,V>{
    int hash;
    K key;
    V value;
    Entry<K,V> next;
    Entry(int hash, K key, V value, Entry<K, V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }
}
private static final int DEFAULT_CAPACITY = 1 << 4;

private Entry<K, V>[] table;

private int capacity;

private int size;
public DhHashMap(int capacity) {
    if (capacity < 0) {
        throw new IllegalArgumentException();
    } else {
        table = new Entry[capacity];
        size = 0;
        this.capacity = capacity;
    }
}
public int size() {
    return size;
}
public boolean isEmpty() {
    return size == 0 ? true : false;}

/**
 * 这个hash算法采用位运算提高了效率,由于采用位运算，因此默认长度最好是2的幂最好
 * @param key
 * @return
 */
private int hash(Object key) {
    return (key == null) ? 0 : key.hashCode()&(capacity-1);
}

/**
 * 按照以前邏輯，默认key一样则只修改value,
 * @param key
 * @param value
 */
public void put(K key, V value) {
    if (key == null) {
        throw new IllegalArgumentException();
    }
    int hash = hash(key);
    Entry<K, V> nEntry = new Entry<K, V>(hash, key, value, null);
    Entry<K, V> entry = table[hash];
    while (entry != null) {
        if (entry.key.equals(key)) {
            entry.value = value;
            return;
        }
        entry = entry.next;
    }
    nEntry.next = table[hash];
    table[hash] = nEntry;
    size++;
}
/**
 * h很简单的逻辑，找到确认的hash值后，一个个遍历
 * @param key
 * @return
 */
public V get(K key) {
    if (key == null) {
        throw new IllegalArgumentException();
    }
    int hash = hash(key);
    Entry<K, V> entry = table[hash];
    while (entry != null) {
        if (entry.key.equals(key)) {
            return entry.value;
        }
        entry = entry.next;
    }
    return null;
}
}

>>>操作符

　这个操作符的作用是将当前整数，转换为２进制后，右移。比如
２>>>1,就是10右移1位剩下了一个1，就是1。
　　案例如下：
public static void main(String [] args)
{
System.out.println(4>>> 1);
}
　　结果是２．
public static void main(String [] args)
{
System.out.println(4>>> 2);
}
　　结果是１

1.8增加了对其优化。

　　其原因是hash的的平均分布可能有问题，有可能导致链表过长，从而使得效率变的很低。因此加了一个新的结构红黑树，如果说1.8之前的hashMap是数组+链表，那么现在版本的就是数组+链表+红黑树。当链表长度<8时，默认以前的方式加元素，也就是数组+链表，当链表>8时则将链表转换为红黑树。（红黑树是一个自平衡的二叉查找树，其插入删除时为了保持红黑树的平衡特征，回自旋来保证，可以将其理解为一个具有自平衡的查找2叉树）

hashmap线程不安全

　　hashmap没有做加锁处理，举个简单例子，2个线程插入一个数据，其hashcode都是一样的，在极端情况下，可能导致插入少了一个数据。

java 1.8源码分析

　　这里只分析下put方法，get方法很简单，自己去看看。突然觉得put方法也很简单。

  final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

　　这里我直接说下流程：

先判断table是否为空，为空，则创建table。
在判断key所在table的位置是否为null,空的话直接创建链表node。否则进入else。
else里面先判断这个链表node的key是否和要put的key一样，一样则覆盖
在判断是否是树节点，是的话，插入树节点的方式插入数据。
否则，遍历链表，要么找到一样的key覆盖，要么创建新的节点，如果是创建节点，那么再判断这个链表长度是否>=8，是的话则将链表转换为红黑树。
再然后，覆盖的已经返回了，创建新节点的，走下一步，hashmap的size加一。在判断是否大于限定值，大于的话则是则扩张table。
好了 over，，，

 /** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null,
                         new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        binCount = 1;
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if (e.hash == hash &&
                                ((ek = e.key) == key ||
                                 (ek != null && key.equals(ek)))) {
                                oldVal = e.val;
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break;
                            }
                        }
                    }
                    else if (f instanceof TreeBin) {
                        Node<K,V> p;
                        binCount = 2;
                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                       value)) != null) {
                            oldVal = p.val;
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }
            if (binCount != 0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    addCount(1L, binCount);
    return null;
}

　　这个是ConcurrentHashMap的put源码。

先根据key计算出位置，然后这个位置为null则cas看看可否插入值成功。
判断是否需要扩容
加锁写数据，和hashmap类似