前言:
最近在做加密工作,需要拿到一个随机的32位HASH 值,32*4=128bits。这里碰到了UUID,借此机会总结一下。方便你我他!
1 介绍
UUID 是 通用唯一识别码(Universally Unique Identifier)的缩写,是一种软件建构的标准,亦为开放软件基金会组织在分布式计算环境领域的一部分。其目的,是让分布式系统中的所有元素,都能有唯一的辨识信息,而不需要通过中央控制端来做辨识信息的指定。如此一来,每个人都可以创建不与其它人冲突的UUID。在这样的情况下,就不需考虑数据库创建时的名称重复问题。目前最广泛应用的UUID,是微软公司的全局唯一标识符(GUID),而其他重要的应用,则有Linux ext2/ext3文件系统、LUKS加密分区、GNOME、KDE、Mac OS X等等。另外我们也可以在e2fsprogs包中的UUID库找到实现。
1.1 定义
UUID是由一组32位数的16进制数字所构成,是故UUID理论上的总数为16^32=2^128,约等于3.4 x 10^38。也就是说若每纳秒产生1兆个UUID,要花100亿年才会将所有UUID用完。
1.2 重复机率
随机产生的UUID(例如说由java.util.UUID类别产生的)的128个比特中,有122个比特是随机产生,4个比特在此版本('Randomly generated UUID')被使用,还有2个在其变体('Leach-Salz')中被使用。利用生日悖论,可计算出两笔UUID拥有相同值的机率约为:
以下是以x=2^122计算出n笔GUID后产生碰撞的机率:
n | 机率 |
---|---|
68,719,476,736 = 2^36 | 0.0000000000000004 (4 x 10^-16) |
2,199,023,255,552 = 2^41 | 0.0000000000004 (4 x 10^-13) |
70,368,744,177,664 = 2^46 | 0.0000000004 (4 x 10^-10) |
与被陨石击中的机率比较的话,已知一个人每年被陨石击中的机率估计为170亿分之1,也就是说机率大约是0.00000000006 (6 x 10^-11),等同于在一年内置立数十兆笔GUID并发生一次重复。换句话说,每秒产生10亿笔UUID,100年后只产生一次重复的机率是50%。如果地球上每个人都各有6亿笔GUID,发生一次重复的机率是50%。
产生重复GUID并造成错误的情况非常低,是故大可不必考虑此问题。
机率也与随机数产生器的质量有关。若要避免重复机率提高,必须要使用基于密码学上的假随机数产生器来生成值才行。
1.3 代码中的介绍
/**
* A class that represents an immutable universally unique identifier (UUID).
* A UUID represents a 128-bit value.
*
* <p> There exist different variants of these global identifiers. The methods
* of this class are for manipulating the Leach-Salz variant, although the
* constructors allow the creation of any variant of UUID (described below).
*
* <p> The layout of a variant 2 (Leach-Salz) UUID is as follows:
*
* The most significant long consists of the following unsigned fields:
* <pre>
* 0xFFFFFFFF00000000 time_low
* 0x00000000FFFF0000 time_mid
* 0x000000000000F000 version
* 0x0000000000000FFF time_hi
* </pre>
* The least significant long consists of the following unsigned fields:
* <pre>
* 0xC000000000000000 variant
* 0x3FFF000000000000 clock_seq
* 0x0000FFFFFFFFFFFF node
* </pre>
*
* <p> The variant field contains a value which identifies the layout of the
* {@code UUID}. The bit layout described above is valid only for a {@code
* UUID} with a variant value of 2, which indicates the Leach-Salz variant.
*
* <p> The version field holds a value that describes the type of this {@code
* UUID}. There are four different basic types of UUIDs: time-based, DCE
* security, name-based, and randomly generated UUIDs. These types have a
* version value of 1, 2, 3 and 4, respectively.
*
* <p> For more information including algorithms used to create {@code UUID}s,
* see <a href="http://www.ietf.org/rfc/rfc4122.txt"> <i>RFC 4122: A
* Universally Unique IDentifier (UUID) URN Namespace</i></a>, section 4.2
* "Algorithms for Creating a Time-Based UUID".
*
* @since 1.5
*/
1. UUID 代表一个128bits的数,也就是 32 位十六机制组成的 16bytes 的数
2. UUID 由两部分组成(高64bits 和 低64bits)
高64bits(前面16位16进制数)
32bits(8位16进制数):time_low
16bits(4位16进制数):time_mid
4 bits (1位16进制数):version
12bits(3位16进制数):time_hi
低64bits(后面16位16进制数)
2bits (不够 1 位) :variant
14bits(3FFF) :clock_seq
48bits(12位16进制数):node
3. UUID 该类是在JDK1.5 版本开始引入
2 代码详细剖析
2.1 获取time
public long timestamp() {
if (version() != 1) {
throw new UnsupportedOperationException("Not a time-based UUID");
}
return (mostSigBits & 0x0FFFL) << 48
| ((mostSigBits >> 16) & 0x0FFFFL) << 32
| mostSigBits >>> 32;
}
将上面1.3 中的time_low、time_mid、time_hi 通过位运算整合到一起。
2.2 获取variant
/**
* The variant number associated with this {@code UUID}. The variant
* number describes the layout of the {@code UUID}.
*
* The variant number has the following meaning:
* <ul>
* <li>0 Reserved for NCS backward compatibility
* <li>2 <a href="http://www.ietf.org/rfc/rfc4122.txt">IETF RFC 4122</a>
* (Leach-Salz), used by this class
* <li>6 Reserved, Microsoft Corporation backward compatibility
* <li>7 Reserved for future definition
* </ul>
*
* @return The variant number of this {@code UUID}
*/
public int variant() {
// This field is composed of a varying number of bits.
// 0 - - Reserved for NCS backward compatibility
// 1 0 - The IETF aka Leach-Salz variant (used by this class)
// 1 1 0 Reserved, Microsoft backward compatibility
// 1 1 1 Reserved for future definition.
return (int) ((leastSigBits >>> (64 - (leastSigBits >>> 62)))
& (leastSigBits >> 63));
}
详细看代码中的注释,不同取值代表不同的意义。
2.3 获取version
/**
* The version number associated with this {@code UUID}. The version
* number describes how this {@code UUID} was generated.
*
* The version number has the following meaning:
* <ul>
* <li>1 Time-based UUID
* <li>2 DCE security UUID
* <li>3 Name-based UUID
* <li>4 Randomly generated UUID
* </ul>
*
* @return The version number of this {@code UUID}
*/
public int version() {
// Version is bits masked by 0x000000000000F000 in MS long
return (int)((mostSigBits >> 12) & 0x0f);
}
通过 1.3 中知道version 占用的是高64bits 中的12~15 4bits。
2.4 获取clock seq
public int clockSequence() {
if (version() != 1) {
throw new UnsupportedOperationException("Not a time-based UUID");
}
return (int)((leastSigBits & 0x3FFF000000000000L) >>> 48);
}
2.5 获取node
public long node() {
if (version() != 1) {
throw new UnsupportedOperationException("Not a time-based UUID");
}
return leastSigBits & 0x0000FFFFFFFFFFFFL;
}
注意的是如果version 为1,也就是time-based UUID,会抛出一个exception。
2.6 随机获取一个UUID
public static UUID randomUUID() {
SecureRandom ng = Holder.numberGenerator;
byte[] randomBytes = new byte[16];
ng.nextBytes(randomBytes);
randomBytes[6] &= 0x0f; /* clear version */
randomBytes[6] |= 0x40; /* set to version 4 */
randomBytes[8] &= 0x3f; /* clear variant */
randomBytes[8] |= 0x80; /* set to IETF variant */
return new UUID(randomBytes);
}
其实 是通过SecureRandom 获取一个16bytes 的数。
private UUID(byte[] data) {
long msb = 0;
long lsb = 0;
assert data.length == 16 : "data must be 16 bytes in length";
for (int i=0; i<8; i++)
msb = (msb << 8) | (data[i] & 0xff);
for (int i=8; i<16; i++)
lsb = (lsb << 8) | (data[i] & 0xff);
this.mostSigBits = msb;
this.leastSigBits = lsb;
}
数组的0~7位高64bits,8~15位低64bits。数组的index 越小,在UUID 位数越高。
2.7 获取完整的UUID
public String toString() {
return (digits(mostSigBits >> 32, 8) + "-" +
digits(mostSigBits >> 16, 4) + "-" +
digits(mostSigBits, 4) + "-" +
digits(leastSigBits >> 48, 4) + "-" +
digits(leastSigBits, 12));
}
/** Returns val represented by the specified number of hex digits. */
private static String digits(long val, int digits) {
long hi = 1L << (digits * 4);
return Long.toHexString(hi | (val & (hi - 1))).substring(1);
}
通过代码我们确定了UUID 的格式:
FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF
跟1.3 中说的对应,分别代表:
time-low(8位16进制数)、time-mid(4位)、version&time-hi(合并成4位)、variant & clock-seq(合并成4位)、node(12位)。
如果要获取完成的字符串可以这样:
private String getHashCode() {
UUID uuid = UUID.randomUUID();
String uuidStr = uuid.toString().replace("-", "");
return uuidStr;
}
参考: