前言
Flink的内存管理器管理着用于排序、散列和缓存所需的内存。内存以相等大小的(Segments)表示,称为内存页。操作器通过请求多个内存页来分配内存。在Flink中,内存又分为堆内存和非堆内存。至于是去申请什么类型的内存,这里有相关的参数去配置。
内存管理器可以预先分配所有内存,或者按需分配内存。在前者中,内存将从开始就被占用和保留,这意味着在请求内存时不能出现OutOfMemoryError,释放的内存也将返回到内存管理器的池中。按需分配是指内存管理器只跟踪当前分配了多少内存段(仅记账)。释放内存页不会将其添加到池中,而是让垃圾收集器重新声明它。
下面让我们一起剖析task manager的内存管理。
一. 堆内存和非堆内存的定义
Flink的内存分为堆内存和非堆内存。他们共同继承一个MemoryPool的抽象类。所以首先让我们先看下MemoryPool的定义。
// ------------------------------------------------------------------------
// 内存池抽象类
// ------------------------------------------------------------------------
abstract static class MemoryPool {
// 得到内存池中可用的内存页
abstract int getNumberOfAvailableMemorySegments();
// 分配内存池的大小
abstract MemorySegment allocateNewSegment(Object owner);
// 从内存池中请求内存页
abstract MemorySegment requestSegmentFromPool(Object owner);
// 往内存池中添加内存页
abstract void returnSegmentToPool(MemorySegment segment);
// 清空内存池
abstract void clear();
}
从代码可以看出MemoryPool就是一个抽象类,具体的实现我们就需要继续往下看了。
堆内存的定义类:
static final class HybridHeapMemoryPool extends MemoryPool {
/** 内存段的容器 */
private final ArrayDeque<byte[]> availableMemory;
private final int segmentSize;
HybridHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(new byte[segmentSize]);
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledSegment(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
byte[] buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment heapSegment = (HybridMemorySegment) segment;
availableMemory.add(heapSegment.getArray());
heapSegment.free();
}
else {
throw new IllegalArgumentException("Memory segment is not a " + HybridMemorySegment.class.getSimpleName());
}
}
@Override
protected int getNumberOfAvailableMemorySegments() {
return availableMemory.size();
}
@Override
void clear() {
availableMemory.clear();
}
}
再来看下非堆内存的定义:
static final class HybridOffHeapMemoryPool extends MemoryPool {
/** 内存段的容器 */
private final ArrayDeque<ByteBuffer> availableMemory;
private final int segmentSize;
HybridOffHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(ByteBuffer.allocateDirect(segmentSize));
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledOffHeapMemory(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
ByteBuffer buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledOffHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment hybridSegment = (HybridMemorySegment) segment;
ByteBuffer buf = hybridSegment.getOffHeapBuffer();
availableMemory.add(buf);
hybridSegment.free();
}
else {
throw new IllegalArgumentException("Memory segment is not a " + HybridMemorySegment.class.getSimpleName());
}
}
@Override
protected int getNumberOfAvailableMemorySegments() {
return availableMemory.size();
}
@Override
void clear() {
availableMemory.clear();
}
}
咋一眼看上去,感觉堆内存和非堆内存的类定义一模一样。仔细看,其实就只有一个地方不一样,那就是存储内存的容器类型。
在堆内存中,内存容器是这么定义的:
private final ArrayDeque<byte[]> availableMemory;
在非堆内存中,内存容器是这么定义的:
private final ArrayDeque<ByteBuffer> availableMemory;
所以从这里就可以较明朗地看出,堆内存和非堆内存的本质区别。堆内存是直接在JVM虚拟机中分配内存的,是byte[]的形式;而非堆内存是以缓冲的方式存在JVM之外的内存,存储方式是java.nio.ByteBuffer。这也是Flink内存设计得比较精妙的地方。尤其是在大数据分析中,基本上是很吃内存的计算密集型操作。所以如果仅靠传统的JVM内存模型,把所以的内存存在堆中,那么肯定会内存溢出的。因为Flink就很巧妙地结合了非堆内存,使用ByteBuffer可以进行高效的IO操作 。
2. 内存管理类分析
首先我们要明白内存管理是针对TaskManager的内存管理,其中每个TaskManager中又被划分为了不同的slots,然后每个slots可能同时在执行着不同的任务操作。先明白以上概念,然后再让我们一起分析下源码:
public class MemoryManager {
private static final Logger LOG = LoggerFactory.getLogger(MemoryManager.class);
/** 默认的内存页大小.目前默认是32KB */
public static final int DEFAULT_PAGE_SIZE = 32 * 1024;
/** 最小的内存页大小,目前是默认为4KB. */
public static final int MIN_PAGE_SIZE = 4 * 1024;
/** 对象锁 */
private final Object lock = new Object();
/** 用来提取内存段的内存池.主要分为堆内存和非堆内存*/
private final MemoryPool memoryPool;
/** 每一个内存owner所分配的内存*/
private final HashMap<Object, Set<MemorySegment>> allocatedSegments;
/**该内存管理器的内存类型,是Heap,或者OFF_Heap */
private final MemoryType memoryType;
/** 用于将大小缩小到页面大小倍数的掩码. */
private final long roundingMask;
/** 内存页的大小. */
private final int pageSize;
/** 初始化总内存大小*/
private final int totalNumPages;
/** 内存管理器管理的内存的总大小。*/
private final long memorySize;
/** task manager的slots数量 */
private final int numberOfSlots;
/** 标记内存管理器是否立即分配内存。*/
private final boolean isPreAllocated;
/** 未分配的内存页数,可用于惰性分配。*/
private int numNonAllocatedPages;
/** 内存管理器是否已关闭的flag */
private boolean isShutDown;
/**
* 构造函数
* 利用给定给定的容量和默认的内存页来创建建内存管理器
*
* @param memorySize 总内存大小
* @param numberOfSlots slots数量
*/
public MemoryManager(long memorySize, int numberOfSlots) {
this(memorySize, numberOfSlots, DEFAULT_PAGE_SIZE, MemoryType.HEAP, true);
}
public MemoryManager(long memorySize, int numberOfSlots, int pageSize,
MemoryType memoryType, boolean preAllocateMemory) {
// sanity checks
if (memoryType == null) {
throw new NullPointerException();
}
if (memorySize <= 0) {
throw new IllegalArgumentException("Size of total memory must be positive.");
}
if (pageSize < MIN_PAGE_SIZE) {
throw new IllegalArgumentException("The page size must be at least " + MIN_PAGE_SIZE + " bytes.");
}
if (!MathUtils.isPowerOf2(pageSize)) {
throw new IllegalArgumentException("The given page size is not a power of two.");
}
this.memoryType = memoryType;
this.memorySize = memorySize;
this.numberOfSlots = numberOfSlots;
// assign page size and bit utilities
this.pageSize = pageSize;
this.roundingMask = ~((long) (pageSize - 1));
final long numPagesLong = memorySize / pageSize;
if (numPagesLong > Integer.MAX_VALUE) {
throw new IllegalArgumentException("The given number of memory bytes (" + memorySize
+ ") corresponds to more than MAX_INT pages.");
}
// 整个内存页数
this.totalNumPages = (int) numPagesLong;
if (this.totalNumPages < 1) {
throw new IllegalArgumentException("The given amount of memory amounted to less than one page.");
}
this.allocatedSegments = new HashMap<Object, Set<MemorySegment>>();
this.isPreAllocated = preAllocateMemory;
this.numNonAllocatedPages = preAllocateMemory ? 0 : this.totalNumPages;
final int memToAllocate = preAllocateMemory ? this.totalNumPages : 0;
switch (memoryType) {
// 堆内存
case HEAP:
this.memoryPool = new HybridHeapMemoryPool(memToAllocate, pageSize);
break;
// 非堆内存
case OFF_HEAP:
if (!preAllocateMemory) {
LOG.warn("It is advisable to set 'taskmanager.memory.preallocate' to true when" +
" the memory type 'taskmanager.memory.off-heap' is set to true.");
}
this.memoryPool = new HybridOffHeapMemoryPool(memToAllocate, pageSize);
break;
default:
throw new IllegalArgumentException("unrecognized memory type: " + memoryType);
}
LOG.debug("Initialized MemoryManager with total memory size {}, number of slots {}, page size {}, " +
"memory type {}, pre allocate memory {} and number of non allocated pages {}.",
memorySize,
numberOfSlots,
pageSize,
memoryType,
preAllocateMemory,
numNonAllocatedPages);
}
// ------------------------------------------------------------------------
// 关闭
// ------------------------------------------------------------------------
/**
* 关闭内存管理器,试图释放它管理的所有内存。
* 根据实现细节,垃圾收集器不一定能够回收内存,因为从内存管理器分配内存的代码中可能仍然存在对已分配段的引用
*/
public void shutdown() {
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
if (!isShutDown) {
// 标记为shutdown,并且释放内存
isShutDown = true;
numNonAllocatedPages = 0;
// 遍历所有已经分配的segments,然后释放他们
for (Set<MemorySegment> segments : allocatedSegments.values()) {
for (MemorySegment seg : segments) {
seg.free();
}
}
memoryPool.clear();
}
}
// -------------------- END CRITICAL SECTION -------------------
}
/**
* 检查内存管理是否已经全部被关闭
*
* @return True, if the memory manager is shut down, false otherwise.
*/
public boolean isShutdown() {
return isShutDown;
}
/**
* 检查内存池是否已经全部释放空
*
* @return True, 如果内存池已经全部释放空; false,表示还没有释放空
*/
public boolean verifyEmpty() {
synchronized (lock) {
return isPreAllocated ?
memoryPool.getNumberOfAvailableMemorySegments() == totalNumPages :
numNonAllocatedPages == totalNumPages;
}
}
// ------------------------------------------------------------------------
// 内存分配和释放
// ------------------------------------------------------------------------
/**
* 从内存管理器分配一组内存页
* 如果内存管理器是预分配全量内存,那么就直接从池中取;
* 如果是按需分配,那么就正常从新分配内存即可。
*
* @param owner 内存页申请的owner
* @param numPages 要申请的页数量
* @return 申请内存页的列表
* @throws MemoryAllocationException Thrown, 如果申请内存的大小超出了内存池的剩余空间
*/
public List<MemorySegment> allocatePages(Object owner, int numPages) throws MemoryAllocationException {
final ArrayList<MemorySegment> segs = new ArrayList<MemorySegment>(numPages);
allocatePages(owner, segs, numPages);
return segs;
}
public void allocatePages(Object owner, List<MemorySegment> target, int numPages)
throws MemoryAllocationException {
// sanity check
if (owner == null) {
throw new IllegalArgumentException("The memory owner must not be null.");
}
// reserve array space, if applicable
if (target instanceof ArrayList) {
// 保证最小容量,允许动态扩容
((ArrayList<MemorySegment>) target).ensureCapacity(numPages);
}
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
if (isShutDown) {
throw new IllegalStateException("Memory manager has been shut down.");
}
// in the case of pre-allocated memory, the 'numNonAllocatedPages' is zero, in the
// lazy case, the 'freeSegments.size()' is zero.
if (numPages > (memoryPool.getNumberOfAvailableMemorySegments() + numNonAllocatedPages)) {
throw new MemoryAllocationException("Could not allocate " + numPages + " pages. Only " +
(memoryPool.getNumberOfAvailableMemorySegments() + numNonAllocatedPages)
+ " pages are remaining.");
}
Set<MemorySegment> segmentsForOwner = allocatedSegments.get(owner);
if (segmentsForOwner == null) {
segmentsForOwner = new HashSet<MemorySegment>(numPages);
allocatedSegments.put(owner, segmentsForOwner);
}
if (isPreAllocated) { // 全部预先分配的情况
for (int i = numPages; i > 0; i--) {
MemorySegment segment = memoryPool.requestSegmentFromPool(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
}
else {
for (int i = numPages; i > 0; i--) {
MemorySegment segment = memoryPool.allocateNewSegment(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
numNonAllocatedPages -= numPages;
}
}
// -------------------- END CRITICAL SECTION -------------------
}
/**
* 释放掉特定内存页的内存
* 如果内存管理器管理预先分配的内存,则内存页返回到内存池。否则,该段只被释放,并符合GC回收的资格。
*/
public void release(MemorySegment segment) {
// check if segment is null or has already been freed
if (segment == null || segment.getOwner() == null) {
return;
}
final Object owner = segment.getOwner();
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
// prevent double return to this memory manager
if (segment.isFreed()) {
return;
}
if (isShutDown) {
throw new IllegalStateException("Memory manager has been shut down.");
}
// remove the reference in the map for the owner
try {
Set<MemorySegment> segsForOwner = this.allocatedSegments.get(owner);
if (segsForOwner != null) {
segsForOwner.remove(segment);
if (segsForOwner.isEmpty()) {
this.allocatedSegments.remove(owner);
}
}
if (isPreAllocated) {
// release the memory in any case
memoryPool.returnSegmentToPool(segment);
}
else {
segment.free();
numNonAllocatedPages++;
}
}
catch (Throwable t) {
throw new RuntimeException("Error removing book-keeping reference to allocated memory segment.", t);
}
}
// -------------------- END CRITICAL SECTION -------------------
}
/**
* 一次性将很多segments一起释放掉
* 释放的方式同上
*/
public void release(Collection<MemorySegment> segments) {
if (segments == null) {
return;
}
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
if (isShutDown) {
throw new IllegalStateException("Memory manager has been shut down.");
}
// since concurrent modifications to the collection
// can disturb the release, we need to try potentially multiple times
boolean successfullyReleased = false;
do {
final Iterator<MemorySegment> segmentsIterator = segments.iterator();
Object lastOwner = null;
Set<MemorySegment> segsForOwner = null;
try {
// go over all segments
while (segmentsIterator.hasNext()) {
final MemorySegment seg = segmentsIterator.next();
if (seg == null || seg.isFreed()) {
continue;
}
final Object owner = seg.getOwner();
try {
// get the list of segments by this owner only if it is a different owner than for
// the previous one (or it is the first segment)
if (lastOwner != owner) {
lastOwner = owner;
segsForOwner = this.allocatedSegments.get(owner);
}
// remove the segment from the list
if (segsForOwner != null) {
segsForOwner.remove(seg);
if (segsForOwner.isEmpty()) {
this.allocatedSegments.remove(owner);
}
}
if (isPreAllocated) {
memoryPool.returnSegmentToPool(seg);
}
else {
seg.free();
numNonAllocatedPages++;
}
}
catch (Throwable t) {
throw new RuntimeException(
"Error removing book-keeping reference to allocated memory segment.", t);
}
}
segments.clear();
// the only way to exit the loop
successfullyReleased = true;
}
catch (ConcurrentModificationException | NoSuchElementException e) {
// this may happen in the case where an asynchronous
// call releases the memory. fall through the loop and try again
}
} while (!successfullyReleased);
}
// -------------------- END CRITICAL SECTION -------------------
}
/**
* 释放掉所有的segments
*
* @param owner The owner memory segments are to be released.
*/
public void releaseAll(Object owner) {
if (owner == null) {
return;
}
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
if (isShutDown) {
throw new IllegalStateException("Memory manager has been shut down.");
}
// get all segments
final Set<MemorySegment> segments = allocatedSegments.remove(owner);
// all segments may have been freed previously individually
if (segments == null || segments.isEmpty()) {
return;
}
// free each segment
if (isPreAllocated) {
for (MemorySegment seg : segments) {
memoryPool.returnSegmentToPool(seg);
}
}
else {
for (MemorySegment seg : segments) {
seg.free();
}
numNonAllocatedPages += segments.size();
}
segments.clear();
}
// -------------------- END CRITICAL SECTION -------------------
}
// ------------------------------------------------------------------------
// 属性,大小和大小转换
// ------------------------------------------------------------------------
/**
* 得到内存的类型,是heap还是off-heap
*/
public MemoryType getMemoryType() {
return memoryType;
}
/**
* 是否是预分配所有内存
*
* @return True if the memory manager pre-allocates the memory, false if it allocates as needed.
*/
public boolean isPreAllocated() {
return isPreAllocated;
}
/**
* 内存页的大小
* @return The size of the pages handled by the memory manager.
*/
public int getPageSize() {
return pageSize;
}
/**
* 得到内存管理器的总内存
*
* @return The total size of memory.
*/
public long getMemorySize() {
return memorySize;
}
/**
* 得到该内存池的总页数
* @return The total number of memory pages managed by this memory manager.
*/
public int getTotalNumPages() {
return totalNumPages;
}
/**
* 计算给定字节数对应多少页。如果给定的字节数不是页面大小的精确倍数,则结果被舍入,从而不包括一部分内存(小于页面大小)。
*
* @param fraction the fraction of the total memory per slot
* @return The number of pages to which
*/
public int computeNumberOfPages(double fraction) {
if (fraction <= 0 || fraction > 1) {
throw new IllegalArgumentException("The fraction of memory to allocate must within (0, 1].");
}
return (int) (totalNumPages * fraction / numberOfSlots);
}
/**
*计算每个slot所分配到的内存大小。
*
* @param fraction 任务slots所占有的总内存占比,也就是taskmanager并不是把所有内存都给slots了。
* @return The number of pages corresponding to the memory fraction.
*/
public long computeMemorySize(double fraction) {
return pageSize * (long) computeNumberOfPages(fraction);
}
/**
* 将给定值字节对齐到内存管理器页面大小的倍数。
* Rounds the given value down to a multiple of the memory manager's page size.
*
* @return The given value, rounded down to a multiple of the page size.
*/
public long roundDownToPageSizeMultiple(long numBytes) {
return numBytes & roundingMask;
}
基本上MemoryManager的定义的解释全部体现在了代码注释上去了,概括来说,主要难点是内存池的申请和内存释放回收。这个其实需要好好体会。
这里还有一些疑问,比如:
- 堆内存和非堆内存之间可以转化吗?
- 系统怎么知道什么时候申请堆内存,什么时候申请非堆内存。
总结
该部分只是taskmanager的memory manager部分,其实涉及到的底层知识蛮多的,相比于C++工程师,底层内存操作对于我们java工程师来说,其实这是一大难点。我深感以上分析其实还不是最底层的分析。后续如果还有心得体会,会及时更新补充进来。