常见文件类型处理

本节简要介绍如何利用Java API和一些第三方类库，来处理如下5中类型的文件：

属性文件：属性文件是常见的配置文件，用于在不改变代码的情况下改变程序的行为。
CSV：CSV是Comma Separated Values 的缩写，表示逗号分隔值，是一种非常常见的文件类型。大部分日志文件都是CSV，CSV也经常用于交换表格类型的数据，待会我们会看到，CSV看上去很简单，但处理的复杂性经常被低估。
Excel：在编程中，经常需要将表格的数据导出为Excel格式，以方便用户查看，也经常需要接受Excel类型的文件作为输入以批量导入数据。
HTML：所有网页都是HTML格式，我们经常需要分析HTML网页，以从中提取感兴趣的信息。
压缩文件：压缩文件有多种格式，也有很多压缩工具，大部分情况下，我们可以借助工具而不需要自己写程序处理压缩文件，但某些情况下，需要自己变成压缩文件或解压缩文件。

属性文件

属性文件一般很简单，一行表示一个属性，属性就是键值对，键和值用等号（=）或冒号（:）分隔，一般用于配置程序的一些参数。在需要连接数据库的程序中，经常使用配置文件配置数据库信息。比如，没有文件config.properties,内容大概如下所示：

db.host = 192.168.10.100
db.port : 3306
db.username = zhangsan
db.password = mima1234

处理这种文件使用字符流是比较容易地，但Java中有一个专门的类java.util.Properties，它的使用也很简单，有如下主要方法：

public synchronized void load(InputStream inStream)
public String getProperty(String key)
public String getProperty(String key, String defaultValue)

load用于从流中加载属性，getProperty用于获取属性值，可以提供一个默认值，如果没有找到配合的值，则返回默认值。对于上面的配置文件，可以使用类似下面的代码进行读取：

Properties prop = new Properties();
prop.load(new FileInputStream("config.properties"));
String host = prop.getProperty("db.host");
int port = Integer.valueOf(prop.getProperty("db.port", "3306"));

使用类Properties处理属性文件的好处是：

可以自动处理空格，分隔符=前后的空格会被自动忽略。
可以自动忽略空行。
可以添加注释，以字符#或！开头的行会被视为注释，进行忽略。

使用Properties也有限制，它不能直接处理中文，在配置文件中，所有非ASCII字符需要使用Unicode编码。比如，不能再配置文件中直接这么写：

name=老马

"老马"需要替换为Unicode编码，如下所示：

name=\u8001\u9A6C

在Java IDE（如Eclipse）中，如果使用属性文件编码器，它会自动替换中文为Unicode编码；如果使用其他编辑器，可以先写成中文，然后使用JDK提供的命令native2ascii转换为Unicode编码。用法如下例所示：

native2ascii -encoding UTF-8 native.properties ascii.properties

native.properties是输入，其中包含中文；ascii.properties是输出，中文替换为了Unicode编码；-encoding指定输入文件的编码，这里指定为UTF-8。

CSV文件

CSV是Comma-Seqarated Values的缩写，表示逗号分隔值。一般而言，一行表示一条记录，一条记录包含多个字段，字段之间用逗号分隔。不过，一般而言，分隔符不一定是逗号，可能是其他字符，如tab符’\t’、冒号’:’、分号’;'等。程序中的各种日志文件通常是CSV文件，在导入导出表格类型的数据时，CSV也是经常用的一个格式。
CSV表格看上去很简单。比如，我们再上一章保存学生列表时，使用的就是CSV格式：

张三，18,80.9
李四，17,67.5

使用之前介绍的字符流，看上去就可以很容易处理CSV文件，按行读取，对每一行，使用String.split进行分隔即可。但使其CSV有一些复杂的地方，最重要的是：

字段内容中包含分隔符怎么办？
字段内容中包含换行符怎么办？

对于这些问题，CSV有一个参考标准：RFC-4180，但实践中不同程序往往有其他处理方式，所幸的是，处理方式大体类似，大概有以下两种处理方式。

使用引用符号比如“，在字段内容两边加上”，如果过内容中包含“本身，则使用两个”。
使用专业字符，常用的是\，如果内容中包含\，则使用两个\。

比如，如果字段内容有两行，内容为：

hello, world \ abc
"老马"

使用第一种方式，内容会变为：

"hello, world \ abc
""老马"""

使用第二种方式，内容会变为：

hello\,world\\ abc\n"老马"

CSV还有其他一些细节，不同程序的处理方式也不一样，比如：

怎么表示null值
空行和字段之间的空格怎么处理
怎么表示注释

对于以上这些复杂问题，使用简单的字符流就难以处理了。有一个第三方类库：Apache Commons CSV，对处理CSV提供了良好的支持，它的官网地址是。简要介绍其用法。Apache Commons CSV中有一个重要的类CSVFormat，它表示CSV格式，它有很多方法以定义具体的CSV格式，如：

//定义分隔符
public CSVFormat withDelimiter(final char delimiter)
//定义引号符
public CSVFormat withQuote(final char quoteChar)
//定义转义符
public CSVFormat withEscape(final char escape)
//定义值为null的对象对应的字符串值
public CSVFormat withNullString(final String nullString)
//定义记录之间的分隔符
public CSVFormat withRecordSeparator(final char recordSeparator)
//定义是否忽略字段之间的空白
public CSVFormat withIgnoreSurroundingSpaces(
    final boolean ignoreSurroundingSpaces)

比如，如果CSV格式使用分号；作为分隔符，使用“作为引号符，使用N/A表示null对象，忽略字段之间的空白，那么CSVFormat可以如下创建：

CSVFormat format = CSVFormat.newFormat('; ')
        .withQuote('"').withNullString("N/A")
          .withIgnoreSurroundingSpaces(true);

除了自定义CSVFormat，CSVFormat类中也定义了一些预定义的格式，如CSVFormat.DEFAULT，CSVFormat.RFC4180。
CSVFormat有一个方法，可以分析字符流：

public CSVParser parse(final Reader in) throws IOException

返回值类型为CSVParser，它有如下方法获取记录信息：

public Iterator<CSVRecord> iterator()
public List<CSVRecord> getRecords() throws IOException
public long getRecordNumber()

CSVRecord表示一条记录，它有如下方法获取每个字段的信息：

//根据字段列索引获取值，索引从0开始
public String get(final int i)
//根据列名获取值
public String get(final String name)
//字段个数
public int size()
//字段的迭代器
public Iterator<String> iterator()

分析CSV文件的基本代码如下所示：

CSVFormat format = CSVFormat.newFormat('; ')
        .withQuote('"').withNullString("N/A")
        .withIgnoreSurroundingSpaces(true);
Reader reader = new FileReader("student.csv");
try{
    for(CSVRecord record : format.parse(reader)){
        int fieldNum = record.size();
        for(int i=0; i<fieldNum; i++){
            System.out.print(record.get(i)+" ");
        }
        System.out.println();
    }
}finally{
    reader.close();
}

除了分析CSV文件，Apache Commons CSV也可以写CSV文件，有一个CSVPrinter，它有很多打印方法，比如：

//输出一条记录，参数可变，每个参数是一个字段值
public void printRecord(final Object... values) throws IOException
//输出一条记录
public void printRecord(final Iterable<? > values) throws IOException

代码示例：

CSVPrinter out = new CSVPrinter(new FileWriter("student.csv"),
        CSVFormat.DEFAULT);
out.printRecord("老马", 18, "看电影，看书，听音乐");
out.printRecord("小马", 16, "乐高；赛车；");
out.close();

输出文件student.csv中的内容为：

"老马",18, "看电影，看书，听音乐"
"小马",16，乐高；赛车；

Excel

Excel主要有两种格式，扩展名分别为.xls和.xlsx。.xlsx是office 2007以后的Excel文件的默认扩展名。Java中处理Excel文件及其他微软文档广泛使用POI类库，其官网是。使用POI处理Excel文件，有如下主要类。

Workbook：表示一个Excel文件对象，它是一个接口，有两个主要类HSSFWork-book和ⅩSSFWorkbook，前者对应．xls格式，后者对应．xlsx格式。
Sheet：表示一个工作表。
Row：表示一行。
Cell：表示一个单元格。

比如，保存学生列表到student.xls，代码可以为：

public static void saveAsExcel(List<Student> list) throws IOException {
    Workbook wb = new HSSFWorkbook();
    Sheet sheet = wb.createSheet();
    for(int i = 0; i < list.size(); i++) {
        Student student = list.get(i);
        Row row = sheet.createRow(i);
        row.createCell(0).setCellValue(student.getName());
        row.createCell(1).setCellValue(student.getAge());
        row.createCell(2).setCellValue(student.getScore());
    }
    OutputStream out = new FileOutputStream("student.xls");
    wb.write(out);
    out.close();
    wb.close();
}

如果要保存为.xlsx格式，只需要替换第一行为：

Workbook wb = new XSSWorkbook();

使用POI也可以方便的解析Excel文件，使用WorkbookFactory的create方法即可，如下所示：

public static List<Student> readAsExcel() throws Exception   {
  Workbook wb = WorkbookFactory.create(new File("student.xls"));
  List<Student> list = new ArrayList<Student>();
  for(Sheet sheet : wb){
      for(Row row : sheet){
          String name = row.getCell(0).getStringCellValue();
          int age = (int)row.getCell(1).getNumericCellValue();
          double score = row.getCell(2).getNumericCellValue();
          list.add(new Student(name, age, score));
      }
  }
  wb.close();
  return list;
}

以上只是介绍了基本方法，如果需要更多信息，如配置单元格的格式、颜色、字体，可参看这里

HTML

HTML是网页的格式，在日常工作中，可能需要分析HTML页面，抽取其中感兴趣的信息。有很多HTML分析器，我们简要介绍一种：jsoup。
假定我们要抽取网页主题内容中每篇文章的标题和链接，怎么实现呢？
定位文章列表的CSS选择器可以是：

# cnblogs_post_body p a

我们来看代码（假定文件为articels.html）：

Document doc = Jsoup.parse(new File("articles.html"), "UTF-8");
Elements elements = doc.select("#cnblogs_post_body p a");
for(Element e : elements){
    String title = e.text();
    String href = e.attr("href");
    System.out.println(title+", "+href);
}

输出为（部分）：

计算机程序的思维逻辑 (1) - 数据和变量， http://www.cnblogs.com/swiftma/p/5396551.html
计算机程序的思维逻辑 (2) - 赋值， http://www.cnblogs.com/swiftma/p/5399315.html

jsoup也可以直接链接URL进行分析，比如，上面代码的第一行可以替换为：

计算机程序的思维逻辑 (1) - 数据和变量， http://www.cnblogs.com/swiftma/p/5396551.html
计算机程序的思维逻辑 (2) - 赋值， http://www.cnblogs.com/swiftma/p/5399315.html

关于jsoup的更多用法，请参看其官网。

压缩文件

压缩文件有多种格式，Java SDK支持两种：gzip和zip，gzip只能压缩一个文件，而zip文件中可以包含多个文件。
先来看gzip，有两个主要类：

java.util.zip.GZIPOutputStream
java.util.zip.GZIPInputStream

它们分别是OutputStream和InputStream的子类，都是装饰类，GZIPOutputStream加到已有的流上，就可以实现压缩，而GZIPInputStream加到已有的流上，就可以实现解压缩。比如，压缩一个文件的代码可以为：

public static void gzip(String fileName) throws IOException {
    InputStream in = null;
    String gzipFileName = fileName + ".gz";
    OutputStream out = null;
    try {
        in = new BufferedInputStream(new FileInputStream(fileName));
        out = new GZIPOutputStream(new BufferedOutputStream(
                new FileOutputStream(gzipFileName)));
        copy(in, out);
    } finally {
        if(out ! = null) {
            out.close();
        }
        if(in ! = null) {
            in.close();
        }
    }
}

调用的copy方法时我们在上一章介绍的。解压缩文件的代码可以为：

public static void gunzip(String gzipFileName, String unzipFileName) throws IOException {
        InputStream in = null;
        OutputStream out = null;
        try {
            in = new GZIPInputStream(new BufferedInputStream(
                    new FileInputStream(gzipFileName)));
            out = new BufferedOutputStream(new FileOutputStream(
                    unzipFileName));
            copy(in, out);
        } finally {
            if(out ! = null) {
                out.close();
            }
        if(in ! = null) {
            in.close();
        }
    }
}

zip文件支持一个压缩文件中包含多个文件，Java API中主要的类是：

java.util.zip.ZipOutputStream
java.util.zip.ZipInputStream

它们也分别是OutputStream和InputStream的子类，也都是装饰类，但不能像GZIP-OutputStream/GZIPInputStream那样简单实用。

ZipOutputStream可以写入多个文件，它有一个重用方法：

public void putNextEntry(ZipEntry e) throws IOException

在写入每个文件前，必须要先调用该方法，表示准备写入一个压缩条目ZipEntry，每个压缩条目有个名称，这个名称是压缩文件的相对路径，如果名称以字符’/'结尾，表示目录，它的构造方法是：

public ZipEntry(String name)

我们看一段代码，压缩一个文件或一个目录：

public static void zip(File inFile, File zipFile) throws IOException {
    ZipOutputStream out = new ZipOutputStream(new BufferedOutputStream(
            new FileOutputStream(zipFile)));
    try {
        if(! inFile.exists()) {
            throw new FileNotFoundException(inFile.getAbsolutePath());
        }
        inFile = inFile.getCanonicalFile();
        String rootPath = inFile.getParent();
        if(! rootPath.endsWith(File.separator)) {
            rootPath += File.separator;
        }
        addFileToZipOut(inFile, out, rootPath);
    } finally {
        out.close();
    }
}

参数inFile表示输入，可以是普通文件或目录，zipFIle表示输出，rootPath表示父目录，用于计算每个文件的相对路径，主要调用了addFileToZipOut将文件加入到ZipOutputStream中，代码为：

private static void addFileToZipOut(File file, ZipOutputStream out, String rootPath) throws IOException {
    String relativePath = file.getCanonicalPath().substring(
                rootPath.length());
    if(file.isFile()) {
        out.putNextEntry(new ZipEntry(relativePath));
        InputStream in = new BufferedInputStream(new FileInputStream(file));
        try {
            copy(in, out);
        } finally {
            in.close();
        }
    } else {
        out.putNextEntry(new ZipEntry(relativePath + File.separator));
        for(File f : file.listFiles()) {
            addFileToZipOut(f, out, rootPath);
        }
    }
}

它同样调用了copy方法将文件内容写入ZipOutputStream，对于目录，进行递归调用。ZipInputStream用于解压zip文件，它有一个对应的方法，获取压缩条目：

public ZipEntry getNextEntry() throws IOException

如果返回值为null，表示没有条目了。使用ZipInputStream解压文件，可以使用类似如下代码：

public static void unzip(File zipFile, String destDir) throws IOException {
    ZipInputStream zin = new ZipInputStream(new BufferedInputStream(
            new FileInputStream(zipFile)));
    if(! destDir.endsWith(File.separator)) {
        destDir += File.separator;
    }
    try {
        ZipEntry entry = zin.getNextEntry();
        while(entry ! = null) {
            extractZipEntry(entry, zin, destDir);
            entry = zin.getNextEntry();
        }
    } finally {
        zin.close();
    }
}

调用extractZipEntry处理每个压缩条目，代码为：

private static void extractZipEntry(ZipEntry entry, ZipInputStream zin, String destDir) throws IOException {
    if(! entry.isDirectory()) {
        File parent = new File(destDir + entry.getName()).getParentFile();
        if(! parent.exists()) {
            parent.mkdirs();
        }
          OutputStream entryOut = new BufferedOutputStream(
                  new FileOutputStream(destDir + entry.getName()));
          try {
              copy(zin, entryOut);
          } finally {
              entryOut.close();
          }
      } else {
          new File(destDir + entry.getName()).mkdirs();
      }
  }

随机读写文件

这里我们先介绍RandomAccessFile的用法，然后介绍怎么利用它实现一个简单的键值对数据库。

用法

RandomAccessFile有如下构造方法：

public RandomAccessFile(String name, String mode) throws FileNotFoundException
public RandomAccessFile(File file, String mode) throws FileNotFoundException

参数name和file容易理解，表示文件路径和File对象，mode是什么意思呢？它表示打开模式，可以有4个取值。

“r”：只用于读。
“rw”：用于读和写。
“rws”：和"rw"一样，用于读和写，另外，它要求文件内容和元数据的任何更新都同步到设备上。
“rwd”：和"rw"一样，用于读和写，另外，它要求文件内容的任何更新都同步到设备上，和"rws"的区别是，元数据的更新不要求同步。

RandomAccessFile虽然不是InputStream/OutputStream的子类，但它也有类似于读写字节流的方法。另外，它还实现了DataInput/DataOutput接口。这些方法我们之前基本都介绍过，这里列举部分方法，以增强直观感受：

//读一个字节，取最低8位，0～255
public int read() throws IOException
public int read(byte b[]) throws IOException
public final int readInt() throws IOException
public final void writeInt(int v) throws IOException
public void write(byte b[]) throws IOException

RandomAccessFile还有另外两个read方法：

public final void readFully(byte b[]) throws IOException
public final void readFully(byte b[], int off, int len) throws IOException

与对应的read方法的区别是，它们可以确保读够期望的长度，如果到了文件结尾也没读够，它们会抛出EOFException异常。
RandomAccessFile内部有一个文件指针，指向当前读写的位置，各种read/write操作都会自动更新该指针。与流不同的是，RandomAccessFile可以获取该指针，也可以更改该指针，相关方法是：

//获取当前文件指针
public native long getFilePointer() throws IOException
//更改当前文件指针到pos
public native void seek(long pos) throws IOException

RandomAccessFile是通过本地方法，最终调用操作系统的API来实现文件指针调整的。
InputStream有一个skip方法，可以跳过输入流中n个字节，默认情况下，它是通过实际读取n个字节实现的。RandomAccessFile有一个类似方法，不过它是通过更改文件指针实现的：

public int skipBytes(int n) throws IOException

RandomAccessFile可以直接获取文件长度，返回文件字节数，方法为：

public native long length() throws IOException

它还可以直接修改文件长度，方法为：

public native void setLength(long newLength) throws IOException

如果当前文件的长度小于newLength，则文件会扩展，扩展部分的内容为定义。如果当前文件的长度大于newLength，则文件会收缩，多出的部分会截取，如果当前文件指针比newLength大，则调用后会变为newLength。
RandomAccessFile中有如下方法，需要注意一下：

public final void writeBytes(String s) throws IOException
public final String readLine() throws IOException

看上去，writeBytes方法可以直接写入字符串，而readLine方法可以按行输入字符串，实际上，这两个方法都有问题的，他们都没有编码的概念，都假定一个方法就代表一个字符，这对于中文显然是不成立的，所以，应避免使用这两个方法。

Java编程实践

设计一个键值数据库BasicDB

在日常的一般文件读写中，使用流就可以了，但在一些系统程序中，流是不适合的， RandomAccessFile因为更接近操作系统，更为方便和高效。
下面，我们来看怎么利用RandomAccessFile实现一个简单的键值数据库，我们称之为BasicDB。

BasicDB功能

BasicDB提供的接口类似于Map接口，可以按键保存、查找、删除，但数据可以持久化保存到文件上。此外，不像HashMap/TreeMap，它们将所有数据保存在内存，BasicDB只把元数据如索引信息保存在内存，值的数据保存在文件上。相比HashMap/TreeMap, BasicDB的内存消耗可以大大降低，存储的键值对个数大大提高，尤其当值数据比较大的时候。BasicDB通过索引，以及RandomAccessFile的随机读写功能保证效率。

BasicDB设计

我们采取如下简单的设计。

将键值对分为两部分，值保存在单独的.data文件中，值在.data文件中的位置和键称为索引，索引保存在.meta文件中。
在.data文件中，每个值占用的空间固定，固定长度为1024，前4个字节表示实际长度，然后是实际内容，实际长度不够1020的，后面是补白字节0。
索引信息既保存在.meta文件中，也保存在内存中，在初始化时，全部读入内存，对索引的更新不立即更新文件，调用flush方法才更新。
删除键值对不修改.data文件，但会从索引中删除并记录空白空间，下次添加键值对的时候会重用空白空间，所有的空白空间也记录到.meta文件中。

我们暂不考虑由于并发访问、异常关闭等引起的一致性问题。这个设计虽然是比较粗糙的，但可以演示一些基本概念。

BasicDB的实现

下面我们来看BasicDB的代码，先来看内部组成和构造方法，然后看一些主方法的实现：

package BasicDB;

import java.util.*;
import java.io.*;

public class BasicDB {
    private static final int MAX_DATA_LENGTH = 1020;
    // 补白字节
    private static final byte[] ZERO_BYTES = new byte[MAX_DATA_LENGTH];
    // 数据文件后缀
    private static final String DATA_SUFFIX = ".data";
    // 元数据文件后缀，包括索引和空白空间数据
    private static final String META_SUFFIX = ".meta";

    //索引信息，键->值在.data文件中的位置
    Map<String, Long> indexMap;
    // 空白空间，值为在.data文件中的位置
    Queue<Long> gaps;

    // 值数据文件
    RandomAccessFile db;
    // 元数据文件
    File metaFile;
    // 构造BasicDB方法
    public BasicDB(String path, String name) throws IOException {
        File dataFile = new File(path + name + DATA_SUFFIX);
        metaFile = new File(path + name + META_SUFFIX);

        db = new RandomAccessFile(dataFile, "rw");

        if (metaFile.exists()) {
            loadMeta();
        } else {
            indexMap = new HashMap<>();
            gaps = new ArrayDeque<>();
        }
    }
    // 该方法将元数据加载到内存
    private void loadMeta() throws IOException {
        DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(metaFile)));
        try {
            loadIndex(in);
            loadGaps(in);
        } finally {
            in.close();
        }
    }
    // 加载索引
    private void loadIndex(DataInputStream in) throws IOException {
        int size = in.readInt();
        indexMap = new HashMap<String, Long>((int) (size / 0.75f) + 1, 0.75f);
        for (int i = 0; i < size; i++) {
            String key = in.readUTF();
            long index = in.readLong();
            indexMap.put(key, index);
        }
    }
    // 保存索引信息
    private void saveIndex(DataOutputStream  out) throws IOException {
        out.writeInt(indexMap.size());  // 保存键值对个数
        for (Map.Entry<String, Long>entry : indexMap.entrySet()) {  //遍历键值对
            out.writeUTF(entry.getKey());  //保存键
            out.writeLong(entry.getValue());  // 保存值
        }
    }
    // 加载空白空间
    private void loadGaps(DataInputStream in) throws IOException {
        int size = in.readInt();
        gaps = new ArrayDeque<>(size);
        for (int i = 0; i<size ; i++) {
            long index = in.readLong();
            gaps.add(index);
        }
    }
    // 保存空白空间信息
    private void saveGaps(DataOutputStream out) throws IOException {
        out.writeInt(gaps.size());  // 保留长度
        for (Long pos : gaps) {  // 遍历每条信息并保存
            out.writeLong(pos);
        }
    }
    // 索引信息和空白信息保存
    private void saveMeta() throws IOException {
        DataOutputStream out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(metaFile)));
        try {
            saveIndex(out);  // 保存索引信息
            saveGaps(out);   // 保存空白空间数据
        } finally {
            out.close();
        }
    }
    // getData方法用于获取数据
    private byte[] getData(long pos) throws IOException {
        db.seek(pos);   // 定位
        int length = db.readInt();  // 获取数据长度
        byte[] data = new byte[length];
        db.readFully(data);   // 读取数据
        return data;
    }
    // 实际数据写入
    private void writeData(long pos, byte[] data) throws IOException {
        if (data.length > MAX_DATA_LENGTH) {
            throw new IllegalArgumentException("maxinum allowed length is"+MAX_DATA_LENGTH+",data length is"+data.length);
        }
        db.seek(pos);  // 指针定位到指定位置
        db.writeInt(data.length);  //数据数据长度
        db.write(data);   // 将内容写入文件
        db.write(ZERO_BYTES, 0, MAX_DATA_LENGTH - data.length);  //数据补白
    }
    // 查找空白空间，如果有就重新安排，否者定位到文件末尾
    private long nextAvailablePos() throws IOException {
        if (!gaps.isEmpty()) {
            return gaps.poll();
        } else {
            return db.length();
        }
    }
    // 保存键值方法
    public void put(String key, byte[] value) throws  IOException {
        Long index = indexMap.get(key);
        if (index == null) {
            index = nextAvailablePos();   // 安排value的位置
            indexMap.put(key, index);
        }
        writeData(index, value);
    }
    // 根据键获取值的方法
    public byte[] get(String key) throws  IOException {
        Long index = indexMap.get(key);   // 根据键值对获取数据存储位置
        if (index != null) {
            return getData(index);
        }
        return null;
    }
    // remove方法用于删除键值对
    public void remove(String key) {
        Long index = indexMap.remove(key);  // 索引结构中删除
        if (index != null) {
            gaps.offer(index);   // 添加到空白空间队列
        }
    }
    // 同步元数据方法
    public void flush() throws IOException {
        saveMeta();
        db.getFD().sync();   // getFD返回文件描述符，其sync方法确保文件内容保存到设备上
    }
    // 关闭数据库
    public void close() throws IOException {
        flush();  //同步数据
        db.close();
    }
}

这里介绍了RandomAccessFile的方法，它可以随机读写，更为接近操作系统的API，在实现一些系统程序时，它比流要更为方便高效。利用RandomAccessFile，我们实现了一个非常简单的键值对数据库，我们演示了这个数据库的用法、接口、设计和实现代码。我们同时展示了以前介绍的容器和流的一些用法。
这个数据库虽然简单粗糙，但是也具备了一些优良特点，比如占用的内存空间比较小，可以存储大量键值对，可以根据键高效访问值等。

《疯狂java讲义》学习（39）：常见文件类型&随机文件读写