POI Excel 基础(一)

POI 5.2.3
官网

github

POI-HSSF and POI-XSSF/SXSSF 用于访问Microsoft Excel格式文件的Java API

HSSF:是Horrible SpreadSheet Format的缩写,也即“可怕的电子表格格式”

    是操作Excel97-2003版本,扩展名为.xls。

XSSF:

    是操作Excel2007版本开始,扩展名为.xlsx。

SXSSF:

    是在XSSF基础上,POI3.8版本开始提供的一种支持低内存占用的操作方式,扩展名为.xlsx。

Excel不同版本的一些区别如下,这些限制其实间接的局限了POI提供的API功能。

1、支持的行数、列数

    Excel97-2003版本,一个sheet最大行数65536,最大列数256。

    Excel2007版本开始,一个sheet最大行数1048576,最大列数16384。

2、文件大小

    .xlsx文件比.xls的压缩率高,也就是相同数据量下,.xlsx的文件会小很多。

3、兼容性

    Excel97-2003版本是不能打开.xlsx文件的。

    Excel2007开始的版本是可以打开.xls文件的。

概述

HSSF是POI项目对Excel '97(-2007)文件格式的纯Java实现。XSSF是POI项目对Excel 2007 OOXML (.xlsx)文件格式的纯Java实现。

HSSF和XSSF提供了读取电子表格、创建、修改、读取和编写XLS电子表格的方法。他们提供:

  • 为有特殊需要而设的低阶结构
  • 用于高效只读访问的事件模型API(eventmodel api
  • 用于创建、读取和修改XLS文件的完整用户模型api

对于从纯HSSF用户模型转换为希望使用联合SS用户模型来支持HSSF和XSSF的人,请参阅SS用户模型转换指南

生成电子表格的另一种方法是通过Cocoon序列化器(但您仍将间接使用HSSF)。使用Cocoon,您可以通过简单地应用样式表和指定序列化器来序列化任何XML数据源(例如,它可能是用SQL输出的ESQL页面)。

如果您只是读取电子表格数据,那么根据您的文件格式,使用org.apache.poi.hssf.eventusermodel包或org.apache.poi.xssf.eventusermodel包中 eventmodel api。

如果要修改电子表格数据,则使用usermodel api。您也可以通过这种方式生成电子表格。

请注意,usermodel系统的内存占用比底层的eventmodel系统要高,但其主要优点是使用起来要简单得多。另外请注意,由于新的XSSF支持的Excel 2007 OOXML (.xlsx)文件是基于XML的,处理它们的内存占用比旧的HSSF支持的二进制文件(.xls)要高。

SXSSF (Since POI 3.8 beta3)

从3.8-beta3开始,POI提供了构建在XSSF之上的低内存占用的SXSSF API。

SXSSF是XSSF的api兼容流扩展,用于必须生成非常大的电子表格,并且堆空间有限的情况。SXSSF通过限制对滑动窗口内的行的访问来实现低内存占用,而XSSF允许访问文档中的所有行。不再在窗口中的旧行在被写入磁盘时变得不可访问。

在自动刷新模式下,可以指定访问窗口的大小,以便在内存中保存一定数量的行。当达到该值时,创建额外的行将导致从访问窗口中删除具有最低索引的行并将其写入磁盘。或者,窗口大小可以设置为动态增长;它可以根据需要通过显式调用flushRows(int keepRows)来定期修剪。

由于实现的流性质,与XSSF相比有以下限制:

  • 在一个时间点上只能访问有限数量的行。
  • 不支持Sheet.clone()
  • 不支持公式计算

更多细节请参见SXSSF How-To

下表概述了POI的电子表格API的比较特点:

在这里插入图片描述

1、开发人员HSSF和XSSF功能指南

想要使用HSSF和XSSF读写电子表格?这份指南是给你的。如果您想要更深入地了解HSSF和XSSF用户api,请参阅HOWTO指南,因为它包含了如何使用这些东西的实际描述。

1.1 新建 Workbook(工作簿)

Workbook wb = new XSSFWorkbook();
...
try (OutputStream fileOut = new FileOutputStream("workbook.xlsx")) {
    
    
    wb.write(fileOut);
}

1.2 新建 Sheet

Workbook wb = new HSSFWorkbook();  // or new XSSFWorkbook();
Sheet sheet1 = wb.createSheet("new sheet");
Sheet sheet2 = wb.createSheet("second sheet");
// Note that sheet name is Excel must not exceed 31 characters
// and must not contain any of the any of the following characters:
// 0x0000
// 0x0003
// colon (:)
// backslash (\)
// asterisk (*)
// question mark (?)
// forward slash (/)
// opening square bracket ([)
// closing square bracket (])
// You can use org.apache.poi.ss.util.WorkbookUtil#createSafeSheetName(String nameProposal)}
// for a safe way to create valid names, this utility replaces invalid characters with a space (' ')
String safeName = WorkbookUtil.createSafeSheetName("[O'Brien's sales*?]"); // returns " O'Brien's sales   "
Sheet sheet3 = wb.createSheet(safeName);
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}

1.3 创建单元格

Workbook wb = new HSSFWorkbook();
//Workbook wb = new XSSFWorkbook();
CreationHelper createHelper = wb.getCreationHelper();
Sheet sheet = wb.createSheet("new sheet");

// Create a row and put some cells in it.
// 行和单元格都是从0开始。
Row row = sheet.createRow(0);
// Create a cell and put a value in it.
Cell cell = row.createCell(0);
cell.setCellValue(1);
// Or do it on one line.
row.createCell(1).setCellValue(1.2);
row.createCell(2).setCellValue(
     createHelper.createRichTextString("This is a string"));
row.createCell(3).setCellValue(true);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}

1.4 创建日期单元格

Workbook wb = new HSSFWorkbook();
//Workbook wb = new XSSFWorkbook();
CreationHelper createHelper = wb.getCreationHelper();
Sheet sheet = wb.createSheet("new sheet");
// Create a row and put some cells in it. Rows are 0 based.
Row row = sheet.createRow(0);
// Create a cell and put a date value in it.  The first cell is not styled as a date.
Cell cell = row.createCell(0);
cell.setCellValue(new Date());

// we style the second cell as a date (and time).  It is important to
// create a new cell style from the workbook otherwise you can end up
// modifying the built in style and effecting not only this cell but other cells.
CellStyle cellStyle = wb.createCellStyle();
cellStyle.setDataFormat(
	    createHelper.createDataFormat().getFormat("m/d/yy h:mm"));
cell = row.createCell(1);
cell.setCellValue(new Date());
cell.setCellStyle(cellStyle);

//you can also set date as java.util.Calendar
cell = row.createCell(2);
cell.setCellValue(Calendar.getInstance());
cell.setCellStyle(cellStyle);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}

1.5 处理不同类型的单元格

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("new sheet");
Row row = sheet.createRow(2);
row.createCell(0).setCellValue(1.1);
row.createCell(1).setCellValue(new Date());
row.createCell(2).setCellValue(Calendar.getInstance());
row.createCell(3).setCellValue("a string");
row.createCell(4).setCellValue(true);
row.createCell(5).setCellType(CellType.ERROR);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}

1.6 Files vs InputStreams

当打开工作簿时,无论是.xls HSSFWorkbook还是.xlsx XSSFWorkbook,工作簿都可以从FileInputStream加载。使用File对象允许更低的内存消耗,而InputStream需要更多的内存,因为它必须缓冲整个文件。

如果使用WorkbookFactory,很容易使用其中一个:

// Use a file
Workbook wb = WorkbookFactory.create(new File("MyExcel.xls"));
// Use an InputStream, needs more memory
Workbook wb = WorkbookFactory.create(new FileInputStream("MyExcel.xlsx"));

如果直接使用HSSFWorkbookXSSFWorkbook,您通常应该通过POIFSFileSystemOPCPackage,以完全控制生命周期(包括完成时关闭文件):

// HSSFWorkbook, File
POIFSFileSystem fs = new POIFSFileSystem(new File("file.xls"));
HSSFWorkbook wb = new HSSFWorkbook(fs.getRoot(), true);
....
fs.close();
// HSSFWorkbook, InputStream, needs more memory
POIFSFileSystem fs = new POIFSFileSystem(myInputStream);
HSSFWorkbook wb = new HSSFWorkbook(fs.getRoot(), true);
// XSSFWorkbook, File
OPCPackage pkg = OPCPackage.open(new File("file.xlsx"));
XSSFWorkbook wb = new XSSFWorkbook(pkg);
....
pkg.close();
// XSSFWorkbook, InputStream, needs more memory
OPCPackage pkg = OPCPackage.open(myInputStream);
XSSFWorkbook wb = new XSSFWorkbook(pkg);
....
pkg.close();

1.7 演示各种对齐选项

public static void main(String[] args) throws Exception {
    
    
    Workbook wb = new XSSFWorkbook(); //or new HSSFWorkbook();
    Sheet sheet = wb.createSheet();
    Row row = sheet.createRow(2);
    
    // 设置行高使用HSSFRow对象的setHeight和setHeightInPoints方法,
	// 这两个方法的区别在于setHeightInPoints的单位是点,而setHeight的单位是1/20个点
    row.setHeightInPoints(30);
    createCell(wb, row, 0, HorizontalAlignment.CENTER, VerticalAlignment.BOTTOM);
    createCell(wb, row, 1, HorizontalAlignment.CENTER_SELECTION, VerticalAlignment.BOTTOM);
    createCell(wb, row, 2, HorizontalAlignment.FILL, VerticalAlignment.CENTER);
    createCell(wb, row, 3, HorizontalAlignment.GENERAL, VerticalAlignment.CENTER);
    createCell(wb, row, 4, HorizontalAlignment.JUSTIFY, VerticalAlignment.JUSTIFY);
    createCell(wb, row, 5, HorizontalAlignment.LEFT, VerticalAlignment.TOP);
    createCell(wb, row, 6, HorizontalAlignment.RIGHT, VerticalAlignment.TOP);
    // Write the output to a file
    try (OutputStream fileOut = new FileOutputStream("xssf-align.xlsx")) {
    
    
        wb.write(fileOut);
    }
    wb.close();
}
/**
 * Creates a cell and aligns it a certain way.
 *
 * @param wb     the workbook
 * @param row    the row to create the cell in
 * @param column the column number to create the cell in
 * @param halign the horizontal alignment for the cell.
 * @param valign the vertical alignment for the cell.
 */
private static void createCell(Workbook wb, Row row, int column, HorizontalAlignment halign, VerticalAlignment valign) {
    
    
    Cell cell = row.createCell(column);
    cell.setCellValue("Align It");
    CellStyle cellStyle = wb.createCellStyle();
    cellStyle.setAlignment(halign);
    cellStyle.setVerticalAlignment(valign);
    cell.setCellStyle(cellStyle);
}

1.8 使用边框

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("new sheet");
// Create a row and put some cells in it. Rows are 0 based.
Row row = sheet.createRow(1);
// Create a cell and put a value in it.
Cell cell = row.createCell(1);
cell.setCellValue(4);
// Style the cell with borders all around.
CellStyle style = wb.createCellStyle();
style.setBorderBottom(BorderStyle.THIN);
style.setBottomBorderColor(IndexedColors.BLACK.getIndex());
style.setBorderLeft(BorderStyle.THIN);
style.setLeftBorderColor(IndexedColors.GREEN.getIndex());
style.setBorderRight(BorderStyle.THIN);
style.setRightBorderColor(IndexedColors.BLUE.getIndex());
style.setBorderTop(BorderStyle.MEDIUM_DASHED);
style.setTopBorderColor(IndexedColors.BLACK.getIndex());
cell.setCellStyle(style);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

1.9 遍历行和单元格

有时,您希望只遍历工作簿中的所有工作表、工作表中的所有行或一行中的所有单元格。这可以通过一个简单的for循环实现。

这些迭代器可以通过调用workbook.sheetIterator(), sheet.rowIterator(), 和 row.cellIterator(),来获得,或者隐式地使用for-each循环。注意,rowIteratorcellIterator遍历已创建的行或单元格,跳过空行和单元格。

for (Sheet sheet : wb ) {
    
    
    for (Row row : sheet) {
    
    
        for (Cell cell : row) {
    
    
            // Do something here
        }
    }
}

1.10 迭代单元格,控制缺失/空白单元格(missing / blank cells)

在某些情况下,在迭代时,需要完全控制如何处理缺失或空白的行和单元格,并且需要确保访问每个单元格,而不仅仅是文件中定义的那些单元格。(CellIterator将只返回在文件中定义的单元格,这主要是那些有值或样式,但它取决于Excel)。

在这种情况下,您应该获取一行的第一列和最后一列信息,然后调用getCell(int, MissingCellPolicy)来获取单元格。使用MissingCellPolicy来控制如何处理空白或空单元格。


// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
    
    
   Row r = sheet.getRow(rowNum);
   if (r == null) {
    
    
      // This whole row is empty
      // Handle it as needed
      continue;
   }
   int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
   for (int cn = 0; cn < lastColumn; cn++) {
    
    
      Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
      if (c == null) {
    
    
         // The spreadsheet is empty in this cell
      } else {
    
    
         // Do something useful with the cell's contents
      }
   }
}

1.11 获取单元格内容

要获取单元格的内容,首先需要知道它是什么类型的单元格(例如,向字符串单元格请求其数字内容将得到NumberFormatException)。你会想要 switch 单元格的类型,然后为那个单元格调用适当的getter。

在下面的代码中,我们循环遍历一个工作表中的每个单元格,打印出单元格的引用(例如A3),然后打印出单元格的内容。

// import org.apache.poi.ss.usermodel.*;
DataFormatter formatter = new DataFormatter();
Sheet sheet1 = wb.getSheetAt(0);
for (Row row : sheet1) {
    
    
    for (Cell cell : row) {
    
    
        CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
        System.out.print(cellRef.formatAsString());
        System.out.print(" - ");
        // get the text that appears in the cell by getting the cell value and applying any data formats (Date, 0.00, 1.23e9, $1.23, etc)
        String text = formatter.formatCellValue(cell);
        System.out.println(text);
        // Alternatively, get the value and format it yourself
        switch (cell.getCellType()) {
    
    
            case CellType.STRING:
                System.out.println(cell.getRichStringCellValue().getString());
                break;
            case CellType.NUMERIC:
                if (DateUtil.isCellDateFormatted(cell)) {
    
    
                    System.out.println(cell.getDateCellValue());
                } else {
    
    
                    System.out.println(cell.getNumericCellValue());
                }
                break;
            case CellType.BOOLEAN:
                System.out.println(cell.getBooleanCellValue());
                break;
            case CellType.FORMULA:
                System.out.println(cell.getCellFormula());
                break;
            case CellType.BLANK:
                System.out.println();
                break;
            default:
                System.out.println();
        }
    }
}

1.12 文本提取

对于大多数文本提取需求,标准的ExcelExtractor类应该提供您所需要的一切。

try (InputStream inp = new FileInputStream("workbook.xls")) {
    
    
    HSSFWorkbook wb = new HSSFWorkbook(new POIFSFileSystem(inp));
    ExcelExtractor extractor = new ExcelExtractor(wb);
    extractor.setFormulasNotResults(true);
    extractor.setIncludeSheetNames(false);
    String text = extractor.getText();
    wb.close();
}

对于非常花哨的文本提取,XLS到CSV等,请查看/poi-examples/src/main/java/org/apache/poi/examples/hssf/eventusermodel/XLS2CSVmra.java

1.13 填充和颜色

Workbook wb = new XSSFWorkbook();
Sheet sheet = wb.createSheet("new sheet");
// Create a row and put some cells in it. Rows are 0 based.
Row row = sheet.createRow(1);
// Aqua background
CellStyle style = wb.createCellStyle();
style.setFillBackgroundColor(IndexedColors.AQUA.getIndex());
style.setFillPattern(FillPatternType.BIG_SPOTS);
Cell cell = row.createCell(1);
cell.setCellValue("X");
cell.setCellStyle(style);
// Orange "foreground", foreground being the fill foreground not the font color.
style = wb.createCellStyle();
style.setFillForegroundColor(IndexedColors.ORANGE.getIndex());
style.setFillPattern(FillPatternType.SOLID_FOREGROUND);
cell = row.createCell(2);
cell.setCellValue("X");
cell.setCellStyle(style);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

1.14 合并单元格(Merging cells)

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("new sheet");
Row row = sheet.createRow(1);
Cell cell = row.createCell(1);
cell.setCellValue("This is a test of merging");
sheet.addMergedRegion(new CellRangeAddress(
        1, //first row (0-based)
        1, //last row  (0-based)
        1, //first column (0-based)
        2  //last column  (0-based)
));
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

1.15 使用字体

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("new sheet");
// Create a row and put some cells in it. Rows are 0 based.
Row row = sheet.createRow(1);
// Create a new font and alter it.
Font font = wb.createFont();
font.setFontHeightInPoints((short)24);
font.setFontName("Courier New");
font.setItalic(true);
font.setStrikeout(true);
// Fonts are set into a style so create a new one to use.
CellStyle style = wb.createCellStyle();
style.setFont(font);
// Create a cell and put a value in it.
Cell cell = row.createCell(1);
cell.setCellValue("This is a test of fonts");
cell.setCellStyle(style);
// Write the output to a file
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

注意,工作簿中唯一字体的最大数量限制为32767。您应该在应用程序中重用字体,而不是为每个单元格创建字体。例子:
Wrong:

for (int i = 0; i < 10000; i++) {
    
    
    Row row = sheet.createRow(i);
    Cell cell = row.createCell(0);
    CellStyle style = workbook.createCellStyle();
    Font font = workbook.createFont();
    font.setBoldweight(Font.BOLDWEIGHT_BOLD);
    style.setFont(font);
    cell.setCellStyle(style);
}

Correct:

CellStyle style = workbook.createCellStyle();
Font font = workbook.createFont();
font.setBoldweight(Font.BOLDWEIGHT_BOLD);
style.setFont(font);
for (int i = 0; i < 10000; i++) {
    
    
    Row row = sheet.createRow(i);
    Cell cell = row.createCell(0);
    cell.setCellStyle(style);
}

1.16 自定义颜色

XSSF:

XSSFWorkbook wb = new XSSFWorkbook();
XSSFSheet sheet = wb.createSheet();
XSSFRow row = sheet.createRow(0);
XSSFCell cell = row.createCell( 0);
cell.setCellValue("custom XSSF colors");
XSSFCellStyle style1 = wb.createCellStyle();
style1.setFillForegroundColor(new XSSFColor(new java.awt.Color(128, 0, 128), new DefaultIndexedColorMap()));
style1.setFillPattern(FillPatternType.SOLID_FOREGROUND);

1.17 读取和重写工作薄

try (InputStream inp = new FileInputStream("workbook.xls")) {
    
    
//InputStream inp = new FileInputStream("workbook.xlsx");
    Workbook wb = WorkbookFactory.create(inp);
    Sheet sheet = wb.getSheetAt(0);
    Row row = sheet.getRow(2);
    Cell cell = row.getCell(3);
    if (cell == null)
        cell = row.createCell(3);
    cell.setCellType(CellType.STRING);
    cell.setCellValue("a test");
    // Write the output to a file
    try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
        wb.write(fileOut);
    }
}

1.18 在单元格中使用换行

Workbook wb = new XSSFWorkbook();   //or new HSSFWorkbook();
Sheet sheet = wb.createSheet();
Row row = sheet.createRow(2);
Cell cell = row.createCell(2);
cell.setCellValue("Use \n with word wrap on to create a new line");
//to enable newlines you need set a cell styles with wrap=true
CellStyle cs = wb.createCellStyle();
cs.setWrapText(true);
cell.setCellStyle(cs);
//increase row height to accommodate two lines of text
row.setHeightInPoints((2*sheet.getDefaultRowHeightInPoints()));
//adjust column width to fit the content
sheet.autoSizeColumn(2);
try (OutputStream fileOut = new FileOutputStream("ooxml-newlines.xlsx")) {
    
    
    wb.write(fileOut);
}
wb.close();

1.19 数据格式化‘

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("format sheet");
CellStyle style;
DataFormat format = wb.createDataFormat();
Row row;
Cell cell;
int rowNum = 0;
int colNum = 0;
row = sheet.createRow(rowNum++);
cell = row.createCell(colNum);
cell.setCellValue(11111.25);
style = wb.createCellStyle();
style.setDataFormat(format.getFormat("0.0"));
cell.setCellStyle(style);
row = sheet.createRow(rowNum++);
cell = row.createCell(colNum);
cell.setCellValue(11111.25);
style = wb.createCellStyle();
style.setDataFormat(format.getFormat("#,##0.0000"));
cell.setCellStyle(style);
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

1.20 Fit Sheet to One Page

Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("format sheet");
PrintSetup ps = sheet.getPrintSetup();
sheet.setAutobreaks(true);
ps.setFitHeight((short)1);
ps.setFitWidth((short)1);
// Create various cells and rows for spreadsheet.
try (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
    
    
    wb.write(fileOut);
}
wb.close();

Images

图像是绘图支持的一部分。要添加图像,只需在绘图父元素(DrawingPatriarch)上调用createPicture()。在撰写本文时,支持以下类型:

  • PNG
  • JPG
  • DIB

应该注意的是,一旦将图像添加到工作表中,任何现有的绘图都可能被擦除。

//create a new workbook
Workbook wb = new XSSFWorkbook(); //or new HSSFWorkbook();
//add picture data to this workbook.
InputStream is = new FileInputStream("image1.jpeg");
byte[] bytes = IOUtils.toByteArray(is);
int pictureIdx = wb.addPicture(bytes, Workbook.PICTURE_TYPE_JPEG);
is.close();
CreationHelper helper = wb.getCreationHelper();
//create sheet
Sheet sheet = wb.createSheet();
// Create the drawing patriarch.  This is the top level container for all shapes.
Drawing drawing = sheet.createDrawingPatriarch();
//add a picture shape
ClientAnchor anchor = helper.createClientAnchor();
//set top-left corner of the picture,
//subsequent call of Picture#resize() will operate relative to it
anchor.setCol1(3);
anchor.setRow1(2);
Picture pict = drawing.createPicture(anchor, pictureIdx);
//auto-size picture relative to its top-left corner
pict.resize();
//save workbook
String file = "picture.xls";
if(wb instanceof XSSFWorkbook) file += "x";
try (OutputStream fileOut = new FileOutputStream(file)) {
    
    
    wb.write(fileOut);
}

Picture.resize()仅适用于JPEG和PNG。目前还不支持其他格式。

猜你喜欢

转载自blog.csdn.net/chinusyan/article/details/130706238