转自:http://blog.csdn.net/zgf19930504/article/details/49506567
java 解析XML 的方法有很多, 常见的解析技术有 SAX 解析, DOM 解析, JDOM 解析, DOM4J 解析, JAXB解析等,其中SAX 解析采用的是流式解析,一遍过,不能折回解析,占用内存少; 而DOM ,JDOM,DOM4J,JAXB 解析采用的是将整个XML 文档全部加载到内存中,然后进行解析,此种解析方式占用内存大,解析效率相对较慢。 接下来笔者就简单地做一下性能对比分析。
【1. 对比SAX、DOM、JDOM、DOM4J、JAXB 在解析XML 方面的速度对比】
【students_bigfile.xml 格式, 大小82.6M 】
- <?xml version="1.0" encoding="UTF-8"?>
- <Students>
- <!--这是第1个Student 元素-->
- <Student grade="2" index="1">
- <Name>zong_0</Name>
- <Age>20</Age>
- <Sex>boy</Sex>
- <Address>beijing No.0</Address>
- <Number>1000</Number>
- </Student>
- <!--这是第2个Student 元素-->
- <Student grade="1" index="2">
- <Name>zong_1</Name>
- <Age>21</Age>
- <Sex>girl</Sex>
- <Address>beijing No.1</Address>
- <Number>1001</Number>
- </Student>
- <!--这是第3个Student 元素-->
- <Student grade="2" index="3">
- <Name>zong_2</Name>
- <Age>22</Age>
- <Sex>boy</Sex>
- <Address>beijing No.2</Address>
- <Number>1002</Number>
- </Student>
- <!-- 省略, 共50 万个Student 片段 -->
- </Students>
【Student 类】由于涉及到JAXB 解析,所以用xjc 反转出的Student 类。
- //
- // This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
- // See <a href="http://java.sun.com/xml/jaxb">http://java.sun.com/xml/jaxb</a>
- // Any modifications to this file will be lost upon recompilation of the source schema.
- // Generated on: 2015.10.29 at 01:04:05 PM CST
- //
- package org.zgf.xml.jaxb.bean;
- import javax.xml.bind.annotation.XmlAccessType;
- import javax.xml.bind.annotation.XmlAccessorType;
- import javax.xml.bind.annotation.XmlAttribute;
- import javax.xml.bind.annotation.XmlElement;
- import javax.xml.bind.annotation.XmlRootElement;
- import javax.xml.bind.annotation.XmlType;
- /**
- * <p>
- * Java class for anonymous complex type.
- *
- * <p>
- * The following schema fragment specifies the expected content contained within
- * this class.
- *
- * <pre>
- * <complexType>
- * <complexContent>
- * <restriction base="{http://www.w3.org/2001/XMLSchema}anyType">
- * <sequence>
- * <element ref="{}Name"/>
- * <element ref="{}Age"/>
- * <element ref="{}Sex"/>
- * <element ref="{}Number"/>
- * <element ref="{}Address"/>
- * </sequence>
- * <attribute name="index" type="{http://www.w3.org/2001/XMLSchema}string" />
- * <attribute name="grade" type="{http://www.w3.org/2001/XMLSchema}string" />
- * </restriction>
- * </complexContent>
- * </complexType>
- * </pre>
- *
- *
- */
- @XmlAccessorType(XmlAccessType.FIELD)
- @XmlType(name = "", propOrder = { "name", "age", "sex", "number", "address" })
- @XmlRootElement(name = "Student")
- public class Student {
- @XmlElement(name = "Name", required = true)
- protected String name;
- @XmlElement(name = "Age", required = true)
- protected String age;
- @XmlElement(name = "Sex", required = true)
- protected String sex;
- @XmlElement(name = "Number", required = true)
- protected String number;
- @XmlElement(name = "Address", required = true)
- protected String address;
- @XmlAttribute(name = "index")
- protected String index;
- @XmlAttribute(name = "grade")
- protected String grade;
- /**
- * Gets the value of the name property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getName() {
- return name;
- }
- /**
- * Sets the value of the name property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setName(String value) {
- this.name = value;
- }
- /**
- * Gets the value of the age property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getAge() {
- return age;
- }
- /**
- * Sets the value of the age property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setAge(String value) {
- this.age = value;
- }
- /**
- * Gets the value of the sex property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getSex() {
- return sex;
- }
- /**
- * Sets the value of the sex property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setSex(String value) {
- this.sex = value;
- }
- /**
- * Gets the value of the number property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getNumber() {
- return number;
- }
- /**
- * Sets the value of the number property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setNumber(String value) {
- this.number = value;
- }
- /**
- * Gets the value of the address property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getAddress() {
- return address;
- }
- /**
- * Sets the value of the address property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setAddress(String value) {
- this.address = value;
- }
- /**
- * Gets the value of the index property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getIndex() {
- return index;
- }
- /**
- * Sets the value of the index property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setIndex(String value) {
- this.index = value;
- }
- /**
- * Gets the value of the grade property.
- *
- * @return possible object is {@link String }
- *
- */
- public String getGrade() {
- return grade;
- }
- /**
- * Sets the value of the grade property.
- *
- * @param value
- * allowed object is {@link String }
- *
- */
- public void setGrade(String value) {
- this.grade = value;
- }
- @Override
- public String toString() {
- return "Student [name=" + name + ", age=" + age + ", sex=" + sex + ", number=" + number + ", address=" + address + ", index=" + index + ", grade=" + grade + "]";
- }
- }
【解析一个 82.6M 的xml 文档,所消耗的时间】
【所消耗的内存占用比】
【为了防止同一个测试用例中,不同的解析器占用内存相互影响,笔者将不同的解析器分为单独的测试用例进行测试,测试的xml 文档依然是这个xml 文档】【SAX(170M) < DOM4J(410M) < JAXB(690M) < JDOM(750M) < DOM(950M);
【1. SAX 解析】
【2. DOM 解析】
【3. JDOM 解析】
【4. DOM4J 解析】
【5. JAXB 解析】
【综合对比分析】
1. 解析速度对比:SAX > DOM4J > JAXB > JDOM > DOM
2. 解析内存对比: SAX < DOM4J < JAXB < JDOM < DOM
3. 编程复杂对比:SAX > DOM > JDOM > DOM4J > JAXB
综上所述,笔者推荐使用JAXB,DOM4J,SAX解析三种技术:
SAX: 解析速度最快,占用内存最小,编程难度大,处理业务逻辑比较复杂。
DOM4J:解析速度较,占用内存较大,编程较简单,处理业务逻辑稍简单。
JAXB: 解析速度稍慢,占用内存较大,编程最简单,处理语无逻辑最简单。
【注】
1. 项目源代码下载地址:下载
2. 项目示例运行时,需要调整JVM 内存,方法参见:《修改jvm 虚拟机内存方法》
3. JVM 内存监控,方法参见: 《jvm 内存监控工具》