临近期末复习统计学,没复习完感觉自己十分憋屈,总想干点别的什么,敲个代码玩玩吧。
pearson相关系数的计算参考:https://blog.csdn.net/Anglebeat/article/details/40299273(这个参考地址的代码有一点错误)
我在这里指出具体错误,他的temp重复计算了。
前面代码我是直接抄的上面那个博客的,稍微做了一下他的错误的修改,所以作者和姓名我都懒得改了(尊重原创)。
第一个类是计算分母(Denominator)的类,名字叫做DenominatorCalculate.java
/**
*
*/
import java.util.List;
/**
* @author alan-king
*
*/
public class DenominatorCalculate {
//add denominatorCalculate method
public double calculateDenominator(List<String> xList,List<String> yList){
double standardDifference = 0.0;
int size = xList.size();
double xAverage = 0.0;
double yAverage = 0.0;
double xException = 0.0;
double yException = 0.0;
double temp1=0.0;
double temp2=0.0;
for(int i=0;i<size;i++){
temp1 += Double.parseDouble(xList.get(i));
}
xAverage = temp1/size;
for(int i=0;i<size;i++){
temp2 += Double.parseDouble(yList.get(i));
}
yAverage = temp2/size;
for(int i=0;i<size;i++){
xException += Math.pow(Double.parseDouble(xList.get(i))-xAverage,2);
yException += Math.pow(Double.parseDouble(yList.get(i))-yAverage, 2);
}
//calculate denominator of
return standardDifference = Math.sqrt(xException*yException);
}
}
第二个类是计算分子(Numerator)的,名字叫做NumeratorCalculate.java
/**
*
*/
import java.util.ArrayList;
import java.util.List;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
/**
* @author alan-king
*
* the class is going to calculate the numerator;
*
*
*/
public class NumeratorCalculate {
//add global varieties
protected List<String> xList , yList;
public NumeratorCalculate(List<String> xList ,List<String> yList){
this.xList = xList;
this.yList = yList;
}
/**
* add operate method
*/
public double calcuteNumerator(){
double result =0.0;
double xAverage = 0.0;
double temp = 0.0;
int xSize = xList.size();
for(int x=0;x<xSize;x++){
temp += Double.parseDouble(xList.get(x));
}
xAverage = temp/xSize;
double yAverage = 0.0;
temp = 0.0;
int ySize = yList.size();
for(int x=0;x<ySize;x++){
temp += Double.parseDouble(yList.get(x));
}
yAverage = temp/ySize;
//double sum = 0.0;
for(int x=0;x<xSize;x++){
result+=(Double.parseDouble(xList.get(x))-xAverage)*(Double.parseDouble(yList.get(x))-yAverage);
}
return result;
}
}
第三个类就是主调用函数的类了,这里也沿用了上文博客命名的名字叫做CallClass.java
/**
*
*/
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import org.apache.poi.hssf.usermodel.*;
import java.io.*;
import org.apache.poi.ss.usermodel.*;
/**
* @author SinWang
*
*/
public class CallClass {
public static void main(String[] args) throws IOException{
double CORR = 0.0;
List<String> xList = new ArrayList<String>();;
List<String> yList = new ArrayList<String>();
System.out.println("批量计算Pearson相关系数");
String filePath = ".\\例11.6.xls";
FileInputStream stream = new FileInputStream(filePath);
HSSFWorkbook workbook = new HSSFWorkbook(stream);//读取现有的Excel
HSSFSheet sheet= workbook.getSheet("Sheet3");//得到指定名称的Sheet
//HSSFRow Row=null;
// HSSFCell Cell=null;
for (Row row : sheet)
{
for (Cell cell : row)
{
// System.out.print(cell.getCellType());
//如果是第一列就把它放到xlist,如果是第二列就把它放到ylist
if(cell.getColumnIndex()==0){
//Get the value of the cell as a number. return double
double x=cell.getNumericCellValue();
System.out.print(x+"\t");
String x1=Double.toString(x);
xList.add(x1);
}else{
//Get the value of the cell as a number. return double
double y=cell.getNumericCellValue();
System.out.print(y+"");
String y1=Double.toString(y);
yList.add(y1);
}
}
System.out.println();
}
NumeratorCalculate nc = new NumeratorCalculate(xList,yList);
double numerator = nc.calcuteNumerator();
DenominatorCalculate dc = new DenominatorCalculate();
double denominator = dc.calculateDenominator(xList, yList);
CORR = numerator/denominator;
System.out.println("运算结果是:");
System.out.printf("CORR = "+CORR);
}
}
这个CallClass类我要详细说说了,因为我批量操作主要是利用这个类,这个类只实现了我想要的一部分功能(不然我不可能称它为一,以后我会用对象的方式将他一个个分离,以便于构造重复利用批量操作的代码)
这个类读取的是一个 叫做“例11.6.xls”的excel文档,这里我用到了统计学第六版(中国人民大学出版社—我不是打广告的,只是为了复习才用的)的例题里面的文件。
我用到了哪些包呢? 我在这里给出了下载地址:https://www.apache.org/dyn/closer.lua/poi/release/bin/poi-bin-3.17-20170915.zip 解压之后找一个jar包,解压之后放到代码的同一个级别目录下就行了。 我运行的效果图: |