Part of your problem can be explained far better by a reply that Josh made to
a similar sort of question that was concerned with floats becoming doubles;
"For clarity, here is the code which seems to reproduce your problem:
HSSFWorkbook wb = new HSSFWorkbook();
wb.createSheet("Sheet1").createRow(0).createCell(0).setCellValue(34.44F); //
F for float literal
OutputStream os = new FileOutputStream("Test34.44.xls");
wb.write(os);
os.close();
In Excel(2007) the cell value displays as 34.4399986267089.
POI uses double (forcing a conversion from float) because that's what Excel
needs.
The root of the problem is that many seemingly simple decimal values cannot
be representable exactly with an IEEE floating point (for example 0.01).
In this example, the float->double conversion seems to be the main source of
the precision loss, as seen with the following code:
System.out.println((double)34.44F); // outputs "34.439998626708984"
It's not just in type conversion that these floating point rounding
limitations can be seen. The following code uses only doubles:
System.out.println(0.05+0.01); // outputs "0.060000000000000005"
Since, POI/Excel uses doubles, doubles have higher precision than floats and
conversions are especially prone to rounding errors, your best bet is to use
doubles as much as possible. This won't eliminate all rounding errors, but
should improve things a lot. "
The reason I quoted that is that you may not be able to get exactly the
number the user entered owing to this sort of problem; but you should be
able to get very close.
Firstly, can I ask why you are extracting the value from the cell as a
String? Typically, you would do something like this, an example I have
copied from the Quick Guide here
http://poi.apache.org/spreadsheet/quick-guide.html#CellContents
http://poi.apache.org/spreadsheet/quick-guide.html#CellContents ;
Sheet sheet1 = wb.getSheetAt(0);
for (Row row : sheet1) {
for (Cell cell : row) {
CellReference cellRef = new CellReference(row.getRowNum(),
cell.getCellNum());
System.out.print(cellRef.formatAsString());
System.out.print(" - ");
switch(cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
System.out.println(cell.getRichStringCellValue().getString());
break;
case Cell.CELL_TYPE_NUMERIC:
if(DateUtil.isCellDateFormatted(cell)) {
System.out.println(cell.getDateCellValue());
} else {
System.out.println(cell.getNumericCellValue());
}
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.println(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_FORMULA:
System.out.println(cell.getCellFormula());
break;
default:
System.out.println();
}
}
}
As you can see, the basic process is to test the cell's type and call the
appropriate method to retrieve the contents as a double, String, etc. Excel
stores - at least I believe it does - numeric values as doubles, so this is
likely to get you as close as possible to the value actually stored in the
cell. It could be though that this is not the exact value the user entered
owing to the issues Josh outlined. I would guess the only way to be certain
to get excatly the value the user entered would be to use numeric String
literal values throughout.
Turning now to seeing the formatted value of the cell, you should not have
to provide the formatting String. It should be possible to call the
formatCellValue() method of the DataFormater class directly, passing the
reference to the cell and receiving a String containing the cell's contents
correctly formatted. I am not at my own machine so cannot test this piece of
code but it should work;
// Assuming that you have opened a Workbook, got the reference to a sheet,
row and then
// a cell, storing the latter into a variable called cell you should be able
to do something like this;
org.apache.poi.ss.usermodel.DataFormatter formatter = new
org.apache.poi.ss.usermodel.DataFormatter();
System.out.println(formatter.formatCellValue(cell));
That method call should use the data format that was applied to the cell
when it was added to the sheet. It means you do not need to supply the data
format String.
If this code fails, please let me know. I will be home later and can try to
put something better together. We are an Office free zone at work now -
thankfully - and so I cannot test anything here with absolute confidence,
sorry.
Post by Frank PrinsDear all,
I would like to make sure if I am correct on the number formatting, when
reading from Excel with the HSSF libraries. I am reading an xls file,
created with Excel 2003, and wonder about the possibilities to extract
the numbers from a sheet.
Suppose a user enters a number like '3.1415' in a cell, which has a
number formatting with two decimals, there are several manners of state
- the floating point number which Excel uses internally, could be
something like '3.14150000003'
- the formatted number, shown in Excel, '3.14'
- the original number, '3.1415'
Am I correct in saying, both the first two occurrences can be retrieved
a) String value = cell.toString();
b) HSSFDataFormatter formatter = new HSSFDataFormatter();
String value = formatter.formatCellValue(cell);
c) HSSFDataFormatter formatter = new HSSFDataFormatter();
double tariffDoubleValue = cell.getNumericCellValue();
String value = formatter.formatRawCellContents(tariffDoubleValue, 0,
"#.####");
Using the last possibility, you need to make the assumption that all
numbers are entered using the format "#,####"
Am I correct in this approach, or is there a possibility to retrieve the
original entered number? Tried to find some an answers in the list
archive and api, but could not find it.
thanks & regards,
Frank
--
View this message in context: http://www.nabble.com/Number-Formatting-in-HSSF-tp22408718p22413448.html
Sent from the POI - User mailing list archive at Nabble.com.