如何使用Apache POI处理空行?(How To handle Null Row using Apache POI?)
我正在使用Apache POI来读取
xlsx
文件,它运行良好。 当找到行为null时,我有问题,我怎么能处理它? 我的文件包含500行,但它显示105667行,其余行找到null。二手代码:
import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.text.SimpleDateFormat; import java.util.logging.Level; import java.util.logging.Logger; import org.apache.poi.openxml4j.exceptions.InvalidFormatException; import org.apache.poi.ss.usermodel.Cell; import org.apache.poi.ss.usermodel.DateUtil; import org.apache.poi.ss.usermodel.Row; import org.apache.poi.ss.usermodel.Sheet; import org.apache.poi.ss.usermodel.Workbook; import org.apache.poi.ss.usermodel.WorkbookFactory; import org.apache.poi.xssf.usermodel.XSSFRow; import org.apache.poi.xssf.usermodel.XSSFSheet; import org.apache.poi.xssf.usermodel.XSSFWorkbook; /** * * @author SAMEEK */ public class readXLSXFile { public int getNumberOfColumn(String fileName, int sheetIndex) throws FileNotFoundException, IOException { File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; XSSFRow row = null; int lastRowNum = 0; int lastCellNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); sheet = workbook.getSheetAt(sheetIndex); lastRowNum = sheet.getLastRowNum(); for (int i = 0; i < lastRowNum; i++) { row = sheet.getRow(i); if (row != null) { if (row.getLastCellNum() > lastCellNum) { lastCellNum = row.getLastCellNum(); } } } return lastCellNum; } public int getNumberOfRow(String fileName, int sheetIndex) throws FileNotFoundException, IOException { File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; int lastRowNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); sheet = workbook.getSheetAt(sheetIndex); lastRowNum = sheet.getLastRowNum(); return lastRowNum; } public String[] getSheetName(String fileName) throws FileNotFoundException, IOException { int totalsheet = 0; int i = 0; String[] sheetName = null; File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); totalsheet = workbook.getNumberOfSheets(); sheetName = new String[totalsheet]; while (i < totalsheet) { sheetName[i] = workbook.getSheetName(i); i++; } return sheetName; } public int getNumberOfSheet(String fileName) throws FileNotFoundException, IOException { int totalsheet = 0; File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; int lastRowNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); totalsheet = workbook.getNumberOfSheets(); return totalsheet; } public String[][] getSheetData(String fileName, int sheetIndex) throws FileNotFoundException, IOException, InvalidFormatException { String[][] data = null; int i = 0; int j = 0;Cell cell=null; long emptyrowcount = 0; InputStream inputStream = new FileInputStream( fileName); // Create a workbook object. Workbook wb = WorkbookFactory.create(inputStream); wb.setMissingCellPolicy(Row.CREATE_NULL_AS_BLANK); Sheet sheet = wb.getSheetAt(sheetIndex); // Iterate over all the row and cells int noOfColumns = getNumberOfColumn(fileName, sheetIndex); System.out.println("noOfColumns::" + noOfColumns); int noOfRows = getNumberOfRow(fileName, sheetIndex) + 1; System.out.println("noOfRows::" + noOfRows); data = new String[noOfRows][noOfColumns]; for (int k = 0; k < noOfRows; k++) { Row row = sheet.getRow(k); if (row == null) { } else { j = 0; for (int l = 0; l < noOfColumns; l++) { // Cell cell = cit.next(); cell = row.getCell(j); if (cell.getCellType() == cell.CELL_TYPE_BLANK) { cell = row.getCell(j, Row.CREATE_NULL_AS_BLANK); } data[i][j] = getCellValueAsString(cell); j++; } i++; } } return data; } /** * This method for the type of data in the cell, extracts the data and * returns it as a string. */ public static String getCellValueAsString(Cell cell) { String strCellValue = null; if (cell != null) { switch (cell.getCellType()) { case Cell.CELL_TYPE_STRING: strCellValue = cell.toString(); break; case Cell.CELL_TYPE_NUMERIC: if (DateUtil.isCellDateFormatted(cell)) { SimpleDateFormat dateFormat = new SimpleDateFormat( "dd/MM/yyyy"); strCellValue = dateFormat.format(cell.getDateCellValue()); } else { Double value = cell.getNumericCellValue(); Long longValue = value.longValue(); strCellValue = new String(longValue.toString()); } break; case Cell.CELL_TYPE_BOOLEAN: strCellValue = new String(new Boolean( cell.getBooleanCellValue()).toString()); break; case Cell.CELL_TYPE_BLANK: strCellValue = ""; break; } } return strCellValue; } public static void main(String s[]) { try { readXLSXFile readXLSxFile = new readXLSXFile(); String[][] sheetData = readXLSxFile.getSheetData("F:/work.xlsx", 0); int columnLength = 0; columnLength = readXLSxFile.getNumberOfColumn("F:/work.xlsx", 0); int rowLength = 0; rowLength = readXLSxFile.getNumberOfRow("F:/work.xlsx", 0); int h = 0; int j = 0; while (j < rowLength) { h = 0; while (h < columnLength) { System.out.print("\t " + sheetData[j][h]); h++; } System.out.println(""); j++; } } catch (InvalidFormatException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } catch (FileNotFoundException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } catch (IOException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } } }
请帮我解决如何处理excel表中的空行?
I am using Apache POI to read
xlsx
file, it works well. I have question to you when row is found null, how I'm able to handle it? My file contain 500 row, but it show 105667 row, rest of row found null.used code:
import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.text.SimpleDateFormat; import java.util.logging.Level; import java.util.logging.Logger; import org.apache.poi.openxml4j.exceptions.InvalidFormatException; import org.apache.poi.ss.usermodel.Cell; import org.apache.poi.ss.usermodel.DateUtil; import org.apache.poi.ss.usermodel.Row; import org.apache.poi.ss.usermodel.Sheet; import org.apache.poi.ss.usermodel.Workbook; import org.apache.poi.ss.usermodel.WorkbookFactory; import org.apache.poi.xssf.usermodel.XSSFRow; import org.apache.poi.xssf.usermodel.XSSFSheet; import org.apache.poi.xssf.usermodel.XSSFWorkbook; /** * * @author SAMEEK */ public class readXLSXFile { public int getNumberOfColumn(String fileName, int sheetIndex) throws FileNotFoundException, IOException { File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; XSSFRow row = null; int lastRowNum = 0; int lastCellNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); sheet = workbook.getSheetAt(sheetIndex); lastRowNum = sheet.getLastRowNum(); for (int i = 0; i < lastRowNum; i++) { row = sheet.getRow(i); if (row != null) { if (row.getLastCellNum() > lastCellNum) { lastCellNum = row.getLastCellNum(); } } } return lastCellNum; } public int getNumberOfRow(String fileName, int sheetIndex) throws FileNotFoundException, IOException { File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; int lastRowNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); sheet = workbook.getSheetAt(sheetIndex); lastRowNum = sheet.getLastRowNum(); return lastRowNum; } public String[] getSheetName(String fileName) throws FileNotFoundException, IOException { int totalsheet = 0; int i = 0; String[] sheetName = null; File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); totalsheet = workbook.getNumberOfSheets(); sheetName = new String[totalsheet]; while (i < totalsheet) { sheetName[i] = workbook.getSheetName(i); i++; } return sheetName; } public int getNumberOfSheet(String fileName) throws FileNotFoundException, IOException { int totalsheet = 0; File inputFile = null; FileInputStream fis = null; XSSFWorkbook workbook = null; XSSFSheet sheet = null; int lastRowNum = 0; // Open the workbook inputFile = new File(fileName); fis = new FileInputStream(inputFile); workbook = new XSSFWorkbook(fis); totalsheet = workbook.getNumberOfSheets(); return totalsheet; } public String[][] getSheetData(String fileName, int sheetIndex) throws FileNotFoundException, IOException, InvalidFormatException { String[][] data = null; int i = 0; int j = 0;Cell cell=null; long emptyrowcount = 0; InputStream inputStream = new FileInputStream( fileName); // Create a workbook object. Workbook wb = WorkbookFactory.create(inputStream); wb.setMissingCellPolicy(Row.CREATE_NULL_AS_BLANK); Sheet sheet = wb.getSheetAt(sheetIndex); // Iterate over all the row and cells int noOfColumns = getNumberOfColumn(fileName, sheetIndex); System.out.println("noOfColumns::" + noOfColumns); int noOfRows = getNumberOfRow(fileName, sheetIndex) + 1; System.out.println("noOfRows::" + noOfRows); data = new String[noOfRows][noOfColumns]; for (int k = 0; k < noOfRows; k++) { Row row = sheet.getRow(k); if (row == null) { } else { j = 0; for (int l = 0; l < noOfColumns; l++) { // Cell cell = cit.next(); cell = row.getCell(j); if (cell.getCellType() == cell.CELL_TYPE_BLANK) { cell = row.getCell(j, Row.CREATE_NULL_AS_BLANK); } data[i][j] = getCellValueAsString(cell); j++; } i++; } } return data; } /** * This method for the type of data in the cell, extracts the data and * returns it as a string. */ public static String getCellValueAsString(Cell cell) { String strCellValue = null; if (cell != null) { switch (cell.getCellType()) { case Cell.CELL_TYPE_STRING: strCellValue = cell.toString(); break; case Cell.CELL_TYPE_NUMERIC: if (DateUtil.isCellDateFormatted(cell)) { SimpleDateFormat dateFormat = new SimpleDateFormat( "dd/MM/yyyy"); strCellValue = dateFormat.format(cell.getDateCellValue()); } else { Double value = cell.getNumericCellValue(); Long longValue = value.longValue(); strCellValue = new String(longValue.toString()); } break; case Cell.CELL_TYPE_BOOLEAN: strCellValue = new String(new Boolean( cell.getBooleanCellValue()).toString()); break; case Cell.CELL_TYPE_BLANK: strCellValue = ""; break; } } return strCellValue; } public static void main(String s[]) { try { readXLSXFile readXLSxFile = new readXLSXFile(); String[][] sheetData = readXLSxFile.getSheetData("F:/work.xlsx", 0); int columnLength = 0; columnLength = readXLSxFile.getNumberOfColumn("F:/work.xlsx", 0); int rowLength = 0; rowLength = readXLSxFile.getNumberOfRow("F:/work.xlsx", 0); int h = 0; int j = 0; while (j < rowLength) { h = 0; while (h < columnLength) { System.out.print("\t " + sheetData[j][h]); h++; } System.out.println(""); j++; } } catch (InvalidFormatException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } catch (FileNotFoundException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } catch (IOException ex) { Logger.getLogger(readXLSFile.class.getName()).log(Level.SEVERE, null, ex); } } }
Please help me how to handle null row in excel sheet?
原文:https://stackoverflow.com/questions/9171861
最满意答案
你对monad的使用(和滥用)肯定很尴尬:
- 通常通过堆叠多个变压器来逐个构建单子
- 它通常不太常见,但有时仍然会发生堆叠几个状态
- 堆叠几个Maybe变换器是非常不寻常的
- 使用MaybeT来中断循环更加不寻常
你的代码有点过于无意义了:
(`when` mzero) . isJust =<< runMaybeT (mapM_ f bases)
而不是更容易阅读
let isHappy = isJust $ runMaybeT (mapM_ f bases) when isHappy mzero
现在关注函数solve1,让我们简化它。 一个简单的方法是删除内在的MaybeT monad。 当找到一个快乐的数字时,你可以走另一条路,只有在数字不满意的情况下才能解决问题。
而且,你也不需要国家单身,是吗? 人们总是可以用显式参数替换状态。
应用这些想法solve1现在看起来好多了:
solve1 :: [Integer] -> IsHappyMemo Integer solve1 bases = go 2 where go i = do happyBases <- mapM (\b -> isHappy Set.empty b i) bases if and happyBases then return i else go (i+1)
我对这段代码更加满意。 其余的解决方案都很好。 困扰我的一件事是你丢弃了每个子问题的备忘录缓存。 这有什么理由吗?
solve :: [String] -> String solve = concat . (`evalState` Map.empty) . mapM f . zip [1 :: Integer ..] where f (idx, prob) = do s <- solve1 . map read . words $ prob return $ "Case #" ++ show idx ++ ": " ++ show s ++ "\n"
如果你重新使用它,你的解决方案不会更有效吗?
solve :: [String] -> String solve cases = (`evalState` Map.empty) $ do solutions <- mapM f (zip [1 :: Integer ..] cases) return (unlines solutions) where f (idx, prob) = do s <- solve1 . map read . words $ prob return $ "Case #" ++ show idx ++ ": " ++ show s
Your solution is certainly awkward in its use (and abuse) of monads:
- It is usual to build monads piecemeal by stacking several transformers
- It is less usual, but still happens sometimes, to stack several states
- It is very unusual to stack several Maybe transformers
- It is even more unusual to use MaybeT to interrupt a loop
Your code is a bit too pointless :
(`when` mzero) . isJust =<< runMaybeT (mapM_ f bases)
instead of the easier to read
let isHappy = isJust $ runMaybeT (mapM_ f bases) when isHappy mzero
Focusing now on function solve1, let us simplify it. An easy way to do so is to remove the inner MaybeT monad. Instead of a forever loop which breaks when a happy number is found, you can go the other way around and recurse only if the number is not happy.
Moreover, you don't really need the State monad either, do you ? One can always replace the state with an explicit argument.
Applying these ideas solve1 now looks much better:
solve1 :: [Integer] -> IsHappyMemo Integer solve1 bases = go 2 where go i = do happyBases <- mapM (\b -> isHappy Set.empty b i) bases if and happyBases then return i else go (i+1)
I would be more han happy with that code. The rest of your solution is fine. One thing that bothers me is that you throw away the memo cache for every subproblem. Is there a reason for that?
solve :: [String] -> String solve = concat . (`evalState` Map.empty) . mapM f . zip [1 :: Integer ..] where f (idx, prob) = do s <- solve1 . map read . words $ prob return $ "Case #" ++ show idx ++ ": " ++ show s ++ "\n"
Wouldn't your solution be more efficient if you reused it instead ?
solve :: [String] -> String solve cases = (`evalState` Map.empty) $ do solutions <- mapM f (zip [1 :: Integer ..] cases) return (unlines solutions) where f (idx, prob) = do s <- solve1 . map read . words $ prob return $ "Case #" ++ show idx ++ ": " ++ show s
相关问答
更多-
你有两个选择: 寻找monad态度主义。 这通常是找到合适的图书馆的问题; 在这种情况下, 提升和概括在一起应该让你去你需要去的地方。 让你的State更具多态性。 这是常用的和推荐的; 它相当于预先应用了第1部分的态射,但在mtl库中已经安装了很多机器以方便使用。 这里的想法是,如果你只是根据get , put和modify来编写你的State动作,那么你可以给它类型: MonadState s m => m a 之后,在呼叫站点,您可以选择任何适用于此的monad,包括State sa和StateT ...
-
对于所有标准的mtl单子,你根本不需要lift 。 get , put , ask , tell - 他们都可以在任何一个monad中使用正确的变压器在堆栈中的某个地方工作。 缺少的是IO ,即使是liftIO也liftIO将任意的IO操作提升到任意数量的层次。 这是为每个“效果”提供的类型类型完成的:例如, MonadState提供get和put 。 如果要在变压器堆栈周围创建自己的newtype包装器,可以使用GeneralizedNewtypeDeriving扩展名来deriving (..., Mo ...
-
我不认为用签名编写函数是可能的: changeReaderT :: (MonadTrans m) => (r -> r') -> m (ReaderT r IO) a -> m (ReaderT r' IO) a 问题是,通常只有第二个参数可能的唯一操作是将它提升到t (m (ReaderT r IO)) a某个monad变压器t ,它不会为您购买任何东西。 也就是说,单独的MonadTrans m约 ...
-
如果布尔值为真, Control.Monad中的when函数将计算其第二个参数: when :: Monad m => Bool -> m () -> m () 所以你得到, throwErrorWhen cond e = when cond (throwError e) There's the when function in Control.Monad that evaluates its 2nd argument if the Boolean is true: when :: Monad m ...
-
我们来看看lift get的类型: lift get :: (MonadTrans t, MonadState a m) => t m a 但是你的MyApp不是一个monad变压器,它只是一个monad。 但是,里面的内容当然是如此,如果你使用的话 st <- MyA $ lift get 有用。 Let's have a look at the type of lift get: lift get :: (MonadTrans t, MonadState a m) => t m a But ...
-
使用哪种Monad变压器?(Which Monad Transformer to use?)[2023-04-11]
在这种情况下,您可以采取两种通用方法。 第一个是让你的所有方法返回你知道你将要使用的堆栈(在这种情况下是EitherT[Future, Int, ?] ),或者你可以让每个单独的方法返回最准确地捕获它自己的效果的类型,然后在撰写时提高你得到的值。 如果你确切地知道这种用法会是什么样子,那么第一种方法可以使语法在语法上更方便,但后一种方法更灵活,在我看来通常是更好的选择。 在你的情况下,它看起来像这样: import scalaz._, Scalaz._ import scala.concurrent.Fut ... -
尴尬的monad变压器堆栈(awkward monad transformer stack)[2022-11-28]
你对monad的使用(和滥用)肯定很尴尬: 通常通过堆叠多个变压器来逐个构建单子 它通常不太常见,但有时仍然会发生堆叠几个状态 堆叠几个Maybe变换器是非常不寻常的 使用MaybeT来中断循环更加不寻常 你的代码有点过于无意义了: (`when` mzero) . isJust =<< runMaybeT (mapM_ f bases) 而不是更容易阅读 let isHappy = isJust $ runMaybeT (mapM_ f bases) when isHappy mzero 现在关 ... -
请注意,正如Edward所说,将ErrorT放在堆栈的顶部而不是底部通常会更简单。 这可以改变堆栈的语义,至少对于比ReaderT更复杂的转换ReaderT - 例如,如果你在堆栈中有StateT ,那么当ErrorT在底部改变时,状态会在出现错误时被回滚,而ErrorT在在顶部,当出现错误时,将保留对状态的更改。 如果你真的需要它在底部,那么像这样的东西通过类型检查器: import Control.Monad.Error import Control.Monad.Morph import System. ...
-
方法调用get[Int]返回一个IndexedStateT[Id, Int, Int, Int] 。 您的Stack[Int]扩展为IndexedStateT[Inner, Int, Int, Int] ,其中Inner是EitherT[Id, String, A] 。 这有点难以推理,所以我会简化你的例子。 我们用一个Option创建一个StateT ,而不是Inner类型别名。 type Stack[A] = StateT[Option, Int, A] get[Int]的赋值仍然失败。 val x: ...
-
问题1 无法重现。 RandT已有这些实例。 问题2 lookup返回Maybe ,但你有一个基于MaybeT的堆栈。 没有MonadMaybe的原因是相应的类型类是MonadPlus (或更一般的Alternative ) - pure / return对应于Just而empty / mzero对应于Nothing 。 我建议创建一个帮手 lookupA :: (Alternative f, Ord k) => k -> M.Map k v -> f v lookupA k = maybe empty p ...