Package org.apache.poi.xssf.extractor
Class XSSFEventBasedExcelExtractor
- java.lang.Object
-
- org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor
-
- All Implemented Interfaces:
Closeable,AutoCloseable,POITextExtractor,POIXMLTextExtractor,ExcelExtractor
- Direct Known Subclasses:
XSSFBEventBasedExcelExtractor
public class XSSFEventBasedExcelExtractor extends Object implements POIXMLTextExtractor, ExcelExtractor
Implementation of a text extractor from OOXML Excel files that uses SAX event based parsing.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classXSSFEventBasedExcelExtractor.SheetTextExtractor
-
Field Summary
Fields Modifier and Type Field Description protected booleanconcatenatePhoneticRunsprotected OPCPackagecontainerprotected booleanformulasNotResultsprotected booleanincludeCellCommentsprotected booleanincludeHeadersFootersprotected booleanincludeSheetNamesprotected booleanincludeTextBoxesprotected Localelocaleprotected POIXMLPropertiesproperties
-
Constructor Summary
Constructors Constructor Description XSSFEventBasedExcelExtractor(String path)XSSFEventBasedExcelExtractor(OPCPackage container)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected SharedStringscreateSharedStringsTable(XSSFReader xssfReader, OPCPackage container)POIXMLProperties.CorePropertiesgetCoreProperties()Returns the core document propertiesPOIXMLProperties.CustomPropertiesgetCustomProperties()Returns the custom document propertiesPOIXMLDocumentgetDocument()Returns opened documentPOIXMLProperties.ExtendedPropertiesgetExtendedProperties()Returns the extended document propertiesOPCPackagegetFilesystem()booleangetFormulasNotResults()booleangetIncludeCellComments()booleangetIncludeHeadersFooters()booleangetIncludeSheetNames()booleangetIncludeTextBoxes()LocalegetLocale()OPCPackagegetPackage()Returns the opened OPCPackage container.StringgetText()Processes the file and returns the textbooleanisCloseFilesystem()voidprocessSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, Styles styles, Comments comments, SharedStrings strings, InputStream sheetInputStream)Processes the given sheetvoidsetCloseFilesystem(boolean doCloseFilesystem)voidsetConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)Concatenate text from <rPh> text elements in SharedStringsTable Default is true;voidsetFormulasNotResults(boolean formulasNotResults)Should we return the formula itself, and not the result it produces? Default is falsevoidsetIncludeCellComments(boolean includeCellComments)Should cell comments be included? Default is falsevoidsetIncludeHeadersFooters(boolean includeHeadersFooters)Should headers and footers be included? Default is truevoidsetIncludeSheetNames(boolean includeSheetNames)Should sheet names be included? Default is truevoidsetIncludeTextBoxes(boolean includeTextBoxes)Should text from textboxes be included? Default is truevoidsetLocale(Locale locale)-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.poi.ooxml.extractor.POIXMLTextExtractor
checkMaxTextSize, close, getMetadataTextExtractor
-
-
-
-
Field Detail
-
container
protected final OPCPackage container
-
properties
protected final POIXMLProperties properties
-
locale
protected Locale locale
-
includeTextBoxes
protected boolean includeTextBoxes
-
includeSheetNames
protected boolean includeSheetNames
-
includeCellComments
protected boolean includeCellComments
-
includeHeadersFooters
protected boolean includeHeadersFooters
-
formulasNotResults
protected boolean formulasNotResults
-
concatenatePhoneticRuns
protected boolean concatenatePhoneticRuns
-
-
Constructor Detail
-
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(String path) throws XmlException, OpenXML4JException, IOException
-
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(OPCPackage container) throws XmlException, OpenXML4JException, IOException
-
-
Method Detail
-
setIncludeSheetNames
public void setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is true- Specified by:
setIncludeSheetNamesin interfaceExcelExtractor
-
getIncludeSheetNames
public boolean getIncludeSheetNames()
- Returns:
- whether to include sheet names
- Since:
- 3.16-beta3
-
setFormulasNotResults
public void setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is false- Specified by:
setFormulasNotResultsin interfaceExcelExtractor
-
getFormulasNotResults
public boolean getFormulasNotResults()
- Returns:
- whether to include formulas but not results
- Since:
- 3.16-beta3
-
setIncludeHeadersFooters
public void setIncludeHeadersFooters(boolean includeHeadersFooters)
Should headers and footers be included? Default is true- Specified by:
setIncludeHeadersFootersin interfaceExcelExtractor
-
getIncludeHeadersFooters
public boolean getIncludeHeadersFooters()
- Returns:
- whether or not to include headers and footers
- Since:
- 3.16-beta3
-
setIncludeTextBoxes
public void setIncludeTextBoxes(boolean includeTextBoxes)
Should text from textboxes be included? Default is true
-
getIncludeTextBoxes
public boolean getIncludeTextBoxes()
- Returns:
- whether or not to extract textboxes
- Since:
- 3.16-beta3
-
setIncludeCellComments
public void setIncludeCellComments(boolean includeCellComments)
Should cell comments be included? Default is false- Specified by:
setIncludeCellCommentsin interfaceExcelExtractor
-
getIncludeCellComments
public boolean getIncludeCellComments()
- Returns:
- whether cell comments should be included
- Since:
- 3.16-beta3
-
setConcatenatePhoneticRuns
public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)
Concatenate text from <rPh> text elements in SharedStringsTable Default is true;- Parameters:
concatenatePhoneticRuns- true if runs should be concatenated, false otherwise
-
setLocale
public void setLocale(Locale locale)
-
getLocale
public Locale getLocale()
- Returns:
- locale
- Since:
- 3.16-beta3
-
getPackage
public OPCPackage getPackage()
Returns the opened OPCPackage container.- Specified by:
getPackagein interfacePOIXMLTextExtractor- Returns:
- the opened OPCPackage
-
getCoreProperties
public POIXMLProperties.CoreProperties getCoreProperties()
Returns the core document properties- Specified by:
getCorePropertiesin interfacePOIXMLTextExtractor- Returns:
- the core document properties
-
getExtendedProperties
public POIXMLProperties.ExtendedProperties getExtendedProperties()
Returns the extended document properties- Specified by:
getExtendedPropertiesin interfacePOIXMLTextExtractor- Returns:
- the extended document properties
-
getCustomProperties
public POIXMLProperties.CustomProperties getCustomProperties()
Returns the custom document properties- Specified by:
getCustomPropertiesin interfacePOIXMLTextExtractor- Returns:
- the custom document properties
-
processSheet
public void processSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, Styles styles, Comments comments, SharedStrings strings, InputStream sheetInputStream) throws IOException, SAXException
Processes the given sheet- Throws:
IOExceptionSAXException
-
createSharedStringsTable
protected SharedStrings createSharedStringsTable(XSSFReader xssfReader, OPCPackage container) throws IOException, SAXException
- Throws:
IOExceptionSAXException
-
getText
public String getText()
Processes the file and returns the text- Specified by:
getTextin interfaceExcelExtractor- Specified by:
getTextin interfacePOITextExtractor
-
getDocument
public POIXMLDocument getDocument()
Description copied from interface:POIXMLTextExtractorReturns opened document- Specified by:
getDocumentin interfacePOITextExtractor- Specified by:
getDocumentin interfacePOIXMLTextExtractor- Returns:
- the opened document
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem)
- Specified by:
setCloseFilesystemin interfacePOITextExtractor
-
isCloseFilesystem
public boolean isCloseFilesystem()
- Specified by:
isCloseFilesystemin interfacePOITextExtractor
-
getFilesystem
public OPCPackage getFilesystem()
- Specified by:
getFilesystemin interfacePOITextExtractor
-
-