Package org.apache.poi.hwpf
Class HWPFDocumentCore
- java.lang.Object
-
- org.apache.poi.POIDocument
-
- org.apache.poi.hwpf.HWPFDocumentCore
-
- All Implemented Interfaces:
Closeable,AutoCloseable
- Direct Known Subclasses:
HWPFDocument,HWPFOldDocument
public abstract class HWPFDocumentCore extends POIDocument
This class holds much of the core of a Word document, but without some of the table structure information. You generally want to work with one ofHWPFDocumentorHWPFOldDocument
-
-
Field Summary
Fields Modifier and Type Field Description protected CHPBinTable_cbtContains formatting properties for textprotected FileInformationBlock_fibThe FIBprotected FontTable_ftHolds fonts for this document.protected ListTables_ltHold list tablesprotected byte[]_mainStreammain document stream bufferprotected ObjectPoolImpl_objectPoolHolds OLE2 objectsprotected PAPBinTable_pbtContains formatting properties for paragraphsprotected StyleSheet_ssHolds styles for this document.protected SectionTable_stContains formatting properties for sections.protected static intFIB_BASE_LENSize of the not encrypted part of the FIBprotected static intRC4_REKEYING_INTERVAL[MS-DOC] 2.2.6.2/3 Office Binary Document ...protected static StringSTREAM_OBJECT_POOLprotected static StringSTREAM_TABLE_0protected static StringSTREAM_TABLE_1protected static StringSTREAM_WORD_DOCUMENT
-
Constructor Summary
Constructors Modifier Constructor Description protectedHWPFDocumentCore()HWPFDocumentCore(InputStream istream)This constructor loads a Word document from an InputStream.HWPFDocumentCore(DirectoryNode directory)This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default.HWPFDocumentCore(POIFSFileSystem pfilesystem)This constructor loads a Word document from a POIFSFileSystem
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description CHPBinTablegetCharacterTable()protected byte[]getDocumentEntryBytes(String name, int encryptionOffset, int len)Reads OLE Stream into byte array - if anEncryptionInfois available, decrypt the bytes starting at encryptionOffset.StringgetDocumentText()Returns document text, i.e.EncryptionInfogetEncryptionInfo()FileInformationBlockgetFileInformationBlock()FontTablegetFontTable()ListTablesgetListTables()byte[]getMainStream()static intgetMaxRecordLength()ObjectsPoolgetObjectsPool()abstract RangegetOverallRange()Returns the range that covers all text in the file, including main text, footnotes, headers and commentsPAPBinTablegetParagraphTable()abstract RangegetRange()Returns the range which covers the whole of the document, but excludes any headers and footers.SectionTablegetSectionTable()StyleSheetgetStyleSheet()abstract StringBuildergetText()Internal method to access document textabstract TextPieceTablegetTextTable()static voidsetMaxRecordLength(int length)protected voidupdateEncryptionInfo()static POIFSFileSystemverifyAndBuildPOIFS(InputStream istream)Takes an InputStream, verifies that it's not RTF or PDF, builds a POIFSFileSystem from it, and returns that.-
Methods inherited from class org.apache.poi.POIDocument
clearDirectory, close, createInformationProperties, getDirectory, getDocumentSummaryInformation, getEncryptedPropertyStreamName, getPropertySet, getPropertySet, getSummaryInformation, initDirectory, readProperties, replaceDirectory, validateInPlaceWritePossible, write, write, write, writeProperties, writeProperties, writeProperties
-
-
-
-
Field Detail
-
STREAM_OBJECT_POOL
protected static final String STREAM_OBJECT_POOL
- See Also:
- Constant Field Values
-
STREAM_WORD_DOCUMENT
protected static final String STREAM_WORD_DOCUMENT
- See Also:
- Constant Field Values
-
STREAM_TABLE_0
protected static final String STREAM_TABLE_0
- See Also:
- Constant Field Values
-
STREAM_TABLE_1
protected static final String STREAM_TABLE_1
- See Also:
- Constant Field Values
-
FIB_BASE_LEN
protected static final int FIB_BASE_LEN
Size of the not encrypted part of the FIB- See Also:
- Constant Field Values
-
RC4_REKEYING_INTERVAL
protected static final int RC4_REKEYING_INTERVAL
[MS-DOC] 2.2.6.2/3 Office Binary Document ... Encryption: "... The block number MUST be set to zero at the beginning of the stream and MUST be incremented at each 512 byte boundary. ..."- See Also:
- Constant Field Values
-
_objectPool
protected ObjectPoolImpl _objectPool
Holds OLE2 objects
-
_fib
protected FileInformationBlock _fib
The FIB
-
_ss
protected StyleSheet _ss
Holds styles for this document.
-
_cbt
protected CHPBinTable _cbt
Contains formatting properties for text
-
_pbt
protected PAPBinTable _pbt
Contains formatting properties for paragraphs
-
_st
protected SectionTable _st
Contains formatting properties for sections.
-
_ft
protected FontTable _ft
Holds fonts for this document.
-
_lt
protected ListTables _lt
Hold list tables
-
_mainStream
protected byte[] _mainStream
main document stream buffer
-
-
Constructor Detail
-
HWPFDocumentCore
protected HWPFDocumentCore()
-
HWPFDocumentCore
public HWPFDocumentCore(InputStream istream) throws IOException
This constructor loads a Word document from an InputStream.- Parameters:
istream- The InputStream that contains the Word document.- Throws:
IOException- If there is an unexpected IOException from the passed in InputStream.
-
HWPFDocumentCore
public HWPFDocumentCore(POIFSFileSystem pfilesystem) throws IOException
This constructor loads a Word document from a POIFSFileSystem- Parameters:
pfilesystem- The POIFSFileSystem that contains the Word document.- Throws:
IOException- If there is an unexpected IOException from the passed in POIFSFileSystem.
-
HWPFDocumentCore
public HWPFDocumentCore(DirectoryNode directory) throws IOException
This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embedded documents.- Parameters:
directory- The DirectoryNode that contains the Word document.- Throws:
IOException- If there is an unexpected IOException from the passed in POIFSFileSystem.
-
-
Method Detail
-
setMaxRecordLength
public static void setMaxRecordLength(int length)
- Parameters:
length- the max record length allowed for HWPFDocumentCore
-
getMaxRecordLength
public static int getMaxRecordLength()
- Returns:
- the max record length allowed for HWPFDocumentCore
-
verifyAndBuildPOIFS
public static POIFSFileSystem verifyAndBuildPOIFS(InputStream istream) throws IOException
Takes an InputStream, verifies that it's not RTF or PDF, builds a POIFSFileSystem from it, and returns that.- Throws:
IOException
-
getRange
public abstract Range getRange()
Returns the range which covers the whole of the document, but excludes any headers and footers.
-
getOverallRange
public abstract Range getOverallRange()
Returns the range that covers all text in the file, including main text, footnotes, headers and comments
-
getDocumentText
public String getDocumentText()
Returns document text, i.e. text information from all text pieces, including OLE descriptions and field codes
-
getText
@Internal public abstract StringBuilder getText()
Internal method to access document text
-
getCharacterTable
public CHPBinTable getCharacterTable()
-
getParagraphTable
public PAPBinTable getParagraphTable()
-
getSectionTable
public SectionTable getSectionTable()
-
getStyleSheet
public StyleSheet getStyleSheet()
-
getListTables
public ListTables getListTables()
-
getFontTable
public FontTable getFontTable()
-
getFileInformationBlock
public FileInformationBlock getFileInformationBlock()
-
getObjectsPool
public ObjectsPool getObjectsPool()
-
getTextTable
public abstract TextPieceTable getTextTable()
-
getMainStream
@Internal public byte[] getMainStream()
-
getEncryptionInfo
public EncryptionInfo getEncryptionInfo() throws IOException
- Overrides:
getEncryptionInfoin classPOIDocument- Throws:
IOException
-
updateEncryptionInfo
protected void updateEncryptionInfo()
-
getDocumentEntryBytes
protected byte[] getDocumentEntryBytes(String name, int encryptionOffset, int len) throws IOException
Reads OLE Stream into byte array - if anEncryptionInfois available, decrypt the bytes starting at encryptionOffset. If encryptionOffset = -1, then do not try to decrypt the bytes- Parameters:
name- the name of the streamencryptionOffset- the offset from which to start decrypting, use-1for no decryptionlen- length of the bytes to be read, useInteger.MAX_VALUEfor all bytes- Returns:
- the read bytes
- Throws:
IOException- if the stream can't be found
-
-