Recently I was challenged to see if I could extract files from Microsoft Office documents without a file header or footer. The files being put into the Microsoft documents were text files, configuration files and source code files. I do not know why these files were being put into office documents except maybe to hide the files. The files were not obfuscated in the documents and could be extracted manually. Extracting these files manually works for a small set of these documents but for a large set, manual extraction was not feasible. The location of these documents were on EnCase images so it made extracting them easier for me with an EnScript. The files in the office documents were kept as OLE objects. To view OLE objects in an office document within EnCase you can right click on the document and select "View File Structure". This action is shown in Figure 1 below.
Figure 1 - Viewing Office Document "File Structure".
This also becomes a manual process viewing each document and then seeing what OLE objects are located in the document. EnCase recently came out with a way to automate the "View File Structure" process. The automated process is done with the File Mounter EnScript. This EnScript allows you to view the embedded file structure of many files such as Thumbs.db, zip archives, and Office documents (YES). Executing the process to automatically mount these types of files looks like Figure 2 below.
Figure 2 - Mounting Office Documents with File Mounter EnScript.
Once all the Office documents have been mounted you can extract all the OLE objects that are non-picture files with my OLExtract EnScript. Get it below.
This EnScript looks for all the OLENative Entries in the mounted documents and then extracts them to your EnCase default export folder. The EnScript will separate each file extracted in relation to which document it was extracted from and files with the same name will be incremented. The results of the execution of OLExtract will look similar to Figure 3 below.
Figure 3 - Results of OLExtract
OLExtract will ignore jpegs stored as OLE objects because they are stored differently then other types of files stored in the document. You can extract jpegs easily from Office documents using file carving.