As part of Help Builder one of the things I do is export Help documents into Word format. This works pretty well as Word is for the most part smart enough to translate the HTML into a workable and almost nicely formatted document.

 

A simple way to do this is to open the document in Word and then copy the entire selection and paste it into another document that contains the proper format template to export to. It looks something like this (using Visual FoxPro 8 code):

 

llError = .F.

TRY

   oWord=CREATE("word.application")

   oWord.VISIBLE = .F.

  

   DoEvents

CATCH

   MESSAGEBOX("Unable to load the Word COM object:" + CHR(13) + CHR(13) +;

              MESSAGE())

   llError = .T.

ENDTRY

IF llError

  RETURN

ENDIF

 

*** Start by loading the HTML file

oDoc = oWord.Documents.OPEN(lcFile)

 

DOEVENTS

 

*** Select and copy the whole thing to the ClipBoard

oWord.SELECTION.WholeStory

oWord.SELECTION.COPY

oDoc.CLOSE()

 

*** Copy the template file

COPY FILE (THISFORM.oHelp.cProjPath + ;

           "templates\msword\helpbuildertemplate.doc") TO ;

          ( FORCEEXT(lcFile,"doc") )

 

oDoc = oWord.Documents.OPEN(FORCEEXT(lcFile,"doc"))

oWord.SELECTION.Paste()

 

oDoc.SAVEAS(FORCEEXT( lcFile, "doc" ))

 

This works great with one exception: All the images embedded in the document are considered external to the document – ie. linked images that must be there on disk. What I really need though is images that are embedded into the document.

 

After much shitty research through the woefully incomplete docs for Office Automation I found a way to embed ‘most’ images easily (I’ll come back to the ‘most’ part shortly as this is the reason for this rant). The following is a VBA macro I created that I call from the VFP code:

 

' This macro replaces external image links with

' embedded images so the document is self-contained

' This macro must be run while the images are in place

Sub ReplaceImages()

 

   For Each oField In ActiveDocument.Fields

      If oField.Type = wdFieldIncludePicture Then

          oField.LinkFormat.SavePictureWithDocument = True

      End If

   Next

End Sub

 

It took me a while to find the SavePictureWithDocumnent option. It basically determines whether images are embedded in the document or live externally as linked files.

 

Well, after some back and forth I found out that the above approach is simple, but causes major problems when dealing with a large document. When running the above code on small documents it works well, but with larger documents (where large refers to the images embedded) memory usage goes through the roof and Word locks up. So… back to the drawing board and some adjustments to code I originally used which is more complex but interactively removes the image and then pastes it back into the document as an embedded image:

 

'***************************************************************************

'*** ReplaceImages

'*****************

' This macro replaces external image links with

' embedded images so the document is self-contained

' This macro must be run while the images are in place

Sub ReplaceImages()

 

   lcPath = ActiveDocument.FullName

   lcPath = Left(lcPath, InStrRev(lcPath, "\"))

  

   For Each oField In ActiveDocument.Fields

      If oField.Type = wdFieldIncludePicture Then

         lcText = oField.Code

         lnLoc1 = InStr(1, lcText, Chr(34))

         lnLoc2 = InStr(lnLoc1 + 1, lcText, Chr(34))

         lcCode = Mid(lcText, lnLoc1 + 1, lnLoc2 - 1 - lnLoc1)

    

         lcCode = Replace(lcCode, "/", "\")

         lcCode = Replace(lcCode, "\\", "\")

        

         oField.Select

        

         If FileExists(lcCode) Then

            Selection.InlineShapes.AddPicture FileName:=lcCode, LinkToFile:=False, SaveWithDocument:=True

            GoTo DonePath

         End If

        

         lcCode = Replace(lcCode, "\\", "\")

         lcCode = lcPath + lcCode

         If FileExists(lcCode) Then

             Selection.InlineShapes.AddPicture FileName:=lcCode, LinkToFile:=False, SaveWithDocument:=True

         End If

        

DonePath:

      End If

   Next

End Sub

 

 

Private Function FileExists(ByVal FileName As String)

Dim FileSize As Long

 

On Error GoTo FileExists_Error

FileSize = FileLen(FileName)

FileExists = True

GoTo FileExists_Exit

 

FileExists_Error:

    FileExists = False

   

FileExists_Exit:

    On Error GoTo 0

 

End Function

 

This code works well even on a very large document of over 500 pages. It’s still not fast, but it doesn’t seem to hurt Word resources much and shows something happening on the screen that doesn’t make the app appear locked up.

 

Now to the MOST part. The FOR loop does not catch all images. The problem is that there appears to be a bug in the Fields collection parsing as it does not catch all images from the HTML. Specifically if the image is marked up with any extra attributes the image does not show up in the Fields collection. Something as simple as:

 

<img src=”images/wwhelp.gif” align=”right”>

 

causes the image to not be included in the fields list which bites big time. Removing the align or any other tag (like HSPACE) causes the image to show up fine.

 

I have yet to figure out how to get at those images, but in the meantime I’ve been removing these tags from the default generation code in Help Builder which works for some of the automatically generated images such as icons for headers and class/data lists.

 

It always amazes me how things like this get through a testing process?