Search:
Locator+ Code:
FTP Home   VSM Home   Product Catalog   Archives   Customer Service   Site Map
 
   Control CAB Files



Manipulate CAB Files Programmatically

Use the Windows API to work with CAB files from within your applications.

by Ken Getz

  Pay attention to what's going on as you install almost any Microsoft product, and you'll notice Microsoft's use of cabinet (CAB) files. These files, stored normally with a CAB extension, contain one or more compressed files. (The popular ZIP file is a similar compressed file format.) With a bit of ingenuity, you can use CAB files in your own applications.

What you need:
Visual Basic 6.0
The Windows API includes the functionality you need if you want to ship your data in a compressed format, or if you want to run your own code to decompress a CAB file as part of your application. Unfortunately, Microsoft has left this functionality mostly undocumented, and the examples you'll find on MSDN are geared toward C/C++ developers. In this article, I'll show you how to decompress CAB files from within your application by using the CABFile class described in this article (download the code from the VBPJ Web site).

Although there's no room here to talk about all the downloadable code, I'll discuss much of the class's important code. I'll also show you how the SetupIterateCabinet API function works, and how you use its callback function to take actions on a specified CAB file.

What types of features would you need from your own CABFile class? You'd certainly want the ability to retrieve information about the CAB file, including the number of files and information about each file included within the CAB file. If you're retrieving information about all the files, you might want to retrieve the information in Extensible Markup Language (XML) format, making the information easier to manipulate later. You would want to be able to extract all the files, or only one, and you'd need the ability to specify the output path, and the output filename when extracting a single file. In addition, you'd want the class to raise events as it does its work, so your user interface can react accordingly. The CABFile class does all these things, but it can't handle one important task: creating or modifying a CAB file. Microsoft provides no API mechanism for this function; you need to use the MakeCab.exe program that comes with Visual Basic to do it. This program—well documented in MSDN—allows you to create CAB files as necessary (see Resources).

A warning: Microsoft never intended, and doesn't support, developers to extract files from the system CAB files using the Windows API. For that task, you're supposed to shell out to the Extract.exe program. You're on your own on this one—there's no inherent reason why you can't use the CABFile class with system CAB files, but you'll get no support from Microsoft if you do and something goes wrong.

Use the CABFile Class
The CABFile class discussed in this article provides several properties, methods, and events (see Table 1). Using the class is simple; for this test case, I made a copy of the xmldso.cab file that's found normally in the Windows 2000 C:\WinNT\Java\classes folder, renamed it as xmldsoDemo.cab, and placed it into the folder with the demo code (see Listing 1 to test most of the CABFile class's features). The demo code opens xmldslDemo.cab, retrieves a listing of all the files in the CAB file, and demonstrates the use of several methods and properties (see Figure 1).

You can use the CABFile class in your own applications either by creating an ActiveX DLL that includes the functionality, or by simply importing the class into your project. In addition, you need to include the CabFilesCallback standard module. This class requires the support of that one standard module because SetupIterateCabinet (the API function the CABFile class uses) requires a callback function, and because callback functions can exist only in standard modules.

 
Figure 1 Retrieve CAB File Information. Click here.

The Windows API provides only a single function that does all this work: SetupIterateCabinet. Part of the group of API functions that handle setup functionality, the SetupIterateCabinet function handles all the CAB file manipulations. Microsoft developers wrote SetupIterateCabinet so it requires a user-defined callback function to do its work. Unlike many other API functions, this one doesn't supply a default callback function. In other words, VB developers using SetupIterateCabinet must supply a function that handles notifications from SetupIterateCabinet; it's in this callback function that you do all your work.

The CABFile class handles all the details of calling SetupIterateCabinet, but if you're going to modify the class's behavior or functionality, you need some idea of how it all works. You must supply four values to call the SetupIterateCabinet function: your CAB file's full path; an unused, reserved value (0, in the CABFile class); the callback function's address; and a long integer specifying contextual information. Because the CABFile class provides entry points for counting, reporting, and extracting files, it uses the final long integer parameter to indicate to the callback function exactly what the callback function should do.

\SetupIterateCabinet returns a nonzero value if it succeeds, and zero if it fails. If SetupIterateCabinet fails, you can retrieve Err.LastDLLError to find out the error code that caused it to fail. (The CABFile class includes a procedure, ErrToText, which converts many common API errors into text. The class uses Err.Raise to raise DLL errors back out to the caller, providing the ErrToText function's return value as the error's text.) For example, the FileCount property procedure in the CABFile class calls SetupIterateCabinet like this:

lngReturn = SetupIterateCabinet( _
   CabName, 0, AddressOf _
   CabinetCallback, sicCount)
CabName is the name of the CAB file to be interrogated, 0 is a placeholder for the reserved parameter, AddressOfCabinetCallback is the callback procedure's address, and sicCount is one of a set of enumerated values, indicating to the callback procedure that you asked it to count the files in the CAB file.

The formal declaration for SetupIterateCabinet's callback function looks like this:

Public Function CabinetCallback( _
   ByVal Context As Long, _
   ByVal Notification As Long, _
   ByVal Param1 As Long, _
   ByVal Param2 As Long) As Long
End Function
When SetupIterateCabinet calls this procedure, Context contains the long integer passed into the SetupIterateCabinet function as contextual information. Notification contains one of three possible notification messages (see Table 2). Param1 and Param2 contain parameters that SetupIterateCabinet sends to your callback function, and the parameter contents depend on the particular notification message you're reacting to. In general, the callback function should handle all the notification messages in Table 2.

Cast the Value
The sample project's callback function doesn't look exactly like this. Instead, the callback function used in the project allows VB to "cast" (that is, copy the data into a specific datatype) the value in Param1, making the function easier to use. Note that instead of passing Param1 as a ByVal Long integer, the callback function passes Param1 as a ByRef user-defined type (UDT):

Public Function CabinetCallback( _
   ByVal Context As Long, _
   ByVal Notification As Long, _
   ByRef Param1 _
   As FileInCabinetInfo, _
   ByVal Param2 As Long) As Long

End Function
Param1's original value is the data structure's address, so the UDT must be a ByRef parameter. Making the UDT a ByRef parameter allows VB to take the bytes at the address provided by the long integer Param1, and show the bytes to you as a FileInCabinetInfo data structure. The ByRef keyword is optional, but it makes the code easier to understand.

One more thing to note about the callback function: It's easiest to create classes that use callback functions if the callback function itself has access to all the API declarations, constants, and enums used by the class. Unfortunately, you must place the callback function in a standard module, where all the useful information isn't contained. The callback function in this example works around this problem by doing nothing more than calling a Friend method in the class module (see Listing 2). SetupIterateCabinet calls the callback function in the standard module, which ends up calling the corresponding method in the class.

To use this technique, you must provide some way to tell the real callback function how to find your callback method. In this example, the CABFile class must call the SetCabFile procedure in the standard module and pass in a reference to itself each time you invoke any property or method that requires the callback function. This technique limits you to one active CABFile object at a time, because there's only a single CABFile variable in the standard module. If you find this to be a limitation, you need to look into some other technique for managing the callback mechanism.

You'll also find some general-purpose, path-handling code in this standard module. You probably want to use this code from within your own applications, so it makes sense to place this code where the CABFile class and your code can get at it.

Call SetupIterateCabinet
Each of the CABFile class's methods and properties follows the same set of four steps. In each procedure, the code first ensures that the class's CabName property has been set. Without this property value, each method will fail (raising an error indicating the problem):

If Len(CabName) = 0 Then
   Err.Raise errNoCabFile, conClass, _
      GetErrorText(errNoCabFile)
Else
   ' Do the real work.
End If
Second, the code calls the SetCabFile procedure in the standard module, so the "fake" callback function can call the real one in the CABFile object.

Third, the code calls the SetupIterateCabinet object, passing in the appropriate parameters. The final parameter, indicating the call's context, is important—this parameter indicates to the callback function exactly what steps the callback function needs to take:

' From the Extract method.
lngReturn = SetupIterateCabinet( _
   CabName, 0, AddressOf CabinetCallback, sicExtract)

Finally, the code checks the return value from SetupIterateCabinet after that function has done its work. A return value of 0 means an error has occurred in the CAB file processing, so the method raises the error back out to the caller:

If lngReturn = 0 Then
   Err.Raise Err.LastDllError, _
      conClass, ErrToText(Err.LastDllError)
End If
Param1 contains a FileInCabinetInfo UDT when your callback function receives the SPFILENOTIFY_FILEINCABINET notification:
Public Type FileInCabinetInfo
   NameInCabinet As Long
   FileSize As Long
   Win32Error As Long
   DosDate As Integer
   DosTime As Integer
   DosAttribs As Integer
   FullTargetName(1 To MAXPATH) As Byte
End Type
Param2 contains the address of a buffer containing the CAB filename you're working with. Note that many text values you work with when using SetupIterateCabinet are provided as addresses (long integers) rather than as neat VB strings. The CABFile class includes the StringFromPointer function, which calls the RTLMoveMemory API function (aliased as CopyMemory, for this application's sake), so it can convert form addresses to VB strings. This procedure is useful in other situations as well. Whenever the API provides you with the address of a string buffer, you need this type of function to copy the data into a VB string:
Private Function StringFromPointer( _
   ByVal ptr As Long) As String
   Dim lngLen As Long
   Dim strBuffer As String

   ' Given a string pointer, copy the
   ' value of the string into a new, safe location.

   lngLen = lstrlen(ptr)
   strBuffer = Space(lngLen)
   Call CopyMemory(ByVal strBuffer, _
      ByVal ptr, lngLen)
   StringFromPointer = strBuffer
End Function
I modified the callback function a bit so the first parameter is always passed as a FileInCabinetInfo structure, because most of the CABFile class's procedures end up triggering the SPFILENOTIFY_FILEINCABINET notification. This modification saves you from having to retrieve the data manually from the address passed to the callback function. In other words, by using this specification for the callback function, you allow VB to cast the value in Param1 into the correct datatype automatically. In reaction to other notification messages, the CABFile class uses the LSet statement to copy the bytes from the FileInCabinetInfo structure into other types of UDTs when you need to retrieve the data. The actual callback function declaration looks like this:
Public Function CabinetCallback( _
   ByVal Context As Long, _
   ByVal Notification As Long, _
   ByRef Param1 As FileInCabinetInfo, _
   ByVal Param2 As Long) As Long

   If Not CabFile Is Nothing Then
      CabinetCallback = CabFile.CabCallBack( _
         Context, Notification, Param1, Param2)
   End If

End Function
Cast the Parameter Into Any UDT
Given the information in Param1 and Param2, the callback procedure can cast Param1 into any UDT. For example, the SPFILENOTIFY_FILEXTRACTED notification passes a FILEPATHS structure in Param1. The CABFile class converts the parameter into the correct type within the code that handles this message:
Dim fp As FILEPATHS
LSet fp = Param1
The callback function you must provide (and the one provided in the CABFile class) includes a Select Case statement, which takes action based on the notification message it receives from SetupIterateCabinet. The entire CabCallback procedure from the CABFile class does something with each notification message (see Listing 3). This example doesn't handle the SPFILENOTIFY_NEEDNEWCABINET message, but does provide handling for the other two messages.

The callback function might receive the SPFILENOTIFY_FILEINCABINET message because you called the FileCount property, or it might receive the Extract, GetInfo, or GetXML methods. How does this procedure know which one you called? All the callback function gets is the notification message, so you might wonder how to discern which activity should take place. Luckily, the folks at Microsoft considered this, and added the final parameter (Context As Long) for SetupIterateCabinet. This parameter passes whatever value you place into it to the callback function, in the callback function's first parameter. So, the callback function in the CABFile class uses another Select Case statement to determine which action it should take:

Select Case Context
   Case sicCount
      mlngCount = mlngCount + 1
      CabCallBack = FILEOP_SKIP
   Case sicReport
      CabCallBack = HandleReport(Param1)
   Case sicGetXML
      CabCallBack = HandleXML(Param1)
   Case sicExtract
      CabCallBack = HandleExtract(Param1)
End Select
Each of the cases in the Select Case block calls a different procedure to handle the required work. You might want to investigate the HandleReport, HandleXML, and HandleExtract functions. These functions retrieve information only from the FileInCabinetInfo structure passed to the callback function, and manipulate the information as necessary to carry out their tasks. It's the return value—the value passed back to SetupIterateCabinet—that matters. Each function indicates what the next step should be (extract or skip the file) by passing back the appropriate return value (see Table 2). In addition, the CAB file might contain information about the various compressed files' relative paths. Much of the code in the CABFile class handles these paths, and the class uses these relative paths throughout the processing of all the files within the CAB file.

Given the MakeCab.exe program, and the code in the CABFile class provided here, you should be able to create and work with CAB files from within any VB program. Although the details of passing information from one place to another are somewhat gruesome, once you've worked through the intricacies (as the CABFile class has done), it's not hard to use the functionality. You'll find a whole raft of useful functions provided by the Setup APIs in Windows if you dig into the Windows API. Check out the entire group of API functions (SetupIterateCabinet, SetupPromptForDisk, and so on) in MSDN for some enlightening reading (see Resources). You'll find other API tricks available to you, as well.


Ken Getz, a senior consultant with MCW Technologies, splits his time between programming, writing, and training. Ken wrote and appears in video training for . He is the coauthor of Visual Basic Language Developer's Handbook and Access 2000 Developer's Handbook Set (Sybex). Reach him at .

 
  Get the code for this issue here.
  Download the code from each article individually. Get the code for this article here.