The Case of the Failed MSI
Posted by William Diaz on December 3, 2015
I recently began encountering a problem where one of our application packages was getting hung during the install process. I could see the program directory correctly populated with application files but it seemed to get hung at end, and msiexec processes continued to run indefinitely. After manually killing each process, I checked to see if the application actually completed installing. When trying to run one of the applications that has a dependency on one of the failed MSI’s APIs, I encountered the following:
Further, Outlook was crashing when trying to load the application’s main component.
Faulting application name: OUTLOOK.EXE, version: 14.0.7012.1000, time stamp: 0x514a1b69 |
My guess was that the application’s components were not getting registered. To confirm, I manually registered all the components in the program directly via the command line:
FOR /R "C:\Program Files (x86)\ApplicationFolder" %G IN (*.dll) DO "%systemroot%\system32\regsvr32.exe" /s "%G").
Afterwards, the application with the dependency launched without error and Outlook was no longer crashing. However, the root issue still needed to be identified as I had no idea what else in the MSI was failing to complete. After manually deleting the components from the file system (it was missing from Programs and Features), I ran the MSI again and this time used Process Explorer to look inside the hung msiexec processes. There were multiple processes running, but I could see one process, though, eating CPU resources, and this was likely where I might find the culprit:
Opening the process details, I select the Threads tab. The thread that is doing all the work (or in this case getting hung up) is indicated by the Cycles tab. From here, I opened the thread by selecting it and clicking Stack:
Stacks are read from the bottom up. You can see a 3rd party component here (HCApi – McAfee HIPS) at play. Note, this might not be entirely abnormal as you can always expect any anti-virus suite to be hooking itself into any number of processes. So, to confirm what I was seeing, I used the Task Manager to dump the msiexec process that was using the CPU time by right-clicking it and select Create dump file (which can also be done via Process Explorer):
Dumps can be analyzed with WinDbg or a tool like DebugDiag 2.0. I have grown increasingly lazy and forgetful over the years with WinDbg. It has a very high learning curve and if you don’t use it much, its easy to forget everything except the old faithful !analyze –v or !analyze –v –hang. I used DebugDiag instead. When it is installed, simply right-click the dump file and select Analyze Crash/hang Issue from the context menu and point it to the crash file.
It will do its best to figure out the issue in the most vague way and you almost always need to do some interpretation of your own. For the most part, the analysis summary can be ignored.
A little bit down in the report points me to what I was seeing in Process Explorer with the problem thread that was using all the CPU time:
You can click the Thread ID like a hyperlink to follow it down in the report, expanding the thread. It reveals a familiar site. However, there is a bit more insight as the entry point is revealed and I can see the problem has something to do with the CustomAction table of the MSI itself.
Through a quick process of elimination using Orca to remove some of the rows in the CustomAction table, I narrow down the cause specifically down to action ISSelfRegisterCosting.
Without that row, the MSI install completes normally; or so at least it seems. I have no idea what removing this action might have elsewhere so this merely a hack. To further confirm the problem is being caused by McAfee, I perform the install on a virtual machine where McAfee is not installed and it proceeds normally. I then reach out to our McAfee enterprise admin and ask him to disable HIPS on one of my workstation. After doing so, the MSI runs and the application installs as expected. He inherits the problem.
Update
This has since been corrected by MacAfee with a HIPS update. One of the DLLs trying to get registered was getting blocked.
Cary Roys said
This is not terribly shocking. Self-registration of *.dll’s is rare for causing weird problems, but the problems are really really weird when they do pop up. Things like crashing Msiexec.exe long after self-registration occurs, and the like.
HIPS or other security software I have also seen hang Msiexec.exe indefinitely during the PublishProduct action when the cached *.msi is placed in c:\Windows\Installer, since that can trigger heuristics in rare cases.
Good job on the detective work with all the Procmon, WinDbg tools, but would the logfile not have told you the same thing, after killing msiexec?
William Diaz said
I could see the log getting hung on the dll registrations. But to actually see where the msi was failing required killing the correct msiexec process so the install process could correctly begin the rollback. There, the log would have been useful:
Action 11:55:44: Rollback. Rolling back action:
Rollback: ISSelfRegisterFiles
MSI (s) (54:68) [11:55:44:431]: Executing op: ActionStart(Name=ISSelfRegisterFiles,,)
cne9999 said
There is a reason for the ISSelfReg.. table and functionality. The log told you the answer and there was no need to go so deep in debugging. I would have seen that action not finishing and focused on the corresponding tables contents. With the native SelfREg tables you can’t determine the order of the registrations. With the ISSelfreg… actions and table you can. This is why developers use it. This is a flaw in the native MSI selfreg tables and API calls where there is no way to tell the API what order to register objects in. InstallShield resolved this by allowing you to choose the order of registration. That is important when you have COM objects that depend on another COM object already having been fully registered and ready to go. Most developers get around this by forcing a reboot and registering objects after the reboot with a run-once action. I see it all the time.0
Many EXE’s that require DLL’s already in place to run a function that extracts its COM data fall into this category. This is much more common then you think but is usually handled by post install actions in the application itself or proprietary CA’s scheduled where they need to be..
So my answer above still stands but it may cause the application to fail on a registration or more if these interdependencies are there. Just eliminating the CA that performs the IS (InstallShield) Self Registration could cause modules to not register at all making parts or all of the application itself unusable. Or even some obscure menu item in the application not functioning. No way to tell without deep application testing. But again, I would focus on migrating what ever is in the ISSelfReg table to the ISSelfreg table. Even though this may cause some objects not to register it should be the first move. From there look at the ISSelf reg table to see what order the objects are being registered in and write your own CA to do the same using Windows native EXE (Regsvr32).
There is one last item… Non Comserver com registration. If one of those ‘ordered’ objects needs to be there for an exe form another CA to run correctly you would need to know the extraction call for the EXE if it is not a comserver. I could write a book on that alone. You need a good disassembler AND/OR and good dependency scanner to determine the non-server call that allows the EXE to extract its data.
And on and on and on…..
William Diaz said
Hardly going deep into debugging here, or at all. I’ve always said that using debugging tools does not make you a debugger. The purpose is to identify the cause and leave it to the developer (or in this case, MacAfee) to find out why HIPS is hanging up the install. The msi here is not an in-house product; and we should not be making modifications to it to workaround anti-virus. What if we come across another IS MSI using the same CA and also getting hung? Nothing in the msi logs is going to tell us that our anti-virus suite is the culprit. Thread stacks and Windows debugging tools are the entry level tools that we are going to be using to identify that.
William Diaz said
I found an MSI packed inside a non-InstallShield setup exhibiting the same issue. It also failed on dll registration. The only difference in this case was that the MSI would seem to time out after about a minute but report complete. It wasn’t until the components for the app were manually registered that it would work. I don’t think the issue specifically related to MSI products but some components registration that MacAfee doesn’t not like. In this case, the hang happens on the component addin for Outlook and it looks like the registration process is being handled by separate process:
MSI (s) (E8:10) [15:27:22:101]: Executing op: CustomActionSchedule(Action=_CBD9642D_2568_4CAD_83C4_72EA1D687202,ActionType=3090,Source=C:\Users\username\AppData\Roaming\PremiereGlobal\GlobalMeet Outlook Toolbar\adxregistrator.exe,Target=/install=”ConferencingAddin.dll” /privileges=user /CLRVersion=”2.0.50727″,)
MSI (s) (E8:10) [15:28:19:650]: Executing op: ActionStart
Owen said
Great debugging Kevin! Sometimes the Verbose MSILog can also turn you on to this CustomAction that is causing the problem (a lot of times, it is custom actions)
Bill said
The ISSelfreg table and all of its associated custom actions uses the InstallShield Self registration method and embedded DLL. This DLL is probably what was caught by McAfee. If you have InstallShield you can open the MSI via MST and change it form the Custom InstallShield registration method in the General Information node, in the Lockdown Permissions field to ‘Traditional Windows Installer Handling’ and InstallShield will migrate the ISLockPermissioins Table to the LockPermissions table for you. Otherwise you need to manually remove the CA and manually migrate the settings. This is the short answer. But in the End there is no way McAfee should flag that action unless its Heuristics are set to high or the InstallShield DLL is infected which it most likely is not.
tupham81 said
Very typical issue.. Did you try installing with the /q switch, in other words with no UI? Is your custom action configured to run under User or System context? Also with Outlook closed?
Is your MSI compiled from a setup capture or is it vendor provided? Im guessing vendor provided as you were using Orca to edit the MSI directly. From memory, the Custom Actions beginning with “ISSetup” usually was just registering files or running DLLs unrelated to the actual application. Given that its causing outlook to crash and its trying to call an RPC function, your best bet is to take out the custom action.
William Diaz said
we rather not edit 3rd party installer properties. Anyway, as noted, the issue was due to McAfee HIPS. An update since has fixed it.