This should have been a standard installation of our flagship app - same program we have running on dozens of other client sites. As far as I can tell, things are configured correctly: the app pool user has the correct permissions, COM is set to run with the launching user, etc...
Things work fine in File Mode (in fact they are currently live and conducting business with File Mode enabled). And if we switch to COM, we'll get quick responses initially. But after a few hits the system will become unresponsive - if you try to load an app page it will just sit and spin - eventually you may get an "All servers busy" message. Yet you can go into Module Admin and most of the server instances will be idle. If you unload and reload the COM servers, things will go back to normal for a bit, but again after several tries it will act as if no servers are available. The number of hits it will accept before it stops responding seems to vary - from 3 or 4 to as many as 40, but usually less than 15. There does not appear to be a particular method that triggers the problem.
Any idea what the issue could be? They had originally set things up using ISAPI so I had them switch to .NET, thinking it was some ISAPI quirk. But we get the same behavior under the .NET Handler.
--stein
Hello,
This line of code in oninit of the server help me a lot when I moved from file to com mode. It gives you the time consumed by each line of code
SET COVERAGE TO ("c:\mylog_" + TRANSFORM(THIS.GetProcessId())+ ".log")
Marcel
I’ve never heard of behavior like this. In the module admin can you see each hit as it happens in com (with the 2 sec delay) for the initial ones if you run them one at a time?
Is the number of hits r the consistent before hang? Might also be good to monitor the exe and memory usage in task manager to see if servers reload and or use a lot of memory for the hits that work.
As I noted, the number of hits before it stops responding is variable, but usually not much more than 10...
Since the site is live, we'll need to find a time when traffic is low so we can see what happens in the module admin and task mgr.
--stein
That sort of sounds like servers get hit once then lock up. If you're running in round robin mode each server will get hit one after the other which means it gets hit once. If it fails/locks at the very end of hte execution that might account for it.
Looking at TaskMan should help determine whether that's the case along with the previous comments.
+++ Rick ---
Today we're seeing just a couple of pages come up before it stops responding to requests under COM. But at the same time that multiple requests are timing out in the browsers, the status page shows that the servers are idle!
We get the same (non) response in either Load Based or Round Robin mode.
FWIW, this is under Server 2016 running on a 4 core VM. As noted, it works fine in File Mode but we get this weird behavior under COM with both ISAPI and .NET.
I'm wondering if we should just have them rebuild the server from scratch and do a fresh install of everything.
--stein
So in that case are all those hits working before it locks up? That looks like ~30 hits?
It doesn't look like this is a problem with the servers implementation (Fox code) since the servers appear to be idle - ie. not hung.
But I don't know what would cause this short of some sort COM failure, but that should generate errors and not just sit there and not do anything.
The admin requests work though, right, since you can get to the Admin page. Servers are marked as available (idle in the display) so they should be picked up out of the pool.
I'm not sure what could possibly cause this and I have never seen this sort of scenario.
+++ Rick ---
Thinking about this some more I think the problem may have to do with server Impersonation. Is it possible that they are using impersonation in the Application Pool (ie. passthrough security from Windows Auth to the user account)? If that's the case the cause may be that a user is logging on and the Web apps account changes and all of a sudden no longer has the rights to actually access the COM object.
+++ Rick ---
I can check whether they are using Impersonation. That would show up as a setting in web.config, right?
--stein
Perhaps - could also be machine wide. You probably need to check that inside fo the IIS Admin tool.
Couple of things you can check:
- Check the Module Admin Page and see what Server Account is set to (bottom)
The account should be whatever you specify in the Application Pool not IUSR or your user account. (or$MachineName$
for Network Service) - Make sure that the Application Pool uses Integrated not Classic
- In the Admin Tool check ASP.NET Impersonation - should be disabled (if using the .NET Module)
- In Web.Config check for (find)
Impersonation
- it shouldn't be set anywhere
If all else fails, try changing the AppPool to SYSTEM and running the application. Does it work then? If that doesn't work either, then you can rule out a permissions problem at the application level and look specifically for system level problems (DCOM configuration for supporting components most likely).
+++ Rick ---
Thanks - again, this is on a client's machine so I will need to work with their techs to follow up on this. I thought about running under SYSTEM but the app uses a SQL database on a separate server - pretty sure that the local SYSTEM will not have access to it.
--stein
I can get to the Module page to check the server account:
They tell me that the opcesql account has admin privileges on that server.
This all looks fine assuming the account has rights. You should double check the permissions for DCOM on the server for the specific account.
As I mentioned to make sure things are working correctly it's a good idea to temporarily try using the SYSTEM account for running the application.
As to SQL Server connections across the network - you can use SQL Server security (uid/pwd) instead of Windows pass through security to connect so there's no requirement for a specific account. Again this can be temporary, but running in a known to work security environment can pinpoint on whether the problems are related to security and logins, or it's something else. If it still fails with SYSTEM then the issue is unlikely to be security/login related.
+++ Rick ---
Just wanted to post an update on this situation - the client is still running in File mode as we have still not figured out how to make things work consistently under COM. If we switch to COM, things appear to work fine initially but we will start getting occasional pages that come up with no content - entirely blank. (I have not seen the "all servers busy" message that the customer reported earlier). Checking the WC Error Log will show entries like this:
11/21/2019 08:46:22 pm Processing Error - https://mell-base.uce.auburn.edu/wconnect/Person.awp2?Mode=SIGNUP
Error: 202
Message: Invalid path or file name.
Code:
Program: expandpage
Line No: 914
Build: AW4.EXE 2019100
There are several weird things about this:
- I can't see any way an invalid path could show up in the context of the line in question.
- Our internal error handler was completely bypassed. For a legit error, the user sees a "Sorry, please contact our office" message instead of a blank page.
- The same method/parameter combo works fine on all other customer sites.
- The same page will work fine as soon as we switch back to File mode.
At Rick's suggestion I got them to do a screen share so I could check their IIS configuration, but did not come across anything obviously amiss in the limited time we had. They are running Windows Server 2016 Standard with Web Connection 7.06.2 (.NET Handler)
--sg
That sounds like ExpandPage()
is not able to load the page template. The path in question would be perhaps a missing page template when the template is compiled in but the app runs with runtime compilation? Could it be that the virtual sits on a network share that's maybe not visible anymore?
+++ Rick ---