Slide 1 WW TSS-04: Advanced Troubleshooting for Wonderware Application Server Rich Liddell Javier Aldán Technical Account Manager Technical Account Manager Global Customer Support Global Customer Support [email protected] [email protected] © 2013 Invensys. All Rights Reserved. The names, logos, and taglines identifying the products and services of Invensys are proprietary marks of Invensys or its subsidiaries. All third party trademarks and service marks are the proprietary marks of their respective owners. Agenda • • • • • Tools & Technique Install Deploy\undeploy Multi-Galaxy Communication Tech Notes & Tech Alerts Slide 3 Common tools SMC Logger Platform Manager Object Viewer Task Manager MiniDump Windows Event System Files Wonderware Developer Network (WDN) Slide 4 Troubleshooting 101 - What Did You Change? Slide 5 System Management Console (SMC) Slide 6 Object Viewer – Locate Process ID Slide 7 Object Viewer Find off scan or quarantined objects: • Uncheck Search by Name • Check only show objects Slide 8 Object Viewer Find Object ID. Slide 9 Secret Dialog Menu Slide 10 How can I tell if someone deployed something? Did you Are Nope…. Deploy? Sure?!? •Gobject_Change_Log •Objects affected •Operation performed •User Comment •User Logged on Slide 11 Return all operations for past 24 hours SELECT Change.change_date, Change.user_profile_name, Oper.operation_name, user_comment, gObj.Tag_name FROM gobject_change_log Change JOIN lookup_operation Oper ON change.operation_id = Oper.operation_id JOIN Gobject GObj ON GObj.gobject_Id = Change.gobject_Id WHERE Change.change_date > DateAdd(hour,-24,getdate()) --and Tag_Name = 'UserDefined_001' --and Oper.operation_name like '%Deploy%' --or Oper.operation_name = 'ModifiedAutomationObjectOnly' ORDER BY change.change_date desc Slide 12 Engine Attributes Slide 13 Automatic MiniDump Generation • MiniDump will enable any process from ArchestrA to dump its process information to a dump file if it ever terminates abnormally or hits an exception error. • If a Minidump file is generated it will be created automatically at the default path of: <drive>:\Program Files\ArchestrA\Framework\minidump • The minidump can be quite large depending on the process (200 - 800mb) Slide 14 Enable MiniDump Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\ArchestrA\Fram ework\Debug] "MinidumpEnabled"=dword:00000001 "MaximumDumpFilesAllowed"=dword:0000000a "MinidumpType"=dword:00000003 Slide 15 Tech Note 726 Capturing a Memory Dump File Using the Microsoft® Debug Diagnostic Tool (32bit) Slide 16 Wonderware Developer Network https://wdn.wonderware.com Slide 17 Contact Wonderware Via Email:[email protected] Via Phone (US):1-800-WONDER1 (or 1-800-966-3371) (international): 1-949-639-8500 • You will need to have a UserID. Slide 18 Install issues….. Slide 19 Unable to Modify install • You have successfully installed • Go to add modify the install via add\remove programs • You then see a quick flash and nothing happens. • You uninstall & reinstall and yet the same behavior Slide 20 Missing Install folder C:\Program Files (x86)\Common Files\ArchestrA\Install\{F7D0430C-1E73-4546-BB8C-3E23DF991668}\External Slide 21 We need to copy it from the Root of the install: Slide 22 Disable UAC during install Must disable User Account Control (UAC) before installing •Failing to do this can result in functional issues –Which can lead to re-install *NOTE: UAC also must be disabled on IDE and GR nodes. ArchestrA System Platform 2012 R2 supports UAC-enabled operations on run-time nodes Slide 23 Combating Deploy issues Slide 24 Deploy issues Failed to deploy <PcName>: Failed to get the bootstrap's version •The security policies in the customer domain are blocking some unauthenticated RPC calls with anonymous impersonation level. •Static cloaking has been enabled for Bootstrap and GR and added code to impersonate the ArchestrA user before deploy/undeploy. Slide 25 Failed to get the bootstrap's version Slide 26 Failed to get the bootstrap's version Slide 27 Known Issue with Windows 7 Client 3.1 SP3 • L00124127 3.1 SP3 p01 • L00122194 3.5 • L00122265 3.6 • L00123983 Slide 28 Deployment issues: What can go wrong? •DCOM •NMX Local mode •Versions •NIC Binding order •aaBootstrap not responding •aaLogger hanging •Global Data Cache •Platforms still deployed but removed from GR Slide 29 Deployment Troubleshooting Checklist Check Time synchronization between Platform Nodes Configure network binding order when using multiple Networks Disable TCP Offload Engine (TOE) Setup ArchestrA Admin Account on all Platforms OSConfiguration utility – 3.1sp3p1 on 2008 sometimes requires the version from 2012 R2 Slide 30 Deployment Troubleshooting Checklist Make sure all Platforms have the same Version and Hotfixes Check Firewall Settings (required Ports are documented in the ReadMe of the Product) Check Tech Note 461, Troubleshooting Bootstrap communication Slide 31 Deployment Logflags Category packages responsible for deployment: PlatformCategory Package – responsible for deploying platform engine EngineCategory Package – responsible for deploying redundant or non redundant application engines. ApplicationCategory Packages – responsible for deploying Areas, DI Objects, and Application Objects Slide 32 Deployment Logflags Components involved in deploy/undeploy process : WWpackageManager – component used by aaGR clients (IDE, GRAccess) to interact with WWPackageServer. WWPackageServer – component running under aaGR (service running on Galaxy Repository node), which is used for interacting with database, validation and sorting of the objects that has to be deployed etc. Slide 33 Deployment Logflags Components involved in deploy/undeploy process : Bootstrap – service that has to be installed on every IAS node, which among other functionalities is used during platform deployment/un-deployment. Platform Install Manager – responsible for installing all code modules on local or remote nodes using MSI. Slide 34 Deployment Logflags Components involved in deploy/undeploy process : File Copy Service – Responsible for copying the files to remote nodes. DCOMTransport – This is the underlying transport used by the File Copy Service to transfer files between nodes. Slide 35 GPO enabled Unable to deploy/undeploy or configure objects in AppServer v3.6 with customer GPO enabled; access denied due to insufficient permissions in objects' "...\CheckedIn" and "...\CheckedOut" folders Hot Fix •L00126108 (3.6) Slide 36 Known Issue Deploy of a redundant engine without cascade causes all running objects to be lost. •Hot Fix –L00126469(3.6) Slide 37 Global Data Cache distribution aaGlobalDataCacheMonitorSvr • ArchestrA GlobalDataCacheMonitorServices. This service will appear in the Task Manager once a platform is deployed to the machine. This service hands information for the Areas and alarms via the XML, also handles security calls. Slide 38 Global Data Cache Slide 39 Overview Global Data Cache GR Node aaBootstrap.exe Remote Platform aaBootstrap.exe aaGlobalDataCacheMonitorSvr Slide 40 Global Data Cache Issue Couldn't get platform name - maybe the platform is not available at this time. IPlatformInformationClerk2::GetPlatformIdentity(Plat formID=xx), hr = 80040405 Platform or Engine mismatch occurred because of non functional Data Cache distribution between the Platforms To resolve the mismatch Problem redeploy the remote Platform •Hotfix –L00125442 (3.1 SP3 p01) –Addressed in 3.6 p01 release Slide 41 Global Data Cache Issue GlobalDataCache folders do not sync if the aaGlobalDataCacheMonitor service is crashed or restarted. Hotfix • L00125643 (3.6) • Addressed in 3.6 p01 release Slide 42 Orphaned platforms Connection accepted from address <nodename1>, which differs from existing entry , address <nodename2>. New connection will be denied •Root cause is an orphan platform which was removed from the galaxy improperly and is still trying to connect to the Galaxy •Identify the Node where the platform is running and remove it by using platform remover Slide 43 Platform Exceed Maximum Heartbeats Slide 44 Platform Exceed Maximum Heartbeats Solution: Setting the proper value in your Platform and AppEngine Configuration Editor Slide 45 Platform Remover (Killer) Run as Administrator Clear out Checkpoint files • C:\Program Files (x86)\ArchestrA\Framework\Bin\CheckPointer Clear out Cache folder • <RootDrive>\ProgramData\ArchestrA\Cache Slide 46 Platform Remover (Killer) Fails to run when there are more than 100 platforms. Slide 47 Scripting Considerations •Using the right script •Debugging •Logmessage() •What is Async for •Script Timeout/Error S © Invensys proprietary & Inve liconfidential nsy d Slide 48 Engineering Efficiency • Script Editor • Auto complete function • Me • MyContainer • Scripts • Multi level Undo-Redo • Line Numbering • Consistent color coding • Syntax Error Indication Slide 49 2014 Engineering Efficiency • Scripting: Exception Handling Trap Exception Handle Exception Slide 50 2014 Let the Engine / Object Relax While First Loading Use a while true script instead of a On true for large tasks (such as IO set reference). Delay with If Script.ExecutionCnt == 2 Slide 51 Use LogMessage() Why have needless Logmessages going to the logger unless required. Always block them in with an IF statement: If me.Debug then Logmessage(me.msg); Endif; Slide 52 Async Scripts • SQL scripts should always be Async • Engine.AsynScriptMaxThread default size is 5 •Engine.AsyncScriptsWaitingCnt •use this for sizing AsynScriptMaxThread Slide 53 Keep it Clean Slide 54 Keep it Clean WAS Clean-up Guide: Improves time to open templates and objects. Improves time to check-in objects and templates. Deploying the InTouch app is faster. Restoring a Galaxy is faster. Backup was faster and smaller Slide 55 Keep it Clean Tech Note 930 https://wdnresource.wonderware.com/support/k bcd/html/1/t002746.htm Slide 56 Multi-Galaxy Communication? Slide 57 Remote data Symptom: View does not show remote Galaxy data Possible reasons: MxData Service is not deployed Discovery Services are not configured correctly Platform is not deployed on the node where MxDataService is running Remote node is not reachable Slide 58 Secure Write Symptom: Writes do not work from InTouchView when security is enabled Possible reasons: Security mode of Galaxy is set to “Galaxy Security” Security mode of InTouch is not set to “ArchestrA” User has not logged into the remote Galaxy at least once Default User Authentication service is not deployed on GR node Security mode of local and remote Galaxies does not match User does not have sufficient permissions to perform the write Remote node is not reachable Slide 59 ASBService OS Account 1. What if the ASBService OS account is not permitted? What account can be used to start the service? 2. Can the ASBService OS account be disabled? Slide 60 ASBService related warnings 3. ASBSecurity Proxy: Connect null FindResponse finding IManageASBSecurity on the SR node” • The ArchestrA Watchdog service needs to be started before creating a new Galaxy • Once the ArchestrA Watchdog service is fixed, the platforms had to be redeployed Slide 61 ASBService related warnings 4. aaServicesDeployAgentHost -:- ASBSecurity Proxy: CallDisconnect delegate caught exception The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state. • Tech Alert 173 • Uninstall / Reinstall product Slide 62 ASBService OS Account Tech Alert 173 Cannot Create a Galaxy or Connect to Any Existing Galaxy After Renaming a Computer if Wonderware Application Server 2012 R2 (Version 3.6) is already installed on the Computer Slide 63 Failed to UnpairWithGR… If one of the Galaxies used as a Galaxy Pair in a Multi-Galaxy Configuration is unavailable, the pair cannot be "unpaired." Slide 64 Failed to UnpairWithGR… • System Platform requires that both paired Galaxies must be present for unpairing to occur cleanly. Outside of seeing the orphaned Galaxy pair in the paired Galaxy list, there is no adverse impact to the system's operation. To reduce orphaned unpaired Galaxies, unpair galaxies before disconnecting from the network. Slide 65 Hotfix When using FSGateway in a multi-galaxy configuration and adding a large number tags to FSGateway using an OPC Client the tags get stuck in an initializing state. Hotfix • L00124824 Slide 66 Questions? Slide 67 Latest issues Slide 68 100% CPU on aaEngine.exe Engines get stuck at 100% CPU •NmxSvc is modified to ensure that it doesn't send incorrect disconnect message to the remote platforms. –Hotfix •L00124013 (3.5 p01) •L00127549 (3.6) *Addressed in 3.6 p01 release. Slide 69 RDI object Bad items that do not exist in the PLC causes RDI to take the AppEngine down over time. • Hotfix –L00128094 (3.6) Slide 70 Old Alarms Old Alarms showing in Alarm Control • They cannot be Acknowledged Hotfix • L00127843 (3.6) Slide 71 Tech Alerts TA # 173 • Cannot Create a Galaxy or Connect to Any Existing Galaxy After Renaming a Computer if Wonderware Application Server 2012 R2 (Version 3.6) is Already Installed on the Computer Slide 72 Tech Alerts Tech Alert 174 System Corruption Can Result when Importing Object Files (aaPKG) Created in a Higher Application Server Version Cannot deploy objects after importing objects developed in 3.1 SP3 P01 to 3.1 SP3 (exists in all version of Application Server up to 2012 R2) Slide 73 Slide 74 Tech Alerts Tech Alert 180 Silenced Alarms are not Logged in the WWAlmDB Database Tech Alert 181 Platform Fails to Deploy on Server 2003 SP2 or XP SP3 Nodes When Using App Server 3.6 P01 Slide 75 Wonderware Developer Network https://wdn.wonderware.com © Invensys 2009 Slide 76 Invensys proprietary & confidential Slide 76 Contact Wonderware Via Email:[email protected] Via Phone (US):1-800-WONDER1 (or 1-800-966-3371) (international): 1-949-639-8500 • You will need to have a UserID. Slide 77 Questions? Slide 78 Slide 79