Subido por Carlos Andres Morales Poveda

WW TSS-04 Advanced Troubleshooting for Wonderware Application Server

Anuncio
Slide 1
WW TSS-04: Advanced
Troubleshooting for Wonderware
Application Server
Rich Liddell
Javier Aldán
Technical Account Manager
Technical Account Manager
Global Customer Support
Global Customer Support
[email protected]
[email protected]
© 2013 Invensys. All Rights Reserved. The names, logos, and taglines identifying the products and services of Invensys are proprietary marks of Invensys or its subsidiaries.
All third party trademarks and service marks are the proprietary marks of their respective owners.
Agenda
•
•
•
•
•
Tools & Technique
Install
Deploy\undeploy
Multi-Galaxy
Communication
Tech Notes & Tech
Alerts
Slide 3
Common tools
SMC Logger
Platform Manager
Object Viewer
Task Manager
MiniDump
Windows Event System Files
Wonderware Developer Network (WDN)
Slide 4
Troubleshooting 101 - What Did You
Change?
Slide 5
System Management Console (SMC)
Slide 6
Object Viewer – Locate Process ID
Slide 7
Object Viewer
Find off scan or quarantined objects:
• Uncheck Search by Name
• Check only show objects
Slide 8
Object Viewer
Find Object ID.
Slide 9
Secret Dialog Menu
Slide 10
How can I tell if someone deployed
something?
Did you
Are
Nope….
Deploy?
Sure?!?
•Gobject_Change_Log
•Objects affected
•Operation performed
•User Comment
•User Logged on
Slide 11
Return all operations for past 24 hours
SELECT Change.change_date, Change.user_profile_name,
Oper.operation_name, user_comment, gObj.Tag_name
FROM gobject_change_log Change
JOIN lookup_operation Oper
ON change.operation_id = Oper.operation_id
JOIN Gobject GObj
ON GObj.gobject_Id = Change.gobject_Id
WHERE Change.change_date > DateAdd(hour,-24,getdate())
--and Tag_Name = 'UserDefined_001'
--and Oper.operation_name like '%Deploy%'
--or Oper.operation_name = 'ModifiedAutomationObjectOnly'
ORDER BY change.change_date desc
Slide 12
Engine Attributes
Slide 13
Automatic MiniDump Generation
• MiniDump will enable any process from ArchestrA to
dump its process information to a dump file if it
ever terminates abnormally or hits an exception
error.
• If a Minidump file is generated it will be created
automatically at the default path of:
<drive>:\Program
Files\ArchestrA\Framework\minidump
• The minidump can be quite large depending on the
process
(200 - 800mb)
Slide 14
Enable MiniDump
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SOFTWARE\ArchestrA\Fram
ework\Debug]
"MinidumpEnabled"=dword:00000001
"MaximumDumpFilesAllowed"=dword:0000000a
"MinidumpType"=dword:00000003
Slide 15
Tech Note 726
Capturing a Memory Dump File Using the Microsoft®
Debug Diagnostic Tool (32bit)
Slide 16
Wonderware Developer Network
https://wdn.wonderware.com
Slide 17
Contact Wonderware
Via Email:[email protected]
Via Phone (US):1-800-WONDER1 (or 1-800-966-3371)
(international): 1-949-639-8500
•
You will need to have a UserID.
Slide 18
Install issues…..
Slide 19
Unable to Modify install
• You have successfully
installed
• Go to add modify the install
via add\remove programs
• You then see a quick flash
and nothing happens.
• You uninstall & reinstall and
yet the same behavior
Slide 20
Missing Install folder
C:\Program Files (x86)\Common Files\ArchestrA\Install\{F7D0430C-1E73-4546-BB8C-3E23DF991668}\External
Slide 21
We need to copy it from the Root of the install:
Slide 22
Disable UAC during install
Must disable User Account Control (UAC)
before installing
•Failing to do this can result in functional
issues
–Which can lead to re-install
*NOTE: UAC also must be disabled on IDE and GR nodes.
ArchestrA System Platform 2012 R2 supports UAC-enabled operations on run-time nodes
Slide 23
Combating Deploy issues
Slide 24
Deploy issues
Failed to deploy <PcName>: Failed to get the
bootstrap's version
•The security policies in the customer domain
are blocking some unauthenticated RPC calls
with anonymous impersonation level.
•Static cloaking has been enabled for Bootstrap
and GR and added code to impersonate the
ArchestrA user before deploy/undeploy.
Slide 25
Failed to get the bootstrap's version
Slide 26
Failed to get the bootstrap's version
Slide 27
Known Issue with Windows 7 Client
3.1 SP3
• L00124127
3.1 SP3 p01
• L00122194
3.5
• L00122265
3.6
• L00123983
Slide 28
Deployment issues: What can go wrong?
•DCOM
•NMX Local mode
•Versions
•NIC Binding order
•aaBootstrap not responding
•aaLogger hanging
•Global Data Cache
•Platforms still deployed but removed from GR
Slide 29
Deployment Troubleshooting Checklist
Check Time synchronization between Platform
Nodes
Configure network binding order when using
multiple Networks
Disable TCP Offload Engine (TOE)
Setup ArchestrA Admin Account on all
Platforms
OSConfiguration utility – 3.1sp3p1 on 2008
sometimes requires the version from 2012
R2
Slide 30
Deployment Troubleshooting Checklist
Make sure all Platforms have the same
Version and Hotfixes
Check Firewall Settings (required Ports are
documented in the ReadMe of the Product)
Check Tech Note 461, Troubleshooting
Bootstrap communication
Slide 31
Deployment Logflags
Category packages responsible for
deployment:
PlatformCategory Package – responsible for deploying
platform engine
EngineCategory Package – responsible for deploying
redundant or non redundant application engines.
ApplicationCategory Packages – responsible for
deploying Areas, DI Objects, and Application Objects
Slide 32
Deployment Logflags
Components involved in deploy/undeploy process :
WWpackageManager – component used by aaGR
clients (IDE, GRAccess) to interact with
WWPackageServer.
WWPackageServer – component running under
aaGR (service running on Galaxy Repository
node), which is used for interacting with
database, validation and sorting of the objects
that has to be deployed etc.
Slide 33
Deployment Logflags
Components involved in deploy/undeploy process :
Bootstrap – service that has to be installed on
every IAS node, which among other
functionalities is used during platform
deployment/un-deployment.
Platform Install Manager – responsible for
installing all code modules on local or remote
nodes using MSI.
Slide 34
Deployment Logflags
Components involved in deploy/undeploy process :
File Copy Service – Responsible for copying the
files to remote nodes.
DCOMTransport – This is the underlying transport
used by the File Copy Service to transfer files
between nodes.
Slide 35
GPO enabled
Unable to deploy/undeploy or configure objects
in AppServer v3.6 with customer GPO enabled;
access denied due to insufficient permissions in
objects' "...\CheckedIn" and "...\CheckedOut"
folders
Hot Fix
•L00126108 (3.6)
Slide 36
Known Issue
Deploy of a redundant engine without cascade
causes all running objects to be lost.
•Hot Fix
–L00126469(3.6)
Slide 37
Global Data Cache distribution
aaGlobalDataCacheMonitorSvr
• ArchestrA GlobalDataCacheMonitorServices. This
service will appear in the Task Manager once a
platform is deployed to the machine. This service
hands information for the Areas and alarms via the
XML, also handles security calls.
Slide 38
Global Data Cache
Slide 39
Overview Global Data Cache
GR Node
aaBootstrap.exe
Remote Platform
aaBootstrap.exe
aaGlobalDataCacheMonitorSvr
Slide 40
Global Data Cache Issue
Couldn't get platform name - maybe the platform is
not available at this time.
IPlatformInformationClerk2::GetPlatformIdentity(Plat
formID=xx), hr = 80040405
Platform or Engine mismatch occurred because of
non functional Data Cache distribution between the
Platforms
To resolve the mismatch Problem redeploy the
remote Platform
•Hotfix
–L00125442 (3.1 SP3 p01)
–Addressed in 3.6 p01 release
Slide 41
Global Data Cache Issue
GlobalDataCache folders do not sync if the
aaGlobalDataCacheMonitor service is crashed or
restarted.
Hotfix
• L00125643 (3.6)
• Addressed in 3.6 p01 release
Slide 42
Orphaned platforms
Connection accepted from address <nodename1>,
which differs from existing entry , address
<nodename2>. New connection will be denied
•Root cause is an orphan platform which was
removed from the galaxy improperly and is still
trying to connect to the Galaxy
•Identify the Node where the platform is running
and remove it by using platform remover
Slide 43
Platform Exceed Maximum Heartbeats
Slide 44
Platform Exceed Maximum Heartbeats
Solution:
Setting the proper value in your Platform and AppEngine Configuration
Editor
Slide 45
Platform Remover (Killer)
Run as Administrator
Clear out Checkpoint files
• C:\Program Files
(x86)\ArchestrA\Framework\Bin\CheckPointer
Clear out Cache folder
• <RootDrive>\ProgramData\ArchestrA\Cache
Slide 46
Platform Remover (Killer)
Fails to run when there
are more than 100
platforms.
Slide 47
Scripting Considerations
•Using the right script
•Debugging
•Logmessage()
•What is Async for
•Script Timeout/Error
S
©
Invensys
proprietary &
Inve
liconfidential
nsy
d
Slide 48
Engineering Efficiency
• Script Editor
• Auto complete function
• Me
• MyContainer
• Scripts
• Multi level Undo-Redo
• Line Numbering
• Consistent color coding
• Syntax Error Indication
Slide 49
2014
Engineering Efficiency
• Scripting: Exception Handling
Trap Exception
Handle Exception
Slide 50
2014
Let the Engine / Object Relax While
First Loading
Use a while true script instead of a
On true for large tasks (such as IO
set reference).
Delay with
If Script.ExecutionCnt == 2
Slide 51
Use LogMessage()
Why have needless Logmessages going to the logger
unless required. Always block them in with an IF
statement:
If me.Debug then
Logmessage(me.msg);
Endif;
Slide 52
Async Scripts
• SQL scripts should always be Async
• Engine.AsynScriptMaxThread default size is 5
•Engine.AsyncScriptsWaitingCnt
•use this for sizing AsynScriptMaxThread
Slide 53
Keep it Clean
Slide 54
Keep it Clean
WAS Clean-up Guide:
Improves time to open templates and objects.
Improves time to check-in objects and templates.
Deploying the InTouch app is faster.
Restoring a Galaxy is faster.
Backup was faster and smaller
Slide 55
Keep it Clean
Tech Note 930
https://wdnresource.wonderware.com/support/k
bcd/html/1/t002746.htm
Slide 56
Multi-Galaxy Communication?
Slide 57
Remote data
Symptom:
View does not show remote Galaxy data
Possible reasons:
MxData Service is not deployed
Discovery Services are not configured correctly
Platform is not deployed on the node where MxDataService is running
Remote node is not reachable
Slide 58
Secure Write
Symptom:
Writes do not work from InTouchView when security is enabled
Possible reasons:
Security mode of Galaxy is set to “Galaxy Security”
Security mode of InTouch is not set to “ArchestrA”
User has not logged into the remote Galaxy at least once
Default User Authentication service is not deployed on GR node
Security mode of local and remote Galaxies does not match
User does not have sufficient permissions to perform the write
Remote node is not reachable
Slide 59
ASBService OS Account
1.
What if the ASBService OS account is not permitted? What
account can be used to start the service?
2.
Can the ASBService OS account be disabled?
Slide 60
ASBService related warnings
3.
ASBSecurity Proxy: Connect null FindResponse finding
IManageASBSecurity on the SR node”
•
The ArchestrA Watchdog service needs to be started before
creating a new Galaxy
•
Once the ArchestrA Watchdog service is fixed, the platforms
had to be redeployed
Slide 61
ASBService related warnings
4.
aaServicesDeployAgentHost -:- ASBSecurity Proxy: CallDisconnect
delegate caught exception The communication object,
System.ServiceModel.Channels.ServiceChannel, cannot be used for
communication because it is in the Faulted state.
•
Tech Alert 173
•
Uninstall / Reinstall product
Slide 62
ASBService OS Account
Tech Alert 173
Cannot Create a Galaxy or Connect to Any Existing Galaxy
After Renaming a Computer if Wonderware Application
Server 2012 R2 (Version 3.6) is already installed on the
Computer
Slide 63
Failed to UnpairWithGR…
If one of the Galaxies used as a Galaxy Pair in a Multi-Galaxy
Configuration is unavailable, the pair cannot be "unpaired."
Slide 64
Failed to UnpairWithGR…
• System Platform requires that both
paired Galaxies must be present for
unpairing to occur cleanly. Outside of
seeing the orphaned Galaxy pair in the
paired Galaxy list, there is no adverse
impact to the system's operation. To
reduce orphaned unpaired Galaxies,
unpair galaxies before disconnecting
from the network.
Slide 65
Hotfix
When using FSGateway in a multi-galaxy
configuration and adding a large number tags to
FSGateway using an OPC Client the tags get stuck in
an initializing state.
Hotfix
• L00124824
Slide 66
Questions?
Slide 67
Latest issues
Slide 68
100% CPU on aaEngine.exe
Engines get stuck at 100% CPU
•NmxSvc is modified to ensure that it doesn't
send incorrect disconnect message to the
remote platforms.
–Hotfix
•L00124013 (3.5 p01)
•L00127549 (3.6)
*Addressed in 3.6 p01 release.
Slide 69
RDI object
Bad items that do not exist in the PLC causes RDI to
take the AppEngine down over time.
• Hotfix
–L00128094 (3.6)
Slide 70
Old Alarms
Old Alarms showing in Alarm Control
• They cannot be Acknowledged
Hotfix
• L00127843 (3.6)
Slide 71
Tech Alerts
TA # 173
• Cannot Create a Galaxy or Connect to Any Existing Galaxy
After Renaming a Computer if Wonderware Application Server
2012 R2 (Version 3.6) is Already Installed on the Computer
Slide 72
Tech Alerts
Tech Alert 174
System Corruption Can Result when Importing
Object Files (aaPKG) Created in a Higher
Application Server Version
Cannot deploy objects after importing objects
developed in 3.1 SP3 P01 to 3.1 SP3 (exists in all
version of Application Server up to 2012 R2)
Slide 73
Slide 74
Tech Alerts
Tech Alert 180
Silenced Alarms are not Logged in the
WWAlmDB Database
Tech Alert 181
Platform Fails to Deploy on Server 2003 SP2 or
XP SP3 Nodes When Using App Server 3.6 P01
Slide 75
Wonderware Developer Network
https://wdn.wonderware.com
© Invensys 2009
Slide 76
Invensys proprietary &
confidential
Slide 76
Contact Wonderware
Via Email:[email protected]
Via Phone (US):1-800-WONDER1 (or 1-800-966-3371)
(international): 1-949-639-8500
•
You will need to have a UserID.
Slide 77
Questions?
Slide 78
Slide 79
Descargar