- Posted by Gavin Soorma
- On May 15, 2018
- 0 Comments
- oratop, TFA, tfactl, trace file analyzer
Trace File Analyzer Collector also known as TFA is a diagnostic collection utility which greatly simplifies the diagnostic data collection for both Oracle Database as well as Oracle Clusterware/Grid Infrastructure RAC environments.
Trace File Analyzer provides a central and single interface for all diagnostic data collection and analysis.
When a problem occurs, TFA collects all the relevant data at the time of the problem and consolidates data even across multiple nodes in a clustered Oracle RAC environment. Only the relevant diagnostic data is collected and can be packaged and uploaded to Oracle Support and this leads to faster resolution times. All the required diagnostic data is collected via a single tfactl command instead of having to individually look for the required diagnostic information across a number of database and cluster alert logs, trace files or dump files.
In addition to the core functionality of gathering, consolidating and processing diagnostic data, Trace File Analyzer comes bundled with a number of support tools which enable us to obtain a lot of other useful information like upgrade readiness, health checks for both Engineered as well as non-Engineered systems, OS performance graphs, Top SQL queries etc.
Oracle Trace File Analyzer is shipped along with Oracle Grid Infrastructure (from version 22.214.171.124). However, it is recommended to download the latest TFA version which can be accessed via the My Oracle Support Note 1513912.1 since the TFA bundled with the Oracle Grid Infrastructure does not include many of the new features, bug fixes and more importantly the Oracle Database Support Tools bundle.
Oracle releases new versions of the TFA several times a year and the most current version is Trace File Analyzer 18.1.1 which is now available for download via the MOS Note 1513912.1.
Select the version appropriate to your operating system. Note that TFA supports Oracle databases and Grid Infrastructure versions 11.2 upwards.
For a new installation, the recommended location is /opt/oracle.tfa and in case an existing version of TFA exists, it will be upgraded as part of the installation.
It is recommended to carry out the installation as the root user.
If root access is not available, the installation can be carried out by the ORACLE_HOME owner, but this installation will cause TFA to function with lower capabilities. Functionalities like automatic collections and collection from remote hosts will not be available as well as collection and analysis of files not readable by the ORACLE_HOME owner like /var/log/messages and log files pertaining to certain clusterware daemon processes.
The TFA download also includes Java Runtime Environment (JRE) version 1.8 which is required for running TFA.
To install TFA, download the appropriate platform specific zip file, copy it to the required machine and unzip. Then execute the file installTFA<platform>.
[root@autprorac1 oracle]# unzip TFA-LINUX_v18.1.1.zip
The installation will prompt if a local install or a cluster install is going to be carried out. A cluster installation does require password-less SSH user equivalency for the root user to all cluster nodes. If this is not already configured, then the installation optionally sets up password-less SSH user equivalency for the root user account.
The Oracle TFA has a daemon process which is configured to start automatically on system start up. It runs from init on UNIX systems or init/upstart/systemd on Linux and in the case of Microsoft Windows runs as a Windows Service.
To start or stop Oracle Trace File Analyzer daemon manually we can use the tfactl start or tfactl stop commands.
We can also enable or disable the automatic restarting of the Oracle Trace File Analyzer daemon via the tfactl disable or tfactl enable commands.
TFA is invoked via the tfactl command which in turn can be run from the command line or from within the Shell interface or via the TFA Menu interface.
TFA is configured to collect diagnostic information automatically for a number of specific Oracle errors and we can also configure it to collect diagnostics for any other user-defined Oracle errors as well.
For instance the following Oracle and Cluster errors would have automatic diagnostic collection enabled:
When TFA detects a problem, it would collect the necessary and relevant diagnostic data related to the problem going back by default to the past 12 hours and would also trim the log files it collects to gather only the bare amount of information required for problem diagnosis.
It then would collect and package the diagnostic data and also consolidate the data on one node in case of a clustered environment. The diagnostic data is then stored in the TFA repository and if configured, can send an email notification that a problem has been detected and the collection of the packaged diagnostic data is now available for upload to Oracle Support.
On-demand Analysis and Diagnostic Collection
In addition to the automatic collection which is configured by default, we can use TFA to analyze all logs and identify any recent errors by performing an on-demand collection and analysis of diagnostic data.
We can collect diagnostic data based on a search string like say ‘ORA-00600’ and also specify the time duration in the past or time interval for which the diagnostic data should be analyzed.
Oracle Database support tools bundle
This is only available in case of TFA which is downloaded from My Oracle Support via the note 1513912.1.
ORAchk and EXAchk: Performs health check as well as upgrade readiness checks of the entire stack for both Engineered as well as non-Engineered systems
oswatcher: Utility to capture and store performance metrics from the operating system
procwatcher: Monitor and collect stack traces of database and clusterware processes using tools like oradebug and pstack
oratop: Utility similar to the unix OS utility top which gives a real-time overview of performance from a database perspective and can be used in combination with the unix top utility to get a more complete overview of system performance
summary, alertsummary, events: High level configuration summary as well as event details along with summary of events from clusterware and databases alert logs across all nodes
param: Find and display database and OS parameters that match a specified pattern
changes: Reports system changes for a given period of time which will include database parameters, operating system parameters, and the patches applied
Other utilities and tools: vi ls grep ps
One Command Service Request Data Collections
Very often when we raise a Service Request with Oracle Support, we are asked to provide additional log and trace files to help Oracle Support better diagnose the problem. Collecting the various log and trace files individually can be a laborious task and we may miss out collecting an important log file required by Oracle Support.
Oracle TFA now provides a single command SRDC (Service Request Data Collection) to collect exactly what is needed by Oracle Support (as well as the DBA) to diagnose a specific problem.
A wide variety of SRDCs are available covering Oracle errors like ORA-00600, ORA-07445, database performance problems (dbperf), database resource problems (dbunixresource), database install and upgrade problems (dbinstall,dbupgrade) , database storage problems (dbasm) etc.
Based on the SRDC, TFA will scan and analyze the relevant log and trace files it requires and then trims those files to only contain the required diagnostic information. The data is then packaged into a zip file which can be then uploaded to Oracle Support.
For example, the TFA command tfactl diagcollect -srdc dbperf will generate a bundled package containing files like the AWR report, ADDM report, ASH report, OSWatcher and ORAchk performance related checks.
Trace File Analyzer Repository
TFA stores all diagnostic data collections in the repository and the size of the repository is the lower of the value 10GB or 50% of available directory free disk space. The location of the repository is the sub-directory tfa/repository under the Trace File Analyzer installation top level directory.
The amount of data collected in the repository is determined by the Trace Level parameter which defaults to the value 1. The possible values are in the range 1-4 and a higher value will obviously lead to the repository being filled at a faster rate.
The Oracle TFA daemon process monitors and automatically purges the repository when the free space falls below 1 GB and by default purges collections older than 12 hours. This can also be configured by the parameter minagetopurge.
Trace File Analyzer Command Examples
- Viewing System and Cluster Summary
- To find all errors in the last one day
tfactl analyze -last 1d
- To find all occurrences of a specific error (in this case ORA-00600 errors)
tfactl analyze -search “ora-00600” -last 8h
- To set the notification email to use
tfactl set notificationAddressfirstname.lastname@example.org
- Enable or disable Automatic collections (ON by default)
tfactl set autodiagcollect=OFF
- Adjusting the Diagnostic Data Collection Period
tfactl diagcollect -last 1 h
tfactl diagcollect -from “2018-03-21″
tfactl diagcollect from “2018-03-21” -to “2018-03-22”
- Analyze, trim and zip all files updated in the last 12 hours, including Cluster Health Monitor and OSWatcher data, from across all nodes the cluster
tfactl diagcollect -all -last 12h
- Run collection from specific nodes in a RAC cluster
tfactl diagcollect -last 1d -node rac01
- Run collection for a specific database
tfactl -diagcollect -database hrdb -last 1d
- Uploading collections to Oracle Support
Execute tfactl setupmos to configure Oracle Trace File Analyzer with MOS user name and password followed by
tfactl diagcollect -last 1d -sr 1234567
- Search database alert logs for the string “ORA-” from the past one day
tfactl analyze -search “ORA” -comp db -last 1d
- Display a summary of events collected from all alert logs and system logs from the past six hours
tfactl analyze -last 6h
- View the summary of a TFA deployment. This will display cluster node information as well as information related to database and grid infrastructure software homes like version, patches installed, databases running etc.
- Grant access to a user
tfactl access add -user oracle
- List users with TFA access
tfactl access lsusers
- Run orachk
tfactl run orachk
- Display current configuration settings
tfactl print config