Wednesday, November 29, 2017

What does db2top Start Date and Start Time actually mean?

While supporting a high-volume transaction period in our shop, I noticed that db2top Start Date and Start Time (option d for database) showed a very recent date and time. I was 100% sure that the DB was not restarted or activated recently.

[-]19:57:36,refresh=2secs(0.002)                                 Database                                AIX,member=[1/1],DB2INST1:MYDB

                                  lqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqk
                                  x              x         25%x         50%x         75%x       100%x
                                  xMaxActSess    x                                                  x
                                  xSortMemory    x                                                  x
                                  xLogUsed       x-                                                 x
                                  xFCM BufLow    x                                                  x
                                  mqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj

                              Start Date Start Time     Status    Shthres    Buffers     FCMBuf   OtherMem
                              2017/11/25   09:23:44     Active          0      16.4G     768.0K       2.2G

                                Sessions    ActSess   LockUsed LockEscals  Deadlocks   LogReads  LogWrites
                                      48          0         0%          0          0          0          0

                                 L_Reads    P_Reads   HitRatio    A_Reads     Writes   A_Writes  Lock Wait
                                   7,197          0    100.00%      0.00%          0          0          0

                                Sortheap    SortOvf PctSortOvf AvgPRdTime AvgDRdTime AvgPWrTime AvgDWrTime
                                   20.0K          3      2.50%       0.00       0.00       0.00       0.00


My co-worker did some research and gave a beautiful explanation. Thanks Mr. Manoj. Here it is.

This command gives you info on when the DB was deactivated/activated. But there are some caveats, read carefully:

Go to the db2diaglog path and issue this.

cat db2diag.log|grep -ip "FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FreeResourcesOnDBShutdown"

The output will look something like this. Give attention to the last line in each paragraph:

2017-10-17-15.52.19.081050-240 E1495007A440         LEVEL: Event
PID     : 15073374             TID : 26036          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : MYDB
APPHDL  : 0-3
HOSTNAME: myhost
EDUID   : 26036                EDUNAME: db2agent (idle) 0
FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FreeResourcesOnDBShutdown, probe:15579
STOP    : DATABASE: MYDB : DEACTIVATED: YES

2017-10-24-07.35.32.144318-240 E110103488A527       LEVEL: Event
PID     : 33572592             TID : 29101          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : MYDB
APPHDL  : 0-29188              APPID: 100.00.000.00.12345.171024112052
AUTHID  : myid                 HOSTNAME: myhost
EDUID   : 29301                EDUNAME: db2agent (idle) 0
FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FreeResourcesOnDBShutdown, probe:15579
STOP    : DATABASE: MYDB : DEACTIVATED: NO

2017-11-25-09.23.43.726522-300 E261286318A515       LEVEL: Event
PID     : 33572592             TID : 40505          PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000           DB   : MYDB
APPHDL  : 0-52137              APPID: 172.16.104.17.38242.171125142340
AUTHID  : MYID                 HOSTNAME: myhost
EDUID   : 40505                EDUNAME: db2agent (mydb) 0
FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FirstConnect, probe:1000
START   : DATABASE: MYDB : ACTIVATED: NO

ACTIVATED or DEACTIVATED “YES” means, the database has been activated or deactivated by an user that has the access to do it, such a DBA or System Admin.

ACTIVATED “NO” means, the database has been activated by DB2 automatically when the database was not started manually, and a CONNECT TO (or an implicit connect) has been issued in an application.

DEACTIVATED “NO” means, the database has been deactivated by DB2 automatically when the last application has disconnected from the DB.


Best practice is to activate the DB manually as soon as the database is started.

Monday, June 26, 2017

Notepad++ Technical Tips & Tricks


  • To add a text at the start or end of each line, use regular expressions.
  • Click Ctrl+H, and in the "Find What" space, enter ^b if you want to add a text at the beginning of the line or $ if you want to add a text at the end of the line. In the "Replace With" space, enter the text that you want to add at the beginning or end of the line and click Replace or Replace All

Thursday, August 4, 2016

How to find the database connection strings?

To find the database connection strings such as url, port number etc:

Database URL can be found using the command ifconfig -a.

en1: flags=xxxxx,480
        inet xxx.xxx.xx.xx netmask 0xffffff00 broadcast xxx.xxx.xx.xx
         tcp_sendspace 12345 tcp_recvspace 12345 rfc1323 1
en2: flags=xxxxx,480
        inet xxx.xxx.xx.xx netmask 0xffffff00 broadcast xxx.xxx.xx.xx
         tcp_sendspace 12345 tcp_recvspace 12345 rfc1323 1
lo0: flags=xxxxx,c0
        inet xxx.xxx.xx.xx netmask 0xff000000 broadcast xxx.xxx.xx.xx
         tcp_sendspace 12345 tcp_recvspace 12345 rfc1323 1


Port number can be found using these steps:

  • db2 get dbm cfg|grep -i svcename
    •  TCP/IP Service name                          (SVCENAME) = db2_db2inst1

  • Search for the svcename value in /etc/services file (cat /etc/services|grep -i db2_db2inst1)
    • db2_db2inst1   50000/tcp 

  • Check if the port is listening
    • netstat -Aan|grep -i 50000

Friday, May 20, 2016

DB2 Export error - SQL10018N The disk is full

While trying to run this export command
db2 "export to /filepath/schema.tablename.ixf of ixf messages v.msgs select * from wcsproln.catentryattr

I got this error "SQL10018N  The disk is full.  Processing was ended.

df -g showed that /proddump had 19 GB free space.

Looked into the db2diag log and found this:

2016-05-17-18.51.46.775102-240 E48658514A815      LEVEL: Error (OS)
PID     : 48627918             TID  : 1           PROC : db2bp
INSTANCE: db2inst1             NODE : 000
APPID   : *LOCAL.db2inst1.160517215412
EDUID   : 1
FUNCTION: DB2 UDB, oper system services, sqlowrite, probe:60
MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
          DIA8312C Disk was full.
CALLED  : OS, -, write
OSERR   : EFBIG (27) "A file cannot be larger than the value set by ulimit."
DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes
0x0FFFFFFFFFFFE170 : 0000 0004 0000 0080                        ........
DATA #2 : unsigned integer, 8 bytes
34
DATA #3 : signed integer, 8 bytes
-1
DATA #4 : String, 105 bytes
Search for ossError*Analysis probe point after this log entry for further
self-diagnosis of this problem.

I did some research on this error message "A file cannot be larger than the value set by ulimit.” and found that file size limit set may be too low.

Command “ulimit -a” showed the limits set:

time(seconds)        unlimited
file(blocks)         2097151
data(kbytes)         131072
stack(kbytes)        32768
memory(kbytes)       32768
coredump(blocks)     2097151
nofiles(descriptors) 2000
threads(per process) unlimited
processes(per user)  unlimited

There are two ways to change these values:

  1. Edit the limits file under /etc/security/limits (takes effect after reboot) 
  2. Use the chuser command to change individual user settings (logout and login required)
Used the chuser command to change the fsize value:

chuser fsize=5000000 db2inst1

Logged off and logged back in and tried the export command. Voila !! It worked.

Wednesday, May 11, 2016

DB2 LUW - Connection refused. ERRORCODE=-4499, SQLSTATE=08001

Application developer says that he is not able to connect to the database, he's getting a message similar to this:
"Error opening socket to server /nnn.nnn.nnn.nnn on port 50,000 with message: Connection refused. ERRORCODE=-4499, SQLSTATE=08001"

How do you resolve it?


  • Get the TCP/IP service name: 
    • Issue this command: db2 get dbm cfg|grep -i svce
    • The output will be something like this:
    • TCP/IP Service name                          (SVCENAME) = svce_db2inst1
    •  
  • Verify the port number:
    • Issue this command: cat /etc/services|grep -i svce_db2inst1
    • The output will be something like this: svce_db2inst1   50000/tcp
    •  
  • Make sure the port is listening:
    • Issue this command: netstat -Aan|grep -i 50000
    • If you do not see any output, it means the port is not listening and that's a problem
  • If the port is not listening, check the db2set parameter:
    • Issue this command:  db2set
    • If you don't see a DB2COMM parameter, then it needs to be set for applications to communicate with DB2
    • Update the DB2COMM parameter with this command: db2set db2comm=TCPIP
    • Stop and restart the DB2 instance: db2stop and db2start 
  • After restarting DB2, check if the port is listening by issuing the command "netstat -Aan|grep -i 50000". If the port is listening, you'll see the output similar to this:
    • f1000e00057aebb8 tcp4       0      0  *.50000               *.*                   LISTEN
    • f1000e00054ddbb8 tcp4       0      0  nnn.nnn.nnn.nnn.50000   nnn.nnn.nnn.mmm 33xxx   ESTABLISHED
    • f1000e000b3653b8 tcp4       0      0  nnn.nnn.nnn.nnn.50000   nnn.nnn.nnn.mmm 33yyy   ESTABLISHED
HTH !!

Monday, May 2, 2016

DB2 LUW - List of tables involved in a referential integrity chain

How do you find all the tables that are connected through referential integrity?

For example, let's assume that I've a table called TABA. This is a parent of TABB and TABB is a parent of TABC and TABC is a parent of TABD and so on. How do I find the all these related tables?

Method 1 - Query SYSCAT.REFERENCES:

db2 "select substr(TABNAME,1,30) as PARENT, substr(CONSTNAME,1,20) as CONSTRAINT, substr(REFTABNAME,1,30) as CHILD, substr(FK_COLNAMES,1,40) as REF_COL from syscat.references where tabname='TABA'"

This will give the child tables of TABA. Run the above query again for the child tables. Continue doing this until the last child table in the chain.

Method 2 - Through db2expl:

Run db2expl on "delete from tabschema.tabname" and parse the output.

db2expln -d lwscomdb -g -q "delete from SCHEMA.TABA" -t | grep -i "Delete:  Table Name"

This will do a db2 explain on the query to delete and then parse the output to get all the table names involved in the referential integrity chain.



My personal preference is Method 2.

Which method do you prefer and why? If you use a different method, please post it as well.

Monday, September 15, 2014

Set up mainframe SLIP trap

Often times, when you are diagnosing a problem on mainframe, if you contact the tech support of the product, you will be asked to set up a SLIP trap.

When you set up SLIP trap, you can indicate what kinds of events you want trapped and the system conditions for the trap, then specify what action the system is to take when the event occurs during the specified conditions. More info on SLIP trap

How do you set up SLIP trap?

Your SLIP trap may look something as simple as this:

SLIP SET,A=SVCD,COMP=0C4,ID=KC01,JOBLIST=(jobname),END

or something as complicated as this:

SLIP SET,IF,ID=slipid,A=SYNCSVCD,ML=1,ASID=(xxxxx),
ASIDLST=(xxxxx,yyyy,zzzz),PVTMOD=(module,00B0),
DATA=((12R?+60E,EQ,54475249,OR,12R?+60E,EQ,E3C7D9C9),
AND,12R?+756,EQ,E4D5D2D5,
AND,12R?+4A8,EQ,C4C9E2E3,AND,12R?+4AC,EQ,E2C5D9E5),END


If the command spreads for 2 lines are less, you can enter it in SDSF - System Command Extension panel:


However, if it spans for more than 2 lines, you cannot enter it in this panel. You need to create a member in SYS1.PARMLIB with the naming convention IEASLPii and save the command in this member. After saving this member, issue this command on SDSF

/T SLIP=ii

This will enable the SLIP trap.

After you set up the trap, run this MVS system command to display it's status:
/RO lpar,D SLIP=slipid

HTH.

Wednesday, August 13, 2014

DFSMS Message ARC1237I - No Space for Migration Copy

This morning I was trying to SMP/E RECEIVE a bunch of PTFs as part of a DB2 tool maintenance. After receiving 20 PTFs, the job went down with a storage space abend IEC032I E37-04 on the SMPPTS dataset.

I noticed that the SMPPTS dataset already had 16 extents. I tried to add some cylinders, but I couldn't. So I tried to change the SMS volume by HMIGRATEing and HRECALLing the dataset. But the HMIG command failed with this message:

ARC1001I hlq.SMPPTS MIGRATE FAILED, RC=0037, REAS=0000
ARC1237I NO SPACE FOR MIGRATION COPY    

I tried migrating the dataset directly to Level 2 by issuing this command:

HMIG / ML2

When I issued this command, it failed with the message:

ARC1001I hlq.SMPPTS MIGRATE FAILED, RC=0080, REAS=0000
ARC1280I MIGRATION FAILED - DATA SET IS IN NEED OF BACKUP

I issued this command to take a DFSMS backup:

HBACKDS

After the backup was complete, I tried to migrated to L2, it worked !!

After the migrate was complete, I recalled the dataset. It went to a different volume with more space and also had only 1 extent filled.

Monday, July 21, 2014

DB2 Performance Improvement tip using COALESCE function

One of the developers at our shop fed his query to an optimizer tool and the optimized query looked kind of weird.

This is the original query:
DELETE FROM TABLE_A
 WHERE COL1 IN (SELECT COL1
                          FROM TABLE_A R
                               INNER JOIN TABLE_B O
                                  ON O.COL2 = R.COL1
                         WHERE O.COL3 != R.COL4)


Query suggested by the optimizer:
DELETE
  FROM TABLE_A
 WHERE EXISTS (SELECT 'X'
                 FROM TABLE_A R
                      INNER JOIN TABLE_B O
                         ON R.COL4 <> COALESCE (O.COL3, O.COL3)
                            AND R.COL1 = COALESCE (O.COL2, O.COL2)
                WHERE COL1 = TABLE_A.COL1
                ORDER BY R.COL4)

The puzzling part here is in the revised query: COALESCE (O.COL3, O.COL3). Why would anyone feed the same column twice to the COALESCE function? It's going to give the same result.

Here is the explanation by IBM though: http://www-01.ibm.com/support/docview.wss?uid=swg21259831

In their own words:

Question
Why might adding a special form of the COALESCE predicate produce cheaper data access plans and result in better performance.
Cause
A "no-op" coalesce() predicate of the form "COALESCE(X, X) = X" introduces an estimation error into the planning of any query using it. Currently the DB2 query compiler doesn't have the capability of dissecting that predicate and determining that all rows actually satisfy it. As a result, the predicate artificially reduces the estimated number of rows coming from some part of a query plan. This smaller row estimate usually reduces the row and cost estimates for the rest of query planning, and sometimes results in a different plan being chosen because relative estimates between different candidate plans have changed.
Why can this do-nothing predicate sometimes improve query performance? The addition of the "no-op" coalesce() predicate introduces an error that masks something else that is preventing optimal performance.

What some performance enhancement tools do is a brute-force test: the tool repeatedly introduces the predicate into different places in a query, operating on different columns, to try to find a case where, by introducing an error, it stumbles onto a better-performing plan. This is also true of a query developer hand-coding the "no-op" predicate into a query. Typically, the developer will have some insight on the data to guide the placement of the predicate.

Using this method to improve query performance is a short-term solution which does not address root cause. It hides potential areas for performance improvements and could have the following implications:
This workaround does not guarantee permanent performance improvements. The DB2 query compiler might eventually handle the predicate better, or it might be affected by other random factors.
Other queries might be affected by the same root cause. The performance of your system in general might be suffering as a result.
Answer
Try to identify and address root cause by determining why the original plan chosen by the query compiler did not perform optimally.

Although the actual estimates in a better-performing plan using "COALESCE(X, X) = X" cannot be trusted, such a plan can still be useful because its "shape" (for example, its join order and access methods) can be used as clues to determine where a problem might lie with the original plan.

Here are some examples of questions that can help to better identify the source of performance issues:
Are the statistics out-of-date?
Should more statistics be collected?
Are there statistical correlations that have not been detected?
Are there other characteristics of the data that are not yet captured or modeled?
Are there hardware problems? (e.g. a disk is misbehaving)
Are there configuration problems?
Are any queries stuck behind a lock?

Friday, September 27, 2013

DB2 Stored Procedure Last Used Date

Ever wonder when was a stored procedure last used/invoked in DB2 LUW? Knowing this may be helpful in determining whether a stored procedure is still in use or how recently it was used. Run this query to find this information.

SELECT PR.PROCSCHEMA
      ,PR.PROCNAME
      ,PK.LASTUSED
  FROM SYSCAT.PROCEDURES PR
      ,SYSCAT.ROUTINEDEP RD
      ,SYSCAT.PACKAGES   PK
 WHERE PR.SPECIFICNAME =  RD.ROUTINENAME
   AND RD.BNAME        =  PK.PKGNAME
   AND PR.PROCSCHEMA  <> 'SYSIBMADM'
   AND PK.LASTUSED     > '01/01/0001'
 ORDER BY PK.LASTUSED DESC
  WITH UR;

Remember that, you can use this query only from DB2 LUW V9.7 onwards.