Perth, Australia
+ (61) 417713124
prosolutions@gavinsoorma.com

GoldenGate Initial Load Methods Oracle source to SQL Server 2012 target

  • Posted by Gavin Soorma
  • On April 1, 2014
  • 0 Comments
  • BCP, BULK INSERT, extract, GoldenGate, Initial Load Methods, Oracle, sourceistable, SQL server 2012

In this post we will look at three different methods of performing an initial data load from an Oracle 11g source database running on an HP-UX IA64 platform to a SQL Server 2012 target database hosted on Windows 2012 Datacenter.

The three methods we are using here are:

1) Oracle GoldenGate Direct Load over network without trail files
2) Oracle GoldenGate File to Replicat method
3) Oracle GoldenGate File with SQL Server BULK INSERT

These are some of the results obtained in our testing:

Initial load extract:

Between 2 and 3 million rows per minute

PRD.SH_BATCH_LOG table with 8001500 rows extracted in 4:30 minutes

PRD.AC_TRANSACTION_RACI table with 74104323 rows extracted in 21 minutes

Initial load replicat:

Between 1 to 1.5 million rows every 2 minutes

With single replicat process, table PRD.SH_BATCH_LOG with 7895001 rows took 15 minutes.

With 3 parallel replicat processes, the same 7.8 million row table was loaded in under 5 minutes. Each replicat processed about 2.6 million rows each.

3 parallel replicat processes pushed CPU utilization to around 60-70% mark but not higher.

Using 5 parallel replicat processes we were able to load a 177 million row table in little over 3 hours

The best performance obtained was using SQL Server BULK INSERT, where we were able to load 8 million rows in around 2 minutes.

 

1) Oracle GoldenGate Direct Load over network without trail files

Note – in the Direct Load method, no trail files are created – but this is not very efficient method for a large table.

GGSCI (db02) 2> edit params defgen

DEFSFILE ./dirdat/source.def,
USERID GGATE_OWNER@REDEVDB2, PASSWORD ggate
TABLE PRD.T_PRODUCT_LINE;


oracle@db02:/u01/oracle/goldengate > ./defgen paramfile /u01/oracle/goldengate/dirprm/defgen.prm

Copy source.def to ./dirdef directory on Windows 2012 server

Create the Initial Load Extract

GGSCI (db02) 5> add extract extinit1 sourceistable
EXTRACT added.


extract extinit1
userid ggate_owner password ggate
RMTHOST DCV-RORSQL-N001.local, MGRPORT 7809
RMTTASK REPLICAT, GROUP rinit1
TABLE PRD.T_PRODUCT_LINE;

Next create the table in the SQL Server database

Create the initial load replicat on the target SQL Server GoldenGate environment

GGSCI (DCV-RORSQL-N001) 28> add replicat repinit1 specialrun
REPLICAT added.


GGSCI (DCV-RORSQL-N001) 29> edit params rinit1

replicat rinit1
TARGETDB sqlserver2012
SOURCEDEFS ./dirdef/source.def
MAP PRD.T_PRODUCT_LINE, TARGET PRD.T_PRODUCT_LINE;

Start the initial load extract

GGSCI (db02) 11> start extract extinit1

Sending START request to MANAGER ...
EXTRACT EXTINIT1 starting


GGSCI (db02) 12> info extract extinit1

EXTRACT    EXTINIT1  Initialized   2014-02-25 10:26   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Not Available
                     First Record         Record 0
Task                 SOURCEISTABLE


GGSCI (db02) 13> info extract extinit1

EXTRACT    EXTINIT1  Last Started 2014-02-25 10:59   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Table PRD.T_PRODUCT_LINE
                     2014-02-25 10:59:03  Record 1
Task                 SOURCEISTABLE


GGSCI (db02) 14> info extract extinit1

EXTRACT    EXTINIT1  Last Started 2014-02-25 10:59   Status STOPPED
Checkpoint Lag       Not Available
Log Read Checkpoint  Table PRD.T_PRODUCT_LINE
                     2014-02-25 10:59:06  Record 801
Task                 SOURCEISTABLE

When the Extract shows the status STOPPED, we can check the target PRD.T_PRODUCT_LINE via the SQL Server 2012 Management Studio and we find that 801 rows have been inserted in the table.

Note – in this method we do not need to start the replicat process on the target.

 

2) Oracle GoldenGate File to Replicat method

In this method we will create 3 replicat processes which will be running in parallel processing the trail files which are generated by the extract process.

The table has 74 million rows.

Create initial load extract

GGSCI (db02) 15> add extract extinit2 sourceistable
EXTRACT added.

GGSCI (db02) 3> edit params extinit2

extract extinit2
userid ggate_owner password ggate
RMTHOST DCV-RORSQL-N001.local, MGRPORT 7809,  tcpbufsize 10485760, tcpflushbytes 10485760
rmtfile ./dirdat/te, maxfiles 999999, megabytes 400, purge
reportcount every 300 seconds, rate
TABLE PRD.AC_TRANSACTION_RACI;

As in the first method, we create the definitions file using DEFGEN and then copy the generated file to the dirdef directory on the target SQL Server GoldenGate software home.


oracle@db02:/u01/oracle/goldengate > ./defgen paramfile /u01/oracle/goldengate/dirprm/defgen.prm

Create three parallel replicat groups

GGSCI (DCV-RORSQL-N001) 39> add replicat repinit2  exttrail ./dirdat/te
REPLICAT added.


GGSCI (DCV-RORSQL-N001) 40> add replicat repinit3  exttrail ./dirdat/te
REPLICAT added.


GGSCI (DCV-RORSQL-N001) 41> add replicat repinit4  exttrail ./dirdat/te
REPLICAT added.



GGSCI (DCV-RORSQL-N001) 42> edit params repinit2


GGSCI (DCV-RORSQL-N001) 43> edit params repinit3


GGSCI (DCV-RORSQL-N001) 44> edit params repinit4


GGSCI (DCV-RORSQL-N001) 45> view params repinit2
replicat repinit2
targetdb sqlserver2012
SOURCEDEFS ./dirdef/source.def
reportcount every 60 seconds, rate
overridedups
end runtime
MAP PRD.AC_TRANSACTION_RACI, TARGET PRD.AC_TRANSACTION_RACI , filter (
@RANGE (1,3));


GGSCI (DCV-RORSQL-N001) 46> view params repinit3
replicat repinit3
targetdb sqlserver2012
SOURCEDEFS ./dirdef/source.def
reportcount every 60 seconds, rate
overridedups
end runtime
MAP PRD.AC_TRANSACTION_RACI, TARGET PRD.AC_TRANSACTION_RACI , filter (
@RANGE (2,3));


GGSCI (DCV-RORSQL-N001) 47> view params repinit4
replicat repinit4
targetdb sqlserver2012
SOURCEDEFS ./dirdef/source.def
reportcount every 60 seconds, rate
overridedups
end runtime
MAP PRD.AC_TRANSACTION_RACI, TARGET PRD.AC_TRANSACTION_RACI , filter (
@RANGE (3,3));

Start the initial load extract



GGSCI (db02) 4> start extract extinit2

Sending START request to MANAGER ...
EXTRACT EXTINIT2 starting

GGSCI (db02) 5> info extract extinit2

EXTRACT    EXTINIT2  Initialized   2014-02-25 11:27   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Not Available
                     First Record         Record 0
Task                 SOURCEISTABLE


GGSCI (db02) 6> !
info extract extinit2

EXTRACT    EXTINIT2  Last Started 2014-02-25 11:56   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Table PRD.AC_TRANSACTION_RACI
                     2014-02-25 11:56:51  Record 1
Task                 SOURCEISTABLE


GGSCI (db02) 9> !
info extract extinit2

EXTRACT    EXTINIT2  Last Started 2014-02-25 11:56   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Table PRD.AC_TRANSACTION_RACI
                     2014-02-25 11:57:31  Record 2312001
Task                 SOURCEISTABLE

While the initial load extract is running start the three parallel replicat processe


GGSCI (DCV-RORSQL-N001) 48> start replicat repinit2

Sending START request to MANAGER ('MANAGER') ...
REPLICAT REPINIT2 starting


GGSCI (DCV-RORSQL-N001) 49> start replicat repinit3

Sending START request to MANAGER ('MANAGER') ...
REPLICAT REPINIT3 starting


GGSCI (DCV-RORSQL-N001) 50> start replicat repinit4

Sending START request to MANAGER ('MANAGER') ...
REPLICAT REPINIT4 starting


GGSCI (DCV-RORSQL-N001) 51> info replicat repinit2

REPLICAT   REPINIT2  Last Started 2014-02-25 11:59   Status RUNNING
Checkpoint Lag       00:02:45 (updated 00:00:02 ago)
Process ID           5792
Log Read Checkpoint  File ./dirdat/te000000
                     2014-02-25 11:57:10.276246  RBA 4893789

While the 3 replicat processes are running, we can see that they are each processing almost the same number of rows and the initial load task has been distributed between the 3 parallel replicat processes


GGSCI (DCV-RORSQL-N001) 54> stats replicat repinit2 latest

Sending STATS request to REPLICAT REPINIT2 ...

Start of Statistics at 2014-02-25 12:02:46.

Replicating from PRD.AC_TRANSACTION_RACI to PRD.AC_TRANSACTION_RACI:

*** Latest statistics since 2014-02-25 11:59:45 ***
        Total inserts                                 100663.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                              100663.00

End of Statistics.


GGSCI (DCV-RORSQL-N001) 55> stats replicat repinit3 latest

Sending STATS request to REPLICAT REPINIT3 ...

Start of Statistics at 2014-02-25 12:02:56.

Replicating from PRD.AC_TRANSACTION_RACI to PRD.AC_TRANSACTION_RACI:

*** Latest statistics since 2014-02-25 11:59:45 ***
        Total inserts                                 100071.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                              100071.00

End of Statistics.


GGSCI (DCV-RORSQL-N001) 56> stats replicat repinit4 latest

Sending STATS request to REPLICAT REPINIT4 ...

Start of Statistics at 2014-02-25 12:03:02.

Replicating from PRD.AC_TRANSACTION_RACI to PRD.AC_TRANSACTION_RACI:

*** Latest statistics since 2014-02-25 11:59:47 ***
        Total inserts                                  98042.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                               98042.00

End of Statistics.

We now see that the initial load extract has stopped and it has extracted 74 million rows


GGSCI (db02) 14> !
info extract extinit2

EXTRACT    EXTINIT2  Last Started 2014-02-25 11:56   Status STOPPED
Checkpoint Lag       Not Available
Log Read Checkpoint  Table PRD.AC_TRANSACTION_RACI
                     2014-02-25 12:17:53  Record 74104323
Task                 SOURCEISTABLE

 

3) Oracle GoldenGate File with SQL Server BULK INSERT

In this method we use the SQL Server 2012 BULK INSERT to process the text file which is generated by the GoldenGate extract process.

Create the initial load extract

Note the parameter used in the extract file – FORMATASCII, BCP

This parameter instructs Oracle GoldenGate to write the output to a text file which is compatible with the SQL Server BCP utility.


GGSCI (db02) 19> add extract extbcp sourceistable
EXTRACT added.

GGSCI (db02) 20> edit params extbcp

"/u01/oracle/goldengate/dirprm/extbcp.prm" 6 lines, 181 characters
extract extbcp
userid ggate_owner, password ggate
FORMATASCII, BCP
RMTHOST DCV-RORSQL-N001.local, MGRPORT 7809
rmtfile ./dirdat/myobjects.dat PURGE
TABLE GGATE_OWNER.MYOBJECTS;

Start the initial load extract

GGSCI (db02) 1> start extract extbcp

Sending START request to MANAGER ...
EXTRACT EXTBCP starting


GGSCI (db02) 2> info extract extbcp

EXTRACT    EXTBCP    Last Started 2014-02-25 12:37   Status RUNNING
Checkpoint Lag       Not Available
Log Read Checkpoint  Table GGATE_OWNER.MYOBJECTS
                     2014-02-25 12:37:05  Record 1
Task                 SOURCEISTABLE


GGSCI (db02) 3> !
info extract extbcp

EXTRACT    EXTBCP    Last Started 2014-02-25 12:37   Status STOPPED
Checkpoint Lag       Not Available
Log Read Checkpoint  Table GGATE_OWNER.MYOBJECTS
                     2014-02-25 12:37:06  Record 78165
Task                 SOURCEISTABLE

Create the initial load replicat

GGSCI (DCV-RORSQL-N001) 62> edit params repbcp


GGSCI (DCV-RORSQL-N001) 63> view params repbcp
targetdb sqlserver2012
GENLOADFILES  bcpfmt.tpl
SOURCEDEFS ./dirdef/source.def
extfile ./dirdat/myobjects.dat
assumetargetdefs
MAP GGATE_OWNER.MYOBJECTS, TARGET GGATE_OWNER.MYOBJECTS;

Start the replicat from the command line


D:\app\product\GoldenGate>replicat paramfile ./dirprm/repbcp.prm reportfile ./dirrpt/repbcp.rpt

***********************************************************************
               Oracle GoldenGate Delivery for SQL Server
Version 12.1.2.0.1 17597485 OGGCORE_12.1.2.0.T2_PLATFORMS_131206.0309
Windows x64 (optimized), Microsoft SQL Server on Dec  6 2013 12:44:54

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2014-02-25 12:41:14
***********************************************************************

Operating System Version:
Microsoft Windows , on x64
Version 6.2 (Build 9200: )

Process id: 1840

Description:

***********************************************************************
**            Running with the following parameters                  **
***********************************************************************

2014-02-25 12:41:14  INFO    OGG-03059  Operating system character set identified as windows-1252.

2014-02-25 12:41:14  INFO    OGG-02695  ANSI SQL parameter syntax is used for parameter parsing.

2014-02-25 12:41:15  INFO    OGG-01552  Connection String: provider=SQLNCLI11;initial catalog=PRD;data source=DCV-RORSQL-N001;persist security info=false;integrated security=sspi.

2014-02-25 12:41:15  INFO    OGG-03036  Database character set identified as windows-1252. Locale: en_US.

2014-02-25 12:41:15  INFO    OGG-03037  Session character set identified as windows-1252.

2014-02-25 12:41:15  INFO    OGG-03528  The source database character set, as dtermined from the table definition file, is UTF-8.
Using following columns in default map by name:
  object_id, object_name, object_type

File created for BCP initiation: MYOBJECTS.bat
File created for BCP format:     MYOBJECTS.fmt

Load files generated successfully.

In SQL Server 2012 Management Studio load the data into SQL Server table via the BULK INSERT command.

  bulk insert [PRD].[GGATE_OWNER].[MYOBJECTS] from 'D:\app\product\GoldenGate\dirdat\myobjects.dat'
  with(
  DATAFILETYPE = 'char',
   FIELDTERMINATOR = '\t',
ROWTERMINATOR = '0x0a'
   );


(78165 row(s) affected)

 

0 Comments

Leave Reply

Your email address will not be published. Required fields are marked *