Table Compare
The table comparison task (tablecompare
) performs the comparison of data stored in two tables of similar structure.
By default the comparison runs on all columns for all rows, but can be limited to:
By default, tables are ordered using their primary key, if available. The ordering can be changed using the Order element.
The table compare task is defined by the XML element tablecompare
whose attributes configure the execution parameters.
The following attributes configure the task:
Attribute | Type | Purpose | Default |
---|---|---|---|
mode | string | Comparison mode. Refer to comparison mode for detailed description. | key |
left | string | Name of the data source on the left side of the comparison | |
right | string | Name of the data source on the right side of the comparison | |
ignoretrailingnulls | bool | Ignores trailing NULL when comparing CHAR and VARCHAR columns |
Defined in Data comparison |
ignoretrailingspaces | bool | Ignores trailing blanks when comparing CHAR and VARCHAR columns |
Defined in Data comparison |
ignoreinbinaries | bool | Ignores trailing blanks and NULLs on FOR BIT DATA , BINARY and VARBINARY columns defined for CHAR and VARCHAR |
Defined in Data comparison |
diffslimit | integer | Differences limit. When greater than zero, the comparison stops when the specified number of differences is detected. | Zero |
warnmissingdbcs | bool | Warns when DBCS bytes not found in the codepage. | true |
warnmmalformeddbcs | bool | Warns when malformed DBCS sequences. | true |
count | bool | If set to false rows are not counted before copy |
true |
charasbinary | bool | If set to true CHAR/VARCHAR columns are treated as BINARY. See char as binary |
false |
first | integer | Compares only the first N rows | |
db2timestampfix | bool | When set to true, DB2 TIMESTAMP get cast as CHAR and date/time errors admitted by DB2 are automatically normalized | false |
compareebcdicbycollate | bool | If true , EBCDIC CHAR fields are compared using EBCDIC collating sequence. If false, comparison is dony by ASCII value |
false |
The following elements configure the table comparison:
Element | Purpose |
---|---|
left | Fully qualified name of the table on the left side |
right | Fully qualified name of the table on the right side |
columns | Defines explicitly the columns to be compared. If omitted all columns are compared |
where | Defines the row selection criteria. If omitted, all rows are compared |
exclude | Defines the regular expression based column exclusion criteria, if omitted no column is excluded |
order | Defines the ordering of the tables being compared |
cast | Defines optional column manipulation and casting |
replace | Overrides and defines the datacomparison string replacement definition set in general configuration |
compare | Defines a custom comparison (layout or plugin) for specific columns |
layouts | Defines the layouts to be used in the layout based comparison |
Comparison Mode
The comparison can be executed in the following modes:
key
: tables are compared by primary key, as follows:- When columns in primary key match
- If remaining columns match, rows are considered equal and next row is fetched from both side
- If remaining columns don’t match, rows are considered different and next row is fetched from both sides
- When columns in primary key don’t match
- If primary key on left is greater than primary key on right, row on left is marked as missing on right and next row on right is fetch
- If primary key on left is less than primary key on right, row on right is marked as missing on left and next row on left is fetch
- When columns in primary key match
sequential
: tables are compared as follows:- When all columns match, rows are considered equal and next row is fetched from both side
- When all columns don’t match
- If row on left is greater than row on right, row on left is marked as missing on right and next row on right is fetch
- If row on left is less than row on right, row on right is marked as missing on left and next row on left is fetch
parallel
: tables are compared side by side as follows:- All columns are compared and, independently from the comparison, results both rows from both sides are fetched after the comparison
Left and Right
The left
and right
elements are used to provide the fully qualified of the objects being compared.
The following attributes are used to configure left and right objects:
Attribute | Type | Purpose | Default |
---|---|---|---|
collate | string | Defines the collating sequence used to sort the table. The collating sequence must be a valid collate identifier of the selected database. If omitted, the datasource default collating sequence (as defined in the configuration) is used. If neither the database or the element define a collating sequence, no collate is used to order the table | |
bitdatacodepage | string | Defines the codepage to translate the binary data (i.e. CHAR FOR BIT DATA ) from EBCDIC to ASCII. If not specified and no default bitdatacodepage is set for the datasource, no translation is performed (default). For the list of supported codepages, refer to Codepages. |
As defined in database datasource configuration |
layoutencoding | string | Defines the encoding (ebcdic or ascii) of the layout based columns | As defined in database datasource configuration |
layoutzoned | string | Defines the SIGN mode (ascii or ebcdic) of the ZONED fields of the layout based columns | As defined in database datasource configuration |
layoutendian | string | Defines the endianness of integer fields of the layout based columns | As defined in database datasource configuration |
layoutcodepage | string | Default codepage to be used for the EBCDIC/ASCII conversion of the layout based columns | As defined in database datasource configuration |
Example
<tablecompare mode="key" left="DB2" right="ORA">
<left>ADMINISTRATOR.TEST01</left>
<right>FAB.TEST01</right>
</tablecompare>
<tablecompare name="TBLCMP11" mode="key" left="DB2" right="QS">
<left collate="UCA500R1_AS">ADMINISTRATOR.TEST04</left>
<right collate="SQL_EBCDIC037_CP1_CS_AS">dbo.TEST04</right>
</tablecompare>
<tablecompare name="TBLCMP12" mode="key" left="DB2" right="QS">
<left bitdatacodepage="ENUS037">ADMINISTRATOR.TESTBD2</left>
<right>dbo.TESTBD1</right>
</tablecompare>
<tablecompare name="IGNORE" mode="key" left="DB2" right="QS"
ignoretrailingspaces="true"
ignoretrailingnulls="true"
ignoreinbinaries="true">
<left>ADMINISTRATOR.test_nulls_1</left>
<right>dbo.test_nulls_2</right>
</tablecompare>
Columns
columns
element is used to provide the list of columns to compare. If omitted, all columns are compared.
Note
If the provided list does not contain one or more columns being part of the primary key, those columns are automatically added to the list.
Each column is defined by the column
element.
Example
<tablecompare name="TBLCMP01" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<columns>
<column>PK1</column>
<column>C1</column>
<column>C2</column>
</columns>
</tablecompare>
Where
The where
element defines the row selection criteria for one or both sides of the comparison. The criteria is expressed as an SQL condition.
The following attributes are used to configure the where element:
Attribute | Type | Purpose | Default |
---|---|---|---|
side | string | Side where the selection criteria applies: both , left or right |
both |
Example
<tablecompare name="TBLCMP01" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<where side="left">
<![CDATA[
PK1 > 'KEY02'
]]>
</where>
</tablecompare>
Order
The order
element sets the sort criteria for one or both sides of the comparison. Tables can be sorted in the following modes:
key
: the table is order by primary key. If the exclude attribute is defined, the element value defines the regular expression for the exclusionindex
: the table is order by the specified index. If the index name indicated, does not contain the schema name, the schema of the table is assumed by default. If the exclude attribute is defined, the element value defines the regular expression for the exclusionuser
: the table is sorted according to the user provided ORDER BY clause specified in the element value
When mode is key
or index
, optionally, some columns can be excluded from the key, defining the attribute exclude
, which value can be:
name
: the columns whose name matches the provided regular expression are excluded from comparisontype
: the columns whose type matches the provided regular expression are excluded from comparison
The following attributes are used to configure the where element:
Attribute | Type | Purpose | Default |
---|---|---|---|
side | string | Side where the selection criteria applies: both , left or right |
both |
mode | string | Order by mode: user , index or key |
user |
exclude | string | Exclusion mode: name or type |
name |
Example
<!-- ORDER BY WHOLE PRIMARY KEY -->
<tablecompare name="TBLCMP02" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<order mode="key"/>
</tablecompare>
<!-- ORDER BY USER PROVIDED ORDER BY CLAUSE -->
<tablecompare name="TBLCMP03" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<order mode="user">PK1 ASC, PK2 DESC, COLX ASC</order>
</tablecompare>
<!-- ORDER BY USER PROVIDED ORDER BY CLAUSE WITH DEFAULT MODE-->
<tablecompare name="TBLCMP05" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<order>PK1 ASC, PK2 DESC, COLX ASC </order>
</tablecompare>
<!-- ORDER BY PRIMARY KEY EXCLUDING KEYPARTS BY TYPE -->
<tablecompare name="TBLCMP06" mode="key" left="QS" right="QS">
<left>dbo.TEST01</left>
<right>dbo.TEST01</right>
<order mode="key" exclude="type">^CHAR|INT</order>
</tablecompare>
<!-- ORDER BY PRIMARY KEY EXCLUDING KEYPARTS BY NAME -->
<tablecompare name="DB2ZOSVSSQL" mode="sequential" left="DB2ZOS" right="QS" >
<left>IBMUSER.TEST_IDX_01</left>
<right>dbo.test_idx_01</right>
<order mode="index" >IX_TEST_IDX_01_CHAR</order>
</tablecompare>
<!-- ORDER BY INDEXES WHICH NAME IS DIFFERENT -->
<tablecompare name="IDX2" left="QS" right="QS" >
<left>dbo.test_idx_01</left>
<right>dbo.test_idx_02</right>
<order mode="index" side="left">dbo.ix_test_idx_01_char</order>
<order mode="index" side="right">dbo.ix_test_idx_02_char</order>
</tablecompare>
Exclude
The exclude
element is used to define the regular expression to exclude columns from the comparison.
There are two types of exclusion criteria:
name
the columns whose name matches the provided regular expression are excluded from comparisontype
the columns whose type matches the provided regular expression are excluded from comparison
The following attributes configure the exclude element:
Attribute | Type | Purpose | Default |
---|---|---|---|
type | string | Exclusion mode: name or type |
name |
side | string | Side where the selection criteria applies: both , left or right |
both |
Example
<tablecompare name="TBLCMP09" mode="key" left="QS" right="QS">
<left>dbo.TEST03A</left>
<right>dbo.TEST03B</right>
<exclude side="right" type="name">F.*B</exclude>
<exclude side="right" type="type">DATETIME2</exclude>
</tablecompare>
Cast
The cast
element is used to define the data manipulation and casting clauses for the columns.
There are two types of cast criteria:
name
: the columns whose name matches the provided regular expression are accessed with the specified clausetype
: the columns whose type matches the provided regular expression are accessed with the specified clause
The following attributes are used to configure the exclude element:
Attribute | Type | Purpose | Default |
---|---|---|---|
type | string | Defines the cast mode: name or type |
type |
arg | string | Defines the regular expression to match columns | |
side | string | Defines the set where the cast applies: left , source , right , target or both |
both |
The clause is defined as SQL and the following special registers are interpolated:
Register | Interpolate |
---|---|
{name} | Column name |
{length} | Column size |
{value} | Host variable's value |
Example
<tablecopy name="CAST.COPY" source="QS" target="QS1" truncate="true" mode="bulk">
<source>dbo.TEST01</source>
<target>dbo.TEST01_COPY</target>
<cast type="name" arg="C1[01]" side="source">UPPER({name})</cast>
</tablecopy>
<tablecompare name="CAST.CMP.OK" left="QS" right="QS1" >
<left>dbo.TEST01</left>
<right>dbo.TEST01_COPY</right>
<cast type="name" arg="C1[01]" side="left">UPPER({name})</cast>
</tablecompare>
Char As Binary
If enabled and the database encoding is EBCDIC, all CHAR/VARCHAR columns are treated as binary (i.e. CHAR FOR BIT DATA) and therefore converted according to the specified codepage (when specified).
The feature can be activated in two ways:
- setting the attribute
charasbinary
totrue
on the task elemement, or - adding the element
charasbinary
The charasbinary
element can be used to override the codepage or to exclude specific columns from the treatment.
The following attributes configure charasbinary
:
Attribute | Type | Purpose | Default |
---|---|---|---|
enabled | bool | If set to false , the feature is deactivated |
true |
codepage | bool | codepage to use | codepage defined for the table |
The following elements configure the table copy:
Element | Purpose |
---|---|
exclude | Exclude columns from the cast. This element works like the element exclude of the task. |
Example
<tableunload name="UNLOAD.DB2.CHARASBIN" db="DB2ZOS" repository="TMPA" charasbinary="true">
<table bitdatacodepage="ENUS370">IBMUSER.TEST01</table>
<file recfmt="fixed" reclen="*">TMP01ZOS.UNLOAD.DB2.ASCII</file>
</tableunload>
<tablecopy name="ZOSCOPY.CHARASBIN.NATIVE" mode="native" source="DB2ZOS" target="QS" truncate="true" >
<source bitdatacodepage="ENUS037">IBMUSER.TEST01</source>
<target>dbo.TEST01_COPY</target>
<charasbinary/>
</tablecopy>
<tablecopy name="ZOSCOPY.CHARASBIN.NATIVE" mode="native" source="DB2ZOS" target="QS" truncate="true" >
<source bitdatacodepage="ENUS037">IBMUSER.TEST01</source>
<target>dbo.TEST01_COPY</target>
<charasbinary codepage="ENUS037" enabled="true">
<exclude side="right" type="name">F.*B</exclude>
<exclude side="right" type="type">DATETIME2</exclude>
</charasbinary>
</tablecopy>
Compare
The compare
element customizes the comparison of the column(s) matching all the provided criteria.
There are two possible customization modes:
- plugin mode: the column is compared by a user provide plug-in (default)
- layout mode: the column is compared according to one or more layouts defined in the
layouts
element
The criteria are:
Criteria | Description |
---|---|
leftname | the name of the column on the left side |
rightname | the name of the column on the right side |
lefttype | the SQL type of the column on the left side |
righttype | the SQL type of the column on the right side |
Note
If no criteria is defined, all columns are matched.
The following attributes are used to configure the compare element:
Attribute | Type | Purpose | Default |
---|---|---|---|
mode | string | Defines the custom comparison mode (plugin, layout) | plugin |
leftname | string | Name of the column on the left side | |
rightname | string | Name of the column on the right side | |
lefttype | string | SQL type of the column on the left side | |
righttype | string | SQL type of the column on the right side | |
path | string | Indicates to load the class script from the specified file | |
assembly | string | Indicates to load the class assembly from the specified DLL | |
class | string | Indicates the name of the class to use | |
plugin | string | Indicates to load the class from the plugin alias defined in the Ianus configuration file | |
whitespaces | string | Defines the white spaces handling mode (asis, trim, ignore) | asis |
The user provided comparison plugin must implement the interface HPE.Ianus.Scripting.IColumnComparator
. For a detailed description of the comparison class please refer to Column Comparator .
Layouts
The layouts
element defines one or more record layout to convert the layout based columns.
The following elements configure the layouts:
Element | Purpose |
---|---|
layout | Adds a layout definition. Refer to Layout |
match | Defines the match algorithm. Refer to Match |
Layout
The layout
element defines a one or more record layouts. Each layout must be identified by a unique name (task-wide).
By default, the layout fields are defined using XML:
- Each layout element defines one single layout
- Record fields are defined using
field
elements
But there is also the possibility to define the layout using COBOL data definition syntax:
- Each layout element defines one or more layouts (one each level 01)
- Record fields are defined using COBOL notation
For further information on layout definition, refer to Record Layouts
Match
The match
elements define one ore more "match class". Whenever more than on layout is provided, Ianus needs a match class to drive the selection of the correct layout for each record compared.
This class must implement the interface HPE.Ianus.Scripting.IColumnLayoutMatch
. For a detailed description of the match class please refer to the Column Layout Matcher section.
The class can be provided as:
- C# script: the code is compiled on the fly by Ianus and declared either:
- inline, by adding the code in the element value, or
- by reference, providing the filename of the script on the path attribute. If the script file name is not absolute, the script is searched in the same path of the job script.
- Class library: the class can be coded in any .NET language such as C# or VB.NET but also COBOL for .NET and compiled as .NET class library. The class can be referenced either:
- directly: providing the filename of the DLL containing the class on the assembly attribute and the full name of the class on the class attribute, or
- by alias: providing the name of the plugin alias defined in the Plug-ins section of the Environment Configuration File.
The following attributes configure the element:
Attribute | Type | Purpose | Default |
---|---|---|---|
path | string | Indicates to load the class script from the specified file | |
assembly | string | Indicates to load the class assembly from the specified DLL | |
class | string | Indicates the name of the class to use | |
plugin | string | Indicates to load the class from the plugin alias defined in Plug-ins |
Example
<tablecompare name="TBCMP1" left="QS" right="QS1">
<left>dbo.test_cmp_1</left>
<right>dbo.test_cmp_2</right>
<compare lefttype="CHAR" assembly="C:\MyProject\bin\Debug\ColumnComparators.dll"
class="ColumnComparators.StringCompareIgnoreCase"/>
<compare leftname="UF2" plugin="StringCompareIC"/>
</tablecompare>
<tablecompare name="${STEP}.OK" left="${SRC_DB}" right="${TRG_DB}" >
<left>${SRC_SCHEMA}.CHAR_BINARIES_A</left>
<right>${TRG_SCHEMA}.CHAR_BINARIES_B</right>
<layouts>
<layout type="cobol" format="free" length="auto">
01 SYS053-DESCR.
03 FIELD01 PIC S9(4) COMP.
03 FIELD02 PIC X(4).
</layout>
</layouts>
<compare mode="layout" sourcename="LAYOUTV" layout="SYS053-DESCR"/>
<compare mode="layout" sourcename="LAYOUTF" layout="SYS053-DESCR"/>
</tablecompare>
Replace
Character or String replacement is enabled by the replace
element, that must contain the following elements:
Element | Type | Purpose | Default |
---|---|---|---|
from | string | The source string to replace | |
to | string | The replacement string |
Example
<tablecopy name="COPY_TRUNC" source="QS" target="QS1" truncate="true" mode="hybrid">
<replace>
<from>�</from>
<to>✽</to>
</replace>
<source>dbo.test_replace_01</source>
<target>dbo.test_replace_02</target>
</tablecopy>
<tablecompare name="COPY_TRUNC" left="QS" right="QS1" truncate="true" >
<replace>
<from>�</from>
<to>✽</to>
</replace>
<left>dbo.test_replace_01</left>
<right>dbo.test_replace_02</right>
</tablecompare>
Comparison Log
Ianus will report on both console and log file the result of the comparison, indicating the differences, when detected.
For each difference detected, Ianus will log a message structured as follows:
- Difference indicator, a single character (
<
,!
or>
) indicating:- When a row is on the left side only (
<
) - When a row with the same PKs is on both sides but some columns are different
- When a row is on the right side only (
>
)
- When a row is on the left side only (
- Row number, prefixed by #
- Primary key(s) content, encolsed into square brakets (
[
and]
), of the rows causing the difference - Names of the column(s) containing different data (after the
@
character)
Example
[info] Key based comparison started
[warn] < #1 [KEY01 ][A ][ 1][2015-01-01 00:00:00.000000]
[warn] ! #3 [KEY02 ][A ][ 2][2015-01-01 00:00:00.000000] @ C12
[warn] > #5 [KEY03 ][A ][ 3][2015-01-02 00:00:00.000000]
[warn] > #6 [KEY04 ][1 ][ 2][2015-01-02 00:00:00.000000]
[info] Compared 7 vs 8 rows
[warn] IanusTest/dbo.TEST01 vs IanusTest/dbo.TEST02 reported 4 differences
Note
Ianus will trace actual content of the columns causing the difference in the log file.
Status codes
Status | Status code | Description |
---|---|---|
Ready | -1 | Task is initialized, but not yet started |
Running | -2 | Task is running |
Success | 0 | Task completed successfully |
Warnings | 1 | Task completed with warnings |
Errors | 2 | Task completed with errors |
Aborted | 9 | Task cannot be executed |