Sunday, 10 July 2016

Informatica CDC logic implementation using MD5 Function

In SCD type-1  mappings, most of the time there is no change in source records but still we do blind updates to target table based on the key columns.

For Instance,

Source table A:



This increases overall execution time as the table size grows. This will eventually degrade your session performance.

Workarounds: -

We can identify changed data records from source and mark only those records for updates where we really have change in records.
This is called Change Data Capture.

In Informatica, We can use MD5() function to implement change data capture.
MD5 function will calculate 32-character hexadecimal code for each record. This will be stored in target table column.

If MD5 value for Source record is different from Target MD5 value
           Then it will be marked for update
 Else Skip the record as there is no change in record.








Thursday, 4 February 2016

[ERROR] Bad int8 external representation (SQL-HY000) - NETEZZA Sql

This is caused to due to invalid data getting queried in SELECT Query.

Common data errors:

1. For Integer/Decimal fields we are getting Character values.
2. Date or TimeStamp Datatype we are getting incorrect date formats or invalid characters.


For Instance,

We have two tables and have to do union on them.

TABLE A                                               TABLE B
COL1   VARCHAR                                COL1 VARCHAR
COL2   INTEGER                                   COL2  INTEGER
COL3   VARCHAR                                 COL3  VARCHAR
COL4   VARCHAR                                 COL4   VARCHAR


When we are doing Union of two datasets. If the column order is incorrect we would be be select wrong data for other datatype.
In below query, Integer datatype is getting mismatched with VARCHAR datatype.

This would lead to BAD INT8 error.


SELECT
COL1
COL2
COL3
COL4
FROM TABLE A

UNION

SELECT
COL1
COL3
COL2
COL4
FROM TABLE A


WORKAROUND:

Correct the SELECT order for all column in both the SQLs






Wednesday, 3 February 2016

[ERROR] The Integration Service fails the session, as Netezza may not be able to serialize execution of queries." when running a PowerCenter session with a ​Netezza target OR Could not serialize - transaction aborted

A PowerCenter session with a ​Netezza Bulk Writer target fails with the following error:

[ERROR] The Integration Service fails the session, as Netezza may not be able to serialize execution of queries.

OR


Could not serialize - transaction aborted


CAUSE :

Data is being inserted or updated in same target table from different instances or sessions.
For Ex.
Target Update Instance 1 is running below command:- 

UPDATE SET ROW1 FROM TABLE A

Target Update Instance 2 is running below command :-

UPDATE SET ROW2 FROM TABLE A


OR 

Target Insert Instance  is running below command:- 

INSET ROW 3 FROM TABLE A

Target Update Instance  is running below command :-

UPDATE SET ROW2 FROM TABLE A

This would cause serialization error in Netezza target


WORKAROUND :

1. If you have primary Key defined on target table, Check if "Ignore Key Constraint" is CHECKED for all instance in Session Level properties.

2. Try using Relational Writer instead of Bulk writer. This will cause performance impact.

3. If we have multiple instance of Updates or Inserts. Try creating separate pipeline for each target instance.