Informatica Interview QA

What is Persistent Lookup cache? What is its significance?

The meaning of persistent goes like this:Continuing to exist despite efforts to eradicate it.

The Integration Service saves or deletes lookup cache files after a successful session based on the Lookup Cache Persistent property.This means integration service eradicates the lookup cache after session got succeeded.

But if you set the persistent property cache will continuing to exist even after the completion of the session.

What is the use if you make the cache continuing to exist even after the completion of session?

If the lookup table does not change between sessions, you can configure the Lookup transformation to use a persistent lookup cache. The Integration Service saves and reuses cache files from session to session, eliminating the time required to read the lookup table.

Can you explain what are error tables in Informatica are and how we do error handling in Informatica?

To capture, the errors at the transformational level, kindly enable the property in the config object tab in the session property i.e "Error log type". choose the option "Relation Database", along with this mention error log DB connection object in the next property. Once its done, 4 tables are created in the database namely :-
- PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source row.
- PMERR_MSG. Stores metadata about an error and the error message.
- PMERR_SESS. Stores metadata about the session.
- PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

So now, you can verify the error, due to which the mapping the failing.

Difference between IIF and DECODE function?

Decode syntax:DECODE(searchstring,value1,value2,----valuen,default);

EX: decode( quarter_variable , 1 , 'winter',

2, 'spring',

3, 'summer',

'atumun')

Here search variable is 'quarter_variable' it will be compared with value 1.If quarter_variable=1 then winter is selected.

if not then 'quarter_variable' it will be compared with value 2.If quarter_variable=2 then spring is selected.

........... continues.

Here we are comparing one single variable(quarter_variable) with many values (1,2,3...etc).I.e value to be compared is mentioned only once here.So Decode is more readable.

Consider the same example written using IIF syntax below.

IIF syntax:IIF( condition, value1 [,value2] );

EX:iif (quarter_variable=1, 'winter',

(iif(quarter_variable=2, 'spring',

(iif(quarter_variable=3, 'summer',

'atumun')))))

Here search variable is 'quarter_variable' it will be compared with value 1.If quarter_variable=1 then winter is selected.

if not then 'quarter_variable' it will be compared with value 2.If quarter_variable=2 then spring is selected.

For both the Decode and IIF condition we are having the same explanation.But for IIF we are implementing manually the condition in the iif.While the decode function do this comparison by itself.

Decode can be used in select statement.

You could use the decode function in an SQL statement as follows:

SELECT supplier_name,
decode(supplier_id, 10000, 'IBM',
                    10001, 'Microsoft',
                    10002, 'Hewlett Packard',
                    'Gateway') result
FROM suppliers;

Difference between Normal load and Bulk load?

During normal mode database log is created.Using this data can be recovered.

During bulk mode database log is not created.Bulk mode is faster.Normal mode is slower.

Bulk mode fails the session when target table has primary keys.

What's the difference between constraind base load ordering and target load plan

Target load order (or) Target load plan is used to specify the order in which the integration service loads the targets. You can specify a target load order based on the source qualifier transformations in a mapping. If you have multiple source qualifier transformations connected to multiple targets, you can specify the order in which the integration service loads the data into the targets.

Use of Target Load Order:

Target load order will be useful when the data of one target depends on the data of another target. For example, the employees table data depends on the departments data because of the primary-key and foreign-key relationship. So, the departments table should be loaded first and then the employees table. Target load order is useful when you want to maintain referential integrity when inserting, deleting or updating tables that have the primary key and foreign key constraints.

Target Load Order Setting:

You can set the target load order or plan in the mapping designer. Follow the below steps to configure the target load order:

1. Login to the powercenter designer and create a mapping that contains multiple target load order groups.
2. Click on the Mappings in the toolbar and then on Target Load Plan. The following dialog box will pop up listing all the source qualifier transformations in the mapping and the targets that receive data from each source qualifier.

3. Select a source qualifier from the list.
4. Click the Up and Down buttons to move the source qualifier within the load order.
5. Repeat steps 3 and 4 for other source qualifiers you want to reorder.
6. Click OK.

Constraint based load: In mapping If you want to load data to parent target table 1st & then load data to child target table you set condition in session-Config Object-check Constraint based load plan.
This will load data to parent table & then to child table.

How the informatica server sorts the string values in Ranktransformation?

When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder.If U configure the seeion to use a binary sort order,the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string.

In update strategy target table or flat file which gives
more performance ? why?

Flat file is only good for insert.You cant update a flat file.So relational table is good while using update strategy.

What are the out put files that the Informatica server creates during running a session?

Following output files are created, when a workflow is executed every time, those are :-

- Reject file: This file contains the rows of data that the writer does not write to targets.

- output file: If session writes to a target file, the informatica server creates the target file based on file properties entered in the session property sheet.

-Session log file: Informatica server creates sessionlog file for each session. It writes information about session into log filessuch as initialization process, creation of sql commands for reader and writer threads, errors encountered and load summary.The amount of detail in session log file depends on the tracing level that uset.

-Cache files: When the informatica server creates memory cache it also creates cache files.For the following circumstances informatica server creates index and data cachefiles.

- Indicator file: If u use the flat file as a target can configure the informatica server to create indicator file. For each target row, the indicator file contains a number to indicate whether the row was marked for insert, update, delete or reject.

what are mapping parameters and varibles in which situation we can use it

If we need to change certain attributes of a mapping after every time the session is run, it will be very difficult to edit the mapping and then change the attribute. So we use mapping parameters and variables and define the values in a parameter file. Then we could edit the parameter file to change the attribute values. This makes the process simple.

Mapping parameter values remain constant. If we need to change the parameter value then we need to edit the parameter file .

But value of mapping variables can be changed by using variable function. If we need to increment the attribute value by 1 after every session run then we can use mapping variables .

In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run.

For more info refer http://miky-schreiber.com/wordpress/?p=23

Difference between connected and unconnected lookup transformation in Informatica?

Connected lookup receives input values directly from mapping pipeline whereas UnConnected lookup receives values from :LKP expression in another transformation.
Connected lookup trasformation will process each and every row whereas UnConnected lookup will not process each and evry row. It will return the values based expression condition.
If the LKP condition is not matched the lookup transformation will return the default value in case of Connected lookup. If no match found for the LKP condition, the lookup transformation will return null values in case of Unconnected lookup.
Connected lookup transformation can use static or dynamic cache, whereas Unconnected lookup can use only static cache.
Connected lookup returns multiple columns from the same row whereas the UnConnected lookup has one return port and returns one column from each row.
Connected lookup supports user-defined default values and the UnConnected lookup does not support user defined values.

How do we implement recovery strategy while running concurrent batches?

Meaning of cuncurrent goes like this,

Existing, happening, or done at the same time.

(of two or more prison sentences) To be served at the same time.

simultaneous - contemporaneous

---------------------------

when we are doing concurrent loading if a session fails all other sessions will run normally.

We can restart the failed session task if informatica server does not perform the commit else perform recovery on the session. If we can recover the session truncate the data in target table and run the session.

What is Data driven?

In a mapping you might need to do insert,Update and delete operation based on the kind of data you get.In that case you need to set the session Property,"treat source rows as" to Data Driven.

This means we make a decision of inserting,updating or deleting the data dynamically using update strategy in mapping.

When you use update strategy in mapping session property by default sets to "data driven".

What is batch? Explain the types of the batches?

Group of session is called a batch.Two kinds group of sessions are possible one is concurrent and Sequential.

If we have sessions with source to target dependency we have to go for sequential otherwise concurrent.

What are the types of meta data repository stores?

Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.

Can you use the mapping parameters or variables created in one mapping into another mapping?

first create a WF variable in WorkFlow
in session -->edit --> components --> Post-session on success variable assignment, assign the mapping variable to WF variable.
in second session do the same in Pre-session on success variable assignment i.e. assign WF variable back to mapping variable.

Why did we use stored procedure in our ETL Application?

stored procedure is used to drop the indexes and recreate them.

To check for space is available in target database.

For implementing complex business rules.

When we can join tables at the Source qualifier itself, why do we go for joiner transformation?

You can Join heterogeneous sources using joiner.But you cant do the same using sql-qualifier.

In a joiner transformation, you should specify the table with lesser rows as the master table. Why?

Joiner will compare each row in master source against the detail source.Fewer the rows in master fewer will be the comparasion.Hence speeds the join process.

Difference between Cached lookup and Un-cached lookup?

When lookup is cached it will take all the target table rows and saves it in cache.But in uncached for every input row it will query the lookup table.

What are the limitations of the joiner transformation?

1. There should not be an update strategy transformation before the Joiner.
2. Any of the direct source to the Joiner transformation should not be a Sequence generator TX. ie. the sequence generator tx should not be directly connected to the Joiner tx.

How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways using lookup delete the duplicate rows

we can use distinct in sql override of source qualifer
we can group by columns on which we may get duplicates.
we can use sorter by enableing distinct in its properties.
we can use a exression transformation and we can flag the duplicate rows.
We can use aggregator and can do group by there.

How do I create Indexes after the load process is done

Using post Sql in session.

In unix we do like this...
#!/usr/bin/ksh
sqlplus -s username/password<<END
execute proc_drop_idx;
Commit;
exit;
END

============================create or replace procedure proc_drop_idx
as
begin
EXECUTE IMMEDIATE 'drop index dept_idx';
end;

why i should not use update transformation before joiner

1.If any of the pipe line contains UPD St t/r, then record from that pipelinbe is flagged accordingly (i.e. either insert/Update/Del or Reject) considering all the columns.

If you are using Joiner, that means you are joining another set of columns with above flagged record, which can voilet the flagged rule.

2. Joiner t/r accepts i/p from 2 sources with Equi join condition only. If you are already using 2 sources, then you can not use third source as a sequence generator.

Also, if you will use Sequence generator as one source, then I don't know how we can define the Equi join condition. Hence we can not use Seq Gen t/r before Joiner.

INFORMATICA

Informatica Interview QA

4 comments:

Popular Posts

Arsip Blog