Tuesday, June 29, 2010

Odi Load Multiple XML Files

Loading multiple XML files into Oracle using ODI is illustrated in this post.

The standard reverse engineering driver will parse the dtd or if the dtd is not available, it will parse the XML and generate a DTD on the fly in the same location where the XML file exists.

To load multiple XML files we will have to employ ODI variables. An ODI variable is needed to store the name of the file that needs to be loaded to Oracle.
Declare the variable in your project V_MYXMLFILE with the following query:
select xmlfilename from my_metadata_table.

This my_metadata_table should exist in an Oracle schema and should contain a column xmlfilename which lists all the XML filenames (alongwith path).
Each time the variable is refreshed, it will populate the first value from the table. You can follow the instructions found at http://odiexperts.com/?p=524 to loop through each row in the table.

Now create 2 ODI procedures:
  1. Truncate XML Schema
  2. Load XML Schema
1.) Truncate XML Schema will contain 1 command "Truncate"

Technology = XML and Schema = Logical XML schema that you want to work on.
The "Command on Target" will be "TRUNCATE SCHEMA MYSCHEMA".
Here MYSCHEMA is the name of the XML schema that is defined in Topology Manager using the s tag as in
jdbc:snps:xml?d=E:/XMLDIR/myXML.dtd&s=MYSCHEMA

Truncate XML Schema will wipe out all the records in the data structure that holds the XML data. It can be in memory or it can be set to a physical database schema by using &db_props=mydbparams parameter in the xml server definition. Here mydbparams is a reference to a physical file mydbparams.properties created in the $ODI_HOME/oracledi/drivers.

2.) Load XML Schema will contain 2 commands:

Show variable value - This will be created using Jython Technology with the following commands :
a = 'Filename = '+ '#V_MYXMLFILE'
raise(a)
Also, mark the Ignore Errors = Yes, for this step.

This is cheat to display the value of the ODI variable during execution in the Operator.
Also, this step is somehow needed so that the value of the variable can be accessed by the next step.
If you remove this step, the next step (which actually loads the XML schema with the contents of the file) errors out.

Load XML Schema
- This will be created using XML Technology and XML logical schema.
Command on Target = LOAD FILE "#MYPROJECT_NAME.V_MYXMLFILE" ON SCHEMA MYSCHEMA INSERT_ONLY


Now you can create a package using these procedures and refresh the variable V_MYXMLFILE with location/name of the file and execute Load XML Schema.
The procedure "Load XML Schema" can also be executed in a loop to load multiple XML files (that adhere to the same DTD) into the database tables.


9 comments:

Badger said...

I know this was some time ago, but I'd like to point you to the ODIXMLCONCAT tool at http://download.oracle.com/docs/cd/E14571_01/integrate.1111/e12643/appendix_a.htm#CEGDHGHF
This allows you to append multiple XML files together (sorting out the inconsistencies that would introduce) so that you run one import/interface rather than looping through many.

Ankit Jain said...

Thanks for the comment Craig.
We have large XML files, sometimes over 100MB and concatenating hundreds of them blows the agent's memory. Hence, we prefer the loop approach.

Other than that, I have used odixmlconcat to concatenate upto 12000 xml files (3-5 KB each) successfully. And it works just perfect.

priya said...

Hi Ankit,

I am new to odi and i am still facing some issues could you please clarify certain points as a am stuck badly
We are performing DB to XML transformation and an intermediate staging table is used.
IKM – IKM SQL to SQL incremental update
CKM – CKM SQL
LKM – LKM SQL to SQL
1. Work table C$_ is created both for staging area and target XML physical schema.
2. SNP_CHECK_TAB is created on the target XML schema. Shouldn't it have been created on the staging area.
3. I am facing issues because of above two points Since after an unsuccessful run and during re-run the work table and check table of target xml schema are first not getting deleted with the message that the table doesn't exist and in the next step when it tries to create a new table the step fails saying the table already exists. I have tried using the DOD (drop on disconnect property during xml data server creation but i am still facing the issue).
thanks

KSandbergFL said...

This is exactly what I need... however, I am having one small problem. The LOAD FILE command is failing for me on the load of the second file, with an error saying that it cannot save the first file... i'm not sure why it's trying to save the 1st XML file anyway, is there a way to get around this?

KSandbergFL said...

I got it working, but not exactly using the method you describe. It might be my version of ODI (11.1.1.3), but the LOAD FILE command does not work, wrapped in double-quotes as you show.

You have to put the double-quotes inside the variable value itself, so that when LOAD FILE interprets the variable, it gets the double-quotes as part of the variable's contents.

Also, I had problems using TRUNCATE SCHEMA... I ended up using DROP SCHEMA, followed by a CREATE SCHEMA WITH DTD, followed by a LOAD FILE ON SCHEMA, finally a SET SCHEMA. The Oracle ODI manual specifically states that the LOAD FILE command does NOT set the context to the schema that LOAD FILE loaded into... so you have to follow LOAD FILE with a SET SCHEMA command. Hope that helps

Anonymous said...

Thank you! This is my first time working with loading XML Files into SQL tables using ODI and with this I was able to load three sample test files with a matching XML Schema into an Oracle Table.

Bas said...

Can anyone tell me the commands to refresh the schema i.e DROP, CREATE and LOAD. Im new to ODI
any help appreciated. Thank you

madhu sudhan said...

hi experts

i have 64 multipls xml files with same structure,this xml files loaded in single target table in database,iam trying but its not geeting results so pls help me to solve this issue.multiple xml files to single target table

Anonymous said...

you have not provided any detail of the error you are receiving.
Whenver, you need help, you need to provide the exact version numbers,
source data, target structure, commands that you are issuing, the error you are getting.
Without this information, its like shooting in the dark.