Using Kettle to load a product file

I happened to get involved in helping someone load product data from their supplier. (Their initials are CK and hence the file and transform names.) 

You will probably have realised that loading OpenVPMS product data is not just a simple matter of loading up a flat file.  To get things set up correctly not only do we need to link products to their supplier, but we need to set the Product Type (because a lot of the reporting in OPV has selection via product type, and because the product type also defines the invoice order and (optionally) discounts.

You may have also understood that OPV divides products into medications, merchandise, and services, and that our product loader has to cope with this.

Why Kettle?  There is also the dataload tool, and its quite possible to use dataload to load product data.  However, if you are starting from a data file provided by the product vendor, you need to process this in the xml format required by dataload.  You almost need Kettle to do this processing, so you may as well use Kettle to do the load the data - it certainly gives us flexibility and powerful processing facilities.

If you look inside the attached zip (it has been renamed to so that I could attach it) you will find 5 kettle transforms and the kettle job that includes them (so you can run the job rather than the five transforms).  There is also an extra transform (see below) than can be used to update prices at a later time. [Note however, that with the advent of the product export/import in OpenVPMS 1.7 this is not really useful.]

Also in the zip are two csv files. CK-supplier contains the single supplier, CK-products contains the products.  Note that this is a severely truncated and editied product set.  The orginal one contained some 7500 items and has been truncated to protect the supplier's commercial information.

The csv file was generated by loading the supplied text file into Excel and then manipulating it, adding the extra columns, and saving it as a CSV file.  So if you want to use these transforms to load your own supplier's data, then you need to process this to match the format of this product file (or modify the transforms).

I also added extra columns as follows:

ProdClass - a single letter to indicate the type of product G=goods=merchandise; M=medication, S=service.  If you add your own supplier's data, you can start by tagging everything as G, but you will need to go through and mark the things that should be medication as M. [Recall that medications have a DrugSchedule and a dispensing labels and instructions.]  You can also add service items like Consults.

SupplierNo - this is set to 1 for every item.  If you include products from another supplier, put 2 here and the add the corresponding line to CK-supplier

Markup - this is the markup index - see the code in the Process step of the CK Products transform, and adjust as you need.

Species - put the species here for items that are species specific. Instructions in the Process step of the CK Products transform.  You can leave this column all blank if you want.

FixedPrice - this column is here if you want to add service products, or if you want to put a 'flag-fall' price on some goods or medications. [eg putting a fixed price of $10 on a can of dog food with a unit price of $3 means that 10 cans cost $10+10x$3 = $40.]  No problem if you leave blank.

UOM - unit of measure for the item.  If you have previously set up the 'Units of Measure' lookup, then you can use the appropriate codes here. No problem if you leave blank.

DrugSched - for the scheduled drugs, put their schedule here (ie S4, S8 etc).  Note that although everything has a schedule (ie dog food is S1 - non-dangerous), only medications have a slot for the schedule.  This means that if there is some item which you think of as merchandise, but it has a schedule that you need to record, then you Must make the item a medication.

ProdType - the Product Type for the item.  What is set here needs to match the Name field in the CK_productTypes.csv file. 


The transforms are as follows: 

  • CK Supplier loads the supplier(s) from the CK_supplier.csv file.
  • CK Load Product Type loads the product types from the file.
  • CK Products loads the products and their prices.
  • CK Product Supplier links products to suppliers and adds in the supplier specific information.
  • CK Set Product Type sets the product type for each product.

There is also  a CK New Prices transform.  This uses the same product input file (for simplicity sake) but just updates the prices. [Remember that in OPV a price update does NOT replace the previous prices – it just add the newer ones with the new date.  Hence these supersede the old one, but you can see what the old prices were. This happens because the prices are ‘collections’. If you updated a simple field (like Name) then the new data will replace the old.]



a) I have not set all the available datafields

b) you will see that I set both the product Name and Printed Name to the name from the Description field in the data file.  OPV has done its 'properCase' processing of the Name field, but the standard archetype does not set properCase for the Printed Name field.  If you want the name on invoices etc to come out the same as the proper cased name, delete line 2 of the mappings in each of the OPV Loader steps so that Printed Name does not get set (and will default to what is in the Name field).

c) you will have to adjust the folder names set in the xxx.csv steps.  You will find that they are all set to " C:\Users\Tim\Documents\HKG Operations\OpenVPMS\Kettle\"

d) naming convention:  all variables generated by the Process step are prefixed by 'o'.  Hence fields like DrugSched come from the input file and have not been altered.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Hi Tim Any chance of uploading the .zip file again so I can have a look at the transforms? I can't see any file attachments on this post. Cheers John

John - has been done.  Regards, Tim G

Syndicate content