Recently I was asked to install and run a bridge between a MSSQL database having SAP datasets to import into SugarCRM Ondemand (Cloud) instance.
Thus, I created a development environment with the same data and connection. The steps for running Spoon (aka Kettle, which was bought) follow.
Applicable to Pentaho Versions: 7 – 8.1
For the version 8.1, the file is pdi-ce-8.1.0.0-365.zip and may be found in https://sourceforge.net/projects/pentaho/files/Pentaho%208.1/client-tools/
2.1 Go to Control Panel -> System -> Advanced -> Environment variables
2.2 Set the JAVA_HOME variable in your Windows accordingly, to the right path.
2.3 Add the JAVA path to the system PATH variable
For that, try to run it by clicking in the Spoon.bat file.
In case it does not run, edit the .sh or .bat files
Open the Spoon.bat or Spoon.sh file and find the line with PENTAHO_DI_JAVA_OPTIONS.
Change the original -xms1024m by -xms512m and the original -xmx2048m to -xmx1024m
Try again. You should see
Notice
The minimal production memory configuration is the original one. The proposed values are for development only. For big databases, the new 512m and 1024m may not be enough for proceeding with some use cases.
Microsoft Native driver
https://www.microsoft.com/en-us/download/details.aspx?id=54670
For MySQL
https://dev.mysql.com/downloads/file/?id=479586
Take the jar files and put them inside data-integration\lib
e.g.
Save the file sqljdbc_4.1\enu\jre7\sqljdbc41.jar inside the data-integration/lib
select host_name();
select SERVERPROPERTY(‘INSTANCENAME’);
select @@version;
ALTER LOGIN sa ENABLE ;
GO
ALTER LOGIN sa WITH PASSWORD = ‘<enterStrongPasswordHere>’ ;
GO
6.1 – In case you need to use Windows integration, add the DLL file to the data-integration\lib folder. It is in sqljdbc_4.1\enu\auth\x64 folder -> sqljdbc_auth.dll
If working locally, you should make sure the database is connectable.
Notice
Maybe you need to ask the customer to make sure the database is accessible and suggest the following steps
Open the port 49170 in the SQL-SERVER. For that, open the right executable. Let’s say for the version 2016, the file is
C:\Windows\SysWOW64\SQLServerManager13.msc
For 2012, the file is
C:\Windows\SysWOW64\SQLServerManager11.msc
In this, using the left side tree, go to SQL Server Network Configuration -> Protocols for <Your DB> e.g. SQLEXPRESS
– Go to the TCP/IP. Change status to Enabled.
– Go to the 2nd tab, IP Addresses. Go to the LAST option – IPAII
– Put 0 in TCP Dynamic Ports
– Put 49170 in the TCP Port.
Save it issuing OK
Restart the SQL Server:
– Go to SQL Server Services, go the active instance, and right-click and select restart.
More in:
If using Windows Firewall.
Access the Control Panel. Windows -> Right-click -> Run. Type control <enter>
In the search control panel dialog box, click “Firewall”
Click on Windows Firewall. Click on the advanced settings in the left side.
Click in Inbound rules in the left side. Right- click “New rule” in the right side. Next.
Choose the “Port” radiobox. Next
Specific local ports: TCP and Type in the textbox 49170. Next
Choose the Allow connection radiobox. Next
Check the 3 checkboxes. Next
Give it the name 49170 SQL Server. Click Finish.
You should see something like this.
Create an Icon for Spoon in the Desktop
Run it and create a new Configuration, following the steps 1 to 6
In this case:
LAPTOP-DEV01
crm_78_stage
SQLEXPRESS
sa
password
If questions in how update the firewall or the TCP/IP MS SQL Server, visit
https://www.youtube.com/watch?v=5UkHYNwUtCo
For each database, we will have an alternative to checking the total number of lines in each table. We could check the size of each table as well.
In MS SQL Server:
In this case, a lightweight one
For MySQL:
SELECT table_name as Name, table_rows as Rows
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = “schema”;
Inside the Pentaho folder, or in the adequate disk/folder, create two folders for saving the work done:
e.g.
d:\pentaho\pdi-rawdata
d:\pentaho\pdi-transformation
Create also the kettle files folder
d:\pentaho\pdi-kettle
The last folder should be included in a Bitbucket repository, under DevOps projects
A transformation is needed to import the files from Databases.
File->New->Transformation
13 – A simple transformation process is composed of the following steps and named as convention:
13.1 – <module>_rawdata –
Module, in this case, is the SugarCRM or SalesForce module to where data is going to.
This transformation simply is there to get the data from the data source, and should also transform the field names from the original datasource to the sugarfield or SalesForce mapping
That means that the datasource could be a XLSX table, a CSV file or a SQL dataset, from any DB engine.
This is the typical appearence of this step, when connecting from database
13.2 – <module>
This step gets the data, prepare the JSON file, connect to the SugarCRM or Salesforce to get the credentials, create the JSON object for injecting it into the CRM via API request.
This is step starts from the CSV generated in the step <module>_rawdata
13.3 – Other transformations
Other transformations may be required, and it is very likely they are possible to be provided. In the next example, we just transformed the input database into an XLSX file
Its name should be <module>_<transformation>. In the following case, it would be articles_xlsx
14 – Now you should create the <module>_rawdata transformation. If data is coming from a database, connect the database already tested in the steps before.
Notice that we take the moment to also transform the name of the variables. This way, we may save at least one step.
For testing, try to limit the query via component.
15 – Press preview. You should see a sample of the rows you want to send.
16 – The step “Rename the variables” will not be used in this case.
17 – In the Output CSV, make sure you point the file to a valid and writable path. Only the cases Add filenames to results and Create parent folder should be clicked.
17.1 – The separator here will be the comma, enclosure “, Force the enclosure, disable enclosure fix and header shoube true. Format CR+LF terminated (Windows, DOS), Compression None, Encoding UTF-8
17.2 – IMPORTANT. In the tab Fields, click minimal width.
18 – Save and test the transformation
tbd
Some JSON sources and targets, when converted, may be changed to the JavaScript UTF-16. To handle that, keep in mind using in the Script component, when necessary, the following:
// Before sending (on client)
var encodedObject = unescape(encodeURIComponent(JSON.stringify(myObject)))
// After sending (on server)
var objectReceived = JSON.parse(decodeURIComponent(escape(messageArgs[0])))
The look and feel of a typical SugarCRM Transformation without using any custom endpoint is shown above. If you need to keep track of the “Assigned to / Assigné à”, you need to take a look at:
File: Accounts.ktr
There you will notice a second path where the system takes the assigned full name, do the necessary transformation through a custom endpoint, which can be found in
Bitbucket Repository: zitta.sugarondemand.com
File: rest_example.php
It should go under custom\clients\base\api\
In red, you will find the objects you need to update:
1 – Each module will have a different CSV file. Therefore, is necessary to update its name;
2 – Get Variables is automatically set by the JOB. If you want to test, the default value is already set as POST
3 – The Script has the endpoint that directs to where you want to put data in.
4 – SugarCRM REST: it has the variables for URL OAUTH and BULK. You will want to update it according to the instance you are working with.
Working with jobs is not different from working with transformations.
A typical view of a sequenced job is like the following.
The start content should be like this when activating the task via CRON / Scheduler:
Passing parameters to the existing transformations should be made like the following when using the transformations mentioned above.
For setting up our jobs, though, we do not use the start function. We call them from the Linux / or Windows Scheduler.
It allows us to run the jobs in a given period
https://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation#KitchenUserDocumentation-Scheduling
A batch file (or sh when using Linux) should be created to trigger the command line. Here is an example. You need to run Kitchen from the its working folder.
run_quotes.bat:
cd c:\data-integration
c:\data-integration\kitchen.bat /file:c:\SUGAR\QuotesJob.kjb /rep:”LocalRepo” /level:Basic > c:\SUGAR\QuotesJob_log.log
A typical usage can be viewed under Windows Scheduler.
When creating a new task, try using Create Task instead of Create Basic Task.