Friday, March 26, 2010

ROW_NUMBER() window function

0 comments
ROW_NUMBER() OVER function can be used as a function to limit rows by specifying no of rows to view.

SELECT * FROM
( SELECT
ROW_NUMBER() OVER (ORDER BY sort_key ASC) AS row_number,
COLUMNS
FROM tablename
) A(name of select statement)
WHERE row_number <= 10


Thursday, March 25, 2010

Select statement to Extract text file in a column and to save images

0 comments
1. Text File in a column
Select statement can be utilized as to select data from a text file in a single column where 1 is as ID, Asia as region and the last column will contain the data from a text file.
SELECT 1 AS ID, 'ASIA' AS REGION, *
FROM OPENROWSET ( BULK 'C:\mysql.TXT',SINGLE_CLOB) AS MYTABLE
in the OPENROWSET parenthesis replace the path with your text file path.
Save an image in a binary format
Following query will save picture in sql db in binary format.

UPDATE CompanyInfo SET Logo = A.BulkColumn FROM
(SELECT * FROM OPENROWSET(BULK N'C:\Image\abc.jpg', SINGLE_BLOB) B) A

Things to consider when executing.
1. Place Power Link image somewhere at client database server’s machine.
2. Replace C:\Image\abc.jpg with the physical image path.
3. Check CompanyInfo table that it contains a record.
4. Execute above script to update image for that record.

Tuesday, March 23, 2010

Select Column Level Fields from more than one table

0 comments
How many books have been purchased along with their quotations in the month of Jan 2010, in order to aggregated values from different table following is an easy way to do it.

Select count(b.book_id) as "Total Books"
,(Select count(q.id)
from tbl_q_received as "q"
where date_format(b.purchase_date,'%y-%m-d')=date_format(b.purchase_date,'%y-%m-d')) as "Total Quotations"
from tbl_book b
where b.purchase_date between '2010-01-01 00:00:00' and '2010-01-3123:59:59'

Sunday, March 21, 2010

How to Choose Date Format in MYSQL

2 comments
While writing a query in MYSQL for getting result up to Hour in dates one should use date format define below because it would not duplicate the values for the same records while matching them to dimension table.
Date_format('your_date','%d-%m-%y-H')
note: Capital H is for (1-24 hours format) while small h is for (1-12 hours format)

Example:
Select date_format(registration_date,'%d-%m-%y-%H') as "Reg_date", count(emp_id)
from emp
group by Reg_date


in order to get not NULL value one should define a separate field for selecting date from a database because defining manually first time date value in field would never bring null value and all your measures may be zero but not null...because it would generate an error i.e. can not convert nvarchar value to field INT

or another solution to it is to set default value to zero.





Wednesday, March 10, 2010

Single Insertion for all values in ETL script

0 comments
It is common practice to get single value from the transactional database and then add to it immediately in the database and then updating rest of the values but it is not efficient because in huge data it is very difficult to update i.e. it is time consuming therefore it is suggested to first select all the values i.e. KPIs from the transactional database using transformation in the Pentaho and store these KPIs values in variable using SET VARIABLE as mentioned below

select your kpis query values using input

SET VARIABLE

Field Name Variable name Scope default value
from input My_Variable valid in root job 0(in case of int)

in the set variable you can store each value you get from the query using INPUT instead of updating get all these values store in the variable using GET VARIABLE in below mentioned way

GET VARIABLE

Name of Variable Name of same variable set as in SET VARIABLE
name to variable as in database ${variable name as defined in SET VARIABLE}

pick a INSERT from OUTPUT get all the variables in the last grid defined in the GET variable and get only those values which want to look up in the database and on the basis of that you want to insert values in the destination............................. in this is way it not required for your ETL to every time get value and update it in the warehouse on the basis of some criteria instead it is better to have all the values ready to insert in one go using single insertion weather it is 1000 columns or a single column think how complected it is to fill single row through 999 updating including a single insertion.......





Hour Basis Records from Transactional Databases to Warehouse

0 comments
To get hour basis records from the source to the warehouse it is recommended to have a table having a date field in the warehouse database i.e. (example 2010.01.01 17:00:00) .

first extract that date and then add 29 minutes to it using add(minutes,date) function for every KPI (Key Performance Indicator) set this date value in your SET VARIABLE and then access it in every desire Transformation through out he ETL script

It is possible within this short time period a record may not exist in the source data and the insertion will through an error for not entering string or nvarchar value in case of Integer value in the destination so remember set the default value to zero 0 while setting the name to variable in the SET VARIABLE property pallet. finally having successfully entering all the values in the destination update last processing date in the same table you created in the destination with the addition of 30 minutes.

Friday, March 5, 2010

Setting and Getting varibles in Pentaho Transformations

0 comments
while working with the variables in pentaho Transformations one can select specific records in global and local variables with the help of SET VARIABLE AND GET VARIABLE available in the transformations

for example:
to get a record from source and then save it into SET VARIABLE there you can define the scope of the variable i.e. valid in root etc.. in the first column name the variable and then select the scope of the variable from the drop down combo box...


now for accessing the same variables' value in any where through the transformations and job what you need to do it take a GET VARIABLE from the menu and open it then in the first column name the variable and then access the same variable you declare in SET VARIABLE
with following style

SET Variable:

variable name Scope

my_variable Valid in root job only

GET VARIABLE::

new variable name declared variable name

my_new_name_for _variable ${my_variable} or %%my_variable%%

by default it is string but u can change the data type through introducing the SELECT VALUES from the menu .......go in the meta data and select the data type for the required Fields and retain the rest of the other fields in their own data types......................