zondag 15 juli 2012

The business value of (un) structured data (big data)

Introduction

In this blogpost I would like to discuss the big data hype and I want to share some thoughts that I have about the business value of Big Data. Two weeks ago I joined a BIDutch session about Big Data. A couple of presentations were given by presenters and they shared their viewpoints on Big Data. I've gathered some thoughts, draw some pictures and mixed them together with information that was presented at the Big data sessions. Below you can see the results.

Big data and the Business value

At this moment there is a believe that structured data is only 5% of the total amount of data that is available around us. This means that the other information is not structured. There are a couple of types of information: structured, semi structured, quasi structured and unstructured information. I will call semistructured, quasi structured and unstructured information all unstructured information to avoid some confusion. Anyway, the other 95% of information is unstructured. Vendors, goeroes, and other people are saying that Big data is growing rapidly (the four V's) but what about the business value?

One of the presenters stated that Big data has more impact on operational level than on strategic level(!). That started me thinking. I'm not sure whether these are general accepted insights of big data. And, perhaps it will move towards more strategic decision making in the future. But, at this moment it seems that the business value of Big data is more focussed on operational level than on strategic level. Is Big Data bridging a gap of business intelligence on the lower part of organizations perhaps? Much of the Business Intelligence implementations are focussed on Tactical and operational levels, nowadays.



In the diagram above I've drawn the types of data (unstructured and structured) in the middle and draw two triangles on the left and on the right of rectangle. On the left the business impact of unstructured information on an organisation and on the right the impact of structured information on an(other) organization. There are two dimensions in this diagram and these are the assumed impact of the (un)structured information on the organisation as a whole and the assumed impact on the operational, tactical and strategic levels.

Conclusion

This blogpost are just some thoughts about big data. It describes possible business impact of big data on businesses. Discussing the impact on a organisation as a whole and the impact of unstructured data on strategic, tactical and operational levels.

Let me know what you think.

Greetz,
Hennie

maandag 9 juli 2012

Table Scan, Index Scan, Index Seek and RID/Key Lookup operators

Introduction

Currently studying for some Microsoft SQL Server exams and I ran into some articles about tablescan-, indexscan-, index seek- and RID/key lookup operators. I was looking for information about the differences between the types of executionplan operators. This blogpost is a description about queryplan operators regarding indexes.


Table Scan

Table Scan means that the whole table is scanned and every row is returned. Typically, this happens when there is no index defined on the table.


Clustered Index Scan

A Clustered Index Scan is when SQL server reads the whole index looking for matches.  Since a scan touches every row in the index, the cost is proportional to the number of rows in the table. This is only useful for small tables.

There seems some discussion on the Pinal Dave's blog about the whether the Index Scan is the same as a Table Scan. It seems to me that there are some slight differences but these can be ignored.


Clustered Index Seek

An Clustered Index Seek is where SQL server uses the B-tree of the index to seek directly for matching records. Clustered Index seeks are preferred for selective queries and this means that fewer rews are returned when this is used. General speaken the Clustered Index Seek step is used when the optimizer decides that the Clustered index can be used for the index seek otherwise a Index Scan is done (which is mainly the same as a Table Scan).


Non-Clustered Index Seek

Whereas the Clustered Index Seek retrieves records at the leaf levels of the B-tree, the non-clustered index has pointers to the actual data in the clustered index or table (heap).


Key Lookup

A Key Lookup happens when a index does not contains all the information to answer a query and the query optimizer decides to use the information of another index to complete the requested information.


RID lookup

Sames as Key Lookup but then the information of a table is used, instead of a index.


Conclusion

In my opinion the following operators can be ordered by best practice when optimizing your queries:
  1. Clustered Index seek (in case of large selective tables).
  2. Non Clustered Index Seek.
  3. Key lookup.
  4. RID lookup.
  5. Index scan (may be useful for small tables)
  6. Table scan

Greetz,
Hennie

vrijdag 6 juli 2012

Creating a SQL Server 2012 playground (part XII)

Introduction

With Master Data Services of Microsoft, you can create a centralized data source and keep it up to date and reduce redundancies accross applications. Also, Master data is a concept than can easily be translated to dimensions (as in starschemas and facts). Therefore it is a concept that is interesting to understand.

This blogpost covers the installation and configuring Master Data Services. Mainly, the installation steps are :
  1. Installation or verification of the installation of Master Data Services
  2. Configuration of MDS :  Create the Master Data Services Database
  3. Configuration of MDS :  Configuration of the MDS Website
  4. Installation  Microsoft SQL Server 2012 Master Data Services Add-in For Microsoft Excel
MDS is supported on x64 operating systems only.

This blogpost is one in a series of blogposts:
  • Creating a VM environment with virtualbox (part I).
  • Configuration of the domain controller (part II).
  • Creating AD users and installing SQL Server 2012 (part III).
  • Installation of Sharepoint (part IV).
  • Adding the tabular mode instance to the SQL Server installation (part V).
  • Adding the powerpivot mode instance to the SQL Server installation (part VI).
  • Configuring SharePoint Central Administration (part VII).
  • Installing Reporting Services Sharepoint mode as Single Server Farm (part VIII).
  • Installing MS SQL Server Powerpivot for Excel 2010 (part IX).
  • Installation of SSDT and the SSDT Power tools (part X).
  • Installation of Contoso and AdventureWorks databases (part XI).
  • Installation of Master Data Services (part XII).
  • Installation of Data Quality Services (part XIII).
  • etc.

Verify installation

If you haven't done, install the Master Data Services component with the SQL Server 2012 installation disc. I've already installed MDS during the standard installation of SQL Server 2012.


Master Data Services has been installed!

Configuration of MDS

Master Data Service isn't ready for normal usage yet and therefore we need to configure MDS  with the configuration manager. There are two steps to follow: Create the Master Data Services database (step 2) and the configuration of the MDS Website (step 3). Step 1 was the verification of the installation of MDS.

2. Create the Master Data Services Database
The Master Data Services Configuration tool can be found in the Start menu.



Review the information that is shown in the Server configuration window and ensure that all the required software has been installed.


Click on “Databases” in the left pane and click the “Create Database” button to start the Create Database wizard. Follow the steps in the wizard and supply the required information to create a new MDS database.


The wizard is start up.


Enter the SQL Server instance name in the window.


Enter the databasename at "Database name".


Enter the administrator account at "User name".


And the creation of the database is in progress.


After the database is created, system settings can be adjusted.



3. Configuration of the MDS Website
The next step is to configure the MDS Website. Click “Web Configuration” on the left part of the window to configure a new IIS web application for MDS. The MDS web application can be created as its own web site (root application) or within another web site. The simplest configuration is to create the MDS web application within the existing Default Web Site.
 
Select “Default Web Site” in the Web Site drop down list and Click the "Create Application…" button to create a new MDS web application. Accept the defaults for web application Alias and Application Pool Name (for a default installation offcourse). Type the User name and Password. This should be the same account as provided in the Service Account step of the Create Database wizard. And, click the OK button to create the new web application.

Click a couple of times on the OK button and leave the default settings as proposed bij Microsoft. The Web Configuration should look similar to that shown below



And then choose the web application you will be taken to the MDS getting started page.


MDS Add in for Excel

The "Microsoft SQL Server 2012 Master Data Services Add-in For Microsoft Excel" gives multiple users the ability to update Master data with Excel. Download the .msi from Microsoft. Double click on the .msi and press Next, Next, install and finish.

Conclusion

In this blogpost i've described the installation of MDS.


Greetz,
Hennie


zondag 1 juli 2012

Creating a SQL Server 2012 playground (part XI)

Introduction

In this blogpost I'll describe the installation of the two demo sample databases: Contoso and AdventureWorks. I've downloaded the Contoso software from the Microsoft Download Center and the Adventureworks from Codeplex.  I've copied them to my VM environment and installed them. In this blogpost I'll show you the installation of these databases and the end result of this process.



This blogpost is one in a series of blogposts:
  • Creating a VM environment with virtualbox (part I).
  • Configuration of the domain controller (part II).
  • Creating AD users and installing SQL Server 2012 (part III).
  • Installation of Sharepoint (part IV).
  • Adding the tabular mode instance to the SQL Server installation (part V).
  • Adding the powerpivot mode instance to the SQL Server installation (part VI).
  • Configuring SharePoint Central Administration (part VII).
  • Installing Reporting Services Sharepoint mode as Single Server Farm (part VIII).
  • Installing MS SQL Server Powerpivot for Excel 2010 (part IX).
  • Installation of SSDT and the SSDT Power tools (part X).
  • Installation of Contoso and AdventureWorks databases (part XI).
  • Installation of Master Data Services (part XII).
  • Installation of Data Quality Services (part XIII).
  • etc.


Installation

The first thing I've done is the installation of the Contoso databases on SQL Server 2012. Download the Contoso software and save the two executable files to your hard disk. Copy them to the VM and execute them:
  • ContosoBIdemoBAK.exe
  • ContosoBIdemoABF.exe


There is one SQL database and one SSAS database. Restore them to the Server with the Restore option. 


On the Codeplex site more demo database are available. A familiar Demodatabase is the AdventureWorks database which is present since SQL Server 2005 version. Now Microsoft has made them available for SQL Server 2012. On Codeplex there quite a couple of downloads available, but you don't need them all. These are the one I've used for my blogpost:
  • AdventureWorks2012_CS
  • AdventureWorks2012
  • AdventureWorksDW2012
  • AdventureWorksLT2012


Reattach the databases in SQL Server 2012 and you're in business.

There are also a couple of analytical databases (tabular and multidimensional) available on Codeplex and these are:
  • AdventureWorks Tabular Model SQL 2012.
  • AW Internet Sales Tabular model.
  • AdventureWorksDW2012Multidimensional.

I've installed them and the endsituation is like the screendump below.



Conclusion

In this blogpost I've described the installation of the Contoso and the adventureworks databases.

Greetz,
Hennie