Digging into SQL Server 2012 columnstore index

Date:


The SQL Server 11.0 launch (code named “Denali”) introduces a brand new knowledge warehouse question acceleration characteristic based mostly on a brand new kind of index known as the columnstore. Columnstore indexing is formally introduced in SQL Server 2012. It’s working based mostly on xVelocity reminiscence optimised expertise and it improves knowledge warehouse question efficiency considerably. Attributable to the truth that knowledge warehousing, resolution assist methods and enterprise intelligence functions are rising in a short time, we want to have the ability to learn and course of very massive knowledge units rapidly and precisely into helpful data and data. Columnstore index expertise is particularly acceptable for knowledge warehousing knowledge units. It improves the frequent knowledge warehousing queries’ efficiency considerably.

Columnstore index is storing knowledge for every column and joins all of the columns to finish the index. There are numerous benefits of utilizing columnstore indexing as compared with the normal rowstore indexing. The time period “rowstore” is utilizing to explain both a heap or a B-tree that accommodates a number of rows per web page. As columnstore indexing is fairly new, it has some restrictions and limitations. So, you have to be conscious of these limitations if you find yourself planning to implement columnstore index in your knowledge warehouse. On this article we are going to talk about in regards to the under matters:

§  How columnstore index works?

§  Advantages of utilizing columnstore indexes

§  Restrictions of columnstore indexes

§  The way to create a SQL Server columnstore index?

§  Planning for creating columnstore index

§  Selecting columns for a columnstore index

Whereas rowstore indexing shops a number of rows per web page, columnstore index shops every column in disk pages individually. The next picture illustrates the distinction between columnstore and rowstore indexing from storage perspective:

clip_image001[14]

As you may see C1, C2…C6 are saved in several pages, so:

·         solely the columns wanted in a question are fetched from the disk

·         because of the redundancy of information inside a column it’s simpler for knowledge compression

·         due to the info compression and ceaselessly accessed elements of generally used columns are nonetheless stay in reminiscence, therefore, buffer hit fee is improved.

As mentioned, columnstore is working based mostly on xVelocity expertise that’s in frequent with SQL Server Evaluation Providers Tabular Mannequin in addition to PowerPivot. Truly, it doesn’t imply that columnstore indexes have to slot in reminiscence; nonetheless, they will use out there server reminiscence successfully to maneuver parts of columns out and in of reminiscence on demand. As columnstore indexes retailer all knowledge for separate columns in separate pages, utilizing columnstore indexes improves I/O scan efficiency considerably.

There are a number of advantages of utilizing columnstore indexes as compared with rowstore indexes as under:

·         As mentioned, solely the columns wanted in a question are fetched from the disk, so, the info warehouse question efficiency is means quicker for frequent knowledge warehouse queries

·         As knowledge is extremely compressed utilizing xVelocity expertise the disk area reduces successfully

·         Because the pages are considerably compressed, the pages containing probably the most ceaselessly accessed columns stay in reminiscence

·         As batch mode processing that’s a complicated question execution expertise that processes chunks of columns is used, the CPU utilization is decreased.

Columnstore indexing is a brand new expertise, so, you have to be conscious of its restrictions if you’re planning to implement columnstore indexes. The next restrictions needs to be thought-about:

·         Columnstore index is out there solely in SQL Server Enterprise, Developer and Analysis editions, so, you’ll face to the next error message if you wish to use columnstore index in different editions of SQL Server 2012: “CREATE INDEX assertion failed as a result of a columnstore index can’t be created on this version of SQL Server.”

·         Tables containing columnstore indexes can’t be up to date. This restriction is likely to be eliminated within the subsequent releases of SQL Server. Now, find out how to insert, replace or delete knowledge in a desk that accommodates a columnstore index? There are three options for this function; nonetheless, plainly the primary resolution is extra easy than the others.

1.       Drop the columnstore index, carry out any INSERT, UPDATE, DELETE or MERGE operations, and recreate the columnstore index.

2.      Partition the desk and swap partitions. For a bulk insert:

§  insert knowledge right into a staging desk

§  construct a columnstore index on the staging desk

§  swap the staging desk into an empty partition

For different updates:

§  swap a partition out of the principle desk right into a staging desk

§  disable or drop the columnstore index on the staging desk

§  carry out the replace operations

§  rebuild or re-create the columnstore index on the staging desk

§  swap the staging desk again into the principle desk.

3.      Place static knowledge right into a predominant desk with a columnstore index, and put new knowledge and up to date knowledge more likely to change, right into a separate desk with the identical schema that doesn’t have a columnstore index. Apply updates to the desk with the latest knowledge. To question the info, rewrite the question as two queries, one towards every desk, after which mix the 2 end result units with UNION ALL. The sub-query towards the massive predominant desk will profit from the columnstore index. If the updateable desk is far smaller, the shortage of the columnstore index may have much less impact on efficiency. Whereas additionally it is attainable to question a view that’s the UNION ALL of the 2 tables, chances are you’ll not see a transparent efficiency benefit. The efficiency will rely upon the question plan, which is able to rely upon the question, the info, and cardinality estimations. The benefit of utilizing a view is that an INSTEAD OF set off on the view can divert updates to the desk that doesn’t have a columnstore index and the view mechanism can be clear to the person and to functions. In case you use both of those approaches with UNION ALL, take a look at the efficiency on typical queries and determine whether or not the comfort of utilizing this method outweighs any lack of efficiency profit.

Word: As we mentioned, the tables containing columnstore index, can’t be up to date. However, it doesn’t appear to be a good suggestion to make use of columnstore to make a read-only desk. As a result of, columnstore index isn’t designed for this specific function and it’s attainable that Microsoft removes this restriction within the subsequent releases of SQL Server.  

·         Columnstore indexes aren’t supporting greater than 1024 columns

·         Solely nonclustered columnstore indexes can be found (there isn’t a clustered columnstore index)

·         A columnstore index can’t be a novel index

·         Creating columnstore indexes on a view or listed view isn’t supported

·         Columnstore indexes can not embody a sparse column (an atypical column that has an optimized storage for null values)

·         Columnstore indexes can not act as major keys or overseas keys (keep in mind that a columnstore index can’t be a novel index)

·         Columnstore indexes can’t be modified utilizing “ALTER INDEX” assertion. Nevertheless, the “ALTER INDEX” assertion can be utilized to disable and rebuild a columnstore index. So the one strategy to modify a columnstore index is to drop and recreate the columnstore index.

·         The key phrase “INCLUDE” isn’t supported to create a columnstore index

·         Sorting isn’t allowed in a columnstore index, so, “ASC” and “DESC” key phrases aren’t supported. Truly, columnstore indexes are ordered in line with the compression algorithm. Values chosen from a columnstore index is likely to be sorted by the search algorithm, however you could use the ORDER BY clause to ensure sorting of a end result set.

·         A columnstore index doesn’t use and even preserve statistics as rowstore index does

·         A columnstore index doesn’t assist FILESTREAM attribute, so, solely the columns within the desk that aren’t used within the columnstore index can comprise the FILESTREAM attribute.

·         As column retailer index is optimized for in-memory processing, so, server reminiscence limitations needs to be thought-about

·         Columnstore indexes don’t assist SEEK, so, if the desk trace FORCESEEK is used, the optimizer is not going to take into account the columnstore index.

·         Columnstore indexes can’t be mixed with web page and row compression, as columnstore indexes are already compressed in a distinct format.

·         Replication isn’t supported for tables containing columnstore index

·         Change monitoring and alter knowledge seize aren’t supported

·         Filestream isn’t supported

·         The next knowledge sorts can’t be included in a columnstore index:

1.       binary and varbinary

2.      ntext , textual content, and picture

3.      varchar(max) and nvarchar(max)

4.      uniqueidentifier

5.      rowversion (and timestamp)

6.      sql_variant

7.      decimal (and numeric) with precision larger than 18 digits

8.     datetimeoffset with scale larger than 2

9.      CLR sorts (hierarchyid and spatial sorts)

10.  xml

Making a columnstore index is rather like creating another index. Typically, there are two methods to create a columnstore index, creating index utilizing T-SQL statements or utilizing SSMS (SQL Server Administration Studio).

Making a columnstore index utilizing T-SQL

In a question editor window execute the next assertion:

 

CREATE NONCLUSTERED COLUMNSTORE INDEX IndexName

    ON TableName (Column1, Column2, …)

Making a columnstore index utilizing SSMS

Open SQL Server Administration Studio (SSMS) and connect with a SQL Server database engine. Keep in mind that columnstore index is out there simply in SQL Server 201 Enterprise Version.

1.       From “Object Explorer”-> broaden the instance-> broaden the databases-> broaden the database-> broaden the table-> proper click on on “Indexes”->  New Index-> Non-Clustered Columnstore Index

clip_image002[10]

2.      In “New Index” window-> Index Identify (kind a reputation)-> Add-> choose the column-> OK-> OK

clip_image004[8]

Now the columnstore index is created and you’ll see it within the “Indexes” in object explorer.

clip_image005[8]

As columnstore index is a brand new expertise, it has many limitations and restrictions. Though the entire columnstore index restrictions needs to be thought-about, one of the crucial basic and necessary restrictions of columnstore index is that it’s NOT out there in all variations of SQL Server 2012. So, it’s actually necessary to know what model of SQL Server goes for use in manufacturing setting. In case your organisation isn’t going to make use of SQL Server 2012 Enterprise version, you can not use columnstore index in any respect. So, it’s important to plan to create rowstore indexes in your knowledge warehouse.

Attributable to the truth that the indexing is basically associated to the queries, it needs to be investigated in a case by case foundation. Though columnstore indexing is bettering the question efficiency, nonetheless, in some instances it would trigger poorer question efficiency.

A few of the efficiency good thing about a columnstore index is derived from the compression methods that scale back the variety of knowledge pages that have to be learn and manipulated to course of the question. Compression works greatest on character or numeric columns which have massive quantities of duplicated values. For instance, dimension tables might need columns for postal codes, cities, and gross sales areas. If many postal codes are situated in every metropolis, and if many cities are situated in every gross sales area, then the gross sales area column can be probably the most compressed, the town column would have considerably much less compression, and the postal code would have the least compression. Though all columns are good candidates for a columnstore index, including the gross sales area code column to the columnstore index will obtain the best profit from columnstore compression, and the postal code will obtain the least.

References: SQL Server 2012 Books On-line, SQL Server Technical Article: Columnstore Indexes for Quick Information Warehouse Question Processing in SQL Server 11.0; November 2010

 

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

Subscribe

spot_imgspot_img

Popular

More like this
Related

The right way to Cut back Enterprise Dangers

Should you go away your contact heart uncovered...

Japanese authorities confer on weak yen, trace at intervention choice By Reuters

By Tetsushi Kajimoto TOKYO (Reuters) - Japan's...

Why healthcare outsourcing is on the rise

For greater than twenty years, Black E book...