Oracle Consulting Oracle Training Oracle Support Development
Oracle Books
SQL Server Books
IT Books
Job Interview Books
Rampant Horse Books
911 Series
Pedagogue Books

Oracle Software
Write for Rampant
Publish with Rampant
Rampant News
Rampant Authors
Rampant Staff
Oracle News
Oracle Forum
Oracle Tips
Articles by our Authors
Press Releases
SQL Server Books

Oracle 11g Books

Oracle tuning

Oracle training

Oracle support

Remote Oracle


Privacy Policy


  SQL Server Tips by Burleson

SQL Server Collation

A collation is a particular set of rules for using characters for a language or alphabet. Although the same language might be spoken by several nations, there are national or cultural variations, not to mention nations with several languages. Microsoft Windows created the Language ID Reference Number (LCID), a 32 bit code for nearly 200 languages. The LCID is more than a number that identifies a national language, it contains information encoded in its bits:

The first ten bits are the primary language ID in the range 0x200 to 0x3FF. Bits ten to fifteen are the sub-language ID in the range 0x20 to 0x3F, for the same language from different regions. Bits sixteen to nineteen are the sort ID and the remaining twelve bits are reserved and should be zero.

Collations are responsible for determining the correct characters and how they are sorted or compared. In SQL Server 2000, different collations can coexist down to the level of columns.

Each SQL Server collation determines:

* The sort order for Unicode.
* The sort order for ASCII.
* The code page used for ASCII.

ASCII characters in multilingual databases are problematic because ASCII has different character sets, called code pages. Converting between code pages is difficult because identical characters from different code pages might have different ASCII codes and some characters have no equivalent. With characters in Unicode there is no such problem.

The Sort ID field determines the sort order, which is very important when comparing or sorting data. There are five considerations about the sort order:

* Ascending or descending? (is ‘a’>‘b’?)
* Case-sensitive? (is ‘a’>‘A’?)
* Accent-sensitive? (is ‘a’=‘á’?)
* Character width? (is ‘a’>‘aa’?)
* Kana character types? (is ‘?’=‘?’?)

SQL Server 2000 has two types of collations:

* Windows collations – use the Windows locale.
* SQL collations - provided for backwards compatibility with sort orders.

A SQL collation name consists of four components:

* SortRules – name of the alphabet or language.
* Pref - uppercase preference. (optional)
* CodePage - code page. (optional)
* CaseSensitivity + AccentSensitivity or BIN

CaseSensitivity can be either CI (case insensitive) or CS (case sensitive)

AccentSensitivity can be either AI (accent insensitive) or AS (accent sensitive)

BIN means that binary sort order is used, instead of text order.

For example SQL_Latin1_General_Pref_CP437_CI_AS is Latin1_General(alphabet), Pref(uppercase preference), CP437(code page 437), CI(case insensitive) and AS(accent sensitive).

The above book excerpt is from:

Super SQL Server Systems
Turbocharge Database Performance with C++ External Procedures

ISBN: 0-9761573-2-2
Joseph Gama, P. J. Naughter  

Linux Oracle commands syntax poster

ION Oracle tuning software

Oracle data dictionary reference poster

Oracle Forum

BC Oracle consulting support training

BC remote Oracle DBA   



 Copyright © 1996 -2017 by Burleson. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation. SQL Server® is the registered trademark of Microsoft Corporation. 
Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks

Hit Counter