Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying case-sensitivity, accent marks, kana character types, and character width.
Collation is a very interesting concept but I quite often see it is heavily neglected. I have seen developer and DBA looking for a workaround to fix collation error rather than understanding if the side effect of the workaround. Collation is a very deep subject. Today in most simple way I would like to explain that different collation can return different result. Without understanding business needs (and sensitivity) one should not change the collation of the columns or database.
Note: Different languages will have different sort orders.
b, etc. are treated in the same way, then it is case-insensitive. A computer treats
a differently because it uses ASCII code to differentiate the input. The ASCII value of
A is 65, while
a is 97. The ASCII value of
B is 66 and
b is 98.
O are treated in the same way, then it is accent-insensitive. A computer treats
A differently because it uses ASCII code for differentiating the input. The ASCII value of
a is 97 and
A 225. The ASCII value of
o is 111 and
O is 243.
When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.
When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive.
To test collation effects try executing the following SQL script:
— Creating table
CREATE TABLE TestCollationTable
FirstName_CI VARCHAR(20) COLLATE Latin1_General_CI_AS,
FirstName_CS VARCHAR(20) COLLATE Latin1_General_CS_AS,
— Inserting data
INSERT INTO [TestCollationTable] VALUES (‘Ahmed’, ‘Ahmed’);
INSERT INTO [TestCollationTable] VALUES (‘ahmed’, ‘ahmed’);
— If you run the following script, you will see the final result (effect) of collation:
ORDER BY [FirstName_CI];
ORDER BY [FirstName_CS];