Proposal For Implementation of T591 in Apache Derby

SQL 2003 Specifies Unique Constraint over null able column as an optional feature (T591).

According to the sql spec when a unique constraint is created over null able columns it should consider nulls as unequal ie if one or more part of the unique key is null, it shouldn't cause a constraint violation if the non null parts are equal. The null part of the key ensures that the key is unique irrespective of the value of non null part of the key.

Current behavior
Now Apache Derby support Unique Constraint only over not null columns (mandatory feature E141-2). It is mandatory to declare the columns, participating in a unique constraint as not null, failing to do so results an SQLException (state 42831).
For example

create table table_one (id integer, data varchar (20))
alter table table_one add constraint unique_key unique (i)

will fail because id is not declared as not null.
This proposal is to support creation of unique constraint over null able columns (ie the above example shouldn't cause any exception).

Rational Behind the Proposal

1. Better compliance to the sql specification.
2. Will make it easier to port (to derby) an application which relies on this feature.

Some Examples of the new proposed behavior

1st Insert

2nd Insert

Remark

null, null, 1

null, null,1

no exception

null, 1, null

null, null,1

no exception

1, 1, null

1, 1, null

no exception

null, null, null

null, null, null

no exception

1, 1, 1

1, 1, 1

Unique Constraint Violation


In the above sample there will be a failure (due to unique constraint violation) only if all key parts are non null and have values which are already held by another single record.

Changes from existing behavior


This will be a new feature and will not effect the existing behavior of Unique Constraint. Applications relying on existing behavior will experience no change as the unique constraint of non nullable column will continue to behave the same way.

Currently creation of unique constraint checks if the key parts are null able and fails if one or more key part are null able. After this support is implemented this check will be disabled and constraint creation will succeed for null able columns.

Upgrades

Soft Upgrade

In case of soft upgrades this feature will not be available.

Hard Upgrade

During had upgrade existing backing indexes will be dropped and recreated as almost unique indexes. It will be possible to create unique constraint over null able column in upgraded data base. After hard upgrade user will able to create unique constraint over null able columns and will also be able to drop not null constraints even if the column is part of a unique constraint.

Impact on Test Cases

All the existing test cases which rely on failure to create unique constraint over null able column will fail.
A new set of test cases will be required to test this functionality. A subset of the new test cases will be

1. Inserting duplicate keys after creating unique constraint (with and without nulls in key parts)
2. Creating unique constraint on table with existing data (with and without duplicate nulls).
3. Dropping not null constraint after creating unique constraint before and after inserting keys.
4. null ordering of unique constraints. Test with order by null first/last clause involving columns participating in unique constraint.

5. Changing key parts to null, changing key parts to non null.

Documentation

Manuals discussing unique constraint in Apache Derby will need changes to include unique constraints over null able columns.
Sections affected by this change will be

(PDF manuals English version)

Derby Reference Manual
Page 24
Section explaining not null condition can't be dropped if the column is part of unique constraint or primary key.

Page 37
Description of Unique Constraint a clause need to be added that uniqueness is ensured only in case of non null keys. Keys with null parts may have duplicate values.

Page 59
Constraint clause has common description for Unique Constraint and Primary Key. A new description for Unique Constraint will be required.
Page 60
Backing index section mentions that Unique Constraint creates a backing unique index. If the new backing index is not exposed to database user, Unique constraint will need a proper explanation about what its using as backing index.

Error Codes

Description of following error codes needs to modified.

42831
42Z20
X0Y63



Implementation Specifics



Internally unique constraint will be backed by no unique index but during insert, checking the immediate left and right slot to find duplicate (if there is no null in the new key). If a duplicate key is found it the insert will be rejected.

A new attribute in BTree to tell the insert routine that its not a non unique index but almost unique index. New attribute in BTree will be a persistent attribute so the file system will not be compatible with the older versions and will require a hard upgrade routine to migrate the old indexes to new version.

While creating the constraint on a table with existing records, a merge sort is perform to sort the keys before creating index. A new merge sort class will be required to sort this almost unique index (allowing duplicates only if there is null in it).