Definition

Both ANSI and ISO published a revision of the SQL standard in 2011, referred to as SQL:2011 [1]. An important new feature of SQL:2011 is the ability to create and manipulate tables whose rows are associated with one or more temporal period definitions.

Historical Background

The first attempt at extending the SQL standard for temporal data support was made in 1995. A set of proposals incorporating the language extensions based on (but not identical to) TSQL2 [2] was submitted to both ANSI and ANSI SQL committees for consideration. However, this effort was abandoned in 2001 because of various reasons. A new attempt at standardizing temporal extensions with a different set of constructs was made in 2008. These language extensions are now part of SQL:2011.

Foundations

The literature on temporal databases recognizes two dimensions of time for temporal data support, e.g., see [3]:

  • Valid time, the time period during which some information is regarded as correctly reflecting the reality

  • Transaction time, the time period during which that information is recorded in the database

In SQL:2011, application-time period tables provide support for valid time, while system-versioned tables provide support for transaction time.

Period Definitions

A major new extension in SQL:2011 is the notion of a period definition. A period definition is a named table component, identifying a pair of columns that capture the period start and the period end time. A period definition can be either an application-time period definition or a system-time period definition. Currently, users are allowed to define at most one application-time period and at most one system-time period per table. Period definitions are only allowed inside a CREATE TABLE statement or an ALTER TABLE statement.

For an application-period table, users get to pick the name of the application-time period, while for a system-versioned table, the SQL standard prescribes the name “SYSTEM_TIME” for the name of system period. For both kinds of tables, users get to pick the names of columns that act as the start and end columns of the period, and the data types of both columns are required to be the same. For an application-period table, the data types of the period start and end columns must be either DATE or a timestamp type, while for a system-versioned table, the data types of the period start and end columns must be a timestamp type with a sub-second precision of at least 6 digits.

SQL:2011 has adopted a closed-open period model, i.e., a period represents all times starting from and including the start time, continuing to but excluding the end time. For a given row, the period end time must be greater than its period start time; in fact, declaring a period definition in a table implies a table constraint that enforces this property.

Application-Time Period Tables

Any table that contains a period definition with a user-defined name is an application-time period table. For example:

CREATE TABLE Emp(ENo INTEGER,EStart DATE,EEnd DATE,EDept INTEGER,PERIOD FOR EPeriod (EStart, EEnd))

Users can pick any name they want for the name of the period. The user is in charge of setting the values of application-time period start and end columns of each row, and the user is free to assign any time values, either in the past, current, or future, for the start and end columns as long as the value of end column is greater than the value of start column. The user is also free to update the values of start and end columns to correct for errors.

The conventional INSERT statement provides sufficient support for setting the initial values of application-time period start and end columns. For example, the following INSERT statement inserts one row into the Emp table:

INSERT INTO EmpVALUES (22217,DATE ‘2010-01-01’,DATE '2011-11-12', 3)

The resulting table looks as shown below (assuming it was empty before):

Eno

EStart

EEnd

EDept

22217

2010-01-01

2011-11-12

3

The conventional UPDATE statement can be used to modify the rows of application-time period tables (including the application-time period start and end times). Similarly, the conventional DELETE statement can be used to delete rows of application-time period tables.

A new feature in SQL:2011 is the ability to specify updates or deletes that are effective within a specified period. This is provided by a syntactic extension to both UPDATE and DELETE statements that lets users specify the period of interest. For example, the following UPDATE statement changes the department of the employee whose number is 22217 from 3 to 4 for the period from Feb. 3, 2011, to Sept. 10, 2011:

UPDATE EmpFOR PORTION OF EPeriodFROM DATE '2011-02-03'TO DATE '2011-09-10'SET EDept = 4WHERE ENo = 22217

Let P be the period from Feb. 3, 2011, to Sept. 10, 2011. To execute this statement, the DBMS locates all rows whose application-time period overlaps P. Any overlapping row whose application-time period is contained in P is simply updated. If an overlapping row whose application-time period has a portion either strictly before or strictly after P, then that row gets split into two or three contiguous rows depending on the extent of overlap, and of these, the row whose application-time period is contained in P is updated. For example, suppose the following is the only overlapping row:

ENo

EStart

EEnd

EDept

22217

2010-01-01

2011-11-12

3

Since the application-time period of the above row extends beyond P at both ends, the result of the UPDATE statement will be these three rows:

ENo

EStart

EEnd

EDept

22217

2010-01-01

2011-02-03

3

22217

2011-02-03

2011-09-10

4

22217

2011-09-10

2011-11-12

3

The DELETE statement is similarly enhanced with FOR PORTION OF syntax to facilitate deletes that are only effective within a specified period. For example, the following DELETE statement removes the employee whose number is 22217 for the period from Feb. 3, 2011, to Sept. 10, 2011:

DELETE EmpFOR PORTION OF EPeriodFROM DATE '2011-02-03'TO DATE '2011-09-10'WHERE ENo = 22217

Similar to the UPDATE example, any row whose application-time period is contained in P from Feb. 3, 2011, to Sept. 10, 2011, is simply deleted. If an overlapping row whose application-time period has a portion either strictly before or strictly after P, then that row gets split into two or three contiguous rows, and of these, the row whose application-time period is contained in P is deleted. For example, suppose the following is the only overlapping row:

ENo

EStart

EEnd

EDept

22217

2010-01-01

2011-11-12

3

The result of the statement will be these two rows:

ENo

EStart

EEnd

EDept

22217

2010-01-01

2011-02-03

3

22217

2011-09-10

2011-11-12

3

Constraints on Application-Time Period Tables

Constraints on application-time period tables need additional capabilities to account for the temporal aspect of the data. For example, unlike regular tables, unique constraints defined on application-time period tables need to allow for multiple rows with the same key value as long as their application-time periods do not overlap. SQL:2011 provides additional syntax for defining the primary and unique key constraints for application-time period tables to permit such keys, as shown below:

ALTER TABLE EmpADD PRIMARY KEY (ENo,EPeriod WITHOUT OVERLAPS)

Similarly, unlike regular tables, referential constraints defined on application-time period tables need to check the application-time period of every row in a child table is contained in the application-time period of the corresponding matching row (or the union of application-time periods of two or more contiguous matching rows) in the parent table. Again, SQL:2011 provides additional syntax for defining the referential constraints for application-time period tables to check for such a condition, as shown below:

ALTER TABLE EmpADD FOREIGN KEY(Edept, PERIOD EPeriod)REFERENCES Dept(DNo, PERIOD DPeriod)

Querying Application-Time Period Tables

In SQL:2011, application-time period tables can be queried using the regular query syntax. For example, to retrieve the department where the employee 22217 worked as of January 2, 2011, one can express the query as

SELECT Name, EdeptFROM EmpWHERE ENo = 22217AND EStart <= DATE '2011-01-02'AND EEnd > DATE '2011-01-02'

A simpler way to formulate the above query would be to employ one of the period predicates provided in SQL:2011 for expressing conditions involving periods: CONTAINS, OVERLAPS, EQUALS, PRECEDES, SUCCEEDS, IMMEDIATELY PRECEDES, and IMMEDIATELY SUCCEEDS. For example, the above query could also be expressed using the CONTAINS predicate, as shown below:

SELECT Ename, EdeptFROM EmpWHERE ENo = 22217 ANDEPeriod CONTAINS DATE '2011-01-02'

If one wanted to know all the departments where the employee whose number is 22217 worked during the period from January 1, 2010, to January 1, 2011, one could formulate the query as

SELECT Ename, EdeptFROM EmpWHERE ENo = 22217AND EStart < DATE '2011-01-01'AND EEnd > DATE '2010-01-01'

Alternatively, the same query could be expressed using the OVERLAPS predicate as

SELECT Ename, EdeptFROM EmpWHERE ENo = 22217 ANDEPeriod OVERLAPSPERIOD (DATE '2010-01-01',DATE '2011-01-01')

Period predicates are functionally similar to the well-known Allen’s interval operators [4], but there are some differences between the two. For example, unlike Allen’s overlaps operator, SQL:2011’s OVERLAPS predicate is a true test of period overlap. Similarly, unlike Allen’s contains operator, SQL:2011’s CONTAINS predicate is a true test of period containment. The reader is referred to [5] for further discussion on the correspondence between SQL:2011’s period predicates and Allen’s operators.

System-Versioned Tables

Any table that contains a period definition with the standard-specified name, SYSTEM_TIME, and includes the keywords WITH SYSTEM VERSIONING in its definition is a system-versioned table. For example:

CREATE TABLE EmpENo INTEGER,Sys_start TIMESTAMP(12) GENERATE ALWAYSAS ROW START, Sys_end TIMESTAMP(12)GENERATE ALWAYS AS ROW END, ENameVARCHAR(30), PERIOD FOR SYSTEM_TIME(Sys_start, Sys_end) ) WITH SYSTEMVERSIONING

Users are allowed to define at most one period named SYSTEM_TIME for a given table. Unlike application-time period tables, users are not allowed to assign values for the start and end columns of rows on insertion, nor to update the start and end times of the rows once they are inserted. The values for these columns are chosen by the system. This is the reason why the definitions of Sys_start or Sys_end columns in the above table definition must include the keywords GENERATED ALWAYS.

Whenever a row is inserted into a system-versioned table, the system sets the value of the start time to a special value, called transaction timestamp, and the value of end column to the highest value of the column’s data type. Every transaction is assumed to be associated with a transaction timestamp. SQL:2011 leaves it up to SQL-implementations to pick an appropriate value for the transaction timestamp of a transaction, but it does require the transaction timestamp of a transaction to remain fixed during the entire transaction.

Similar to application-time periods, system-time periods use closed-open period model. At any given point in time, a row in a system-versioned table is regarded as current system row if the system-time period of that row contains the current time. A row that is not a current system row is regarded as a historical system row.

INSERT into a system-versioned table automatically sets the value of Sys_start column to the transaction timestamp and the value of Sys_end column to the highest value of the column’s data type. For example, assume the following INSERT statement executed in a transaction whose transaction timestamp is 2012-01-01 09:00:00 (the sub-second portion of the timestamp value is not shown in any of the examples for readability):

INSERT INTO Emp (ENo, EName)VALUES (22217, 'Joe')

The resulting table looks as shown below (assuming it was empty before):

ENo

Sys_Start

Sys_End

EName

22217

2012-01-01 09:00:00

9999-12-31 23:59:59

Joe

UPDATE and DELETE on system-versioned tables only operate on current system rows, and they result in the automatic insertion of a historical system row for every current system row that is updated or deleted. Users are not allowed to update or delete historical system rows. Users are also not allowed to modify the system-time period start or the end time of both current system rows and historical system rows.

An UPDATE statement on a system-versioned table first inserts a copy of the old row with its system-time period end time set to the transaction timestamp, indicating that the row ceased to be current as of the transaction timestamp. It then updates the row while changing its system-period start time to the transaction timestamp, indicating that the updated row be the current system row as of the transaction timestamp. For example, suppose the content of the Emp table is as shown above. The following UPDATE statement changes the name of the employee whose number is 22217 from Joe to Tom effective from the transaction timestamp of the transaction in which the UPDATE statement was executed:

UPDATE EmpSET EName = 'Tom'WHERE ENo = 22217

A historical system row that corresponds to the state of the row prior to the update is first inserted, and then the update is performed. Assuming the above statement is executed in a transaction with the transaction timestamp 2012-02-03 10:00:00, the final result will be these two rows:

ENo

Sys_Start

Sys_End

EName

22217

2012-01-01 09:00:00

2012-02-03 10:00:00

Joe

22217

2012-02-03 10:00:00

9999-12-31 23:59:59

Tom

In this example, the row whose name is Tom is the current system row, while the row whose name is Joe is the historical system row. Note that historical system rows created as a result of sequence of updates for a given row form one contiguous chain without any gap between their system-time periods.

A DELETE statement on a system-versioned table does not actually delete the qualifying rows; instead it changes the system-time period end time of those rows to the transaction timestamp, indicating that those rows ceased to be current as of the transaction timestamp. For example, suppose the content of the Emp table is as shown above. The following DELETE statement simply changes the system-time period end time of the current system row for the employee 22217 to the transaction timestamp of the transaction in which the DELETE statement was executed:

DELETE FROM EmpWHERE ENo = 22217

Assuming the above statement is executed in a transaction with the transaction timestamp 2012-06-01 00:00:00, the final result will be the following rows:

ENo

Sys_Start

Sys_End

EName

22217

2012-01-01 09:00:00

2012-02-03 10:00:00

Joe

22217

2012-02-03 10:00:00

2012-06-01 00:00:00

Tom

In contrast to the application-time period tables, FOR PORTION OF SYSTEM_TIME is not needed (and hence not allowed) for the UPDATE and DELETE statements on system-versioned tables.

Constraints on System-Versioned Tables

Unlike application-time period tables, the definition of constraints on system-versioned tables need no additional capabilities to account for the temporal aspect of the data they capture. This is because constraints on system-versioned tables need only be enforced on the current system rows. Historical system rows in a system-versioned table form immutable snapshots of the past. Any constraints that were in effect when a historical system row was created would have already been checked when that row was a current system row, so there is no need to enforce constraints on historical system rows.

For example, the following ALTER TABLE statement specifies ENo column as the primary key of Emp table:

ALTER TABLE EmpADD PRIMARY KEY (ENo)

The above constraint ensures there exists exactly one current system row with a given ENo value.

Similarly, the following ALTER TABLE statement specifies a referential constraint between Emp and Dept tables:

ALTER TABLE EmpADD FOREIGN KEY (Edept)REFERENCES Dept (DNo)

The above constraint is again enforced only on the current system rows of Emp and Dept tables.

Querying System-Versioned Tables

SQL:2011 provides three syntactic extensions for retrieving rows from system-versioned tables. The first extension is the FOR SYSTEM_TIME AS OF syntax that is useful for querying the table content as of a specified point in time. For example, the following query retrieves the rows of Emp that were current as of Jan. 2, 2011:

SELECT ENo,EName,Sys_Start,Sys_EndFROM Emp FOR SYSTEM_TIME AS OFTIMESTAMP '2011-01-02 00:00:00'

The second and third extensions allow for retrieving the content of a system-versioned table between any two points in time. The following query returns all rows that were current starting from TIMESTAMP '2011-01-02 00:00:00’up to (but not including) TIMEST AMP '2011-12-31 00:00:00':

SELECT ENo,EName,Sys_Start,Sys_EndFROM Emp FOR SYSTEM_TIME FROMTIMESTAMP '2011-01-02 00:00:00’TOTIMESTAMP '2011-12-31 00:00:00'

In contrast, the following query returns all rows that were current starting from TIMESTAMP '2011-01-02 00:00:00’up to (and including) TIMESTAMP '2011-12-31 00:00:00':

SELECT ENo,EName,Sys_Start,Sys_EndFROM Emp FOR SYSTEM_TIME BETWEENTIMESTAMP '2011-01-02 00:00:00'ANDTIMESTAMP '2011-12-31 00:00:00'

If a query on system-versioned tables does not specify any of the above three syntactic options, then that query is assumed to specify FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP by default, and the query returns only the current system rows as the result. For example, the following query returns only the current system rows of Emp table:

SELECT ENo,EName,Sys_Start,Sys_EndFROM Emp

Bitemporal Tables

In SQL:2011, a table that contains both the system-time period definition and an application-time period definition and includes the keywords WITH SYSTEM VERSIONING is essentially a bitemporal table. Such tables combine the capabilities of both system-versioned and application-time period tables. Rows in such tables are associated with both the system-time period and the application-time period. Queries on such tables can specify predicates on both application-time periods and system-time periods to qualify rows that will be returned in the query result.

Cross-References