Saturday, January 23, 2016

SQLite, RowID, INDEX and AUTOINCREMENT

In case you already worked with SQLite you most likely already come across the concept of ROWIDs and their relations to PRIMARY KEY (INDEX). I had a hard time to understand all these topics and I want to share what I learned here (all commands are also available in a GIST).

A ROWID is an identifier so SQLite can identify each row. You can create tables without ROWID but for now simply assume that every table has a hidden field ROWID. You can display it by including the ROWID (or _ROWID_) column in your query. Here’s an example:

CREATE TABLE names_none (name TEXT);

INSERT INTO [names_none] VALUES ("JAMES");
INSERT INTO [names_none] VALUES ("DAVID");
INSERT INTO [names_none] VALUES ("ROBERT");
INSERT INTO [names_none] VALUES ("MICHAEL");
INSERT INTO [names_none] VALUES ("WILLIAM");

SELECT ROWID as RowID, * FROM [names_none];

The result (I’m using DB Browser for SQlite):

clip_image001

So, even this table with only one column has a row ID. You can also use ROWID for normal queries:

select ROWID,* from names_none where ROWID>3;

clip_image002

Because it already has a unique index for each row, SQLite will just reuse the ROWID for and PRIMARY KEY field.

CREATE TABLE names_primarykey (ID INTEGER PRIMARY KEY, name TEXT);

INSERT INTO [names_primarykey] VALUES (NULL, "JAMES");
INSERT INTO [names_primarykey] VALUES (NULL, "DAVID");
INSERT INTO [names_primarykey] VALUES (NULL, "ROBERT");
INSERT INTO [names_primarykey] VALUES (NULL, "MICHAEL");
INSERT INTO [names_primarykey] VALUES (NULL, "WILLIAM");

SELECT ROWID as RowID, * FROM [names_primarykey];

clip_image003

As you can see ROWID and ID are identical. Finally, let’s create a table with PRIMARY KEY AUTOINCREMENT and see if the ROWID is reused again.

CREATE TABLE names_autoincrement (IDA INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT);

INSERT INTO [names_autoincrement] VALUES (NULL, "JAMES");
INSERT INTO [names_autoincrement] VALUES (NULL, "DAVID");
INSERT INTO [names_autoincrement] VALUES (NULL, "ROBERT");
INSERT INTO [names_autoincrement] VALUES (NULL, "MICHAEL");
INSERT INTO [names_autoincrement] VALUES (NULL, "WILLIAM");

SELECT ROWID as RowID, * FROM [names_autoincrement];

clip_image004

SQlite is reusing ROWID again.

Now we will delete some records from the tables to check what happens with the ROWIDs.


delete from [names_none] where name="JAMES" or name="ROBERT";

delete from [names_primarykey] where name="JAMES" or name="ROBERT";

delete from [names_autoincrement] where name="JAMES" or name="ROBERT";


There is no change and all ROWIDs for all tables still look the same.

clip_image005

However, this changes when we compact the database using the VACUUM command. Do the following in DB Browser for SQLite:

  • File -> Close Database (Yes, save changes)
  • File -> Open Database
  • File -> Compact Database, OK

Query the tables again:

SELECT ROWID as RowID, * FROM [names_none];

clip_image006

For the table without index, the ROWIDs has changed, SQLite reorganized them starting with 1. “David” was “2” before and is now “1”. “Michael” was ROWID 4 before, and is now 2.

The other two tables are unaffected, the still have the same ROWIDs.

clip_image007

clip_image008

In other words: SQLite does only change the ROWID if you execute VACUMM and if that table does not have a PRIMARY KEY field.

Right now, there we no difference between using PRIMARY KEY and PRIMARY KEY AUTOINCREMENT. Additionally, the SQLite docs say that AUTOINCREMENT is using extra CPU, memory and disk I/O and should be avoided when possible.

To order to demonstrate the difference between PRIMARY KEY and PRIMARY KEY AUTOINCREMENT, we will insert rows that have a ROWID of 9223372036854775807 which is the largest possible integer supported by SQLite:


INSERT INTO [names_none](ROWID, name) VALUES (9223372036854775807,"XENA");

INSERT INTO [names_primarykey] VALUES (9223372036854775807, "XENA");

INSERT INTO [names_autoincrement] VALUES (9223372036854775807, "XENA");

SQLite will be happy to insert these values into each table:

clip_image009

Now we try to add a row to each table:

INSERT INTO [names_none] VALUES ("ZULU");

clip_image010

SQLite has chosen a random ROWID below 922... , so the insert works.

INSERT INTO [names_primarykey] VALUES (NULL, "ZULU");

clip_image011

The same happened here: Random ROWID below the maximum value and the insert worked.

Finally, we try the same for the PRIMARY KEY AUTOINCREMENT table:

INSERT INTO [names_autoincrement] VALUES (NULL, "ZULU");

This time, we get an error:

database or disk is full: INSERT INTO [names_autoincrement] VALUES (NULL, "ZULU");

The reason for this is that a PRIMARY KEY AUTOINCREMENT tells SQLite to always return a bigger value then the last inserted key (ROWID). Hence, because we already used the largest integer, SQLite can’t go any further and stops with an error.

In other words: You should use AUTOINCREMENT if you depend on that the values are always increasing. For example, if your software generates bills and you print on each bill the unique key, it would be rather confusing for the users that a bill generated Mondays might have a bigger number than the one generated on Thursday.

But, it’s not likely that you every reach the point where your AUTOINCREMENT field stops working: If your key starts with 1 and a table get inserted 100 records per second for 360 days a year, it would take about 2.924.712.087 years (!!) for it to reach 9223372036854775807 value.