sqlite-history/README.md
# sqlite-history
[![PyPI](https://img.shields.io/pypi/v/sqlite-history.svg)](https://pypi.org/project/sqlite-history/)
[![Tests](https://github.com/simonw/sqlite-history/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-history/actions?query=workflow%3ATest)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-history?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-history/releases)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-history/blob/main/LICENSE)
Track changes to SQLite tables using triggers
For more on this project: [sqlite-history: tracking changes to SQLite tables using triggers](https://simonwillison.net/2023/Apr/15/sqlite-history/)
## Installation
Install this library using `pip`:
pip install sqlite-history
## Usage
This library can be used to configure triggers on a SQLite database such that any inserts, updates or deletes against a table will have their changes recorded in a separate table.
You can enable history tracking for a table using the `configure_history()` function:
import sqlite_history
import sqlite3
conn = sqlite3.connect("data.db")
conn.execute("CREATE TABLE table1 (id INTEGER PRIMARY KEY, name TEXT)")
sqlite_history.configure_history(conn, "table1")
Or you can use the CLI interface, available via `python -m sqlite_history`:
python -m sqlite_history data.db table1 [table2 table3 ...]
Use `--all` to configure it for all tables:
python -m sqlite_history data.db --all
## How this works
Given a table with the following schema:
<!-- [[[cog
import cog
create_table_sql = """
CREATE TABLE people (
id INTEGER PRIMARY KEY,
name TEXT,
age INTEGER,
weight REAL
);
""".strip()
cog.out(
"```sql\n{}\n```".format(create_table_sql)
)
]]] -->
```sql
CREATE TABLE people (
id INTEGER PRIMARY KEY,
name TEXT,
age INTEGER,
weight REAL
);
```
<!-- [[[end]]] -->
This library will create a new table called `_people_history` with the following schema:
<!-- [[[cog
from sqlite_history import sql
import sqlite3
db = sqlite3.connect(":memory:")
db.execute(create_table_sql)
columns_and_types = sql.table_columns_and_types(db, "people")
history_schema = sql.history_table_sql("people", columns_and_types)
cog.out(
"```sql\n{}\n```".format(history_schema.strip())
)
]]] -->
```sql
CREATE TABLE _people_history (
_rowid INTEGER,
id INTEGER,
name TEXT,
age INTEGER,
weight REAL,
_version INTEGER,
_updated INTEGER,
_mask INTEGER
);
CREATE INDEX idx_people_history_rowid ON _people_history (_rowid);
```
<!-- [[[end]]] -->
The `_rowid` column references the `rowid` of the row in the original table that is being tracked. If a row has been updated multiple times there will be multiple rows with the same `_rowid` in this table.
The `id`, `name`, `age` and `weight` columns represent the new values assigned to the row when it was updated. These can also be `null`, which might represent no change or might represent the value being set to `null` (hence the `_mask` column).
The `_version` column is a monotonically increasing integer that is incremented each time a row is updated.
The `_updated` column is a timestamp showing when the change was recorded. This is stored in milliseconds since the Unix epoch - to convert that to a human-readable UTC date you can use `strftime('%Y-%m-%d %H:%M:%S', _updated / 1000, 'unixepoch')` in your SQL queries.
The `_mask` column is a bit mask that indicates which columns changed in an update. The bit mask is calculated by adding together the following values:
1: id
2: name
4: age
8: weight
Tables with different schemas will have different `_mask` values.
A `_mask` of `-1` indicates that the row was deleted.
The following triggers are created to populate the `_people_history` table:
<!-- [[[cog
triggers_sql = sql.triggers_sql("people", [c[0] for c in columns_and_types])
cog.out(
"```sql\n{}\n```".format(triggers_sql.strip())
)
]]] -->
```sql
CREATE TRIGGER people_insert_history
AFTER INSERT ON people
BEGIN
INSERT INTO _people_history (_rowid, id, name, age, weight, _version, _updated, _mask)
VALUES (new.rowid, new.id, new.name, new.age, new.weight, 1, cast((julianday('now') - 2440587.5) * 86400 * 1000 as integer), 15);
END;
CREATE TRIGGER people_update_history
AFTER UPDATE ON people
FOR EACH ROW
BEGIN
INSERT INTO _people_history (_rowid, id, name, age, weight, _version, _updated, _mask)
SELECT old.rowid,
CASE WHEN old.id != new.id then new.id else null end,
CASE WHEN old.name != new.name then new.name else null end,
CASE WHEN old.age != new.age then new.age else null end,
CASE WHEN old.weight != new.weight then new.weight else null end,
(SELECT MAX(_version) FROM _people_history WHERE _rowid = old.rowid) + 1,
cast((julianday('now') - 2440587.5) * 86400 * 1000 as integer),
(CASE WHEN old.id != new.id then 1 else 0 end) + (CASE WHEN old.name != new.name then 2 else 0 end) + (CASE WHEN old.age != new.age then 4 else 0 end) + (CASE WHEN old.weight != new.weight then 8 else 0 end)
WHERE old.id != new.id or old.name != new.name or old.age != new.age or old.weight != new.weight;
END;
CREATE TRIGGER people_delete_history
AFTER DELETE ON people
BEGIN
INSERT INTO _people_history (_rowid, id, name, age, weight, _version, _updated, _mask)
VALUES (
old.rowid,
old.id, old.name, old.age, old.weight,
(SELECT COALESCE(MAX(_version), 0) from _people_history WHERE _rowid = old.rowid) + 1,
cast((julianday('now') - 2440587.5) * 86400 * 1000 as integer),
-1
);
END;
```
<!-- [[[end]]] -->
## Development
To contribute to this library, first checkout the code. Then create a new virtual environment:
cd sqlite-history
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest