You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we compaction data files, the row id changes. This causes us to need to update the index files whenever we compact. When the index files are updated, it invalidates them in the cache, degrading query performance. If row ids were stable when rows were moved, this would not happen.
Scope
This epic makes row ids stable after moving. It does not make them stable after updates. Rows that are updated will be deleted and appended under new ids.
A future epic will cover "primary keys", which will be the point at which row ids will be stable after updates in addition to moves. This is kept out of scope for now to keep the workload of this manageable.
Design
In very simple terms:
Add row ids as auto-incrementing u64 id. The manifest will track max_row_id and assign in similar process as fragment ids are assigned during the commit loop.
Each fragment metadata will contain a small row id index. This index maps from row id to row address. (Row address is what we currently call _rowid.) In most cases, such as after an append, this will be a simple range of values (max_row_id + 1)..(physical_rows + max_row_id + 1).
Deletion files will be superceded by tombstones contained in the row id index. This cuts down on total number of files to manage.
A new feature flag will be introduced to make sure older readers don't try to interpret these new row ids.
Motivation
When we compaction data files, the row id changes. This causes us to need to update the index files whenever we compact. When the index files are updated, it invalidates them in the cache, degrading query performance. If row ids were stable when rows were moved, this would not happen.
Scope
This epic makes row ids stable after moving. It does not make them stable after updates. Rows that are updated will be deleted and appended under new ids.
A future epic will cover "primary keys", which will be the point at which row ids will be stable after updates in addition to moves. This is kept out of scope for now to keep the workload of this manageable.
Design
In very simple terms:
max_row_id
and assign in similar process as fragment ids are assigned during the commit loop._rowid
.) In most cases, such as after an append, this will be a simple range of values(max_row_id + 1)..(physical_rows + max_row_id + 1)
.Plan
The text was updated successfully, but these errors were encountered: