Cosimo: Rating Storage

Deck files contain only prompt/response pairs. Review history is stored separately so deck files stay easy to edit by hand.

Decision

Use SQLite for review history.

The review history is event data: each exposure records which datum was shown, when it was shown, the user's pre-response guess when one was provided, and the user's post-response feedback. SQLite is a good fit because it gives us durable append behaviour, indexes for scheduling queries, validation constraints, and safe updates without adding punctuation noise to the editable deck format.

Flat File Alternative

An append-only flat file such as CSV, TSV, or line-delimited JSON would be easy to inspect and back up. It would also avoid a database dependency.

The tradeoffs are poor for this data:

CSV and JSON add escaping and punctuation noise around numeric review events.
Querying review history by datum and time would require scanning the whole file or building a second index.
Updates for future scheduling metadata would need full-file rewrites or a compaction mechanism.
Concurrent writes are easier to corrupt than SQLite transactions.

Flat files remain appropriate for decks because humans edit prompt/response content. SQLite is better for review events because the application writes and queries them.

Schema

Current schema version: 15

CREATE TABLE cosimo_schema_version (
    version INTEGER PRIMARY KEY
);

CREATE TABLE review_events (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    datum_key TEXT NOT NULL,
    reviewed_at_unix_seconds INTEGER NOT NULL,
    prompt_guess INTEGER NOT NULL CHECK (prompt_guess BETWEEN 0 AND 4),
    recall_feedback INTEGER NOT NULL CHECK (recall_feedback BETWEEN 1 AND 4)
);

CREATE INDEX review_events_datum_time
    ON review_events (datum_key, reviewed_at_unix_seconds, id);

CREATE TABLE card_schedules (
    datum_key TEXT PRIMARY KEY,
    memory_state TEXT NOT NULL CHECK (
        memory_state IN ('new', 'learning', 'review', 'relearning')
    ),
    due_at_unix_seconds INTEGER NOT NULL,
    stability_days REAL NOT NULL CHECK (stability_days >= 0),
    difficulty REAL NOT NULL CHECK (difficulty >= 0),
    elapsed_days INTEGER NOT NULL CHECK (elapsed_days >= 0),
    scheduled_days INTEGER NOT NULL CHECK (scheduled_days >= 0),
    interval_seconds INTEGER NOT NULL CHECK (interval_seconds >= 0),
    repetitions INTEGER NOT NULL CHECK (repetitions >= 0),
    lapses INTEGER NOT NULL CHECK (lapses >= 0),
    last_reviewed_at_unix_seconds INTEGER,
    last_recall_feedback INTEGER CHECK (
        last_recall_feedback IS NULL OR last_recall_feedback BETWEEN 1 AND 4
    ),
    scheduler_algorithm TEXT NOT NULL DEFAULT 'legacy_fsrs_like_v1'
        CHECK (scheduler_algorithm IN ('legacy_fsrs_like_v1', 'fsrs6')),
    learning_step_index INTEGER CHECK (
        learning_step_index IS NULL OR learning_step_index >= 0
    ),
    CHECK (
        (memory_state IN ('learning', 'relearning')) =
        (learning_step_index IS NOT NULL)
    )
);

CREATE TABLE card_states (
    datum_key TEXT NOT NULL,
    state_type TEXT NOT NULL CHECK (
        state_type IN ('deferred', 'flag', 'suspended')
    ),
    ended_at_unix_seconds INTEGER CHECK (
        ended_at_unix_seconds IS NULL OR ended_at_unix_seconds >= 0
    ),
    PRIMARY KEY (datum_key, state_type)
);

CREATE INDEX card_states_active_type
    ON card_states (state_type, ended_at_unix_seconds, datum_key);

CREATE TABLE card_first_seen (
    datum_key TEXT PRIMARY KEY,
    first_seen_at_unix_seconds INTEGER NOT NULL CHECK (
        first_seen_at_unix_seconds >= 0
    )
);

CREATE INDEX card_first_seen_time
    ON card_first_seen (first_seen_at_unix_seconds, datum_key);

CREATE TABLE card_reversals (
    datum_key TEXT PRIMARY KEY,
    created_by TEXT NOT NULL DEFAULT 'manual' CHECK (created_by IN ('manual', 'bulk')),
    bulk_batch_id INTEGER,
    created_at_unix_seconds INTEGER,
    CHECK ((created_by = 'bulk') = (bulk_batch_id IS NOT NULL))
);

CREATE TABLE card_reversal_batches (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    created_at_unix_seconds INTEGER NOT NULL
);

CREATE TABLE review_passes (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    started_at_unix_seconds INTEGER NOT NULL,
    finished_at_unix_seconds INTEGER NOT NULL,
    was_scheduled INTEGER NOT NULL CHECK (was_scheduled IN (0, 1)),
    ended_early INTEGER NOT NULL CHECK (ended_early IN (0, 1)),
    initial_questions INTEGER NOT NULL CHECK (initial_questions >= 0),
    total_questions INTEGER NOT NULL CHECK (total_questions >= 0),
    unique_cards_reviewed INTEGER NOT NULL CHECK (unique_cards_reviewed >= 0),
    CHECK (finished_at_unix_seconds >= started_at_unix_seconds),
    CHECK (unique_cards_reviewed <= total_questions)
);

CREATE INDEX review_passes_started_at
    ON review_passes (started_at_unix_seconds, id);

CREATE TABLE review_pass_items (
    pass_id INTEGER NOT NULL,
    review_event_id INTEGER NOT NULL,
    position INTEGER NOT NULL CHECK (position > 0),
    was_repeat INTEGER NOT NULL CHECK (was_repeat IN (0, 1)),
    PRIMARY KEY (pass_id, position),
    FOREIGN KEY (pass_id) REFERENCES review_passes(id) ON DELETE CASCADE,
    FOREIGN KEY (review_event_id) REFERENCES review_events(id) ON DELETE CASCADE,
    UNIQUE (pass_id, review_event_id)
);

CREATE INDEX review_pass_items_event
    ON review_pass_items (review_event_id);

datum_key normally uses the stable card ID stored in the deck terminator: =1= in a deck file becomes card:1 in the review database. Legacy review events may still use an older content-derived key such as fnv1a64:...; when a legacy deck is opened and assigned card IDs, matching old review events are migrated to the new card key.

Users who add new cards by hand should use the unnumbered === terminator rather than choosing numeric IDs themselves. Cosimo then assigns IDs after checking both the deck and dormant review-storage rows. Existing numeric terminators should be preserved when editing existing cards; manually reusing a number can deliberately reconnect content to old history and should be treated as a repair operation, not normal editing. Deck metadata may contain next_card_id, which records the next ID Cosimo should consider even after dormant database rows are scrubbed. Scrub updates that high-water mark before deleting inactive rows, so later new cards do not reuse IDs solely because their old database rows were cleaned up.

Generated reverse cards are not stored in the deck file. card_reversals stores the parent card keys that should have a generated reverse direction. The generated reverse card then uses a child key of the form card:1:reverse. Review events, schedules, and suspension state for the reverse direction are stored under that child key, separately from the parent card:1 key. Removing the generated reverse card deletes only the card_reversals row; any existing reverse review events and schedule rows remain dormant and are used again if the parent is made reversible later. Removing the parent card removes the generated reverse card from the active deck view; the old reverse rows remain in the database but are ignored unless a deck backup containing the original parent card ID is restored.

created_by, bulk_batch_id, and card_reversal_batches record whether an active reverse card was created manually or by a bulk reversal operation. This lets Undo Last Bulk Reversal remove only the reverse cards from the latest bulk batch while preserving manual reverse cards and older bulk batches. Rows created by older schema versions are treated as manual reversals because they were not created by a recorded batch.

The database does not store prompt or response text. Those remain in the deck file.

card_first_seen stores the first time this review database saw a physical deck card. It is used for dashboard intake and deck-flow analytics, not for scheduling. The timestamp is not guaranteed to be the user's original card creation date: existing cards are recorded the first time a schema-version-13 or later database is opened or the dashboard is generated. Generated reverse cards are not counted as deck-card intake.

card_states stores sparse per-study-item state that is not part of the deck text. Current state types are suspended, flag, and deferred. Suspension and flag rows are active when ended_at_unix_seconds = NULL; clearing one of those states records an end time. Deferral rows are active while ended_at_unix_seconds is in the future, so the end time is the point at which the card becomes available again. Re-enabling the same state updates the same row, so the table stores current state plus the last cleared marker rather than a full event log. Generated reverse cards use their own child datum key, so a parent card and its reverse direction can be suspended, flagged, or deferred independently. Because this is current state rather than history, Cosimo removes expired deferrals and inactive cleared state rows when opening a deck, before starting a scheduled or forced review pass, and during database scrub or space-reclaim actions.

review_passes stores pass-level data for later analysis: when the pass started and finished, whether it was scheduled, whether it ended early, the initial number of questions, the final number of answered questions, and the number of unique cards reviewed. review_pass_items links each pass to the existing review_events rows in pass order and preserves whether an item was a same-pass repeat in older review data. This keeps per-card ratings normalised in review_events while preserving the session shape needed for pass-level analytics.

Deferring a card before revealing its answer does not create a review event and does not count as an answered question. Current review-pass rows do not store a separate deferred-card count; the active deferral itself is stored as card state.

Rating Values

prompt_guess stores the optional pre-response confidence rating:

0 Not answered: the user revealed the response without recording confidence.
1 Don't know.
2 Probably don't know.
3 Probably know.
4 Certainly know.

Existing schema-version-1 databases created before 0 was allowed are normalised in place when opened. The table is rebuilt with the wider prompt_guess check constraint and existing review events are copied unchanged. The schema version then advances linearly:

Version 1 to 2 adds card_suspensions.
Version 2 to 3 adds review_passes and review_pass_items.
Version 3 to 4 adds interval_seconds to card_schedules; existing rows are initialised from scheduled_days * 86400.
Version 4 to 5 adds card_reversals.
Version 5 to 6 adds card_reversal_batches and the created_by, bulk_batch_id, and created_at_unix_seconds metadata on card_reversals; existing reversible rows are treated as manual reversals.
Version 6 to 7 adds card_reversal_exclusions, which stored parent cards that should not be made reversible by bulk operations in older betas.
Version 7 to 8 adds card_states, moves existing card_suspensions rows into active suspended state rows, and drops the old suspension table.
Version 8 to 9 allows deferred rows in card_states.
Version 9 to 10 retires card_reversal_exclusions. During application load, older rows are written into the deck as non-reversible card markers and the legacy table is then removed.
Version 10 to 11 corrects stored lapse aggregates so only Again on a review-state card counts as a scheduled lapse. The migration recalculates lapses and the final learning/review state from stored review events, but deliberately does not recalculate due dates, difficulty, stability, or intervals.
Version 11 to 12 adds scheduler metadata to card_schedules: a scheduler_algorithm marker and a nullable learning_step_index. Existing schedules are marked as legacy_fsrs_like_v1; learning and relearning rows receive step index 0, while new and review rows keep no step index. The column also accepts fsrs6 for schedules created by the current scheduler.
Version 12 to 13 adds card_first_seen, which records when the review database first observes each physical deck card for dashboard deck-flow analytics.
Version 13 to 14 recomputes card_schedules from stored review_events using the current FSRS-6 scheduler. Review events are unchanged. Schedule rows that cannot be derived from review events are removed, because card_schedules is derived scheduler state rather than the authoritative review history.
Version 14 to 15 removes empty review_passes rows that have no review_pass_items. A pass without an answered question is not study activity; review events and ordinary linked pass rows are unchanged.

recall_feedback stores the post-response recall rating:

1 Again.
2 Hard.
3 Good.
4 Easy.

Recall feedback is still required before a review event is recorded. Revealing without a confidence rating therefore creates a normal review event with prompt_guess = 0 once the user supplies recall feedback.

Exam mode currently does not record review events and does not update schedules. It collects typed answers, grades exact and accepted non-exact answers for the exam summary, and leaves the review database unchanged. When the exam setup's Exact option is selected, non-exact answers are counted incorrect without manual grading and are shown only for answer review. Timed exam limits are summary/report metadata only: when time expires, the current answer text is submitted if present, but no review event, schedule, or review pass row is written. This keeps exam mode as an assessment/reporting flow rather than another scheduling input.

Integrity Checks

Cosimo runs PRAGMA quick_check whenever it opens a review database. If SQLite reports damage or cannot read the file as a database, opening the deck fails before Cosimo applies schema setup, migrates legacy card IDs, or writes deck backups.

After opening and migrating the database, Cosimo also performs higher-level consistency checks against the active deck. It validates stored rating values and schedule memory states through the Rust typed APIs, checks that stored datum keys have the expected shape, runs SQLite's foreign-key check for review pass links, rejects duplicate active deck keys, and rejects active reversible cards whose responses are no longer unique. Dormant history for removed cards or removed reverse directions is allowed to remain in the database so restoring a deck backup or re-enabling a reverse card can recover that history.

Scheduling Use

The scheduler stores FSRS-style memory state in card_schedules: difficulty, stability, due date, scheduled interval, repetitions, lapses, scheduler algorithm marker, optional learning/relearning step index, and whether the card is new, learning, review, or relearning. scheduled_days is the coarse day count used for the next due date after any deterministic interval fuzzing has been applied. interval_seconds is the actual interval used with the last review time to produce the next due timestamp, so learning and relearning steps can use sub-day intervals without encoding fractional days. New schedules use the current FSRS-6 scheduler: 1-minute and 10-minute learning steps for new cards, and a 10-minute relearning step after a review-state lapse. Review-state intervals of sufficient length are fuzzed deterministically from stable card/review data, bounded by elapsed days and the maximum interval; learning and relearning steps are not fuzzed. Before calculating a new FSRS-6 state, Cosimo clamps existing stored stability and difficulty to the reference FSRS bounds, so out-of-range old or manually edited schedule rows cannot feed unbounded values into the next update. The scheduler_algorithm column records whether a stored row was produced by the older legacy_fsrs_like_v1 scheduler or by the current fsrs6 scheduler. The old scheduler is no longer available as executable scheduling code; the legacy marker is retained only so old stored rows can be recognised. Schema version 14 replays stored review events and rewrites derived schedule rows with the current FSRS-6 scheduler, so ordinary migrated databases should no longer contain mixed legacy/current schedule semantics. On ordinary review writes, Cosimo still replays the card's stored review events under FSRS-6 before applying the new answer, so the stored schedule remains consistent with the review history. learning_step_index is present only for learning and relearning cards.

lapses follows the FSRS/Anki meaning: it is incremented only when a card that was in review state is marked Again and moves into relearning. Again while new, learning, or already relearning is stored as a review event but does not increment this counter.

The four recall ratings are state-dependent scheduling inputs:

On new and learning cards, Again restarts the first learning step, Hard keeps the card in the current short-step path, Good advances to the next learning step or graduates at the final step, and Easy graduates immediately.
On review-state cards, Again is the lapse transition into relearning, Hard remains a successful recall but produces a cautious interval, Good produces the ordinary next review interval, and Easy produces a longer interval.
On relearning cards, Again repeats the relearning step without adding another lapse, Hard keeps the card in relearning for another short step, and Good or Easy return the card to review state.

Suspension is stored in card_states with state_type = 'suspended'. A suspended card remains in the deck and keeps its review history, but Cosimo excludes it from scheduled passes, forced full-deck passes, and status-bar due/next-review calculations until it is unsuspended.

Flagging is stored in card_states with state_type = 'flag'. A flagged card remains fully reviewable and schedulable. The flag is only a user-visible marker for later attention, shown in the ready list and exposed through the flagged card filter. Parent and generated reverse cards can be flagged independently.

Deferral is stored in card_states with state_type = 'deferred' and a future ended_at_unix_seconds. A deferred card remains in the deck and keeps its stored schedule, but Cosimo excludes it from scheduled passes, forced passes, ready-screen due counts, and status-bar due/next-review calculations until the deferral end time is reached. Clearing a deferral records the current time as the end time. Parent and generated reverse cards can be deferred independently.

Reversibility is stored separately in card_reversals, keyed by the parent card. Generated reverse cards are included in scheduling, forced passes, filters, card details, and learning-dashboard analytics while the parent is marked reversible. Standard and reverse directions can therefore be in different memory states and have different due dates.

Cards that should not be reversed are stored in the deck file, because that is authored knowledge about the card rather than study history. A numbered deck terminator such as =7i= means card 7 is non-reversible; plain =7= makes no statement about reversibility. Marking a card non-reversible removes any active generated reverse card, but old reverse review events and schedules remain dormant. Bulk reversal excludes these cards. Allowing reversal again removes only the deck marker; the card can then be made reversible manually or by a later bulk operation.

Tools -> Scrub Review Database removes dormant database rows that are not reachable from the current deck and its active generated reverse cards. It deletes inactive review events, schedules, inactive card states, state rows for removed cards, reversal rows, unused bulk-reversal batches, and review-pass rows that included scrubbed events. Active standard cards, active generated reverse cards, current suspensions, current flags, and current future deferrals are kept. Non-reversible markers are deck data, not database rows. Expired deferrals and other inactive state rows can also be removed by deck open, review start, scrub, and reclaim-space maintenance. Scrub runs SQLite VACUUM after the delete transaction commits so the database file can reclaim free pages. After a scrub, removed reverse-card or removed-card history cannot be recovered by re-enabling reversal or restoring only the deck file; restore a matching review-database backup if that dormant history is needed.

Tools -> Reclaim Database Space runs SQLite VACUUM without deleting dormant rows. It exists for space reclamation after ordinary database churn or after a scrub has already decided what logical data should be removed.

Cosimo's post-response ratings are FSRS's four review grades:

1 Again
2 Hard
3 Good
4 Easy

The pre-response confidence rating is used by the learning dashboard for confidence calibration intervals, but it does not currently affect FSRS scheduling. Not answered confidence values are therefore included in statistics but do not enter calibration intervals or change the scheduling grade, which comes from recall feedback.

The review event log remains the audit trail. The scheduling table is the current per-card state used to decide when cards are due.