Humans more easily remember or learn items when they are studied a few times over a long period of time (spaced presentation), rather than studied repeatedly in a short period of time.

Hermann Ebbinghaus, German Psychologist having introduced the Forgetting Curve

Forgetting is key to learning. Forgetting is what separates what is useful and what is not. Sometimes, we want to make a piece of knowledge stick in memory despite what our memory thinks. Spaced Repetition is an effective solution to this problem and Anki is the most popular OSS tool to help you.

What You Will Learn
  • What is the history of SRS algorithms.
  • How Anki SRS algorithm is implemented.
  • How Anki SRS algorithm differs from other known implementations.
  • How Anki SRS algorithm can be improved.

The Anki source code is published under AGPL v3. The code presented in this article has been slightly adapted for learning and readability purposes.

Prerequisites

I assume that you have used Anki before. All code examples use the Python language, mainly because Anki was implemented only in Python until recently, and also because it’s a great language for novice programmers. You don’t need to have a solid understanding of the language to follow the article as the code uses basic Python syntax.

SRS Primer

The role of any Spaced Repetition System (SRS) algorithm is to determine what the user should review now, or say differently when must happen the next review for every remembered item. The goal of any implementation is to counteract the effect of the forgetting curve:

As soon as we learn or review a piece of knowledge, the decay begins. SRS algorithms need to determine the optimal interval between two reviews to ensure we haven’t completely forgotten it (memory retention = 0%) while trying to limit as much as possible the number of reviews. In practice, most algorithms use 10% for the forgetting index (= 90% of items are remembered correctly) so that we don’t have too many items to review again while keeping the number of reviews close to optimal.

The details of the different algorithms differ greatly between systems. We will start by reviewing the most popular ones in history before introducing Anki’s solution.

The Leitner System (1970-)

If Hermann Ebbinghaus is credited for the initial research behind SRS, the Leiter System is often quoted as the first algorithm. This system uses a physical box as depicted by the following illustration:

Here is a small Python program implementing the logic behind the original Leitner system:

from queue import Queue
import random
CARDS_PER_CM = 5
BOX = [
Queue(1 * CARDS_PER_CM),
Queue(2 * CARDS_PER_CM),
Queue(5 * CARDS_PER_CM),
Queue(8 * CARDS_PER_CM),
Queue(14 * CARDS_PER_CM),
]
def add(card, i):
BOX[i].put(card)
if BOX[i].full():
study()
def review(card):
return random.choice([True, True, True, False])
def study():
for index, partition in enumerate(BOX):
if partition.full():
# Time to review the cards
print(f"Time to study partition {index + 1}!")
cards_to_review = []
while not partition.empty():
cards_to_review.append(partition.get())
for card in cards_to_review:
answer = review(card)
new_index = None
if answer and index + 1 < len(BOX):
# Promote
new_index = index + 1
elif not answer and index - 1 > 0:
# Demote
new_index = 0
else:
# Replace in the same partition
new_index = index
add(card, new_index)
if __name__ == "__main__":
# Populate the box
for i in range(140):
add("New Card", 0)
# Study
study()

The original Leitner system cannot really be considered a spaced repetition system. There is no concept of an (optimum) interval. The system simply prioritizes which items to review based on the available physical space in each partition.

An alternative method using three boxes where incorrect answers are only moved back to the previous box is often privileged:

Here is a program implementing this new logic:

from queue import Queue
import random
from datetime import datetime, timedelta
A = 0
B = 1
C = 2
SYSTEM = [
Queue(), # Box A: every day
Queue(), # Box B: every 2-3 days (ex: Tuesday & Friday)
Queue(), # Box C: every week (ex: Sunday)
]
def add(card, i):
"""Add a new card in the Leitner system."""
SYSTEM[i].put(card)
def review(card):
"""Answer a single card."""
return random.choice([True, True, True, False])
def study_box(number):
"""Review all cards in a box."""
cards_to_review = []
while not SYSTEM[number].empty():
cards_to_review.append(SYSTEM[number].get())
for card in cards_to_review:
answer = review(card)
new_number = None
if answer and number < C:
# Promote
new_number = number + 1
elif not answer and number > A:
# Demote
new_number = number - 1
else:
# Replace in the same box
new_number = number
add(card, new_number)
def study(day):
"""Study the box according the week day."""
weekday = day.weekday()
if weekday == 0: # Monday
study_box(A)
elif weekday == 1: # Tuesday
study_box(A)
study_box(B)
elif weekday == 2: # Wednesday
study_box(A)
elif weekday == 3: # Thursday
study_box(A)
elif weekday == 4: # Friday
study_box(A)
study_box(B)
elif weekday == 5: # Saturday
study_box(A)
elif weekday == 6: # Sunday
study_box(A)
study_box(C)
if __name__ == "__main__":
# Populate the box
for i in range(140):
add("New Card", 0)
# Study (over 10 days)
for i in range(10):
day = datetime.today() - timedelta(days=10 - i)
study(day)

The modern Leitner system assigns intervals to the different boxes. Variants exist with more boxes but for this system to be considered a spaced repetition system, we would need a lot more boxes to have longer and longer intervals between reviews.

SM-0 (1985)

Algorithms are precise instructions to carry out. As we have seen with the Leitner system, algorithms don’t have to be executed on computers at all. We can manually perform what a computer does, except we will need a lot more time. The first version of the SuperMemo Algorithm was also thought to be executed manually.

The SM-0 algorithm (aka the paper-and-pencil SuperMemo method) was published in 1985 and relies on paper books filled with tables.

Although the algorithm was designed to be executed manually, we can still capture the logic using code:

import random
from datetime import date, timedelta
from queue import Queue
# The table of repetition intervals determines the number of days between
# two successive reviews.
# SM-0 applies the factor 1.7 between two successive values.
# Ex: 4, 7, 12, 20, ...
TABLE_REPETITION_INTERVALS = [4] # First review after 4 days
for i in range(1, 15): # 15 repetitions max
prev = TABLE_REPETITION_INTERVALS[i - 1]
next = int(prev * 1.7)
TABLE_REPETITION_INTERVALS.append(next)
# The book containing the pages to review
DATABOOK = []
# The book containing the page numbers to review day after day
# NB: We use a sparse dictionary where only dates
# with one or more pages to review are present
SCHEDULE_BOOK = {} # <date, [page numbers]>
# Fake the user in answering the question
def review_question(question, repetitions):
"""
Randomly answer a question.
The chance of answering correctly increases with the number of repetitions.
"""
return random.choice([True] * repetitions * 4 + [False])
# A single page in the data book.
class Page:
def __init__(self, questions, answers):
# "Question field" column
self.questions = questions
# "Answer field" column
self.answers = answers
# "Repetition scores" column
# => Determined during the review session
# "Repetitions" column
self.repetitions = []
def review(self):
remaining_questions = Queue(self.questions)
# Review until there is no more cards wrongly answered
iteration = 1
# Memorize the number of wrong answers during the first iteration
U = 0
while not remaining_questions.empty():
questions_to_review = []
while not remaining_questions.empty():
questions_to_review.append(remaining_questions.get())
for question in questions_to_review:
if not review_question(question, iteration):
# Review again
remaining_questions.append(question)
if iteration == 1:
U += 1
iteration += 1
self.repetitions.append({
"No": len(self.repetitions) + 1,
"Dat": str(date.today()),
"U": U,
})
if __name__ == "__main__":
# Add a new page for illustration purposes
DATABOOK.append(Page(
questions=["Question 1", "Question 2", "Question 3"],
answers=["Answer 1", "Answer 2", "Answer 3"],
))
page_number = len(DATABOOK) - 1
# Mark the page to review according the table of repetition intervals
now = date.today()
for interval in TABLE_REPETITION_INTERVALS:
review_date = str(now + timedelta(days=interval))
if review_date not in SCHEDULE_BOOK:
SCHEDULE_BOOK[review_date] = []
print(f"Page {page_number} to review on {review_date}")
SCHEDULE_BOOK[review_date] = [page_number]
# Review sessions during one year
for i in range(365):
day = str(now + timedelta(days=i))
if not day in SCHEDULE_BOOK:
# Nothing to review today
continue
# Review each planned pages
for page in SCHEDULE_BOOK[day]:
print(f"Reviewing page {page} on {day}")
DATABOOK[page].review()
1
The grade of the answer does not influence the next interval. Difficult items are reviewed again the same day but the next intervals are fixed and determined with a factor 1.7 when creating the page.

The SM-0 algorithm can be challenging in practice for different reasons:

  • All items on a given page are reviewed at the same time. For hard-to-remember items (items that require more than 3 reviews on a given day to be recalled), SM-0 recommends duplicating them on a new page in your book. These items will be reviewed more frequently, and some will maybe be duplicated again if still too hard to remember.
  • The intervals are determined using an estimation of the average case (x1.7) but the ideal intervals depend on the complexity and your familiarity with the subject. You probably need shorter intervals for science subjects like Mathematics for example.
  • Last but not least, executing the algorithm manually works, but is far from being a smooth learning experience…

Enter the computer.

SM-2 (1987)

Unlike physical systems where cards are grouped in the same box/partition/page and are reviewed collectively, digital systems consider each item separately. For example, the SuperMemo algorithm called SM-2 assigns a specific level of difficulty to every card and determines the appropriate intervals between repetitions using this specific value (called the E-Factor).

Now, the same logic but implemented as code:

import random
from datetime import date, timedelta
from queue import Queue
def grade(question, repetitions):
# Increase the chance of success with the increased number of repetitions
choices = [0] * 1 * repetitions + [1] * 2 * repetitions + \
[2] * 3 * repetitions + [3] * 4 * repetitions + \
[4] * 5 * repetitions + [5] * 6 * repetitions
return random.choice(choices)
# Settings
I1 = 1
I2 = 6
MIN_EF = 1.3
class Item:
def __init__(self, question, answer):
self.question = question
self.answer = answer
self.EF = 2.5
self.I = I1
self.next_review = date.today() + timedelta(days=self.I)
self.repetitions = 0
def review(self, day, q):
self.EF = max(self.EF+(0.1-(5-q)*(0.08+(5-q)*0.02)), MIN_EF)
if q < 3:
self.I = I1
elif self.I == I1:
self.I = I2
else:
self.I = round(self.I * self.EF)
self.next_review = day + timedelta(days=self.I)
self.repetitions += 1
return q < 4
if __name__ == "__main__":
# Populate items
items = []
for i in range(1, 100):
items.append(Item(f"Q{i}", f"A{i}"))
# Review one year
# for i in range(365):
for i in range(365):
day = date.today() + timedelta(days=i)
items_to_review = Queue()
for item in items:
if item.next_review == day:
items_to_review.put(item)
while not items_to_review.empty():
item = items_to_review.get()
q = grade(item.question, item.repetitions + 1)
if not item.review(day, q):
items_to_review.put(item)
1
The E-Factor never goes down lower than 1.3. SuperMemo found out that items having lower E-Factors were repeated annoyingly often when the root cause was usually their formulation and not the review process. We will see how Anki manages such cards later. These items must often be reformulated to conform to the minimum information principle.
2
The E-Factor is always initialized to the same difficulty value. It will decrease for bad grades and increase for good grades.
3
Unlike SM-0, the grades (= item difficulty) influence the factor used to determine the next interval.
4
Like SM-0, difficult items are reviewed again the same day.

The SM-2 algorithm, while relatively basic, remains popular even today as you will discover in the rest of this article.

Anki Algorithm

From Wikipedia:

“The SM-2 algorithm, created for SuperMemo in the late 1980s, forms the basis of the spaced repetition methods employed in the program. Anki’s implementation of the algorithm has been modified to allow priorities on cards and to show flashcards in order of their urgency.

Wikipedia

Anki source code includes different versions of its SRS algorithm (called Scheduler). All got inspiration from SM-2. The V2 is in use since 2018 even if the V3 is looming. For this article, we can ignore the details between these versions. Check the source code on GitHub if you are interested in the differences between the V1, V2, or V3.

We will analyze the V2.1 scheduler as it is the version I’m familiar with. We will use the version 2.10.0 of Anki Desktop to ignore recent refactorings (the rewrite of backend code in Rust, the introduction of Protocol Buffer messages, the factorization of common code among scheduler versions using inheritance, etc.). This will help us keep the code easy to grasp.

Here is a recall of Anki terminology:

As outlined by the schema, we will focus on the core abstractions (Collection, Note, Card) that affects how the SRS algorithm works. In addition, cards in Anki are scheduled differently according to their state:

Here is an overview of the Anki algorithm:

The use of separate new/review queues tries to remediate a common complaint with the standard SM-2 algorithm is that repeated failings of a card cause the card to get stuck in “low interval hell” (also known as “ease hell”). In Anki, the initial acquisition process does not influence the ease factor.

Part 1: Settings

Unlike previous systems, Anki is highly configurable. Not all settings affect the SRS algorithm. Here are the default setting values used by Anki that will be used:

# Whether new cards should be mixed with reviews, or shown first or last
NEW_CARDS_DISTRIBUTE = 0
NEW_CARDS_LAST = 1
NEW_CARDS_FIRST = 2
# The initial factor when a card gets promoted
STARTING_FACTOR = 2500
# Default collection configuration
colConf = {
'newSpread': NEW_CARDS_DISTRIBUTE,
'collapseTime': 1200,
}
# Default deck configuration
deckConf = {
'new': {
'delays': [1, 10],
'ints': [1, 4],
'initialFactor': STARTING_FACTOR,
'perDay': 20,
},
'rev': {
'perDay': 200,
'ease4': 1.3,
'maxIvl': 36500,
'hardFactor': 1.2,
},
'lapse': {
'delays': [10],
'mult': 0,
'minInt': 1,
'leechFails': 8,
},
}
1

If there is no more card to review now but the next card in learning is in less than collapseTime seconds, show it now.

  • collapseTime: Tells Anki how to behave when there is nothing left to study in the current deck but cards in learning. + Setting: Preferences > Basic > Learn ahead limit * 60 (default: 20 minutes)
2

The settings differ based on the queue where a card belongs. For example, when learning (new) cards, the delay is increased by graduating steps whereas the delay is multiplied by a given factor for review (rev) cards. The meaning of individual settings will become clearer when we will detail the logic.

  • new.delays: The list of successive delays between the learning steps of the new cards. The first delay will be used when you press the Again button on a new card. The Good button will advance to the next step. Once all steps have been passed, the card will become a review card and will appear on a different day. + Setting: Preferences > New Cards > Learning steps (Default: 1m 10m)
  • new.ints: The list of delays according to the button pressed while leaving the learning mode after pressing “Good” or “Easy.” + Setting: Preferences > New Cards > Graduating interval/Easy interval (Default: 1 and 4)
  • new.initialFactor: The ease multiplier new cards start with. By default, the Good button on a newly-learned card will delay the next review by 2.5x the previous delay. + Setting: Preferences > Advanced > Starting ease (Default: 2.50),
  • new.perDay: The maximum number of new cards to introduce in a day, if new cards are available. + Setting: Preferences > Daily Limits > New cards/day (Default: 20)
  • rev.perDay: The maximum number of review cards to show in a day, if cards are ready for review. + Setting: Preferences > Daily Limits > Maximum reviews/day (Default: 50)
  • rev.ease4: An extra multiplier that is applied to a review card’s interval when you rate it Easy. + Setting: Preferences > Advanced > Easy bonus (Default: 1.30)
  • rev.maxIvl: The maximum number of days a review card will wait. When reviews have reached the limit, Hard, Good and Easy will all give the same delay. + Setting: Preferences > Advanced > Maximum interval (Default: 36500)
  • rev.hardFactor: The multiplier applied to a review interval when answering Hard. + Setting: Preferences > Advanced > Hard interval (Default: 1.20)
3

When you forget a review card, it is said to have “lapsed”, and the card must be relearnt. The default behavior for lapsed reviews is to reset the interval (minInt) to 1 (i.e. make it due tomorrow) and put it in the learning queue for a refresher (delays) in 10 minutes.

  • lapse.delays: The list of successive delays between the learning steps of lapsed cards. By default, pressing the Again button on a review card will show it again 10 minutes later. + Setting: Preferences > Lapses > Relearning steps (Default: 10m)
  • lapse.minInt: The minimum interval given to a review card after answering Again. + Setting: Preferences > Lapses > Minimum interval (Default: 1)
  • lapse.mult: The multiplier applied to a review interval when answering Again. + Setting: Preferences > Advanced > New interval (Default: 0)
  • lapse.leechFails: The number of times Again needs to be pressed on a review card before it is marked as a leech. + Setting: Preferences > Lapses > Leech threshold (Default: 8)

Part 2: Model

Let’s begin with the model. Anki stores cards in an SQLite database. In this tutorial, we will mimic the same model but we will store the cards directly in memory inside the collection object. We will also ignore decks completely as they mostly allow reviewing different cards using different settings or at different times but don’t profoundly change how Anki works.

class Collection:
def __init__(self, id=None):
d = datetime.datetime.today()
d = datetime.datetime(d.year, d.month, d.day)
# Timestamp of the creation date in seconds.
self.crt = int(time.mktime(d.timetuple()))
# In-memory list of cards
self.cards = []
self.sched = Scheduler(self)
def addNote(self, note):
self.cards.append(Card(note))
class Note:
def __init__(self):
self.id = intId()
self.tags = []
def addTag(self, tag):
if not tag in self.tags:
self.tags.append(tag)
class Card:
def __init__(self, note, id=None):
self.id = intId()
self.note = note
# Timestamp of the creation date in second.
self.crt = intTime()
# 0=new, 1=learning, 2=review, 3=relearning
self.type = 0
# Queue type:
# -1=suspend => leeches (as manual suspension is not supported)
# 0=new => new (never shown)
# 1=lrn => learning/relearning
# 2=rev => review (as for type)
self.queue = 0
# The interval. Negative = seconds, positive = days
self.ivl = 0
# The ease factor in permille.
# Ex: 2500 = the interval will be multiplied by 2.5
# the next time you press "Good".
self.factor = 0
# The number of reviews.
self.reps = 0
# The number of times the card went from a "was answered correctly"
# to "was answered incorrectly" state.
self.lapses = 0
# Of the form a*1000+b, with:
# a => the number of reps left today
# b => the number of reps left till graduation
# Ex: '2004' = 2 reps left today and 4 reps till graduation
self.left = 0
# Due is used differently for different card types:
# - new => note id or random int
# - lrn => integer timestamp in second
# - rev => integer day, relative to the collection's creation time
self.due = self.id
1
The Scheduler implementation will be the main topic of the remaining of this section.
2
The identifiers are initialized using a helper function intId() which uses the current time and ensures two successive calls return different values. Here is the definition:
import time
def intId():
"""Returns a unique integer identifier."""
t = intTime(1000)
# Make sure the next call to the function returns a different value
while intTime(1000) == t:
time.sleep(1)
return t
def intTime(scale=1):
"The time in integer seconds. Pass scale=1000 to get milliseconds."
return int(time.time()*scale)

The Scheduler is the largest class that will be covered. A scheduler in Anki is an object supporting two methods:

  • getCard(): Returns the next card to review
  • answerCard(card, ease): Updates the card after an answer (ease: 0 for “Again”, 1 for “Hard”, 2 for “Good”, and 3 for “Easy”)
class Scheduler:
def __init__(self, col):
# The collection used to retrieve the cards
self.col = col
# An upper limit for new and review cards
self.queueLimit = 50
# An upper limit for learning cards
self.reportLimit = 1000
# The number of already reviewed cards today
self.reps = 0
# The number of days since the collection creation
self.today = self._daysSinceCreation()
# The timestamp of the end of day
self.dayCutoff = self._dayCutoff()
# The timestamp in seconds to determine the learn ahead limit
self._lrnCutoff = 0
self.reset()
1

The attribute today represents the number of days since the collection creation. It is used when searching for review cards where the attribute due represents the number of days relative to it. The value is initialized like this:

class Scheduler:
def _daysSinceCreation(self):
startDate = datetime.datetime.fromtimestamp(self.col.crt)
return int((time.time() - time.mktime(startDate.timetuple())) // 86400)
# Note: 86400s = 1d
2

The attribute dayCutoff represents the timestamp of the beginning of the next day. Anki allows customizing at which hour a day ends. Here, we simply use midnight:

class Scheduler:
def _dayCutoff(self):
date = datetime.datetime.today()
date = date.replace(hour=0, minute=0, second=0, microsecond=0)
if date < datetime.datetime.today():
date = date + datetime.timedelta(days=1)
stamp = int(time.mktime(date.timetuple()))
return stamp
3

The attribute _lrnCutoff is related to the setting collapseTime (also called the learn ahead limit). The method _updateLrnCutoff() is used to initialize it and update it:

class Scheduler:
def _updateLrnCutoff(self, force):
nextCutoff = intTime() + self.col.colConf['collapseTime']
if nextCutoff - self._lrnCutoff > 60 or force:
self._lrnCutoff = nextCutoff
return True
return False

Part 3: Queues Management

The method reset() present in the last line of the Scheduler’s constructor initializes the queues managed by Anki:

class Scheduler:
def reset(self):
self._resetLrn()
self._resetRev()
self._resetNew()
# New cards
#################################################################
def _resetNew(self):
self._newQueue = []
self._updateNewCardRatio()
def _fillNew(self):
if self._newQueue:
return True
lim = min(self.queueLimit, deckConf["new"]["perDay"])
self._newQueue = list(filter(lambda card: card.queue == 0,
self.col.cards))
self._newQueue.sort(key=lambda card: card.due)
self._newQueue = self._newQueue[:lim]
if self._newQueue:
return True
def _updateNewCardRatio(self):
if colConf['newSpread'] == NEW_CARDS_DISTRIBUTE:
if self._newQueue:
newCount = len(self._newQueue)
revCount = len(self._revQueue)
self.newCardModulus = (
(newCount + revCount) // newCount)
# if there are cards to review, ensure modulo >= 2
if revCount:
self.newCardModulus = max(2, self.newCardModulus)
return
self.newCardModulus = 0 # = Do not distribute new cards
# Learning cards
#################################################################
def _resetLrn(self):
self._updateLrnCutoff(force=True)
self._lrnQueue = []
def _fillLrn(self):
if self._lrnQueue:
return True
cutoff = intTime() + colConf['collapseTime']
self._lrnQueue = list(filter(lambda card: card.queue == 1 and
card.due < cutoff, self.col.cards))
self._lrnQueue.sort(key=lambda card: card.id)
self._lrnQueue = self._lrnQueue[:self.reportLimit]
return self._lrnQueue
# Review cards
#################################################################
def _resetRev(self):
self._revQueue = []
def _fillRev(self):
if self._revQueue:
return True
lim = min(self.queueLimit, self.col.deckConf["rev"]["perDay"])
self._revQueue = list(filter(lambda card: card.queue == 2 and
card.due <= self.today, self.col.cards))
self._revQueue.sort(key=lambda card: card.due)
self._revQueue = self._revQueue[:lim]
if self._revQueue:
r = random.Random()
r.seed(self.today)
r.shuffle(self._revQueue)
return True
1
By default, the queues are empty. Anki defers their filling until a card is retrieved.
2
The method _updateNewCardRatio() determines the frequency for new cards (only when new cards are spread among other cards). For example, if there are 50 review cards and 10 new cards, the ratio will be 5 so that a new card is returned after every 5 review cards. The attribute reps present in Scheduler keeps the current number of reviewed cards for the current study session and will be useful when using the ratio _newCardModulus to determine if the next card must be a new card or a review card.
3
Anki searches for all cards in the queue 0 (= new) and sorts them by their due date before returning the first N cards based on the current daily limit.
4
Anki searches for all cards in the queue 1 (= lrn) that are due and sorts them by timestamp as the id is initialized from the creation timestamp
5
Anki searches for all cards in the queue 2 (= rev) that are due and sorts them by the due date before returning the first N shuffled cards based on the current daily limit.

The logic to initialize the queues is ready but will be executed in the next step when retrieving a card to study.

Part 4: Card Retrieving

The main method is the method getCard().

class Scheduler:
def getCard(self):
card = self._getCard()
if card:
self.reps += 1
return card

This method delegates to _getCard() and simply increases the counter of studied cards except when the study session is completed.

class Scheduler:
def _getCard(self):
"Return the next due card or None."
# learning card due?
c = self._getLrnCard()
if c:
return c
# new first, or time for one?
if self._timeForNewCard():
c = self._getNewCard()
if c:
return c
# card due for review?
c = self._getRevCard()
if c:
return c
# new cards left?
c = self._getNewCard()
if c:
return c
# collapse or finish
return self._getLrnCard(collapse=True)
# New cards
##########################################################################
def _getNewCard(self):
if self._fillNew():
return self._newQueue.pop()
def _timeForNewCard(self):
"True if it's time to display a new card when distributing."
if not self._newQueue:
return False
if colConf['newSpread'] == NEW_CARDS_LAST:
return False
elif colConf['newSpread'] == NEW_CARDS_FIRST:
return True
elif self.newCardModulus:
return self.reps and self.reps % self.newCardModulus == 0
# Learning queues
##########################################################################
def _getLrnCard(self, collapse=False):
if self._fillLrn():
return self._lrnQueue.pop()
# Reviews
##########################################################################
def _getRevCard(self):
if self._fillRev():
return self._revQueue.pop()
1

By default, Anki shows cards in a well-defined order:

  • New cards when newSpread == NEW_CARDS_FIRST
  • Learning cards that are due
  • New cards when newSpread == NEW_CARDS_DISTRIBUTE (default)
  • Review cards
  • New cards when newSpread == NEW_CARDS_LAST
2

The methods _fillXXX() return True when a queue is not empty, in which case, we simply have to pop an element from it.

The queues are now initialized when retrieving the first card in each of them. This works great for the current session but when a new day begins, Anki must reinitialize the queues because other cards may have reached their due date.

1
class Scheduler:
2
3
def reset(self):
4
self._updateCutoff()
5
self._resetLrn()
6
self._resetRev()
7
self._resetNew()
8
9
def _updateCutoff(self):
10
# days since col created
11
self.today = self._daysSinceCreation()
12
# end of day cutoff
13
self.dayCutoff = self._dayCutoff()
14
return stamp
15
16
def _checkDay(self):
17
# check if the day has rolled over
18
if time.time() > self.dayCutoff:
19
self.reset()
20
21
def getCard(self):
22
self._checkDay()
23
card = self._getCard()
24
if card:
25
self.reps += 1
26
return card
1
The method _updateCutoff() is called every time the queues are reset (= once a day). When this happens, it means a new day began and therefore the day limit must be refreshed too.
2
The method _checkDay() is called every time we retrieve a new card to study. This way, if we have passed the current day, the queue will be reset before returning the next card.

Part 5: Card Updating

Now that we have a method to empty the list of cards to study, we will turn our attention to the core part of the SRS algorithm. Every time we study a card, the card must be rescheduled. In short, we need to update the attribute due (= the next review date) of the card but the logic varies according to its current state (ex: the current queue, ease factor, and interval).

class Scheduler:
def answerCard(self, card, ease):
assert 1 <= ease <= 4
assert 0 <= card.queue <= 4
card.reps += 1
if card.queue == 0:
self._answerNewCard(card, ease)
elif card.queue in [1, 3]:
self._answerLrnCard(card, ease)
elif card.queue == 2:
self._answerRevCard(card, ease)
else:
assert 0

We will detail each case separately.

Part 5.1: Answering New Cards

class Scheduler:
def _answerNewCard(self, card, ease):
# came from the new queue, move to learning
card.queue = 1
card.type = 1
# init reps to graduation
card.left = self._startingLeft(card)
def _startingLeft(self, card):
conf = self._lrnConf(card)
tot = len(conf['delays'])
tod = self._leftToday(conf['delays'], tot)
return tot + tod*1000
def _leftToday(self, delays, left, now=None):
"The number of steps that can be completed by the day cutoff."
if not now:
now = intTime()
delays = delays[-left:]
ok = 0
for i in range(len(delays)):
now += delays[i]*60
if now > self.dayCutoff:
break
ok = i
return ok+1
1
Anki simply updates the attribute queue to move a card to a different queue. When the destination queue will be reset (ex: for tomorrow’s session), the card will be automatically inserted into it.
2
The attribute type is similar to the attribute queue (they share the same values 0, 1, and 2). In practice, the attributes queue and type may differ for example after a lapse. When pressing “Again,” on a review card, the card will be moved back to the learning back (queue = 1) but the type will be unchanged (type = 2) to remember the card was previously a review card. This will be useful when graduating the card back to the review queue after relearning.
3
The attribute left is particular. The numeric format keeps two pieces of information: how many times the card will be reviewed today, and how many steps before graduation. The methods _startingLeft and _leftToday implement this logic. You can safely ignore the details.

So, when answering a new card, the card is automatically promoted to the learning queue.

Part 5.2: Answering Learning Cards

class Scheduler:
def _answerLrnCard(self, card, ease):
conf = self._lrnConf(card)
# immediate graduate?
if ease == 4:
self._rescheduleAsRev(card, conf, True)
# next step?
elif ease == 3:
# graduation time?
if (card.left%1000)-1 <= 0:
self._rescheduleAsRev(card, conf, False)
else:
self._moveToNextStep(card, conf)
elif ease == 2:
self._repeatStep(card, conf)
else:
# back to first step
self._moveToFirstStep(card, conf)
def _lrnConf(self, card):
if card.type == 2:
return self.col.deckConf["lapse"]
else:
return self.col.deckConf["new"]
1
The settings differ according to if the card comes from the review or new queue. For example, the steps are different after a lapse than when learning a new card for the first time.

We will detail what happens depending on which button was pressed when answering the card.

After pressing “Again”…

self._moveToFirstStep(card, conf)

The card is moved back to the first step:

class Scheduler:
def _moveToFirstStep(self, card, conf):
card.left = self._startingLeft(card)
# relearning card?
if card.type == 3:
self._updateRevIvlOnFail(card, conf)
return self._rescheduleLrnCard(card, conf)
def _updateRevIvlOnFail(self, card, conf):
card.ivl = self._lapseIvl(card, conf)
def _lapseIvl(self, card, conf):
ivl = max(1, conf['minInt'], int(card.ivl*conf['mult']))
return ivl
def _rescheduleLrnCard(self, card, conf, delay=None):
# normal delay for the current step?
if delay is None:
delay = self._delayForGrade(conf, card.left)
card.due = int(time.time() + delay)
card.queue = 1
return delay
def _delayForGrade(self, conf, left):
left = left % 1000
delay = conf['delays'][-left]
return delay*60
1
We restore the attribute left as if the card were new.
2
We process lapses differently. By default, we reset the attribute ivl to 1 (next review in one day).
3
The card due date is determined by adding the next step to the current date. The card remains in the learning queue (1).
4
The method _delayForGrade() is a helper method to get the next step interval. The method extracts the number of remaining steps from the attribute left (Ex: 1002 => 2 remaining steps) and uses the setting delay to find the matching delay (Ex: 1m 10m 1d => next study in 10m).

After pressing “Hard”…

self._repeatStep(card, conf)

The current card step is repeated. This means the attribute left is unchanged. We still have the same number of remaining steps before graduation. The difference is that the card will be rescheduled in a delay slightly longer than the previous one. We average the last and next delays (Ex: 1m 10m 20m and we are at the step 2 => repeat in 15m).

class Scheduler:
def _repeatStep(self, card, conf):
delay = self._delayForRepeatingGrade(conf, card.left)
self._rescheduleLrnCard(card, conf, delay=delay)
def _delayForRepeatingGrade(self, conf, left):
# halfway between last and next
delay1 = self._delayForGrade(conf, left)
delay2 = self._delayForGrade(conf, left-1)
avg = (delay1+max(delay1, delay2))//2
return avg
1
We reuse the method _rescheduleLrnCard() introduced just before to update the card’s due date.

After pressing “Good”…

# graduation time?
if (card.left%1000)-1 <= 0:
self._rescheduleAsRev(card, conf, False)
else:
self._moveToNextStep(card, conf)

The decision depends on if there are remaining steps or not:

Case 1: If we have finished the last step, the card is graduated to the learning queue:

class Scheduler:
def _rescheduleAsRev(self, card, conf, early):
lapse = card.type == 2
if lapse:
self._rescheduleGraduatingLapse(card)
else:
self._rescheduleNew(card, conf, early)
def _rescheduleGraduatingLapse(self, card):
card.due = self.today+card.ivl
card.type = card.queue = 2
def _rescheduleNew(self, card, conf, early):
card.ivl = self._graduatingIvl(card, conf, early)
card.due = self.today+card.ivl
card.factor = conf['initialFactor']
card.type = card.queue = 2
def _graduatingIvl(self, card, conf, early):
if card.type in (2,3):
return card.ivl
if not early:
# graduate
ideal = conf['ints'][0]
else:
# early remove
ideal = conf['ints'][1]
return ideal
1
When a lapse is graduated, we add the previous interval to the current date to determine the due date.
2
When a new card is graduated, we initialize the two key attributes relative to the SRS algorithm: the ease factor and the interval. These fields will be necessary to determine the next due date for review cards.
3
When graduating a new card, the initial interval will be different if we are completed all steps (“Good”) or if we have pressed (“Easy”) to immediately graduate the card (1 vs 4 days by default).

Case 2: If there are remaining steps:

class Scheduler:
def _moveToNextStep(self, card, conf):
# decrement real left count and recalculate left today
left = (card.left % 1000) - 1
card.left = self._leftToday(conf['delays'], left)*1000 + left
self._rescheduleLrnCard(card, conf)
1
The attribute left is updated to decrement the number of remaining steps and to recalculate the number of studies until the next day.

After pressing “Easy”…

self._rescheduleAsRev(card, conf, True)

The card is graduated to the review queue similarly to when we complete every step. The only exception is that the initial interval will be larger as explained in the previous point.

Part 5.3: Answering Review Cards

class Scheduler:
def _answerRevCard(self, card, ease):
if ease == 1:
self._rescheduleLapse(card)
else:
self._rescheduleRev(card, ease)

After pressing “Again”…

class Scheduler:
def _rescheduleLapse(self, card):
conf = self.col.deckConf["lapse"]
card.lapses += 1
card.factor = max(1300, card.factor-200)
suspended = self._checkLeech(card, conf)
if not suspended:
card.type = 2
delay = self._moveToFirstStep(card, conf)
else:
# no relearning steps
self._updateRevIvlOnFail(card, conf)
delay = 0
return delay
# Leeches
##########################################################################
def _checkLeech(self, card, conf):
if card.lapses >= conf['leechFails']:
# add a leech tag
f = card.note
f.addTag("leech")
# Suspend
card.queue = -1
return True
1
The number of lapses for this card is increased.
2
The ease factor is reduced by 0.2 (but no lower than 1.3 as recommended by SM-2).
3
If the number of lapses reaches the value of the setting leechFails, the card is marked as a leech. A tag is added to the note and the card is moved to the queue -1 (= suspended). The card will therefore be ignored when filling the different queue as no method _fillXXX() considers cards in the queue -1.

After pressing “Hard,” “Good,” “Easy”…

The card will be rescheduled in an “ideal” number of days. In practice, most cards reside in the learning queue, and the “Again” button is pressed rarely. This means the core logic of the Anki SRS algorithm is determined by the following methods.

class Scheduler:
def _rescheduleRev(self, card, ease):
# update interval
self._updateRevIvl(card, ease)
# then the rest
card.factor = max(1300, card.factor+[-150, 0, 150][ease-2])
card.due = self.today + card.ivl
def _updateRevIvl(self, card, ease):
card.ivl = self._nextRevIvl(card, ease)
# Interval management
##########################################################################
def _nextRevIvl(self, card, ease):
"Next review interval for CARD, given EASE."
delay = self._daysLate(card)
conf = self.col.deckConf["rev"]
fct = card.factor / 1000
hardFactor = conf.get("hardFactor", 1.2)
if hardFactor > 1:
hardMin = card.ivl
else:
hardMin = 0
ivl2 = self._constrainedIvl(card.ivl * hardFactor, conf, hardMin)
if ease == 2:
return ivl2
ivl3 = self._constrainedIvl((card.ivl + delay // 2) * fct, conf, ivl2)
if ease == 3:
return ivl3
ivl4 = self._constrainedIvl(
(card.ivl + delay) * fct * conf['ease4'], conf, ivl3)
return ivl4
def _daysLate(self, card):
"Number of days later than scheduled."
return max(0, self.today - card.due)
def _constrainedIvl(self, ivl, conf, prev):
ivl = max(ivl, prev+1, 1)
ivl = min(ivl, conf['maxIvl'])
return int(ivl)
1
The attribute ivl determines the next due date (we add it to the current date to determine the value of the attribute due).
2
The ease factor is changed by removing 0.15 for “Hard” cards or by adding 0.15 for “Easy” cards. The ease factor is left unchanged for “Good” cards. Only their intervals will be changed to increase the period between studies.
3

The method _nextRevIvl() determine the next interval:

  • “Hard”: the current interval is multiplied by the value of the hard interval (1.2 by default).
  • ”Good”: the current interval is multiplied by the current ease (+ a bonus if the card was late).
  • ”Easy”: the current interval is multiplied by the current ease times the easy bonus (1.3 by default) (+ a bonus if the card was late).

We are done 🎉. The complete code is available in the companion GitHub repository. A more complete annotated version is also available in the same repository including two additional features described next.

Bonus: Day Boundaries

Anki treats small steps and steps that cross a day boundary differently. With small steps, the cards are shown as soon as the delay has passed, in preference to other due cards in review. This is done so that you can answer the card as closely to the calculated delay as possible. In contrast, if the interval crosses a day boundary, it is automatically converted to days.

In the implementation, the code splits the learning queue into two distinct queues: sub-day learning and day learning.

# ...
def _resetLrn(self):
self._lrnQueue = []
self._lrnDayQueue = []
# ...
def _rescheduleLrnCard(self, card, conf, delay=None):
# normal delay for the current step?
if delay is None:
delay = self._delayForGrade(conf, card.left)
card.due = int(time.time() + delay)
# due today?
if card.due < self.dayCutoff:
card.queue = 1
else:
# the card is due in one or more days, so we need to use the
# day learn queue
ahead = ((card.due - self.dayCutoff) // 86400) + 1
card.due = self.today + ahead
card.queue = 3
# ...
def _getCard(self):
# learning card due?
c = self._getLrnCard()
if c:
return c
# new first, or time for one?
if self._timeForNewCard():
c = self._getNewCard()
if c:
return c
# card due for review?
c = self._getRevCard()
if c:
return c
# day learning card due?
c = self._getLrnDayCard()
if c:
return c
# new cards left?
c = self._getNewCard()
if c:
return c
# collapse or finish
return self._getLrnCard(collapse=True)
1

The previous queue is split into two queues:

  • _lrnQueue (queue == 1) = sub-day learning queue
  • _lrnDayQueue (queue == 3) = day learning queue
2
Learning cards are rescheduled in the sub-day queue 1 when the next review is planned before the end of the day review session. The due date is the number of seconds until the next review. Otherwise, the card is rescheduled in the day learning queue 3 and the delay is the number of days until the next review.
3
Sub-day learning cards are prioritized first to be sure to review them as close as their delay in seconds. Day learning cards are reviewed last since their delay in days tolerates more flexibility (reviewing them the next day is not as bad as for sub-day learning cards).

Bonus: Fuzzing

When you select an ease button on a review card, Anki also applies a small amount of random “fuzz” to prevent cards that were introduced at the same time and given the same ratings from sticking together and always coming up for review on the same day.

Here is the code:

def _fuzzedIvl(self, ivl):
min, max = self._fuzzIvlRange(ivl)
return random.randint(min, max)
def _fuzzIvlRange(self, ivl):
if ivl < 2:
return [1, 1]
elif ivl == 2:
return [2, 3]
elif ivl < 7:
fuzz = int(ivl*0.25)
elif ivl < 30:
fuzz = max(2, int(ivl*0.15))
else:
fuzz = max(4, int(ivl*0.05))
# fuzz at least a day
fuzz = max(fuzz, 1)
return [ivl-fuzz, ivl+fuzz]
1

The function _fuzzedIvl() is only called for intervals greater than one day. For sub-day learning cards introduced in the previous point, fuzzing is also applied up to 5 minutes:

maxExtrax = min(300, int(delay*0.25))
fuzz = random.randrange(0, maxExtra)
2
The fuzz factor is reduced but the fuzzing increases as intervals become larger.

A Better Anki SRS Algorithm?

The SM-2 algorithm, on which Anki is based, was released in 1987 in SuperMemo 1.0. It was revised several times since:

Each version iterates over deficiencies of the previous one. You can find a short summary of the main changes or a (very) long summary of the history of SuperMemo. The short version is probably too terse to understand the improvements, and the long version is probably too detailed to understand everything. (It took me more than 5 hours to read it but it was worth the reading!)

SuperMemo 2 was great. Its simple algorithm has survived in various mutations to this day in popular apps such as Anki or Mnemosyne. However, the algorithm was dumb in the sense that there was no way of modifying the function of optimum intervals. The findings of 1985 were set in stone. Memory complexity and stability increase were expressed by the same single number: E-factor. It is a bit like using a single lever in a bike to change gears and the direction of driving.

Piotr Wozniak, Original author of SuperMemo

From a high-level perspective, the main motivation for every version is to determine better optimal intervals (= the ideal periods between reviews of a single card) so that the forgetting index is close to 10% (= recall of 90% is acceptable).

From a low-level perspective, several approaches were experimented by SuperMemo. The first major version (SM-2) introduced the ease factor to capture the difficulty of an item (the lower the ease factor = the more difficult = the shorter the interval). The ease factor was multiplied by the previous interval to determine the next interval.

The successive iterations become more and more elaborate by adding new dimensions, in particular, what is called by SuperMemo the two-component model: stability and retrievability (in complement to difficulty represented by the E-Factor). Stability tells you how long a piece of knowledge can last in memory. Retrievability tells you how easy it is to recall a piece of knowledge. These notions may appear similar but they aren’t. “If you take two memories right after a review, one with a short optimum interval, and the other with a long optimum interval, the memory status of the two must differ,” declares Piotr Wozniak, “Both can be recalled perfectly (maximum retrievability) and they also need to differ in how long they can last in memory (different stability).”

What follows is an example of the optimum factors (OF) matrix used in SM-4/SM-5. The matrix ignores the retrievability dimension, which was introduced in SM-6.

A two-dimensional matrix is easier to represent but the logic is similar with more dimensions. Initially, the matrix was defined based on prior measurements in SuperMemo. After each answer, the grade tells SuperMemo how well the interval “performed.” If the grade is low, the interval was too long. If the grade is high, the interval was too short. The entry in the matrix is updated in consequence and matrix smoothing is applied (= if a value increases, a smaller increase can be beneficial to neighbors too).

The two-component model of long-term memory still represents the foundation of SuperMemo since its introduction in SM-4 in 1989. Piotr Wozniak was pessimistic about a better, faster, and more effective algorithm as soon as 1994. The versions of the algorithm that appeared after that didn’t introduce a breakthrough improvement like SuperMemo did when it abandoned the SM-2 algorithm in 1989, the same algorithm that keeps popping up in new applications.

"New" Applications

Many applications relying on SRS appeared in popular app stores more or less recently: Quizlet, Memrise, Duolingo, LingoDeer, Brainscape, Lingvist, Chegg, RemNote, Mochi, Memcode, …

  • Mochi’s algorithm is very simple. The card interval is doubled after each correct answer, and cut in half otherwise.

  • Memrise’s algorithm is similar to Mochi’s. The card interval increases using the following steps: 4 hours, 12 hours, 24 hours, 6 days, 12 days, 48 days, 96 days, and 6 months. Any wrong answer moves back the card to the first interval.

  • Quizlet’s algorithm has known several iterations. The first implementation simply repeats all the questions you got wrong. The second implementation is similar to Anki where the card interval increases by approximately 2.2 and wrong answers reset the interval to one day. The next implementation relies on machine learning and uses the millions of answers to determine the recall probability, which is the chance you answer correctly. This allows, for example, to reduce the interval for words with irregular spellings when learning a foreign language.

  • Duolingo’s algorithm is similar to Quizlet. Duolingo has millions of students who generate billions of statistics about language learning every day. Like Quizlet, Duolingo uses machine learning to predict how likely you are to remember any given word at any time. This is represented by the strength meter (still strong, pretty good, time to practice, overdue) below every lesson.

  • RemNote’s algorithm is customizable like Anki and most settings will look familiar to Anki users, especially after following this tutorial.

  • Memcode’s algorithm also uses SM-2.

In my opinion, Anki is not perfect but there is no need to focus too much on optimizing it:

  • Adding more dimensions? What if you review inadvertently a card, for example when explaining the idea to a coworker. No algorithm can exploit this and postpone the next review. No algorithm will ever be perfect.
  • Using machine learning? Applying the lessons from other learners is great for common datasets. For example, if most French users have trouble learning a particular English word, chances are future French users will need shorter intervals too. But what about custom-edited cards about subjects such as science, management, and parenting. What about your interest in any of these subjects. We remember more easily what passionates us. Machine learning excels when there are patterns but learning is profoundly a personal, unique experience.

Therefore, I think we should focus more on optimizing our practices rather than the tools. Here are two key practices:

  • Devote time to understand. Learning is a 3-steps process: encoding, storage, and retrieval. Anki helps to store information for a longer period by reviewing it (“use it or lose it”). But Anki is dependent on how good the encoding happened. You cannot learn something you haven’t understood first. Therefore, you must devote (a lot of) time writing your own flashcards. A poor encoding process will make the best SRS algorithm useless.

  • Devote time to learn. Trying Anki is easy. Sticking to it is hard. Many users quickly abandon Anki probably because its benefits can only be visible after several years of making it a habit. And everyone knows changing habits is hard, otherwise Atomic Habits would not be the #1 best-selling book on Amazon last year. A lack of motivation will make the best SRS algorithm useless.

One last important thing,

Learning is one of the most enjoyable things in the world.

Piotr Wozniak

To Remember
  • A Spaced Repetition System (SRS) counteracts the effect of the forgetting curve. Memory decay is inevitable but can be influenced.
  • SRS systems can be implemented with or without a computer. The Leitner system remains popular.
  • SRS systems often target a retention close to 90% (= 10% of cards are wrongly answered).
  • SuperMemo introduced the first SRS algorithm running on a computer (SM-2).
  • SM-2 continues to be used by most applications including Anki, despite having been abandoned in SuperMemo three decades ago.
  • Anki makes the SM-2 highly configurable and uses different queues to manage cards differently based on if they are new, in learning, or simply in review.
  • Most algorithms use the item difficulty (known as the ease factor) to determine optimal intervals. SuperMemo goes well beyond and also uses memory stability and memory retrievability.
  • Recent SRS applications rely on machine learning to exploit the specificities of the learning materials (ex: English words with irregular syntax) and to use the information collected from their massive dataset of users to tune their algorithm. SuperMemo never chose this approach.
  • The perfect SRS algorithm will never exist. No algorithm can determine if you are passionate about a subject, or if you review by chance the content of a card at work during a discussion with a coworker (in which case an “ideal” algorithm must postpone the next review).
  • Creating great flashcards and making reviewing them a habit have probably a far bigger impact than any improvement in the SRS algorithm you use.
  • The Anki Website explains succinctly the main differences between its algorithm and SM-2.
  • Anki Database Structure: The most up-to-date guide to the Anki internal database schema, which was more than useful during the writing of this article.
  • The Ease Factor Problem: Interesting insight about the impact of changing the ease factor after a lapse.
  • A great video to introduce most of the notions covered in the Anki section.
  • Last but not least, the true history of spaced repetition: An extensive coverage of the subject by Piotr Wozniak. A reference.

About the author

Julien Sobczak works as a software developer for Scaleway, a French cloud provider. He is a passionate reader who likes to see the world differently to measure the extent of his ignorance. His main areas of interest are productivity (doing less and better), human potential, and everything that contributes in being a better person (including a better dad and a better developer).

Read Full Profile

You may also


Tags