Why SQLite Forensics?
SQLite is a widely popular database format that is used extensively almost everywhere. Both iOS and Android devices employ SQLite as a storage format of choice, with built-in and third-party applications relying on SQLite to keep their data. A wide range of desktop applications use SQLite and even Microsoft has chosen this format for storing Windows 10 timeline data!
This is why every digital forensics investigator and cyber incident responder must know the SQLite features and peculiarities, as well as the proper tools to extract as much information from SQLite databases as possible.
In this article, we will review the most forensically interesting SQLite features, dangers of using a non-forensic tool for SQLite analysis and offer a set of requirements for a proper tool to use.
We will use Belkasoft X, a DFIR tool by Belkasoft to illustrate the concepts being discussed.
SQLite features of interest
The following SQLite features are important to be aware of and use in the course of your investigations:
- Freelists: a special area inside a SQLite database, which contains recently deleted data
- Journal and Write-Ahead Log (WAL) files: transactional files, which can store recently added or recently deleted data
- Unallocated space: not to be confused with hard drive unallocated space, this SQLite feature allows finding deleted data even outside of freelists
Recovering SQLite deleted data from freelists
What deleted data could be
There are many things which may cause a database record to be deleted. This is not just a user hitting the Delete button or swiping a line in their mobile chat application to the left.
Among other reasons that an item may become deleted:
- Application automated history cleaning (e.g., scheduled to be cleaned once a month)
- Application automated archiving history (older history is stored in a backup, not necessarily in a SQLite format, while it is cleaned from the original database file)
- A secret chat disappeared after a "time bomb" triggered. Many chat apps would store a secret message until a given amount of time has passed since it is received or read by a counterparty. After that it is deleted automatically
- Browser history can be cleared, whether manually or automatically
- A specialized application, which cleans up various traces, can be involved
- System data can be cleaned automatically by an operating system (for example, Timeline data for Windows)
This means that if you recover something from a SQLite database, it does not necessarily mean that that item was explicitly and intentionally deleted by a user.
Things to know about SQLite freelists
In a SQLite database, regular (meaning, not deleted and not being a part of an ongoing transaction) records are stored in a file called the "Database Image".
Things to know about the "database image":
- The database image is broken up into "pages" of a fixed size
- The size of each page is specified in the file header
- Typically, all pages are of the same size and this size is specified when the database is created
- Each page may have one of the many roles assigned by SQLite. One of these roles is storing the actual data
- Each page has a number assigned by the SQLite engine. Page numbering starts with 1
A SQLite database may have one or more unused pages. These pages may be created and reserved for future use to improve performance; in this case, they contain no meaningful data. Such unused pages in the database file are stored on what is called a "freelist".
Here is how SQLite.org explains the meaning of freelists:
"A database file might contain one or more pages that are not in active use. Unused pages can come about, for example, when information is deleted from the database. Unused pages are stored on the freelist and are reused when additional pages are required."
In a nutshell, when some data is deleted from a SQLite database, it is not deleted immediately; instead, it goes to a freelist.
There is an exception though: an option called "auto_vacuum", affects whether freelist exist in a database or not.
Auto_vacuum, secure delete, and how they affect forensic investigation
An "auto_vacuum" is one of the so-called "pragma statements", a SQLite-specific SQL extension, which affects the way SQLite works. Particularly, auto_vacuum specifies what happens with deleted data.
By default, this setting is set to "none", which means that auto-vacuum is disabled. Quoting sqlite.org, "When auto-vacuum is disabled and data is deleted data from a database, the database file remains the same size. Unused database file pages are added to a "freelist" and reused for subsequent inserts. So no database file space is lost."
This is good for a forensic investigator: by default, there are freelists in every SQLite database. However, there are other options.
If "auto_vacuum" is set to "full", no freelists are kept inside the database. Instead, upon every transaction commit, freelist pages are moved to the end of the database file, and after that the file is truncated to exclude these pages. It is also important to mention that during this process, the data itself is not wiped, meaning that potentially it can be carved and recovered. This may be useful if you are working with an unencrypted and uncompressed hard drive; while this will not work for a mobile device, unless you can make a physical dump and it is unencrypted, a pretty rare possibility these days.
"Secure_delete" is another important pragma statement. By default, this option is set to "off", which means that there is no secure delete employed. However, it is possible to set it to "on", which will dramatically affect the deletion operation from a forensics perspective. Doing this will enforce all deleted content to be overwritten with zeros (note: "fast" setting for this option may or may not do the same, depending on circumstances).
What do we mean by "possible to set on and off", who is in charge? The answer is: developers of the specific application you are dealing with. It is up to the developers who design and program their application, to decide on whether or not they employ secure deletion or automatic vacuum of their databases. This means that your possibility to recover deleted data may vary from application to application and depends on settings chosen by the corresponding application vendor.
What happens with freelist data next?
Pages that are deleted from a SQLite database with "auto_vacuum" set to "off", will be placed in a freelist. At this point in time, it is possible for a specialized tool to extract data, however, this possibility is not there forever:
- Freelist data can be wiped. This may happen from an application call to VACUUM SQLite statement, what will rebuild the database and truncate it, excluding unneeded freelist pages
- Freelist pages can be claimed by the database engine to store other data. This means that if the database is being actively used and new data is written to it, freelist pages may disappear more quickly
Until then, the deleted pages are sitting in the freelist.
Importance of using a specialized tool for freelist analysis
It is important to mention that freelist data is not accessible through the database engine via regular means (such as a standard SQLite API). Standard database viewers and SQLite components do not provide access to information stored in the freelist.
To give an example, the popular SQLite database browser: DB Browser for SQLite, will not show you this data. On the screenshot below you can see that DB Browser did not find any chats stored in the Messages table of the Skype database:
The database size is more than 300Kb, which gives investigators a hint that it should not be empty. At this point, they may think to use a specialized tool, but if not, they are going to lose lots of information potentially crucial for the case.
Belkasoft X implements a low-level SQLite database parsing without relying on standard APIs, which gives the tool an access to all of the database peculiarities, including freelists. On the screenshot below you can see the same database, opened in Belkasoft X's SQLite Viewer:
The product recovered as many as 53 chats, each one of them might have become a cornerstone in your case. This is why with SQLite forensics, you should opt-in for a specialized DFIR tool rather than a freely downloadable non-forensic utility.
Journal Files and Write Ahead Logs
While the freelist offers access to deleted SQLite pages, the write ahead log (or "WAL") can become a doorway to accessing records not yet committed into the main database.
But before we dive deep into WAL files, let us review the transaction file previously used by SQLite: journal files.
Historically, SQLite used journaling to guard against potential write errors. This mechanism was called "Rollback Journals". Every time the SQLite engine was about to write data to a page, the original content of that page was backed up into a separate journal file.
If the write operation concluded successfully, the engine would remove the journal file. If there was ever a problem writing data, or the write operation was interrupted for whatever reason, the journal file would remain on the disk. As a result, the next time the database is started, the engine would discover those journal files and roll the entire database back to its previous state by merging data from rollback journals into the main database.
Journal files existed before SQLite version 3.7.0.
Modern versions of SQLite employ a completely different journaling mechanism. Called "Write Ahead Log", it is basically an upside-down version of Rollback Journals.
With write ahead logs, the engine does not touch the main database file when a new or altered page is written. Instead, for the time being, new or altered data is stored in a separate file called Write Ahead Log.
Interestingly, the new data may remain in the WAL file for a very long time until committed by the "Checkpoint" event. Until then, the database will read new data from the WAL instead of using the main database file.
The Checkpoint event occurs automatically after the WAL size reaches a certain size (by default, this is 1000 pages). Changes can be committed via the database API or manually with the following SQL command: PRAGMA wal_checkpoint;
For many types of artifacts, one thousand records are a lot. An entire chat session may never trigger a single checkpoint. A few days of browsing may not even force the Checkpoint event. This means that for a few days, weeks, or even longer, the information in the write ahead log will remain uncommitted.
Importance of using a specialized tool for Journal and WAL analysis
Similar to freelists, neither Journal nor WAL data is accessible through the database engine via regular means (such as a standard SQLite API). However, there is a significant difference in how a non-specialized tool work with WAL files.
General database tools (e.g., DB Browser for SQLite) will attempt to commit write ahead logs before accessing data. What does this mean?
- If your media is writable, you are going to alter your evidence! You will see WAL records in your tool, but your entire results may not be accepted: the main SQLite database file will be changed, its hash value will not match the original, WAL file will disappear from the media
- If it is not writable, commit will not happen. The data will not get into the main database file and thus will not be shown in your product of choice. You will not see any uncommitted data
Both options could be a catastrophe for your case.
The worst thing here is that even digital forensic software tools may use standard SQLite drivers. This means you may experience the same consequences as noted above.
Belkasoft X supports low-level parsing for both Rollback Journal files and WAL files, without attempting to alter anything. This means that you will not alter your evidence but at the same time, will get maximum results available.
It is worth mentioning that a WAL may contain more than just new, uncommitted records. As an example, it can also contain altered records. This means that by analyzing the main database file along with the corresponding WAL file, you can see two versions of the same record, which may give you an extra bit of information (for example, a first version of a chat message before it was edited after sending).
Belkasoft X shows Rollback Journal and WAL items with a colorful tag in the Record type column:
Journal items have an aurous color
WAL records are highlighted with a magenta tag
A handy feature of Belkasoft X is that it merges items from regular SQLite records, as well as journal and WAL, in the same Artifacts view if those items originate an artifact. This means that if a chat or a browser link was recovered from a WAL or journal file (or a freelist), they will still be added to the correspondent node, such as Chats or URLs of the Artifacts window. This is indeed user-friendly given that not all users are necessarily aware of SQLite peculiarities and could miss important information otherwise.
A deleted Telegram chat is automatically merged along with existing chats by Belkasoft X. The "Is deleted" property conveniently shows that it was a deleted item.
SQLite unallocated space
There is one more interesting concept, every DFIR examiner must be aware of, when conducting SQLite forensics: SQLite unallocated. Though it sounds like a hard drive's unallocated space, it is quite different.
As mentioned above, a SQLite database is broken into pages. Some of these pages are what they call "leaf table B-trees", which contain the data. In turn, these leaf table B-trees contain cells. Normally, SQLite will attempt to place a new cell towards the end of the B-tree page. The newly added cells will reside towards the end of the page until they are allocated for a new cell, the space before the first cell starts is unallocated. Unallocated space can be empty, or it may contain remnants of previously used pages (for example, freelist pages).
To put it simply, unallocated space in SQLite databases are page fragments containing random chunks of data. This is similar to unallocated space on a regular hard drive, the difference is that there could be many unallocated areas in a database. These unallocated fragments do not contain valid data or pointers and represent areas of available space that are ready to accept new data, when it is written to the database.
Unlike freelists, unallocated space is not formally referenced from anywhere in a SQLite database. This is why it is very difficult and time-consuming to examine it. In most cases it is impossible to determine which page (and, correspondingly, which table) is used to contain these fragments.
Belkasoft X displays all areas of unallocated space in SQLite databases within its built-in SQLite Viewer. To view it, simply open the database of interest and navigate to "Unallocated space" tab, shown below all database tables:
"Raw data" tab shows all records found in SQLite unallocated
Most of these records are useless, but some may give you a hint of previously deleted data. The product can carve these areas for signatures of artifacts it supports (for example, chats and browsers), and if found, such results are shown under the tab called "Carved data":
It is worth mentioning that since the product cannot attribute these chunks to any profile and even, in a general case, an application, they will not be shown under the Artifacts window of the product.
Built-in SQLite Viewer: useful perks
SQLite Viewer window, built in Belkasoft X, has a number of useful features for a digital forensic investigator or cyber incident responder—apart from deleted data recovery. To name a few:
- Database properties view
- Table structure view
- Setting column type
- In-place picture BLOB preview
Database properties view
On the top left corner of the SQLite Viewer, you can find Properties node. With its help you can review basic properties of the opened database, such as page size, number of pages (including freelist), journal type and hash value:
Table structure view
By default, when you click on a table in the table list on the left, you will be presented with rows belonging to this table. However, you may want to review the table structure, including column types and constraints, such as 'Primary key' and 'Not null':
Setting column type
A table can contain various data which you can interpret right inside the SQLite Viewer. For example, if a datetime stamp is stored in Unix (Epoch) time, you can set the column type to present you the human-readable time by right clicking on the column header and selecting the proper type in the "Choose column type" menu:
Timestamp column is originally shown as an integer value, let us change that
Unix UTC time selected and is now shown
It is quite handy that the product remembers the column type selection and uses it for report generation, should you run it from within the SQLite Viewer.
In-place picture BLOB preview
One particularly useful column type is "Picture". Many applications, including WhatsApp, store picture previews ("thumbnails") right inside the database. In the screenshots above, you can see the "thumbnail" column on the left of "timestamp". The data is binary and actually presents a small version of a picture, sent via a WhatsApp message.
Belkasoft X conveniently shows these thumbnails in-place, right on each table row, if one of column types is set to "Picture":
Reviewing WhatsApp database, table "message_thumbnails": "thumbnail" column type is set to "Picture". You can preview the thumbnail right inside the grid!
Belkasoft X allows you to create a report for any SQLite table. You can right click inside the grid and then either "Create report for all items" or "Create report for checked items". There are multiple formats available, including XLSX, PDF, HTML, CSV and others. As mentioned above, column type selection is respected during the report generation.
In most cases the proper report format would be CSV or XLSX, since tables typically have too many columns to fit into a text format such as PDF or HTML. However, if you prefer one of the latter, you can choose only needed columns by adjusting set of columns in the "Advanced options" window on the "Columns" tab:
SQLite forensics: interesting real use cases
Articles on SQLite forensics typically review chat applications like Skype and WhatsApp, or browsers like Chrome and Chromium-based. Not to overlap with them, we will describe two other interesting cases.
First case: Windows 10 Timeline
The author of this article has analyzed his own traveling laptop, which has Windows 10 installed, with Belkasoft X. The laptop was used extensively for a couple of years and thus had a rich Windows 10 Timeline history.
The ratio of deleted and transactional records vs. regular SQLite records is quite amazing: out of total 6430 records, found by Belkasoft X, almost 45% (2842) were freelist items! Besides, 117 items fell into journal file category:
A real Windows 10 Timeline database typically has numerous deleted and transactional items
Windows 10 Timeline is a good example of a SQLite database, having many deleted items without a user explicitly deleting them.
Using a non-forensic tool or a forensic tool with poor SQLite support (e.g. employing standard SQLite API) may result in you losing up to 50% of meaningful Timeline data!
More on Windows 10 Timeline analysis you can find in our article "How to Analyze Windows 10 Timeline with Belkasoft Evidence Center".
Second case: recovering a SQLite database from memory
In this case we had a memory dump taken with the help of our free Windows memory capturing tool, Belkasoft Live RAM Capturer, from a running Windows 7 machine. The dump was analyzed with Belkasoft X with an option to carve for all signatures supported by the product, and to extract memory processes.
As a result, the software extracted 66 processes, including several Chrome.exe processes:
From inside one of the running Chrome instances, some SQLite databases were carved:
Located under the "Other files" node, 13 SQLite databases are shown, which were found in the memory of this particular Chrome process
Opened in a built-in SQLite Viewer, one of databases gave the following result:
It looked like the database was heavily overwritten: it had just two tables with a small number of records. When we looked inside one of the tables, we found out that all six records recovered by the product were actually deleted.
By mentioning this example, we would like to show the importance of a forensic tool like Belkasoft X in your toolset. Imagine, how much time and knowledge one will have to have to repeat the same analysis manually:
- Step 1: Analyze memory dump and extract processes
- Step 2: Inside every memory process look for a signature for a SQLite database and carve out all of those databases
- Step 3: Recover database schema (standard tools may fail on this step because the database is not completely valid, being partially overwritten)
- Step 4: For one of the tables, find freelist data and recover it (standard tools will not find freelist data)
Or you can just enjoy a fully automated analysis process.
An in-depth analysis of SQLite databases is needed for almost every DFIR investigation because this format is so widespread on both computer and mobile devices. The SQLite format has a number of interesting peculiarities, including freelist pages, Rollback Journal, Write Ahead Log and unallocated space.
All of them must be considered when it comes to the analysis of SQLite, otherwise you will be losing deleted data, not yet committed data, older versions of updated data and so on.
For some applications, the volume of non-regular SQLite data may reach 50% (for example, Windows 10 Timeline). We have seen databases with up to 95% of freelist data, and many of competitor tools were failing to even open such SQLites.
Having a proper DFIR tool with in-depth SQLite support, not relying on standard SQLite API, is an absolute must for your toolset. One such tool is Belkasoft X, which offers solid support for parsing regular and deleted SQLite data, shows WAL and journal records, recovers and carves SQLite unallocated records, and can even process databases from RAM.