Aura Books

How chapters work in MP3 audiobooks (and why most players miss them)

MP3 was not designed for audiobooks. Chapter markers were retrofitted into the format through ID3v2 metadata frames, and the result is a mess that almost every player handles differently. Here is what’s really going on, and how to make chapters work reliably.

The two ways MP3 audiobooks “have” chapters

  1. Folder-as-book. The most common pattern in the wild: each chapter is its own MP3 file, named 01 - Prologue.mp3, 02 - The Discovery.mp3, and so on. There are no chapter frames inside the files; the “chapters” are just the file boundaries. This works in any player that supports folder imports.
  2. One-file with embedded markers. A single MP3 covering the whole book with CHAP frames in the ID3v2 tag pointing to chapter start and end times. This is rare in distributed audiobooks but is what tools like m4b-tool and ffmpeg produce when you ask for embedded chapters in MP3.

What ID3v2 CHAP frames actually contain

ID3v2 (the metadata system MP3 borrowed from the original ID3 spec) defines a frame type called CHAP, which holds:

  • An element ID — an opaque string that identifies the chapter within the file.
  • Start time and end time in milliseconds.
  • Start offset and end offset in bytes (almost always 0xFFFFFFFF, meaning “ignore me, use the times”).
  • Sub-frames for the chapter title (TIT2), an optional subtitle (TIT3), and even chapter art (APIC).

A separate frame, CTOC (table of contents), groups CHAP frames into an ordered list and marks one TOC as the “top level” for the file. A player that wants to read chapters from an MP3 needs to:

  1. Find the ID3v2 header (the bytes ID3 at the start of the file or the end with the 3DI footer).
  2. Walk the frames until it finds a top-level CTOC.
  3. Follow the element IDs into matching CHAP frames.
  4. Read the embedded TIT2 sub-frame for each chapter’s title.

None of this is technically hard, but it requires writing an actual ID3v2 sub-frame parser rather than the “read the first eight frames” happy path most tag libraries take. That is the structural reason chapter support is so spotty in MP3.

Why most players ignore CHAP frames

For a music player, chapter frames are wasted complexity — songs don’t have chapters. Most popular tagging libraries (Mutagen, TagLib, id3lib in its various forms) parse CHAP and CTOC, but the consumer player on top usually doesn’t read them. Apple Music, Spotify, Plexamp, foobar2000 in its default config — none of them surface CHAP chapters from an MP3 audiobook.

The players that do read them are almost all audiobook-specific: Smart AudioBook Player, BookFusion, Voice Audiobook Player on Android; Bound and BookPlayer on iOS; and Aura Books on the web. Even within that group, behavior varies: some require the CTOC to be flagged as “top level”, others accept any TOC; some need every CHAP frame to have a non-empty title, others fall back to “Chapter 1, 2, 3”.

How Aura Books handles each case

When you import an MP3 (or a folder of MP3s) we run three strategies in order:

  1. music-metadata-browser for ID3v2 tags. If the file has a top-level CTOC with CHAP sub-frames, we use those as chapters.
  2. If we received multiple files in one import, we treat each file as a chapter, with the title taken from the TIT2 tag if present, otherwise from the filename with track-number prefixes stripped.
  3. If a chapters.txt file accompanies the audio in the same folder, we parse it in the “simple chapters” format that mkvmerge, m4b-tool, and most other tools produce.

The “simple chapters” format

If you’re adding chapters to an MP3 audiobook by hand, this is the easiest path. Create a text file next to your MP3 named chapters.txt with one chapter per line:

00:00:00.000 Prologue
00:04:32.000 Chapter 1 — The Discovery
00:31:18.500 Chapter 2 — Aftermath
01:02:44.000 Chapter 3 — A Letter

Aura Books and most other audiobook players will read this directly. The format is forgiving: timestamps can be HH:MM:SS or HH:MM:SS.mmm, and the title is everything after the first space.

Adding CHAP frames properly (with mid3v2)

If you want the chapters baked into the MP3 itself, the cleanest tool is mid3v2 from the Python Mutagen package. The command looks roughly like:

pip install mutagen
mutagen-inspect "book.mp3"          # see what’s already there
mid3v2 --CHAP="ch1,0,272000,Prologue" book.mp3
mid3v2 --CHAP="ch2,272000,1879500,Chapter 1" book.mp3
# ...add one --CHAP per chapter, then write the top-level TOC:
mid3v2 --CTOC="toc,top,ch1,ch2,ch3" book.mp3

The first argument is the element ID, then start and end times in milliseconds, then the chapter title. The CTOC marks the table of contents as top-level (top) and lists element IDs in order.

When in doubt, use folder-as-book

All of the above is helpful if you’re a power user. For most people the practical advice is simpler: keep your MP3 audiobooks as folders, one MP3 per chapter, with the chapter title in the filename. Every modern audiobook player handles that layout, and you can re-import the same folder into a new device in a few seconds. The CHAP/CTOC route is only worth it if you specifically need single-file portability.


Want to see your library’s chapters parsed correctly? Open Aura Books →