ufraan
← Back

BitTorrent Internals

A practical explanation of BitTorrent internals: torrent files, bencoding, trackers, peer discovery, pieces, seeders, and leechers.

2026-05-0112 min read

BitTorrent is a decentralized peer-to-peer (P2P) file-sharing protocol designed for fast, efficient distribution of large files over the internet.

Let's first see how we classically download files from the internet, and why we even need something like BitTorrent.

Client-server download model: one server serving multiple clients
Client-server download model: one server serving multiple clients

The Problem with Client-Server Downloads

The client requests a file from the server, the server has the file and responds. But things get interesting when your download size is a bit larger:

  • Server bandwidth is limited, so as more clients connect, speed slows down.
  • Speed of data transfer is capped by the server's upload capacity.
Alice downloading from Bob — limited by Bob's upload speed
Alice downloading from Bob — limited by Bob's upload speed

If Bob's upload speed is 60 Mbps, then no matter how fast Alice's download speed is, the overall download speed cannot exceed 60 Mbps.

Peer-to-Peer Networks

In a P2P network, every party participating in the network has the exact same capabilities: they are all equal peers and can initiate conversations with each other.

The main highlight of P2P: even if a few nodes crash or are removed, the network keeps serving its purpose. No single point of failure.

This isn't just about outages — it also applies to the core service the network provides. For example, if the network's job is to serve files, even if one machine goes down, other machines would still share those files with whoever needs them. There are no system interruptions as long as the network is stable enough.

P2P networks come in two flavors:

  • Pure P2P: No central entity. Every node can connect to every other node.
  • Hybrid P2P: Has a central entity, used to share metadata about the data across peers — not the data itself.
Pure vs hybrid P2P architecture comparison
Pure vs hybrid P2P architecture comparison

If the central entity goes down, the network and its services are affected. This hybrid P2P architecture is what powers BitTorrent.

BitTorrent has a central entity called a tracker. Peers talk to each other, but to know who to talk to, they first consult the tracker.

Core Idea

The core idea of BitTorrent is to download a file from multiple machines concurrently.

We saw that download speed is limited by the upload capacity of the sender — be it a user, a server, or anything else. If you can download at 100 Mbps but the sender can only upload at 60 Mbps, you'll max out at 60 Mbps.

But what if instead of downloading from one machine, we distributed the file across the network and connected to 50 different clients simultaneously to download in parallel? That's the idea behind BitTorrent.

Multiple peers downloading different pieces of a file in parallel
Multiple peers downloading different pieces of a file in parallel
  • Faster downloads.
  • Upload load is distributed among peers. Every peer may hold some fragment of the file and can serve it to others. You still get high download speeds, but the upload burden is shared across the network.
  • A large number of downloads puts only a small load on each peer, because it's highly distributed.
  • Breaking a file into smaller chunks boosts concurrency.

A Simplified Download Flow

When a user wants to download a file, they sniff around the network to find peers that have the pieces. For this, they use a tracker.

The user goes to the tracker and says "I want this file." The tracker responds with a list of peers that have it. The user then connects directly to those peers and downloads the file.

Peer connecting to tracker and then to other peers for file chunks
Peer connecting to tracker and then to other peers for file chunks

Let's say a user wants a file that has 4 chunks. They go to the tracker, the tracker responds with the list of machines for each chunk, the user talks to those peers, downloads each chunk, and concatenates them locally to get the full file.

Key Terminology

These terms come in handy when doing a deep dive into the algorithms.

1. Pieces and Blocks

A file shared on the BitTorrent network is split into pieces, and a piece is what peers serve. Each piece is further split into blocks, with one block transferred per request.

For example, a 16 MB piece would be split into about 1,000 blocks of 16 KB each. A piece cannot be served if any of its blocks is missing. The local client concatenates pieces to reconstruct the original file.

File split into pieces, each piece split into blocks
File split into pieces, each piece split into blocks

2. Peer Set

A list of peers that a node can send pieces to or request pieces from. For example, A's peer set (as given by the tracker) might be C and E. So if A wants to send or receive a file, it does so through C and E.

Peer set diagram showing A connected to C and E
Peer set diagram showing A connected to C and E

3. Active Peer Set

A subset of the peer set that you're actively transferring data with. You might get 50 peers from the tracker, but you won't connect to all 50 at once — only some of them. To keep overall network traffic in check, the active peer set is kept smaller (e.g., 10 out of 50).

Active peer set as a subset of the full peer set
Active peer set as a subset of the full peer set

4. Seeders & Leechers

  • Seeder: A peer that has all the pieces of a file and is actively sharing them in the network.
  • Leecher: A peer that is currently downloading.
Seeder vs leecher comparison diagram
Seeder vs leecher comparison diagram

A large number of seeders results in faster download speeds, since you get multiple concurrent uploads. If a torrent has only one seeder, it essentially becomes a classic client-server download. But if leechers far outnumber seeders, download speeds will take a hit.

BitTorrent is Popularity-Friendly

New and popular files will have many seeders and download faster. Old or unpopular files have fewer seeders and download slower.

For example, when a new version of an operating system is released, many people will want to download it. Ubuntu and Debian offer official torrent distributions, and there will be many seeders — so whoever wants to download gets fast speeds.

Applications of BitTorrent

  1. Downloading Linux distributions (faster than FTP and HTTP), large software, movies, games, and more.
  2. Sending patches to users (e.g., security patches). You can run a small BitTorrent-based system where you drop a file into one node and it automatically distributes across every machine in your network. Massive data centers use this to power security patch distribution.
  3. Facebook uses BitTorrent to power massive deployments and distribute build artifacts across servers. Instead of thousands of servers all downloading a binary from one source, it splits the file across multiple places. The network gradually converges and every node ends up with the full file.

The Torrent File

To download or upload any file from the torrent network, you need a .torrent file. This file holds metadata about the file you want to download.

For example, if you want to download Ubuntu from the torrent network, the Ubuntu ISO would have a corresponding .torrent file. You download it, which contains all the metadata, and then use it to fetch the actual file from the network.

Torrent file shown as bridge between user and P2P network
Torrent file shown as bridge between user and P2P network

Lifecycle of a Torrent File

Seeders are seeding data in the network, and as long as at least one seeder is serving the file, the torrent is alive. Otherwise, the torrent is dead.

It's therefore very important to have at least one seeder — otherwise nobody can download the file.

User downloading torrent file via HTTP, then joining the P2P network
User downloading torrent file via HTTP, then joining the P2P network

What separates BitTorrent from a classic blockchain or cryptocurrency use case is that there's no incentive for anyone to join and stay as a seeder. Cryptocurrency incentivizes participation in the network — BitTorrent doesn't.

What Does the Torrent File Hold?

The torrent file is static — no matter when you download it, it will always have the same content. It holds metadata about the file, not the actual data.

A torrent file is essentially a dictionary of key-value pairs:

  1. announce: URL of the tracker. This tells your torrent client which tracker to contact to find peers in the network.
  2. created by: Name and version of the program that created the torrent.
  3. creation date: Creation timestamp in Unix epoch.
  4. encoding: Encoding used for strings in the info dictionary. Defaults to UTF-8.
  5. comment: Optional comment from the author.
  6. info: A dictionary describing the file(s) of the torrent. For example, if you're downloading Ubuntu, it would contain information about the Ubuntu image itself.

BitTorrent supports two types of downloads — single-file and multi-file — and the structure of the info dictionary varies depending on which one is used.

Single-file torrent info dictionary structure
Single-file torrent info dictionary structure
Multi-file torrent info dictionary structure
Multi-file torrent info dictionary structure

File Data Information

The info dictionary also stores information about the pieces:

  1. piece length: Number of bytes in each piece.
  2. pieces: 20-byte SHA1 hash values for each piece, concatenated together.
Pieces stored with their SHA1 hashes in the torrent file
Pieces stored with their SHA1 hashes in the torrent file

Since a file is split into equal-size pieces, piece length tells you how big each one is. For example, a 1 GB file with a piece size of 1 MB would have 1024 pieces. The torrent file doesn't store the actual piece data — instead, for each piece it stores a 20-byte SHA1 hash and concatenates all of them together.

Bencoding: The Torrent File Format

Torrent files use a custom encoding format called **bencoding**: not JSON.

When you open a .torrent file in a client like qBittorrent, the client first decodes the bencoded file to extract the metadata. The component that does this is called a bencoding decoder.

Bencoding Specification

Every torrent file is a bencoded dictionary. The bencoding specification supports only 4 data types: strings, integers, lists, and dictionaries.

Bencoding format breakdown showing how data types are encoded
Bencoding format breakdown showing how data types are encoded

So the entire torrent file is a bencoded dictionary.

I wrote a bencoding decoder in Go — understood it way better: bencode-foo

BitTorrent Architecture

The BitTorrent architecture consists of four entities:

  1. The .torrent file
  2. Trackers
  3. Seeders
  4. Leechers

Pieces

Whenever a file is shared on the BitTorrent network, it's not shared in its entirety. It's first broken into pieces, which become the unit of transmission.

The downloader gets these pieces and concatenates them locally to form the complete file. All pieces are the same length.

For example, a 3 MB file with a piece size of 1 MB creates 3 pieces: p1, p2, p3.

Pieces of a file being transferred across the P2P network
Pieces of a file being transferred across the P2P network

When you join the network and download a piece from a seeder, you immediately broadcast to the rest of the network: "I have this piece now — if anyone needs it, come to me instead."

As each peer downloads any piece, they inform everyone else. This is the power of P2P.

Multiple peers sharing pieces with each other
Multiple peers sharing pieces with each other

Torrent File

A metafile that holds static information about the file: filename, size, piece information, and more. It does not hold the actual data.

One critical field it holds is the announce URL — the tracker URL. The tracker is the only central entity in the BitTorrent architecture, acting as a metadata store where peers find each other.

Seeder vs leecher comparison table
Seeder vs leecher comparison table
Seeder, tracker, and peer relationships
Seeder, tracker, and peer relationships

Each torrent file is uniquely identified by an infohash: a SHA1 hash of the info section of the .torrent file. The .torrent file itself is typically downloaded through a regular HTTP web server.

Tracker

The tracker is the only central entity in this P2P network, and it's very lightweight.

For a given torrent, the .torrent file contains the tracker URL. Every peer in the network connects to this tracker to get metadata about who else is in the network.

It's a decentralized network where there can be multiple trackers, but you'll connect to one tracker for a given .torrent file.

The tracker does not download or transfer files. It only holds information about peers and their distribution — that's why it's so lightweight.

The core jobs of a tracker:

  1. Keep track of peers that hold the file.
  2. Keep track of peers that are downloading.
  3. Help peers find other peers to download content from.

A tracker is essentially a simple HTTP server that hands out peer information to the network and periodically collects stats from peers.

High-level architecture diagram showing torrent file, tracker, seeders, and leechers
High-level architecture diagram showing torrent file, tracker, seeders, and leechers

When you have a .torrent file, you first extract info from it, then contact the tracker saying "I want to join your network." The tracker responds with roughly 50 peers that are part of this network.

Peer set and state within the BitTorrent network
Peer set and state within the BitTorrent network

The tracker doesn't just send info to users — peers in the network also periodically report back to the tracker: downloaded amount, uploaded amount, which torrent they're part of, and more.

Peer sets and established connections in the network
Peer sets and established connections in the network
Peer set gossip and high-level architecture
Peer set gossip and high-level architecture

That's BitTorrent. Still clever, still widely used.

Check out this playlist for a video walkthrough of the topic.