# Cantor set

The **Cantor ternary set** is a remarkable subset of the real numbers
named after German mathematician Georg Cantor who described the set
in 1883. It has the same cardinality as \(\mathbb{R}\), yet it
has zero Lebesgue measure.

The set can be constructed recursively by first considering the unit
interval \(C_{0}=[0,1]\). In the next step, we divide the set into
three equal parts and discard the open middle set. This leads to the
new set \(C_{1}=[0,\tfrac{1}{3}]\cup[\tfrac{2}{3},1]\). This procedure
is repeated on the two remaining subintervals \([0,\tfrac{1}{3}]\) and
\([\tfrac{2}{3},1]\), and iteratively on all remaining subsets such
that \(C_n = \tfrac{1}{3}C_{n-1} \cup (\tfrac{2}{3} +
\tfrac{1}{3}C_{n-1})\). The **Cantor set** is defined by the limit as
\(n\to\infty\), i.e., \(\mathcal{C} = \cap_{n=0}^{\infty} C_{n}\).

The recursion can be illustrated in python in the following way

```
import numpy as np
def transform_interval(x, scale=1.0, translation=0.0):
return tuple(map(lambda z: z * scale + translation, x))
def Cantor(n):
if n==0: return {(0,1)}
Cleft = set(map(lambda x: transform_interval(x, scale=1/3), Cantor(n-1)))
Cright = set(map(lambda x: transform_interval(x, translation=2/3), Cleft))
return Cleft.union(Cright)
```

```
print(Cantor(1))
```

```
{(0.6666666666666666, 1.0), (0.0, 0.3333333333333333)}
```

The following code displays the first 5 iterations of the algorithm
and clearly illustrates the *self-similarity* property in the sets \(C_{n}\).

```
def intervalprint(x, n=240):
sym = n*[' ']
val = np.linspace(0, 1, num=n)
for interval in x:
idx = np.where((val >= interval[0]) & (val <= interval[1]))[0]
for i in idx:
sym[i] = u'\u2500'
st = ''.join(sym)
print(st)
x = list(map(Cantor, [0,1,2,3,4,5]))
for i in range(0, len(x)): intervalprint(x[i])
```

```
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────── ────────────────────────────────────────────────────────────────────────────────
─────────────────────────── ────────────────────────── ────────────────────────── ───────────────────────────
───────── ───────── ──────── ───────── ───────── ──────── ───────── ─────────
─── ─── ─── ─── ─── ── ─── ─── ─── ─── ── ─── ─── ─── ─── ───
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
```

It is clear that \(\mathcal{C}\) is **bounded** (being a subset of the
unit interval) and **closed** since it constructed by removing countably
many open intervals from a closed interval. It also has **zero measure**,
meaning that it does not contain any intervals.

To see this, note that in the first step the open middle set \((\tfrac{1}{3}, \tfrac{2}{3})\) is removed which has measure 1/3. From the remaining two closed sets, \([0, \tfrac{1}{3}]\) and \([\tfrac{2}{3}, 1]\), the open middle sets are removed in the next iteration: \((\tfrac{1}{3^{2}}, \tfrac{2}{3^{2}})\) and \((\tfrac{2}{3}+\tfrac{1}{3^{2}}, \tfrac{2}{3}+\tfrac{2}{3^{2}})\), each of measure \(\tfrac{1}{3^2}\). In the \(n\)th iteration \(2^n-1\) open intervals are removed each of length \(\tfrac{1}{3^{n}}.\) Hence, the Lebesgue measure of the Cantor set is \[m(\mathcal{C}) = m([0,1])-\sum_{k=0}^{\infty} \tfrac{2^{n}}{3^{n+1}} = 1-\frac{1}{3}\sum_{k=1}^{\infty} \left(\tfrac{2}{3}\right)^{n} = 1 - \frac{1}{3}(1-\tfrac{2}{3})^{-1} = 0. \]

In this sense, we can think of the Cantor set just containing “dust”.

It can also be shown that the Cantor set can be represented as \[ C =
\left\{ \sum_{i=1}^{\infty} \frac{x_{n}}{3^{n}} \mid
x_{n}\in{0,2}\right\}, \] from which it again
follows that the set is
uncountable. This can also be seen by considering an arbitrary
*countable subset* \[N = \{n_{1}, n_{2}, \ldots \}\subseteq
\mathcal{C}.\] In the construction of \(\mathcal{C}\), we must have
that \(n_{1}\) belongs to one of the two closed intervals in
\(C_{1}\). Hence, there is one interval \(I_{1}\) of length
\(\tfrac{1}{3}\) such that \(n_{1}\not\in I_{1}\). Similarly,
\(C_{2}\cap I_{1}\) is the union of two new closed intervals of length
\(\tfrac{1}{3^{2}}\), and \(n_{2}\) can at most be in one of these two
sets. Let \(I_{2}\subset I_{1}\) be this interval s.t. \(n_{2}\not\in
I_{2}\). Continuing in this fashion we construct a series of intervals
\(\mathcal{C}\supset I_{1}\supset I_{2}\supset \cdots \supset I_{k}
\supset\cdots\), where \(I_{k}\) is one of the closed intervals of
\(C_{n}\) of length \(\tfrac{1}{3^{n}}\) such that \(n_k\not\in
I_{k}\). The limit \(I_{\infty} = \cap_{k=1}^{\infty}I_{k}\) is
non-empty (see below), and by construction \(N\cap I_{\infty} =
\emptyset\). Hence given any arbitrary countable subset of
\(\mathcal{C}\) we can still find elements in \(\mathcal{C}\) that
does not belong to this subset, and thus there cannot exist a
bijection between \(\mathbb{N}\) and \(\mathcal{C}\) and we
conclude that \(\mathcal{C}\) is uncountable infinite.

Here we used **Cantor’s Intersection Theorem:** Let \((X, d)\) be a
complete metric space. Let \((I_{k})_{k\in N}\) be a sequence of decreasing non-empty,
closed subsets of \(X\) then \(\bigcap_{k\in N}I_{k} \neq
\emptyset.\)

*Proof:* For each \(k\in N\) choose \(x_{k}\in I_{k}\). The diameter of
\(I_{k}\) is defined as \(\operatorname{diam}(I_{k})=\sup\{𝑑(𝑥,𝑦):𝑥,𝑦∈
I_{k}\}\). We only need to consider the case where
\(\operatorname{diam}(I_{k})\to 0.\) In this case, let \(\epsilon>0\)
then there exists \(N>0\) such that \(d(x, x’)<\epsilon\) for all
\(x,x’\in I_{N}\). Let \(m,n>N\) then \(x_{n}, x_{m}\in I_{N}\) since
\(I_N \supset I_{N+1}\supset … \supset I_{m},I_{n}\), and thus
\(d(x_{n},x_{m})<\epsilon\). We conclude that
\((x_{k})_{k\in\mathbb{N}}\) is a Cauchy sequence. Since \(X\) is
complete \(x_n\to x\) as \(n\to\infty\) for some \(x\in X\). Note that
\((x_{n})_{n=N}^{\infty} \subset I_{N}\) and since \(I_{N}\) is
closed, the limit point also lies in the set \(x\in I_{N}\). As this
holds for any \(N\), we conclude that \(x\in\bigcap_{k\in N}I_{k}\).
This concludes the proof. Note, that in this case the limit point
\(x\) is in fact the only element. Since if we let \(x’\) be another element
then \(x,x’\in I_n\) for all \(n\in N\) but
\(d(x,x’)\leq\operatorname{diam}(F_{n})\to 0\) as \(n\to\infty\). Hence
\(d(x,x’)=0\), and therefore \(\bigcap_{k\in N}I_{k} = \{x\}.\)