# Cantor set

The Cantor ternary set is a remarkable subset of the real numbers named after German mathematician Georg Cantor who described the set in 1883. It has the same cardinality as $$\mathbb{R}$$, yet it has zero Lebesgue measure.

The set can be constructed recursively by first considering the unit interval $$C_{0}=[0,1]$$. In the next step, we divide the set into three equal parts and discard the open middle set. This leads to the new set $$C_{1}=[0,\tfrac{1}{3}]\cup[\tfrac{2}{3},1]$$. This procedure is repeated on the two remaining subintervals $$[0,\tfrac{1}{3}]$$ and $$[\tfrac{2}{3},1]$$, and iteratively on all remaining subsets such that $$C_n = \tfrac{1}{3}C_{n-1} \cup (\tfrac{2}{3} + \tfrac{1}{3}C_{n-1})$$. The Cantor set is defined by the limit as $$n\to\infty$$, i.e., $$\mathcal{C} = \cap_{n=0}^{\infty} C_{n}$$.

The recursion can be illustrated in python in the following way

  import numpy as np

def transform_interval(x, scale=1.0, translation=0.0):
return tuple(map(lambda z: z * scale + translation, x))

def Cantor(n):
if n==0: return {(0,1)}
Cleft = set(map(lambda x: transform_interval(x, scale=1/3), Cantor(n-1)))
Cright = set(map(lambda x: transform_interval(x, translation=2/3), Cleft))
return Cleft.union(Cright)

  print(Cantor(1))

{(0.6666666666666666, 1.0), (0.0, 0.3333333333333333)}


The following code displays the first 5 iterations of the algorithm and clearly illustrates the self-similarity property in the sets $$C_{n}$$.

  def intervalprint(x, n=240):
sym = n*[' ']
val = np.linspace(0, 1, num=n)
for interval in x:
idx = np.where((val >= interval) & (val <= interval))
for i in idx:
sym[i] = u'\u2500'
st = ''.join(sym)
print(st)

x = list(map(Cantor, [0,1,2,3,4,5]))
for i in range(0, len(x)): intervalprint(x[i])

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────                                                                                ────────────────────────────────────────────────────────────────────────────────
───────────────────────────                           ──────────────────────────                                                                                ──────────────────────────                           ───────────────────────────
─────────         ─────────                           ────────         ─────────                                                                                ─────────         ────────                           ─────────         ─────────
───   ───         ───   ───                           ───   ──         ───   ───                                                                                ───   ───         ──   ───                           ───   ───         ───   ───
─ ─   ─ ─         ─ ─   ─ ─                           ─ ─    ─         ─ ─   ─ ─                                                                                ─ ─   ─ ─         ─    ─ ─                           ─ ─   ─ ─         ─ ─   ─ ─


It is clear that $$\mathcal{C}$$ is bounded (being a subset of the unit interval) and closed since it constructed by removing countably many open intervals from a closed interval. It also has zero measure, meaning that it does not contain any intervals.

To see this, note that in the first step the open middle set $$(\tfrac{1}{3}, \tfrac{2}{3})$$ is removed which has measure 1/3. From the remaining two closed sets, $$[0, \tfrac{1}{3}]$$ and $$[\tfrac{2}{3}, 1]$$, the open middle sets are removed in the next iteration: $$(\tfrac{1}{3^{2}}, \tfrac{2}{3^{2}})$$ and $$(\tfrac{2}{3}+\tfrac{1}{3^{2}}, \tfrac{2}{3}+\tfrac{2}{3^{2}})$$, each of measure $$\tfrac{1}{3^2}$$. In the $$n$$th iteration $$2^n-1$$ open intervals are removed each of length $$\tfrac{1}{3^{n}}.$$ Hence, the Lebesgue measure of the Cantor set is $m(\mathcal{C}) = m([0,1])-\sum_{k=0}^{\infty} \tfrac{2^{n}}{3^{n+1}} = 1-\frac{1}{3}\sum_{k=1}^{\infty} \left(\tfrac{2}{3}\right)^{n} = 1 - \frac{1}{3}(1-\tfrac{2}{3})^{-1} = 0.$

In this sense, we can think of the Cantor set just containing “dust”.

It can also be shown that the Cantor set can be represented as $C = \left\{ \sum_{i=1}^{\infty} \frac{x_{n}}{3^{n}} \mid x_{n}\in{0,2}\right\},$ from which it again follows that the set is uncountable. This can also be seen by considering an arbitrary countable subset $N = \{n_{1}, n_{2}, \ldots \}\subseteq \mathcal{C}.$ In the construction of $$\mathcal{C}$$, we must have that $$n_{1}$$ belongs to one of the two closed intervals in $$C_{1}$$. Hence, there is one interval $$I_{1}$$ of length $$\tfrac{1}{3}$$ such that $$n_{1}\not\in I_{1}$$. Similarly, $$C_{2}\cap I_{1}$$ is the union of two new closed intervals of length $$\tfrac{1}{3^{2}}$$, and $$n_{2}$$ can at most be in one of these two sets. Let $$I_{2}\subset I_{1}$$ be this interval s.t. $$n_{2}\not\in I_{2}$$. Continuing in this fashion we construct a series of intervals $$\mathcal{C}\supset I_{1}\supset I_{2}\supset \cdots \supset I_{k} \supset\cdots$$, where $$I_{k}$$ is one of the closed intervals of $$C_{n}$$ of length $$\tfrac{1}{3^{n}}$$ such that $$n_k\not\in I_{k}$$. The limit $$I_{\infty} = \cap_{k=1}^{\infty}I_{k}$$ is non-empty (see below), and by construction $$N\cap I_{\infty} = \emptyset$$. Hence given any arbitrary countable subset of $$\mathcal{C}$$ we can still find elements in $$\mathcal{C}$$ that does not belong to this subset, and thus there cannot exist a bijection between $$\mathbb{N}$$ and $$\mathcal{C}$$ and we conclude that $$\mathcal{C}$$ is uncountable infinite.

Here we used Cantor’s Intersection Theorem: Let $$(X, d)$$ be a complete metric space. Let $$(I_{k})_{k\in N}$$ be a sequence of decreasing non-empty, closed subsets of $$X$$ then $$\bigcap_{k\in N}I_{k} \neq \emptyset.$$

Proof: For each $$k\in N$$ choose $$x_{k}\in I_{k}$$. The diameter of $$I_{k}$$ is defined as $$\operatorname{diam}(I_{k})=\sup\{𝑑(𝑥,𝑦):𝑥,𝑦∈ I_{k}\}$$. We only need to consider the case where $$\operatorname{diam}(I_{k})\to 0.$$ In this case, let $$\epsilon>0$$ then there exists $$N>0$$ such that $$d(x, x’)<\epsilon$$ for all $$x,x’\in I_{N}$$. Let $$m,n>N$$ then $$x_{n}, x_{m}\in I_{N}$$ since $$I_N \supset I_{N+1}\supset … \supset I_{m},I_{n}$$, and thus $$d(x_{n},x_{m})<\epsilon$$. We conclude that $$(x_{k})_{k\in\mathbb{N}}$$ is a Cauchy sequence. Since $$X$$ is complete $$x_n\to x$$ as $$n\to\infty$$ for some $$x\in X$$. Note that $$(x_{n})_{n=N}^{\infty} \subset I_{N}$$ and since $$I_{N}$$ is closed, the limit point also lies in the set $$x\in I_{N}$$. As this holds for any $$N$$, we conclude that $$x\in\bigcap_{k\in N}I_{k}$$. This concludes the proof. Note, that in this case the limit point $$x$$ is in fact the only element. Since if we let $$x’$$ be another element then $$x,x’\in I_n$$ for all $$n\in N$$ but $$d(x,x’)\leq\operatorname{diam}(F_{n})\to 0$$ as $$n\to\infty$$. Hence $$d(x,x’)=0$$, and therefore $$\bigcap_{k\in N}I_{k} = \{x\}.$$