What is a partial fraction decomposition and why should I care?

Partial fraction decomposition is a process that takes in the quotient of two polynomials and gives an equivalent form as the sum of terms with simple denominator. For example $$\frac{1}{1-x^2} = \frac{1}{2}\left(\frac{1}{x+1}-\frac{1}{x-1}\right)$$ Partial fraction decomposition is very important as it is the only way to integrate rational functions and it greatly simplifies taking their derivatives. Additionally partial fraction decomposition is required to take the inverse laplace transform of rational functions. It can even be used to solve differential equations. The typical way of actually carrying out a partial fraction decomposition is to start with the general partial fraction decomposition and then solve for the coefficients. For example to calculate the decomposition of \(\frac{1}{1-x^2}\) you'd start by writing $$\frac{A}{x-1}+\frac{B}{x+1} = \frac{A(x+1)+B(x-1)}{x^2-1} = \frac{Ax+A+Bx-B}{x^2-1} $$ Then by equating coefficients we get a system of 2 equations $$A+B = 0$$ $$\text{and } A-B = -1$$. This strategy of starting with the most general possible decomposition and solving for the coefficients works but it suffers from being very slow and difficult to work with. Just the setup requires n(n-1) polynomial multiplications. Fortunately there is a better way which weirdly enough can involve complex integration.

Complex calculus

Pulling in complex numbers doesn't initially seem very helpful but there are two important properties of complex numbers that are relevant here. Firstly, complex numbers guarentee that the denominator can be factored fully whereas for example \(\frac{1}{1+x^2}\) cannot be factored over the real numbers. Secondly in the complex numbers a point where the function is undefined does not break up the domain into pieces. This is very important because the behavior of the function around some singularity a is dominated by the coefficient of \(\frac{1}{z-a}\) so the hope is that by analyzing the function near its singularities we can construct each term individually without being (directly) impacted by the other parts of the function.

So what is calculus for complex numbers. As obvious as this may sound its basically the same as calculus for real numbers but now the numbers can be complex. For example the definition of the derivative is still $$f'(z_0) = \lim_{h\to 0} \frac{f(z_0+h)-f(z_0)}{h}$$ but now h can go to 0 on any path you can draw and the limit must be the same value. For example h could be \((a+bi)t\) or \(e^{-(1+i)t}\) or any other function that goes to 0. When the derivative of a function exists at some point or set we say it is holomorphic at that point or on that set By choosing \(h=t\) and \(h=it\) we get a necessary condition for complex differentiability which is that

$$\lim_{t \to 0} \frac{f(z_0+it)-f(z_0)}{it} \Bigg \rvert _{z_0} = \lim_{t \to 0} \frac{f(z_0+t)-f(z_0)}{t} \Bigg \rvert _{z_0}$$ By writing f(z) as a function \(\mathbb{R}^2 \to \mathbb{R}^2\) \(f(x,y) = (u(x,y),v(x,y))\) where \(u(x,y) := \Re(f(x+iy)), v(x,y) := \Im(f(x+iy))\) this becomes $$\frac{\partial f}{\partial y} = i\frac{\partial f}{\partial{x}}, \text{so } \frac{\partial (u(x,y)+iv(x,y))}{\partial y} = i\frac{\partial (u(x,y)+iv(x,y))}{\partial{x}}$$ or as it's more commonly written $$\frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x}, \frac{\partial v}{\partial y} = \frac{\partial u}{\partial x} $$ This set of differential equations is called the Cauchy Riemann equations and they are satisfied by most commonly encountered functions. This condition is much stronger than each coordinate having a derivative. What these equations are saying is that at any point, a small neighborhood around that point only gets scaled or rotated relative to the other points.

Now moving on to integration there are two properties that would be nice for a complex integral to have. Firstly, the definition of an integral as the limit of a Riemann sum and secondly, the fundamental theorem of calculus. In addition for it to be a good integral it needs to agree with the Riemann integral where applicable. Like derivatives the "naive" generalization where you just let the numbers in the integral be complex turns out to work great. There is one caveat to complex integration however. While for two real numbers there is really only one path between them, for any two complex numbers there are an infinite number of different paths between them. The solution is that for complex integration you must, if necessary, specify the path that you choose to integrate on with some piecewise continuous differentiable function \(\gamma\) where \(\gamma(a) = z_0, \gamma(b)=z_n\). There is then a complex version of a Riemann sum given by $$\int_\gamma f(z) dz =\lim_{n\to \infty}\sum f(z_k)(z_k-z_{k-1}) = \lim_{n\to \infty} \sum f(\gamma(t_k))(\gamma(t_k)-\gamma(t_{k-1})) = \lim_{n\to \infty} \sum f(\gamma(t_k))\frac{\gamma(t_k)-\gamma(t_{k-1})}{t_k-t_{k-1}}(t_k-t_{k-1}) = \text{by the mean value theorem} \lim_{n\to \infty} \sum f(\gamma(t_k))\gamma'(c_k)(t_k-t_{k-1}) = \int_a^b f(\gamma(t))\gamma'(t) dt$$ Fortunately its often not actually necessary to specify the exact path. This is due to the somewhat surprising fact that if a function is holomorphic on some domain then the value of the integral doesn't change as you move the path through the domain. To prove this we can just use the Cauchy Riemann equations along with Greens theorem $$\oint_\gamma f(z) dz = \oint_\gamma (u(x,y)+iv(x,y))(dx+i dy) = \oint_\gamma u(x,y)+iv(x,y) dx -v(x,y)+iu(x,y)dy=\iint_\Sigma -(\frac{\partial v}{\partial x}+\frac{\partial u}{\partial y}) + i(\frac{\partial u }{\partial x}-\frac{\partial v}{\partial y})dx dy =\iint_\Sigma 0 dx dy = 0 $$

Now we can actually compute some integrals over closed loops. Unfortunately almost all of these integrals are boring except for one that is extremely important. Right away \(\forall n \in \mathbb{N}, z^n\) is holomorphic so \(\oint z^n dz = 0\). As for negative n < -1 let \(\gamma(t) = e^{it}\) then \(\oint_\gamma z^n dz = \int_0^{2\pi} ie^{i(n+1)t} dt = 0\). On the other hand if n = -1 $$\oint_\gamma \frac{1}{z} dz = \int_0^{2\pi} \frac{1}{e^{it}}ie^{it} dt = \int_0^{2\pi} i dt = 2\pi i$$ This is the key to the particular trick here for partial fraction decomposition and really many tricks throughout complex analysis. Here's a desmos graph that actually implements complex integration

Hopefully you can see how if the path has only boring holomorphic points inside, the integral is 0. Meanwhile if there is a singularity inside the integral it doesn't matter what the path is it will be some fixed value. This is the key to quickly performing partial fraction decomposition.

What Does This Have to do with Partial Fractions?

Simply put, complex integration is more or less the perfect tool to solve our problem. Integrating around a pole eliminates all the influence from the other parts of the function so by choosing many paths we can get information about each term individually rather than having to deal with the terms $$\text{let } \gamma_i \text{ be a simple closed curve around } z_i\text{ and no other roots of Q(z). Then } \prod\frac{1}{z-z_i} = \sum \frac{a_i}{z-z_i}, \oint_\gamma \prod\frac{1}{z-z_i} dz = \oint_\gamma \sum \frac{a_i}{z-z_i} dz = i2\pi a_i$$ This tells us that if we can evaluate these integrals then they give us the partial fraction decomposition for free. Unfortunately theres no way to integrate this symbolically without doing the partial fraction decomposition we're trying to avoid. Fortunately we can just directly apply the formula for complex integration with some convenient path. In this case a convenient path would be \(z_i+\varepsilon*e^{it}\). The exact path is, of course, arbitrary however, circles tend to be nice and well behaved. $$\oint_\gamma \prod \frac{1}{z-z_i} dz = \int_0^{2\pi} i\varepsilon e^{it} \prod \frac{1}{\varepsilon e^{it}+z_j-z_i} dt = \int_0^{2\pi} i\frac{\varepsilon e^{it}}{\varepsilon e^{it}} \prod_{i \neq j} \frac{1}{\varepsilon e^{it}+z_j-z_i} dt = \int_0^{2\pi} i\prod_{i \neq j} \frac{1}{\varepsilon e^{it}+z_j-z_i} dt $$ now because the inside of the integral is well behaved as \(\varepsilon \to 0\) we can take the limit $$\lim_{\varepsilon \to 0 } \int_0^{2\pi} i\prod_{i \neq j} \frac{1}{\varepsilon e^{it}+z_j-z_i} dt = \int_0^{2\pi} \lim_{\varepsilon \to 0 } i\prod_{i \neq j} \frac{1}{\varepsilon e^{it}+z_j-z_i} dt = \int_0^{2\pi} i\prod_{i \neq j} \frac{1}{z_j-z_i} dt = 2\pi i\prod_{i \neq j} \frac{1}{z_j-z_i}$$ and get that $$\oint_{z_j+\varepsilon e^{it}} \prod \frac{1}{z-z_i} dz = 2\pi i \prod_{i \neq j} \frac{1}{z_j-z_i} \text{or because }2\pi a_i = \oint_{z_j}\gamma \prod \frac{1}{z-z_i} dz,\text{ } a_i = \prod_{i \neq j} \frac{1}{z_j-z_i} \text{which is more commonly written } \lim_{z \to z_i} \frac{z-z_i}{q(z)-q(z_i)} = \frac{1}{q'(z_i)} $$

What about repeated root.

Repeated roots are an issue for the whole procedure as \(f(z)\) is no longer well behaved as \(\varepsilon \to 0\). Fortunately there is somewhat of a workaround. Using the same steps as before on \(\frac{(x-a)^{n-1}}{q(x)}\) we can determine that the coefficient of \(\frac{1}{(x-a)^n}\) is \(\frac{1}{q^{(n)}(a)}\). Then we can just subtract the bit that isn't well behaved and do the same process. Let f(a) be the part of q(x) that is not \((x-a)^n\). Then $$\frac{1}{(x-a)^nf(x)} -\frac{1}{f(a)(x-a)^n} = \frac{f(a)-f(x)}{(x-a)^nf(a)f(x)}$$ Now this is integrable if we multiply by \((x-a)^{n-2}\). Doing so results in $$\int_0^{2\pi} i \varepsilon e^{it} \frac{f(z_0)-f(\varepsilon e^{it}+z_0)}{(\varepsilon e^{it})^2f(z_0)f(\varepsilon e^{it}+z_0)}$$ now taking \(\lim_{\varepsilon \to 0}\) $$\int_0^{2\pi} i\frac{-f'(a)}{f(z_0) f(z_0)} = -2\pi i \frac{f'(z_0)}{f(z_0)^2} dt$$

what about when the numerator isn't 1

This is actually nice and simple. If the numerator isn't 1, you can expand as normal and then divide at the end. If $$\frac{1}{q(x)} = \sum \frac{a_i}{x-x_i} \text{then } \frac{p(x)}{q(x)} = \sum{\frac{p(x)(a_i)}{x-x_i}} \text{By the polynomial remainder theorem } \frac{p(x)a_i}{x-x_i} = h(x)+\frac{p(a)a_i}{x-x_i} \text{but without loss of generality }\frac{p(x)}{q(x)} \text{is a proper fraction so all the non fraction parts } h(x) \text{ cancel and we're left with} \frac{p(a)a_i}{x-x_i} $$ If there are higher powers of \(\frac{1}{x-a}\) polynomial division still works but its not as simple as p(a)

How About an Actual Example

To show how much easier this method is lets work through an example with both methods. I encourage you to get out a piece of paper and try (the simpler method) yourself. When I did it it really felt like magic seeing the coefficients come out without almost any work

$$\frac{P(x)}{Q(x)} = \frac{x^2+2x+3}{(x-3)^2(x-4)^2(x-2)}$$ $$\text{Conventional Method: } \frac{A}{(x-3)^2} +\frac{B}{x-3}+ \frac{C}{(x-4)^2} +\frac{D}{x-4}+ \frac{E}{x-2} = \frac{E(x-3)^2(x-4)^2+D(x-2)(x-4)(x-3)^2+C(x-2)(x-4)^2+B(x-2)(x-4)^2(x-3)}{(x-3)^2(x-4)^2(x-2)} = \frac{Ex^{4}-14Ex^{3}+73Ex^{2}-168Ex+144E+x^{4}D-12x^{3}D+53x^{2}D-102Dx+72D+x^{3}C-8x^{2}C+21Cx-18C+x^{4}B-13x^{3}B+62x^{2}B-128Bx+96B+x^{3}A-10x^{2}A+32Ax-32A}{(x-2)(x-4)^2(x-3)^2}. \text{collecting and equating like terms gets a system of 5 linear equations with 5 unknowns}$$ $$E+D+B=0$$ $$-14E-12D+C-13B+A=0$$ $$73E+53D-8C+62B-10A=1$$ $$-168E-102D+21C-128B+32A=2$$ $$144E+72D+96B-32A = 3$$ or in matrix form $$\begin{pmatrix}0&1&0&1&1\\ 1&-13&1&-12&-14\\ -10&62&-8&53&73\\ 32&-128&21&-102&-168\\ -32&96&-18&72&144\end{pmatrix} \begin{pmatrix}A\\B\\C\\D\\E\end{pmatrix} = \begin{pmatrix}0\\0\\1\\2\\3\end{pmatrix}$$ If you're really good with matrices feel free to solve this system. The solution is \begin{pmatrix}18\\26\\\frac{27}{2}\\ \frac{-115}{4} \\ \frac{11}{4}\end{pmatrix}

Meanwhile the other method is so simple its difficult to show work without feeling like I'm over explaining. Immediately we can get $$\frac{p(2)}{q'(2)(x-2)}+\frac{p(3)}{q''(3)(x-3)^2}+\frac{p(4)}{q''(4)(x-4)^2} = \frac{2^2+2(2)+3}{(2-3)^2(2-4)^2(x-2)} + \frac{3^2+2(3)+3}{(3-2)(3-4)^2(x-3)^2} + \frac{4^2+2(4)+3}{(4-2)(4-3)^2(x-4)^2} = \frac{11}{4(x-2)}+\frac{18}{(x-3)^2}+\frac{27}{2(x-4)^2} $$ To get the 1st power terms of x-3 and x-4 is a bit more work but still relatively simple. The first step is to pick a term to work on first (I chose x-3) and to subtract off the 2nd power term which gives us $$\frac{x^2+2x+3}{(x-2)(x-3)^2(x-4)} - \frac{18}{(x-3)^2} = \frac{x^2+2x+3-18(x-4)^2(x-2)}{(x-2)(x-3)^2(x-4)^2}$$ As a reminder, by construction the top part of that is divisible by 1 copy of x-3 because other wise it would have a remainder when dividing by 2 copies. That means we can divide by x-3 and get a new polynomial. Fortunately we don't even care about this polynomial. We just care about its value at x = 3. It may be undefined there but we can just take a limit. Thus we have $$\frac{ \lim_{x\to 3} \frac{x^2+2x+3-18((x-4)^2(x-2))}{x-3}}{(x-2)(x-3)(x-4)^2} = \frac{\frac{d}{dx}x^2+2x+3-18((x-4)^2(x-3))\rvert_{x=3}}{(x-2)(x-3)(x-4)^2} = \frac{2(3)+2-18(2(3-4)(3-2)+(3-4)^2)}{(x-2)(x-3)(x-4)^2} = \frac{26}{(x-2)(x-3)(x-4)^2} $$ Now we can apply the same formula and get \(\frac{26}{(3-2)(3-4)^2x-3} = \frac{26}{x-3}\). Now quickly doing the same process for x-4 yields $$\frac{\frac{d}{dx}(x^2+2x+3-\frac{27}{2}((x-2)(x-3)^2))\rvert_{x=4}}{(x-2)(x-3)^2(x-4)} = \frac{2(4)+2-\frac{27}{2}(2(4-3)(4-2)+(4-3)^2)}{(x-2)(x-3)^2(x-4)} = \frac{-115}{2(x-2)(x-3^2)(x-4)} \to \frac{-115}{2(4-2)(4-3)^2(x-4)} = \frac{-115}{4(x-4)} $$ So the whole solution is $$\frac{x^2+2x+3}{(x-2)(x-3)^2(x-4)^2} = \frac{11}{4(x-2)}+\frac{18}{(x-3)^2}+\frac{27}{2(x-4)^2} + \frac{26}{x-3}+\frac{-115}{4(x-4)}$$

Existence and Uniqueness

You may have noticed that partial fraction decomposition feels very vector-y in that there some collection of objects together with addition and scaling. In addition \(\frac{P(x)}{Q(x)}\) where Q(x) is fixed also feels very vector-y. In fact both of these are examples of vector spaces which lets us reframe our problem in terms of linear algebra. Specifically finding a partial fraction decomposition is the same as representing a vector in the \(\frac{P(x)}{Q(x)}\) basis in terms of the "simple fraction" basis. It is very simple to get a matrix that converts from the simple fractions to the complicated fraction by just putting everything over a common denominator and expanding. If these vectors are linearly independent they span the whole space and thus partial fraction decomposition is both unique and exists for all P(x). Indeed it is true that \(\sum \frac{1}{x-x_i}\) spans the whole space and form a basis. The proof is actually a great exercise. If you just want the proof you can read this paper but I encourage you to try and figure it out yourself. As a hint, a set of vectors is linearly independent iff \(\sum{a_i v_i} = 0 \implies a_i = 0\)

Challenge Problem

This is a totally optional but very cool application of partial fractions. I will outline the process, then solve below. The process will basically be as follows. First, using the same process as above develop a partial fraction decomposition of the cotangent function. Then, using the partial fraction decomposition, find a closed form for a particular infinite series. Then using particular values of that infinite series, solve the Basel problem.

For the first step use the fact that cotangent has poles at integer multiples of pi and calculate the residues around each pole. $$\oint_\gamma \frac{\cos(z)}{\sin(z)} dz = \int_0^{2\pi} i\varepsilon e^{it}\frac{\cos(\varepsilon e^{it}+n\pi)}{sin(n\pi+\varepsilon e^it)} dt = \int_0^{2\pi} \lim_{\varepsilon \to 0} \frac{i \varepsilon e^{it} cos(\varepsilon e^{it})}{\varepsilon sin(e^{it})} \frac{(-1)^n}{(-1)^n} dt = \int_0^{2\pi} i dt = 2\pi i$$ So the residue at any of the poles is 1. Thus $$\cot(x) = \sum_{n=-\infty}^{\infty} \frac{1}{x-n \pi}$$ Now by combining the positive and negative terms $$\frac{1}{x-n \pi} + \frac{1}{x+n \pi } =\frac{x+n\pi+x-n\pi}{x^{2}-n^{2}\pi^{2}} = \frac{2x}{x^{2}-n^{2}\pi^{2}} $$ $$\text{So} \cot(x) = \sum_{n=-\infty}^{\infty}\frac{1}{x+\pi n} = \frac{1}{x}+2x\sum_{n=1}^{\infty}\frac{1}{x^{2}-\left(n\pi\right)^{2}}$$ Now isolating the infinite series $$\frac{x \cot(x)-1}{2x^2} = \sum_{n=1}^{\infty}\frac{1}{x^{2}-\left(n\pi\right)^{2}} $$ Taking the limit as \(x\to 0 \) $$\lim_{x \to 0} \frac{x \cot(x)-1}{2x^2} = \lim_{x \to 0} \frac{x \cos(x)-\sin(x)}{2x^2 \sin(x)} = \sum_{n=1}^{\infty}\frac{1}{-\left(n\pi\right)^{2}} $$ which we can use L'Hôpital's rule on to get $$\lim_{x \to 0} \frac{x \cos(x)-\sin(x)}{2x^2 \sin(x)} = \lim_{x \to 0} \frac{-x\sin(x)}{2x\left(2\sin\left(x\right)+x\cos\left(x\right)\right)} = \lim_{x \to 0} \frac{-\sin(x) - x\cos(x)}{\left(4-2x^2\right)\sin\left(x\right)+8x\cos\left(x\right)} = \lim_{x \to 0} \frac{x\sin(x)-2\cos(x)}{-12x\sin\left(x\right)+\left(12-2x^2\right)\cos\left(x\right)} = -\frac{1}{6} $$ Since $$\sum_{n=1}^{\infty}\frac{1}{-\left(n\pi\right)^{2}} = -\frac{1}{6} \: , \: \sum_{n=1}^{\infty}\frac{1}{n^{2}} = \frac{\pi^2}{6}$$