I recently encountered a problem in which I needed to uniformly sample vectors with positive values in \(\mathbb{R}^n \) that had unit \(L_1 \) norm, where the \(L_1 \) norm is\[ \lVert (x_1, x_2, \ldots, x_n)^T \rVert = |x_1| + |x_2| + \ldots + |x_n| \]

The solution was actually quite straightforward: The isosurfaces of the \(L_1\) norm in \(\mathbb{R}^n\) look something like this, which each color corresponding to a particular isosurface:

l1-isosurface

I realized that to uniformly sample this red surface, I could sample \(n\) iid exponential random variables \( \{X_i \sim \text{Exp}(\lambda =1) \}_{i = 1 \ldots n} \) and then normalize the resulting vector \( X = (X_i, \ldots X_n)^T \) using its norm. This works because the pdf of \(X\) is \[\begin{aligned} f_X(x) &= f_{X_1}(x_1) f_{X_2}(x_2) \ldots f_{X_n}(x_n) \\ &= e^{-x_1} \ldots e^{-x_n} \\ &= e^{-(x_1 + x_2 + \ldots x_n)} \\ &= e^{-\lVert x \rVert_1}\end{aligned} \]

Meaning the probability of sampling a given point depends only on its norm, so re-scaling all vectors to have unit norm will yield a uniform sampling on the unit norm surface. This looks something like this:

Sampling of exponentials

Sampling of exponentials

Rescaling of sampled points

Rescaling of sampled points

This got me thinking: can this be used to sample arbitrary surfaces? Essentially the same technique can be used to uniformly sample the n-sphere \(S^n\) embedded in \(\mathbb{R}^{n+1}\) by instead using the \(L_2\) norm and sampling from a Normal random variable (you use the squared \(L_2\) norm but it still uses the symmetry of the exponential sampling). The main difference is that the \(L_1\) sampling was done for strictly positive values while this \(L_2\) norm sampling can be done for both positive and negative values.

First we focus on sampling vectors in all four quadrants.

Generalizing slightly, we could let our surface we want to sample be a Quadric, a surface defined as the solutions of the quadratic form\[ R = y^T A y + b^T y\]where \( A \) is a real symmetric matrix and \(b\) is a vector, and \(R\) is some positive scalar. Then, we want our joint random variable to have the the pdf

\[ f_Y(y) = \frac1{Z} \exp(-(y^T A y + b^T y)) \]

Where \(Z \) is our normalization constant to make \(f_Y\) a pdf.We can compute \(Z\) using the eigendecomposition of \(A\), \(A = Q \Lambda Q^T \) where \( \Lambda \) is a diagonal matrix with entries \( \lambda_i \) and \(Q\) is an orthogonal matrix ( \( Q^T Q = Q Q^T = I_{n\times n} \) ).

We calculate \[ \begin{aligned} Z &= \int_{\mathbb{R}^n} f_Y(y) dy \\ &= \int_{\mathbb{R}^n} \exp(-y^T A y - b^T y)dy \\ &= \int_{\mathbb{R}^n} \exp(-y^T Q \Lambda Q^T y - b^T Q Q^T y)dy \\ &= \int_{\mathbb{R}^n} \exp(-(Q^T y)^T \Lambda (Q^T y) - (Q^T b)^T (Q^T y))dy \\ &= \int_{\mathbb{R}^n} \exp(-x^T \Lambda x - (Q^T b)^T x) dx \\ &= \int_{\mathbb{R}^n} \exp(\sum_{i=1}^n -x_i^2 \lambda_i - (Q^T b)_i x_i) dx \\ &= \int_{\mathbb{R}^n} \prod_{i=1}^n \exp(-x_i^2 \lambda_i - (Q^T b)_i x_i)dx \end{aligned}\]

Here we reach a point where we’d like to exchange our product sign and our integral using Fubini’s theorem, but this integral may actually diverge. This is because if any of the eigenvalues \( \lambda_i \) is not strictly positive, our quadric form has unbounded surface area. Thus we are forced to impose the constraint that all the \(\lambda_i > 0\). Applying Fubini’s theorem we get

\[ \begin{aligned} Z &= \prod_{i=1}^n \int_{\mathbb{R}} \exp(-x^2 \lambda_i - (Q^T b)_i x) dx \\ &= \prod_{i=1}^n \sqrt{\frac{\pi}{\lambda_i}} \exp(\frac{(Q^T b)_i^2}{4\lambda_i}) \\ &= \sqrt{\pi^n \det(A)^{-1}} \exp(\sum_{i=1}^n \frac{(Q^T b)_i^2}{4\lambda_i}) \end{aligned} \]

This yields a well-defined pdf. Notice that this normalization constant grows unbounded as any of the \( \lambda_i \rightarrow 0 \).

Notice that above, we decomposed \(f_Y\) in terms of a product of \(n\) independent functions.We use these as our independent random variables which we will sample. Note that these are expressed in the eigenbasis of \(A\), and thus to obtain our final PDF we will have to apply the inverse of our \(Q^T\) isometry that we used in our integration.

Each \( f_i \) is defined as follows:

\[ \begin{aligned} f_i(x) &= \sqrt{\frac{\lambda_i}{\pi}} \exp(-\frac{(Q^T b)_i^2}{4\lambda_i}) \exp(-x^2 \lambda_i - (Q^T b)_i x) \\ &= \sqrt{\frac{\lambda_i}{\pi}} \exp(-\frac{(Q^T b)_i^2}{4\lambda_i} - x^2 \lambda_i - (Q^T b)_i x) \end{aligned} \]

Identifying the proper terms, we can express the CDF \(F_i\) as

\[ F_i(x) = \frac{1}{2} \left(\Phi\left(\frac{2 \lambda_i x+ (Q^T b)_i}{2 \sqrt{\lambda_i}}\right)+1\right)\]

Where \( \Phi \) is the Gaussian definite integral \( \Phi(x) = \frac{2}{\sqrt{\pi }}\int _0^x e^{-t^2} dt \).We then calculate the inverse CDF \( F_i^{-1} \) directly as

\[ F_i^{-1}(x) = \frac{2 \sqrt{\lambda_i} \Phi^{-1}(2 x-1)- (Q^T b)_i}{2 \lambda_i} \]

Using the inverse CDF method of sampling, we then can sample \(n\) iid uniform random variables \( U_i \). With the sample values \( {u_1, \ldots u_n } \), we compute the vector\[ x = (F_1^{-1}(u_1), \ldots, F_n^{-1}(u_n))^T \]Note that we need to perform our rescaling to constraint \( x \) to be on the surface in our eigenbasis. We compute\[ x’ = \frac{R}{x^T \Lambda x + (Q^T b)^T x} x \]and finally\[ y = Qx’ \]Where \( Q \) is applied in order to shift back to the principle axes of our quadric surface. This procedure will yield samples \( y \) that lie uniformly on our surface.