The Sparse Fourier Transform. Haitham Hassanieh

Чтение книги онлайн.

Читать онлайн книгу The Sparse Fourier Transform - Haitham Hassanieh страница 13

Автор:
Жанр:
Серия:
Издательство:
The Sparse Fourier Transform - Haitham Hassanieh ACM Books

Скачать книгу

of Pσ, τ, bx. The algorithm then computes , which is the rate B subsampling of as pictured in part (c). During estimation loops, the algorithm estimates x̂i based on the value of the nearest coordinate in , namely image. During location loops (part (d)), the algorithm chooses J, the top dk (here, 4) coordinates of , and selects the elements of [n] that are closest to those coordinates (the shaded region of the picture). It outputs the set I of preimages of those elements. In this example, the two coordinates on the left landed too close in the permutation and form a “hash collision.” As a result, the algorithm misses the second from the left coordinate in its output. Our guarantee is that each large coordinate has a low probability of being missed if we select the top O(k) samples.

      Lemma 3.2 Define image to be the error tolerated in Lemma 3.1. Then for any i ∈ [n] with |x̂i| ≥ 4E

image

      Proof The proof can be found in Appendix A.2 ■

      Our SFT 1.0 algorithm shown in Algorithm 3.1 is parameterized by and δ. It runs L = O(log n) iterations of the inner loop, with parameters image and d = O(1/) as well as δ.7

      Lemma 3.3 The algorithm runs in time image.

      Proof To analyze this algorithm, note that

image

      or |I′| ≤ 2dkn/B. Therefore, the running time of both the location and estimation inner loops is image. Computing I′ and computing the medians both take linear time, namely O(Ldkn/B). Thus, the total running time is image. Plugging in image and d = O(1/), this running time is image. We require B = Ω(k/), however; for k > ∊n/log(n/δ), this would cause the run time to be larger. But in this case, the predicted run time is Ω(n log n) already, so the standard FFT is faster and we can fall back on it. ■

      Theorem 3.1 Running the algorithm with parameters ∊, δ < 1 gives ′ satisfying

image

      with probability 1 − 1/n and running time image.

      Proof Define

image

      Lemma 3.2 says that in each location iteration r, for any i with |x̂i| ≥ 4E,

image

      Thus, E[si] ≥ 3L/4, and each iteration is an independent trial, so by a Chernoff bound the chance that si < L/2 is at most 1/2Ω(L) < 1/n3. Therefore by a union bound, with probability at least 1 − 1/n2, iI′ for all i with |x̂i| ≥ 4E.

      Next, Lemma 3.1 says that for each estimation iteration r and index i,

image

      Therefore, with probability image in at least 2L/3 of the iterations.

      Since real image is the median of the real image, there must exist two r with image but one real image above real image and one below. Hence, one of these r has image, and similarly for the imaginary axis. Then

image

      By a union bound over I′, with probability at least 1 − 1/n2 we have image for all iI′. Since all iI′ have image and |x̂i| ≤ 4E with probability 1 − 1/n2, with total probability 1 − 2/n2 we have

image

      Rescaling and δ gives our theorem. ■

       3.2.3 Extension

      In this section, we present an extension of the SFT 1.0 algorithm which adds a heuristic to improve the runtime. We refer to this new algorithm as SFT 2.0 and it is shown in Algorithm 3.2. The idea of the heuristic is to apply the aliasing filter to restrict the locations of the large coefficients. The algorithm is parameterized by M that divides n. It performs the aliasing filter as a preprocessing step to SFT 1.0 and uses its output to restrict the frequency locations in the set Ir outputted by the location loops, as shown in Algorithm 3.2.

      Observe that image. Thus,

image

      This means that the filter is very efficient, in that it has no leakage at all. Also, it is simple to compute. Unfortunately, it cannot be “randomized” using Pσ, τ, b: after permuting by σ and b, any two colliding elements j and j′ (i.e., such that j = j′ mod M) continue to collide. Nevertheless, if x̂j is large, then j mod M is likely to lie in T—at least heuristically on random input.

      SFT 2.0 assumes that all “large” coefficients j have j mod M in T. That is, we restrict our sets Ir to contain only coordinates

Скачать книгу