3/3/2026 at 7:22:20 PM
How could this lend insight into why Fast Fourier Transform approximates self-attention?> Because self-attention can be replaced with FFT for a loss in accuracy and a reduction in kWh [1], I suspect that the Quantum Fourier Transform can also be substituted for attention in LLMs.
[1] "Fnet: Mixing tokens with fourier transforms" (2021) https://arxiv.org/abs/2105.03824 .. "Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs" https://syncedreview.com/2021/05/14/deepmind-podracer-tpu-ba...
"Why formalize mathematics – more than catching errors" (2025) https://news.ycombinator.com/item?id=45695541
Can the QFT Quantum Fourier Transform (and IQFT Inverse Quantum Fourier Transform) also be substituted for self-attention in LLMs, and do Lean formalisms provide any insight into how or why?
by westurner
3/3/2026 at 10:56:29 PM
> Because self-attention can be replaced with FFT for a loss in accuracy and a reduction in kWh [1], I suspect that the Quantum Fourier Transform can also be substituted for attention in LLMs.Couldn't figure out where you are quoting this from.
> Can the QFT Quantum Fourier Transform (and IQFT Inverse Quantum Fourier Transform) also be substituted for self-attention in LLMs
No. The quantum Fourier transform is just a particular factorization of the QFT as run on a quantum computer. It's not any faster if you run it on a classical computer. And to run (part of) LLMs would be more expensive on a quantum computer (because using arbitrary classical data with a quantum computer is expensive).
by wasabi991011
3/4/2026 at 4:50:48 PM
My mistake. That's actually a quote of myself, from an also tangential comment re: "Transformer is a holographic associative memory" (2025) https://news.ycombinator.com/item?id=43029899 .. https://westurner.github.io/hnlog/#comment-43029899There's more to that argument though.
Is quantum logic more appropriate for universal function approximation than LLMs (self-attention,), which must not do better than next word prediction unless asked (due to copyright)?
If quantum probabilistic logic is appropriate for all physical things, then quantum probabilistic logic is probably better at simulating physical things.
If LLMs, like [classical Fourier] convolution, are an approximation and they don't do quantum logic, then they cannot be sufficient at simulating physical things.
But we won't know until we have enough coherent qubits and we determine how to quantum embed these wave states. (And I have some notes on this; involving stars in rectangular lattices and nitrogenated lignin and solitons.)
Or, it's possible to reason about what will be possible given sufficient QC to host an artificial neural network. How to quantum embed a trained LLM into qubit registers (or qubit storage) and use programmable/reconfigurable quantum circuits to lookup embeddings and do only feed-forward better than convolution?
But QFT and IQFT solve the discrete inverse logarithm problem.
There's probably a place for quantum statistical mechanics in LLMs. Probably also counterfactuals including Constructor Theory counterfactuals.
by westurner
3/3/2026 at 9:15:54 PM
This is just standard Fourier theory of being able to apply dense global convolutions with pointwise operations in frequency space? There’s no mystery here. It’s no different than a more general learnable parameterization of “Efficient Channel Attention (ECA)”by gyrovagueGeist
3/3/2026 at 9:21:05 PM
> There’s no mystery here.
Yes and no. Yeah, no mystery because for some reason there's this belief that studying math is useless and by suggesting it's good that you're gatekeeping. But no because there are some deeper and more nuanced questions, but of course there are because for some reason we are proud of our black boxes and act like there's no other way
by godelski