5/18/2025 at 6:49:05 PM
I see that you're looking for clusters within PCA projections -- You should look for deeper structure with hot new dimensional reduction algorithms, like PaCMAP or LocalMAP!I've been working on a project related to a sensemaking tool called Pol.is [1], but reprojecting its wiki survey data with these new algorithms instead of PCA, and it's amazing what new insight it uncovers with these new algorithms!
https://patcon.github.io/polislike-opinion-map-painting/
Painted groups: https://t.co/734qNlMdeh
(Sorry, only really works on desktop)
[1]: https://www.technologyreview.com/2025/04/15/1115125/a-small-...
by patcon
5/18/2025 at 7:01:41 PM
Thanks for pointing those out — I hadn’t seen PaCMAP or LocalMAP before, but that definitely looks like the kind of structure-preserving approach that would fit this data better than PCA. Appreciate the nudge — going to dig into those a bit more.by brig90
5/19/2025 at 3:21:14 AM
Try TDA ("mapper", or really, anything based on kernel density computed connectivity), it's a whole new world.This ain't your parents' "factor analysis".
by loxias
5/19/2025 at 4:24:09 PM
Ooooo I will definitely check it out! It's strangely hard to find any comparisons in youtube videos -- it seems TDA isn't actually a dimensional reduction algorithm, but something closely relayed, maybe?by patcon
5/19/2025 at 11:04:00 AM
LLM model interpretability also uses Sparse Autoencoders to find concept representations (https://openai.com/index/extracting-concepts-from-gpt-4/), and, more recently, linear probes.by khafra
5/18/2025 at 8:03:51 PM
I’ve had much better luck with umap than PCA and t-sne for reducing embeddings.by staticautomatic
5/18/2025 at 10:43:43 PM
PaCMAP (and its descendant localmap) are comparable to t-sne at preserving both local and global structure (but without messing much with finicky hyperparameters)by patcon