1/10/2025 at 6:37:34 PM
I see that bounty at the bottom, so tossing away my chances here, but this visualization is just asking to be mapped onto a Hilbert Curve. [0] When you "stripe" the data like this, points that are sorted close together could end up pretty far apart, since a distance in the Y axis skips an entire row of data as you move down, rather than a distance in the X axis which is 1-to-1 with the source data.If you map it onto a hilbert curve, the X and Y axis mean nothing, but visually points that are close together in the sorted list, will be visually close together in the output image.
Since the first part of an ISBN is the country, then the second part is the publisher, and the third part is the title, with a check sum at the end, I would remove the checksum and sort them each as a big number. (no hyphens)
You should end up with "islands", where you see big areas covered by big publishing countries, with these "islands" having bright spots for the publisher codes.
Bonus points for labeling these areas!
I set up something a while ago [1] for an interview that does this with weather data. It makes the seasons really obvious since they're all grouped together.
[0] https://en.wikipedia.org/wiki/Hilbert_curve
[1] https://graypegg.com/hilbert (https://github.com/graypegg/hilbertcurveplayground code if anyone wants to go for the prize using this! Please at least mention me if you decide to reuse this code, but I can't stop ya lol)
by graypegg
1/10/2025 at 7:03:50 PM
And there's a generalized Hilbert curve, the Gilbert curve, for non powers of two rectangular regions [0] (online demo [1]).by abetusk
1/11/2025 at 12:25:46 AM
What property makes the Hilbert curve desirable compared to, say, a snake pattern, with which neighbouring ISBNs are also neighbours in the visualisation?The worry I have with Hilbert curves is that they make the result look like there are distinct "squares" of data [0] when really this is just an artifact of how Hilbert curves work. In that sense, the current visualization is more useful, because it's straightforward to identify the location of each country in it.
[0] https://raw.githubusercontent.com/jakubcerveny/gilbert/maste...
by n2d4
1/12/2025 at 4:19:23 AM
In a snake pattern, the neighbouring pixels on the left and right are related, but the ones above and below have skipped a whole row.And yeah that’s true! you end up with squares with Hilbert curves. But those squares are all « related » data. Then those squares are related to the squares near it. Zoom out more and that grouping of squares is related to the neighbouring macro-squares etc etc.
Basically the square shape is a positive. Kind of like how charting the derivative lets you see how random/related information is, grouping into these squares gives you a visualization of pattern-ness, rather than any specific measurement.
by graypegg
1/12/2025 at 5:18:11 PM
> In a snake pattern, the neighbouring pixels on the left and right are related, but the ones above and below have skipped a whole row.But this is also true in Hilbert curves across the boundaries of the "squares" that I mentioned. The two center pixels in the top row are much more distant than any two pixels would be in a snake pattern.
by n2d4
1/11/2025 at 10:59:59 AM
> What property makes the Hilbert curve desirable compared to, say, a snake pattern, with which neighbouring ISBNs are also neighbours in the visualisation?2D neighbourhood is better than 1D one
> The worry I have with Hilbert curves is that they make the result look like there are distinct "squares" of data
that's the point, tho? instead of distinct lines of taken ISBNs in a row, you get distinct squares if taken ISBNs in a row - much more noticeable
by NooneAtAll3