1/15/2025 at 12:24:31 AM
I've been maintaining my personal website as plain HTML for five years now. I must say, I quite like this method. There's no substitute for practice when it comes to maintaining your skills at editing HTML and CSS.Yes, you must copy and paste content and not having layout page is annoying at times. But the overhead of just doing it yourself is surprisingly small in terms of the time commitment.
Typically, I'll draft a post in MS Word then open the git repo for my site, hosted on github pages, duplicate and rename the template.html page that includes the CSS, footer, and header for my site and then copy my content into it. When I'm happy with everything, I'll make my commit and then a minute later it's live at my custom domain. Seeing that it takes only 11KBs and 26ms to load my landing page strangely delightful.
by uncheckederror
1/15/2025 at 5:21:10 AM
> copy and paste content and not having layout page is annoying at timesHTML was envisioned as an SGML application/vocabulary, and SGML has those power features, such as type-checked shared fragments/text macros (entities, possibly with parameters), safe third-party content transclusion, markup stream processing and filtering for generating a table of content for page or site navigation, content screening for removal/rejection of undesired script in user content, expansion of custom Wiki syntax such as markdown into HTML, producing "views" for RSS or search result pages in pipelines, etc. etc. See [1] for a basic tutorial.
[1]: https://sgmljs.net/docs/producing-html-tutorial/producing-ht...
by tannhaeuser
1/15/2025 at 8:18:24 AM
I didn't expect this to be serious and am surprised on that the tutorial actually delivers. Way back when I was learning HTML and it was said that it was built with SGML, then this relation remained a total mystery to me.by kreetx
1/15/2025 at 3:50:15 PM
> I didn't expect this to be serious and am surprised on that the tutorial actually delivers.Same here. I believed that SGML was like Lisp of markup languages, except that it went completely extinct. Good to see it's still usable, I feel like I want to try it out now (instead of making a third generation of my static site generator from scratch).
by TeMPOraL
1/16/2025 at 11:59:36 AM
I’ve become quite a fan of writing in SGML personally, because much of what you note is spot-on. Some of the points seem a bit of a stretch though.Any type-checking inside of SGML is more akin to unused-variable checking. When you say that macros/entities may contain parameters, I think you are referring to recursive entity expansion, which does let you parameterize macros (but only once, and not dynamically within the text). For instance, you can set a `¤tYear` entity and refer to that in `copywrite "¤tYear/¤tDay`, but that can only happen in the DTD at the start of the document. It’s not the case that you could, for instance, create an entity to generate a Github repo link and use it like `&repoName = "diff-match-patch"; &githubLink`. This feature was used in limited form to conditionally include sections of markup since SGML contains an `IGNORE` “marked section”.
<!ENTITY % private-render "IGNORE">
...
<![%private-render[
<side-note>
I’m on the fence about including this bit.
It’s not up to the editorial standards.
</side-note>
]]>
SGML also fights hard against stream processing, even more so than XML (and XML pundits regret not deprecating certain SGML features like entities which obstruct stream processing). Because of things like this, it’s not possible to parse a document without having the entire thing from the start, and because of things like tag omission (which is part of its syntax “MINIMIZATION” features), it’s often not possible to parse a document without having _everything up to the end_.Would love to hear what you are referring to with “safe” third-party transclusion and also what features are available for removal or rejection of undesired script in user content.
Apart from these I find it a pleasure to use because SGML makes it easy for _humans_ to write structured content (contrast with XML which makes it easy for software to parse). SGML is incredibly hard to parse because in order to accommodate human factors _and actually get people to write structured content_ it leans heavily on computers and software doing the hard work of parsing.
It’s missing some nice features such as namespacing. That is, it’s not possible to have two elements of the same name in the same document with different attributes, content, or meanings. If you want to have a flight record and also a list of beers in a flight, they have to be differentiated otherwise they will fail to parse.
<flight-list>
<flight-record><flight-meta pnr=XYZ123 AAL number=123>
</flight-list>
<beer-list>
<beer-flight>
<beer Pilsner amount=3oz>Ultra Pils 2023
<beer IPA>Dual IPA
<beer Porter>Chocolate milk stout
</beer-list>
DSSSL was supposed to be the transforms into RSS, page views, and other styles or visualizations. With XML arose XSL/XSLT which seemed to gain much more traction than DSSSL ever did. My impression is that declarative transforms are best suited for simpler transforms, particularly those without complicated processing or rearranging of content. Since `osgmls` and the other few SGML parsers are happy to produce an equivalent XML document for the SGML input, it’s easy to transform an SGML document using XSL, and I do this in combination with a `Makefile` to create my own HTML pages (fair warning: HTML _is not XML_ and there are pitfalls in attempting to produce HTML from an XML tool like XSL).For more complicated work I make quick transformers with WordPress’ HTML API to process the XML output (I know, XML also isn’t HTML, but it parses reliably for me since I don’t produce anything that an HTML parser couldn’t parse). Having an imperative-style processor feels more natural to me, and one written in a programming language that lets me use normal programming conveniences. I think getting the transformer right was never fully realized with the declarative languages, which are similar to Angular and other systems with complicated DSLs inside string attribute values.
I’d love to see the web pick up where SGML left off and get rid of some of the legacy concessions (SGML was written before UTF-8 and its flexibility with input encodings shows it — not in a good way either) as well as adopt some modern enhancements. I wrote about some of this on my personal blog, sorry for the plug.
https://fluffyandflakey.blog/2024/10/11/ugml-a-proposal-to-u...
Edit: formatting
by dmsnell
1/16/2025 at 8:23:38 PM
Nice to meet a fellow SGML fan!> When you say that macros/entities may contain parameters, I think you are referring to recursive entity expansion,
No, I'm referring to SGML data attributes (attributes declared on notations having concrete values defined on entities of the respective notation); cf. [1]. In sgmljs.net SGML, these can be used for SGML templating which is a way of using data entities declared as having the SGML notation (ie. stand-alone SGML files or streams) to replace elements in documents referencing those entities. Unlike general entities, this type of entity expansion is bound to an element name and is informed of the expected content model and other contextual type info at the replacement site, hence is type-safe. Data attributes supplied at the expansion site appear as "system-specific entities" in the processing context of the template entity. See [2] for details and examples.
Understanding and appreciating the construction of templating as a parametric macro expansion mechanism without additional syntax may require intimate knowledge of lesser known SGML features such as LPDs and data entities, and also some HyTime concepts.
> create an entity to generate a Github repo link
Templating can turn text data from a calling document into an entity in the called template sub-processing context so might help with your use case, and with the limitation to have to declare things in DTDs upfront in general.
> it’s not possible to parse a document without having the entire thing from the start, and because of things like tag omission (which is part of its syntax “MINIMIZATION” features), it’s often not possible to parse a document without having _everything up to the end_.
Why do you think so and why should this be required by tag inference specifically? In sgmljs.net SGML, for external general entities (unlike external parameter entities which are expanded at the point of declaration rather than usage), at no point does text data have to be materialised in its entirety. The parser front-end just switches input events from another external source during entity expansion and switches back afterwards, maintaining a stack of open entities.
Regarding namespaces, one of their creators (SGML demi-good James Clark himself) considers those a failure:
> the pain that is caused by XML Namespaces seems massively out of proportion to the benefits that they provide (cf. [3]).
In sgmljs.net SGML, you can handle XML namespace mappings using the special processing instructions defined by ISO/IEC 19757-9:2008. In effect, element and attributes having names "with colons" are remapped to names with canonical namespace parts (SGML names can allow colons as part of names), which seems like the sane way to deal with "namespaces".
I haven't checked your site, but most certainly will! Let's keep in touch; you might also be interested in sgmljs.net SGML and the SGML DTD for modern HTML at [4], to be updated for WHATWG HTML review draft January 2025 when/if it's published.
Edit:
> Would love to hear what you are referring to with “safe” third-party transclusion and also what features are available for removal or rejection of undesired script in user content.
In short, I was mainly referring to DTD techniques (content models, attribute defaults) here.
[1]: https://sgmljs.net/docs/sgmlrefman.html#data-entities
[2]: https://sgmljs.net/docs/templating.html
by tannhaeuser
1/15/2025 at 8:13:28 AM
> Yes, you must copy and paste content and not having layout page is annoying at times. But the overhead of just doing it yourself is surprisingly small in terms of the time commitment.This calls out for server side includes[0]. I so loved server side includes back in the late 90s. You still work in plain HTML and CSS, boilerplate can be centralized and not repeated, and clients receive the entire page in a single request.
by EvanAnderson
1/15/2025 at 1:03:48 PM
> But the overhead of just doing it yourself is surprisingly small in terms of the time commitment.Holy cow. The sole reason I learned SSI and then PHP in 1998 was because I was sick of this after like 2 weeks.
This person has more patience in their pinky than I have ever had.
by wink
1/15/2025 at 12:29:52 AM
> Yes, you must copy and paste contentMany people who maintain their own sites in vanilla web technologies tend to create reusable functions to handle this for them. It can generate headers and the like dynamically so you don't have to change it on every single page. Though that does kill the "no javascript required" aspect a lot of people like
Of course you could simply add a build step to your pure HTML site instead!
by culi
1/15/2025 at 1:24:13 AM
I recently learned the object tag can do what I wished for in the 90s... work as an include tag: <object data="footer.html"></object>
Turn your back for twenty-five years, and be amazed at what they've come up with! ;-)Should reduce a lot of boilerplate that would get out of sync on my next project, without need for templating.
by mixmastamyk
1/17/2025 at 8:30:26 AM
Hey, I need to try this out, so it is like iframe except the frame part and all its issues?by johnisgood
1/15/2025 at 2:15:53 AM
Unfortunately that will require the client to make additional web requests to load the page, effectively doubling latency at a minimum.by liontwist
1/15/2025 at 5:26:10 AM
A few extra <object> in a blog post is a worthwhile tradeoff, if you're literally using raw HTML.- HTTP/1.1 (1997) already reuses connections, so it will not double latency. The DNS lookup and the TCP connection are a high fixed cost for the first .html request.
- HTTP/2 (2015) further reduces the cost of subsequent requests, with a bunch of techniques, like dictionary compression.
- You will likely still be 10x faster than a typical "modern" page with JavaScript, which has to load the JS first, and then execute it. The tradeoff has flipped now, where execution latency for JS / DOM reflows can be higher than network latency. So using raw HTML means you are already far ahead of the pack.
So say you have a 50 ms time for the initial .html request. Then adding some <object> might bring you to 55 ms, 60 ms, 80 ms, 100 ms.
But you would have to do something pretty bad to get to 300 ms or 1500 ms, which you can easily see on the modern web.
So yes go ahead and add those <object> tags, if it means you can get by with no toolchain. Personally I use Markdown and some custom Python scripts to generate the header and footer.
by chubot
1/15/2025 at 4:22:55 PM
Yes, I’d add that not merely “raw html” but a file on disk can be served directly by Linux without context switches (I forget the syscall), and transferred faster than generation.by mixmastamyk
1/17/2025 at 8:31:14 AM
sendfile? splice? io_uring?by johnisgood
1/17/2025 at 7:50:37 PM
Yes, most likely sendfile.by mixmastamyk
1/15/2025 at 2:56:57 AM
Sounds like premature optimization for a simple page. If the objects are sized their regions should be fillable afterward without need to resize and be cached for subsequent access.by mixmastamyk
1/15/2025 at 4:00:45 AM
The other solutions are even easier and don’t double latency.> be cached for subsequent access.
So now you need to setup cache control?
by liontwist
1/15/2025 at 4:18:52 PM
Nope and nope.by mixmastamyk
1/15/2025 at 6:57:09 PM
Good explanation. I’ll stick with cat.by liontwist
1/17/2025 at 7:49:03 PM
Have a look at the rest of the thread. Chubot explains at length, and I added a few points.by mixmastamyk
1/15/2025 at 7:19:50 PM
I didn't know you could use object tags in that way! Thanks. That seems like a great solution if you're cool with an extra requestby culi
1/15/2025 at 9:23:15 AM
Couldn't you sort of do that using server side includes back en the 90s? Assuming that your web server supported it.by mrweasel
1/16/2025 at 1:31:05 AM
Yes, and a Makefile was an option as well. But an include tag was a no-brainer not long after html was invented. Especially after img, link, applet, frame, etc were implemented.by mixmastamyk
1/15/2025 at 1:06:46 AM
I've adopted the idea that a blog post is archived when it's published; I don't want to tinker with it again. Old pages may have an old style, but that's OK, it's an archive. Copy/paste works great for this.The only reason I use a blog engine now (Hugo) is for RSS. I kept messing up or forgetting manual RSS edits.
by 8organicbits
1/15/2025 at 2:21:37 AM
I really love this! I've seen it in action a couple times in the wild, and it's super cool seeing how the site's design has evolved over time.It also has the benefit of forcing you to keep your URIs stable. Cool URIs don't change: https://www.w3.org/Provider/Style/URI.html
by promiseofbeans
1/15/2025 at 7:30:08 AM
Or, let me be cheeky: you could add some `<php include('header.html')?>` in your html.by arkh
1/15/2025 at 7:10:58 AM
> It can generate headers and the like dynamically so you don't have to change it on every single paYeah, I noped out of that and use a client-side include (webcomponent) so that my html can have `<include-remote remote-src='....'>` instead.
Sure, it requires JS to be enabled for the webcomponent to work, but I'm fine with that.
See https://www.lelanthran.com for an example.
[EDIT: Dammit, my blog doesn't use that webcomponent anymore! Here's an actual production usage of it: https://demo.skillful-training.com/project/webroot/ (use usernames (one..ten)@example.com and password '1' if you want to see more usage of it)]
by lelanthran
1/15/2025 at 7:06:34 PM
yeah clearly there's a lot of ways to solve this issue if javascript is enabled. But there's a big overlap between the folks who wanna use vanilla web technologies and the folks who want their site to run without javascriptby culi
1/15/2025 at 1:55:15 AM
Isn't using React with a static site generator framework basically the same thing but better?by spoonfeeder006
1/15/2025 at 7:27:32 PM
Not remotely! Unless you meant Preact. React ships an entire rendering engine to the front-end. Most sites that use React won't load anything if javascript isn't enabledby culi
1/15/2025 at 9:24:59 AM
Then you'd have to learn React, and for many of us the point is that we really don't want to learn React, or other frontend frameworks.by mrweasel
1/15/2025 at 3:40:57 AM
Yes, if you want to throw up in your mouth.by datavirtue
1/15/2025 at 6:49:51 AM
In theory yes, in practice good luck maintaining that if you are just a solo blogger.I doubt your blog would last a single month without some breaking change of some sort in one of the packages.
by realusername
1/16/2025 at 1:04:44 AM
you mean npm packages? why would you need to update those anyhow?by spoonfeeder006
1/16/2025 at 2:50:59 AM
Because at some point it will cease to work? It needs upgrades like any other project.Every upgrade in the JS world is very painful.
by realusername
1/16/2025 at 9:37:57 PM
Why will they stop working eventually? Assuming they are all self contained and you don't upgrade even node js for that projectEdit: Oh right, OS upgrades could do it. Or network keys changing etc...
by spoonfeeder006
1/16/2025 at 10:02:25 PM
Yeah I guess React + SSG isn't the best choice. Nano JSX might be betterby spoonfeeder006
1/15/2025 at 8:54:13 AM
Yes, it is. Unfortunately HN has a crazy bias against JavaScript (the least crazy part of the web stack) and in favour of HTML and CSS, even though the latter are worse in every meaningful way.by lmm
1/16/2025 at 11:47:08 AM
It isn't crazy, judging by the number of times I've seen posts here and on other blogs talking about a 100k web page ballooning to 8Mb because of all the Javascript needed to "collect page analytics" or do user tracking when ads are included. Granted that may not be needed for personal websites, but for almost anything that has to be monetized you're going to get stuck with JS cancer because some sphincter in a suit needs for "number to go up".by dickersnoodle
1/17/2025 at 12:39:49 AM
> I've seen posts here and on other blogs talking about a 100k web page ballooning to 8Mb because of all the Javascript needed to "collect page analytics" or do user tracking when ads are includedPerfect example. HN will see a page with 6Mb of images/video, 1Mb of CSS and 200Kb of JavaScript and say "look at how much the JavaScript is bloating that page".
by lmm
1/15/2025 at 9:06:00 AM
I don't even know where to begin with the pretence that you can compare HTML with JS and somehow conclude that one is 'better' than the other. They are totally different things. JS is for functionality, and if you're using it to serve static content, you're not using it as designed.by oneeyedpigeon
1/15/2025 at 9:08:02 AM
I don't particularly care about "designed for". If you've got to serve something to make the browser display the static content you want it to, the least unpleasant way to do so is with JS.by lmm
1/15/2025 at 3:57:00 PM
Least unpleasant to the developer. Most unpleasant to the user. It breaks all kinds of useful browser features (which frontend devs then recreate from scratch in JS, poorly; that's probably the most widespread variant of Greenspun's tenth rule in practice).by TeMPOraL
1/16/2025 at 2:10:56 AM
> It breaks all kinds of useful browser features (which frontend devs then recreate from scratch in JS, poorly; that's probably the most widespread variant of Greenspun's tenth rule in practice).Nah, it's the opposite. JS tends to perform better and be more usable for the same level of feature complexity (people who want more complex sites, for good reasons or bad, tend to use JS, but if you compare like with like), HN just likes to use them as a stick to reinforce their prejudices. (E.g. if you actually test with a screenreader, aria labels work better than "semantic" HTML tags)
by lmm
1/16/2025 at 12:22:35 PM
> E.g. if you actually test with a screenreader, aria labels work better than "semantic" HTML tagsInteresting how this is opposite to the recommendations from MDN, such as:
Warning: Many of these widgets are fully supported in modern browsers. Developers should prefer using the correct semantic HTML element over using ARIA, if such an element exists.
The first rule of ARIA use is "If you can use a native HTML element or attribute with the semantics and behavior you require already built in, instead of re-purposing an element and adding an ARIA role, state or property to make it accessible, then do so." -- which also refers to: https://www.w3.org/TR/using-aria/#rule1
Though I can believe that real life may play out different than recommendations.
Also, as I understand it, ARIA is orthogonal to JS, and it doesn't alter behavior for browser users.
by TeMPOraL
1/15/2025 at 9:56:35 AM
> Yes, you must copy and paste content and not having layout page is annoying at timeI think this was one of the most common usages of PHP in the beginning, at least for those who basically wrote static HTML/CSS and needed a header/footer. It was probably a gateway into more advanced dynamic pages, eventually ending up using databases and other advanced functionality.
<?php include('header.inc'); ?>
<p>Here's a list of my favourite movies</p>
<ul>
<li>...</li>
</ul>
<?php include('footer.inc'); ?>
It would be great if HTML had a similar capability. People have asked for it for over 30 years, so it's unlikely that it will be implemented now.
by throwaway04623
1/15/2025 at 12:12:15 PM
> > Yes, you must copy and paste content and not having layout page is annoying at time> I think this was one of the most common usages of PHP in the beginning,
> <?php include('header.inc'); ?>
And other tools beforehand: basic CGI or even more basic server-side includes (https://en.wikipedia.org/wiki/Server_Side_Includes)
To reduce CPU and IO load on the server (or just in circumstances where SSI was not enabled on the server they had available) some would pre-process the SSI directives (obviously this doesn't work for dynamic results such as the output from many #exec examples), so all that is being served is simple static files ‑ a precursor to more complex modern static site builders.
> It would be great if HTML had a similar capability. People have asked for it for over 30 years, so it's unlikely that it will be implemented now.
That doesn't really fit with the intentions of HTML, and could impose a bunch of extra network latency overhead compared to using SSI instead, leading to either complete rendering delays or jumpy pages as included content is merged in in steps, though I have seen it implemented multiple ways using a bit of JS (some significantly more janky than others).
by dspillett
1/15/2025 at 4:34:50 PM
See my reply about the object tag. Suffers from lack of press I guess.by mixmastamyk
1/15/2025 at 2:12:07 AM
> Yes, you must copy and paste contentManual work is almost never a good solution. Try this:
for PAGE in *.page
do
cat header.html "$PAGE" footer.html > “$PAGE.html”
done
by liontwist
1/15/2025 at 3:53:00 AM
A slightly simpler version of same is: for PAGE in *.page
do
cat header.html "$PAGE" footer.html > "$PAGE.html"
done
As noted in a peer comment, the cat[0] command supports concatenating multiple files to stdout in one invocation.HTH
EDIT: if you don't want the output file to be named "a.page.html" and instead it to be "a.html", replace the above cat invocation with:
cat header.html "$PAGE" footer.html > "${PAGE%.page}.html"
This syntax assumes use of a POSIX/bash/zsh shell.0 - https://man.freebsd.org/cgi/man.cgi?query=cat&apropos=0&sekt...
by AdieuToLogic
1/15/2025 at 2:20:09 AM
Why not use server side includes? Most web servers support it, and it dates back to one of the early features of webservers. <!--# set var="pagetitle" value="Main page" -->
<!--# include file="00__header.html" -->
... main content here
<!--# include file="00__footer.html" -->
by adamzochowski
1/15/2025 at 2:20:51 AM
Because that requires a server with the proper config and this is an HTML file. So it works in every environment, like locally on your machine, or GitHub pages.by liontwist
1/15/2025 at 3:32:35 AM
`cat` supports multiple files, no? The whole point is that it concatenates. Why use 3 commands?by 8n4vidtmkvmk
1/15/2025 at 3:57:44 AM
Because I’m typing on my phone and the line was long. Thanks!by liontwist
1/15/2025 at 3:54:07 AM
Oh man, cattiness use of cat!by dgfitz
1/15/2025 at 3:59:03 PM
More cats are strictly cuter than less cats.by TeMPOraL
1/15/2025 at 11:08:59 AM
Unfortunately, this doesn't adjust the <title> element.by vaylian
1/17/2025 at 10:47:55 AM
envsubstby liontwist
1/17/2025 at 8:35:39 AM
sed? :Dby johnisgood
1/15/2025 at 9:32:37 AM
This is my workflow for my site, too, just replacing MS Word with Obsidian since it syncs over all my devices allowing me to write/edit my future content wherever I am at, then upload later.I tried things like bashblog for awhile, but it has some quirks like sometimes placing posts out of order when building the index page. That and I have zero use for the built in analytics options or things like Discus comments, so it seemed like I was really only using about 30% of what it was meant to do.
Here's a link to that for anyone interested. It's quite tweakable.
by 0xEF
1/15/2025 at 11:30:01 AM
> you must copy and paste contentI've started the Slab templating language[0] to be able to define reusable HTML fragments. It means using a dedicated tool but hopefully not needing to resort to a real programming language.
by thu
1/15/2025 at 4:21:35 AM
How do you do for syntax highlighting ?by begueradj
1/15/2025 at 7:21:16 AM
use esbuild to get rid of copy-pastingby ycombinatrix