overhauling my website


on ; by arya dradjica

As of yesterday, my website looks a little different. I added a navigation bar to every page and polished the visuals. But that was the trivial bit — in the background, I completely overhauled the website infrastructure. This blog is now powered by Typst! If you manage your website like me, you might find my new approach interesting.

Some background: Typst

I’ve been using Typst for a year or two now, and I’m incredibly happy with it. For those who don’t know, it’s a typesetting system like LaTeX; except it’s so much better that I can’t use LaTeX anymore. In my opinion, every long-lived piece of software accumulates reasons to be rewritten; and Typst is manifestation of that. I want to take a moment to gush about it.

First of all, it’s a great language. Most of the time, it feels like Markdown; it’s simple, concise, and easy to work with. But where Markdown stretches into HTML, Typst stretches into a real programming language. No LaTeX-style macro substitution here; you get functions, loops, and a real type system. It feels like bits of Rust (which is used by the implementation) has trickled into it, and at least for me, that makes it very accessible. I would be remiss not to mention the math mode, which is so much easier to use than that of LaTeX. I really encourage you to check it out.

I’ll also speak to the compiler. Typst saw LaTeX’s issues with compilation times and really took it to heart — the result is a full-blown compiler with a serious incremental computation engine at its heart. It can compile your documents from scratch in under a second, but in incremental mode, you can get live updates for every character you type. It’s an open-source Rust codebase, available on GitHub. There’s even a third-party LSP server for code completion and error checking in your IDE of choice.

My time at university really benefited from knowing about Typst. While it wasn’t always the right choice (I couldn’t always get my classmates to join in the fun) it saved me quite a few headaches. If you’re writing papers, or other serious documents, I would highly recommend checking out Typst; even if you’re not in a position to stop using LaTeX etc.

The blog life

I quite enjoy writing, and the old-school Internet, and so I’ve hosted a few websites over the years. But it’s not easy. Some people make the smart, pragmatic decision to write content first and figure out the rest later; I am unfortunately not one of those people. I have strong opinions about every aspect of hosting a website: styling and accessibility, Javascript, browser compatibility, network overhead, and even the raw HTML being served. While styling can be an endless rabbit hole, the bigger issue for me is the infrastructure for building and serving the website.

I’ve tried out various static site generators (SSGs), and all of them have left me frustrated. Hugo didn’t give me enough control over the generated HTML and didn’t have enough documentation around templating (as of a few years ago). Most other generators come with the ick factor of Javascript, but I did still try some; I found Pug interesting and played with Astro. But in the end, nothing quite stuck.

My solution for the last year has just been to write HTML by hand. It wasn’t ideal, but I got the level of control I needed. If you’re using an SSG, I would suggest peeking at the generated HTML and understanding how much work the SSG is doing for you; it’s often relatively little! On the other hand, my website did suffer for it; the styling grew inconsistent (because the CSS was embedded in every page) and I didn’t have syntax highlighting. The inconvenience of building a new page probably stopped me from writing more.

At the start of 2024, I was looking to write about Base32/64 conversion, and my glassy crate. Unfortunately, I never got around to finishing it (it has joined my ever growing heap of things I will come back to one day), but I had come up with some interesting SIMD algorithms. I wanted to write about them, and express things more visually. Markdown and hand-written HTML were definitely off the table, so I ended up turning to Typst. Somewhere in my backups is a draft of a Typst document with some neat visuals of Base32 encoding, showing how bits and bytes get shuffled around, and the vague steps of a SIMD algorithm for it. I gave up because I moved on from the project and I couldn’t figure out how to turn it into a nice HTML blog post. It turned out, HTML export was on Typst’s development roadmap, but they were busy with other things.

It turns out we can have nice things

Since then, every time I touched Typst, I would check on the tracking issue for HTML export. At the start of this year, I was ecstatic to see that work had begun; and with last month’s release of Typst 0.14.0, it seemed good enough to try out. I’ve spent the last few weeks exploring it, determining that it is in fact good enough for web development, and rewriting my entire website.

(I should also note, in case you stop reading here, that the feature is still highly experimental, and I can’t predict how it’s going to work in the future. I am assuming that further development will only add new features, and I’m okay with redoing my website to account for breaking changes. You might not be.)

I want to applaud the design effort that is going into this feature, which you can see on GitHub. There’s a keen awareness of the importance of doing things right. While exporting Typst’s style information will be supported eventually, the current implementation focuses on exporting the right semantic HTML. Users can fill in the gaps by explicitly inserting HTML elements, which are exposed with well-typed interfaces. Large, hard-to-tackle issues, like exporting multiple HTML files from a single Typst document, are taken seriously and are accounted for in each iteration of the design. The developers are aware that this feature will interact with many parts of the language and are not papering over those issues.

You can actually see the Typst source for this page. The files it references, such as (at the time of writing) style.typ, are exposed too. Let me explain how it works.

Every page of the website has a dedicated directory, storing index.typ and any additional assets. index.typ is a standalone Typst document that can be compiled into a standalone HTML page. Using a global show rule, it wraps the body of the page in manually-defined HTML structure. In short:

#show: body => {
html.html({
html.head({
// 'context' to access global details
context html.title(document.title)

// A well-typed interface for '<meta charset="utf-8">'.
html.meta(charset: "utf-8")

// Typst will read and embed the CSS from 'style.css'.
html.style(read("style.css"))
})

html.body({
// The document body gets pasted here.
body
})
})
}

Hello World!

I can define the exact HTML structure Typst should output, but for regular content, it will take over and generate well-formed, semantic HTML. I can make use of Typst as a programming language, reading files and processing document metadata. Even in the regular content, I can dip into explicit HTML elements on demand, e.g. <time> and <abbr>. I get fast compilation, great LSP support, and the ability to use one language across the entire system. It’s perfect :D

It didn’t take too long to extract the content from my hand-written HTML files and stick them in Typst. I think the most frustrating part was rewriting <a> tags into Typst #link()s, but some regex-foo got the job done. Of course, there are some issues I had to work around (e.g. extraneous <p> tags), but getting Typst to build the right HTML structure was surprisingly easy.

Once I had all translated all my content into Typst, I had to figure out a way to generate the HTML for every page, so I could publish the new site. This turned out to be another rabbit hole.

Bonus round: abusing Ninja

I have a few dozen pages on this website, and they all need individual Typst invocations. I need a build system to track dependencies and update things as necessary. One day, with multi-file HTML output, that could just be the Typst compiler; but I managed to build something fun for now.

First, I had to pick a build system. I was pretty familiar with GNU Make already, but it obviously comes with a lot of historical baggage. Tup is a really cool build system, but it lacks features and I don’t think it’s been updated in a while. While Ninja is intended to be used by a larger build system, e.g. CMake or Meson, it turned out to have just enough features to be useful on its own.

Step one: every page in my website has an index.typ file which needs to be compiled into index.html. That Typst file has several dependencies, such as the global style.typ that provides the HTML structure, and the global style.css providing CSS to embed. The first important step was to make the build system aware of those dependencies so that it would rebuild the right things.

This turned out to be quite easy; a similar problem exists in C/C++ with header files. As such, C compilers can produce a dependency file, listing all the inputs they used as a Makefile. You can try this out with gcc -M. Make and other tools have evolved to consume these files; Ninja has the depfile option. Typst offers exactly the same functionality with --deps-format make.

Next, I had to enumerate all my pages for the build system. This was definitely the hardest part with Ninja: it explicitly disallows file globs for these kinds of purposes, because they are inefficient to track. The other option was to dynamically generate a Ninja file containing the list of pages, and include it; Ninja only supports dynamic generation for the top-level build.ninja file itself, which is a bit frustrating, but I made it work. But I had to find some way to inform Ninja when to rebuild the file…

I realized that another piece of software is already doing the same work of tracking existing files, and that’s my VCS. I’m using Jujutsu, a very cool system that is compatible with Git but offers several niceties. I peeked at the .jj folder in my repository and figured that the file .jj/working_copy/checkout would change every time the working copy changed, so I pointed Ninja at it.

I also needed a way to find auxiliary assets for every page. These assets would be linked to by the HTML of the page, so I went a bit overboard and wrote a Python script powered by html.parser that would scan for <a> and <link> tags, extract referenced local assets, and generate another dependency file for those assets. Perhaps you could just query Jujutsu for source files in the matching directory; but eh, this was more fun. This ensures that unlinked files don’t get published by accident.

In the end, here’s my overall setup:

Now, ./do sets up Ninja and builds all my HTML. I can configure my Typst LSP to automatically build a particular page I’m editing (and it can use Typst’s incremental engine for that); but I haven’t explored how this interacts with Ninja’s update tracking. I typically make some changes, commit then in Jujutsu, and then run ./do publish.

If you like the idea of building your website in Typst, this might be a good setup. At least, I hope you’re now aware of implicit dependency tracking. In theory, I could share my entire setup; but it’s more complicated than what I explained here, and I’d have to strip out information about my publishing routine.

Wrapping up

I’m really happy with this setup! While the build system is definitely hacky, my main focus is on maintaining the site content — and in that respect, this has made my life so much better. Typst is just really good, and I’m glad I can use it for the bulk of my writing now. I’m really excited to see how Typst’s HTML export evolves further, but more importantly, I’m excited to write more.

P.S: What have I been up to?

My last post was about takeaway, my Rust library for work-stealing task queues. I didn’t notice the time slip by, but it’s been three months! I have some explaining to do.

The primary focus of my free time continues to be Krabby, in one way or another. The next step towards name resolution seemed to be writing a parser, so I started looking into that; but it quickly became way too complicated. Instead, I’m opting for a weird middle ground: I interleave parsing with global name resolution so that it can execute before parsing finishes.

From the perspective of optimization, the hard part of name resolution is that source files can depend on each other. But it’s important to note that only the top-level items in every source file can be referenced. Instead of parsing everything in a source file before starting name resolution, I’m going to find the top-level items in each file from their token streams. This allows me to defer parsing until later, and it should be more resilient to parsing errors.

In order to identify items, like fn foo() {} and mod bar;, I need to search for keywords. This opened up another can of worms: I’m not interning identifiers properly right now. So searching for matches of identifiers is tricky. I also need to fix this in order to compare identifiers between source files. So I’ve been working on identifier interning for the last two months.

Progress has been pretty good. My goal was to design a fast multi-threaded identifier interner, which I estimate could be 5-10x faster than the existing symbol_table crate; along the way, I accidentally made a really fast single-threaded interner too. According to my benchmarks, it’s 1.5x faster than string-interner, which I didn’t think would be possible. Anyways, thread-safe reallocation and rehashing are incredibly difficult, and it’s just taking me a while to wrap my head around things. You can track my progress on Codeberg.

Once I have a good multi-threaded interner, I can implement top-level item parsing and really tackle global name resolution. I have a fairly concrete plan for dealing with all its intricacies, with glob imports and ordering; but that plan hasn’t collided with the real world yet. I’m excited to see what happens.

There are plenty of non-critical things I want to set up for Krabby too. I have a draft of a lexer that’s twice as fast, using scalar SIMD techniques. I also wanted to implement fuzz testing to compare various lexer implementations. And the Krabby CLI can do with some love. If something like this sounds fun, maybe send me an e-mail?

P.P.S: Hosting and security

I also wanted to mention security in the context of website hosting. I don’t own the servers hosting my website; I am currently using Uberspace. While self-hosting could be fun, I haven’t evaluated the implications of making my IP address public. I’m not sure it’s a great idea. But that means I have to trust someone else to serve my website for me.

I’m terribly aware of the stereotype, but I’ve made my build system sign each page of my website before publishing it. It does so by collecting all the assets for a page (including index.typ and index.html) into an index.tar.xz tarball, which is then signed, and the detached signature is written to index.tar.xz.sig. You can fetch these for the current page, or any other page on this website. Each page receives an individual signature so that you don’t need to download the entire website to verify anything … and so that I can maintain hidden pages. Signatures are made with a dedicated website signing key, which is signed by my main key.