Setting up a blog, the tradeoffs
This post is complementary to my welcoming post and treats of the more technical decisions I've made when setting up this blog. When I started considering the idea, there were several options presenting themselves:
- The first, obvious option, was to simply use LiveJournal or any other online blogging service.
- The second option was to install some well-known blogging software, like WordPress on the server.
- The third option was to actually make my own software.
The second option was very tempting, but it has one big drawback: attackers usually know the vulnerabilities in those softwares and they often make easy targets, sometimes even when being kept up to date. Attacks against websites are very popular nowadays and setting up web software that is used by a lot of other people puts one more at risk automatically. This is not to say I don't use already made web software at all, but I try to limit it if possible. Also, I've heard that WordPress is a rather massive piece of software and that updating it can be a headache. I don't really know of other blogging softwares and I haven't done any researches.
The first option was interesting by many aspect, mostly since I already have a LJ account that I use purely for reading. I could just post there and let them handle all the technical stuff. The problem is of course to trust them about it. If they go down, I would probably lose my audience and all the comments, unless I'd be religious about backing it all up. Also, I consider my writings as belonging to myself and I'm not too happy to delegate them to a third party.
The third option was obviously the most challenging and although it would put me fully in control, it represented quite an undertaking. Even so, it was worth considering, if I could devise a way to keep it simple.
All things considered, a blog consists mostly of static content and the only thing that can be troublesome to handle are comments. In addition to the issue regarding the content I brought up earlier, comments are the only potential security threat (because they are the only user input) and they need to be stored somehow.
So, my first idea was to just screw the comments and be happy with just having static content. With that in mind, choosing the third option was the logical choice.
Of course, purely entirely static content is a bit impractical. It'd need at least a little bit of scripting to at least separate the posts and the comments from the page layout. Posts can just be stored as simple html files with inline css if needed and the page itself can consist of mostly html with some php to include the right post.
The navigation links were easy to set up with a little more code. I decided to not do what most people do and keep the "next" link always active, even if the post loaded was in fact the last one. This way, checking if there is a new post and navigating to it can be done in one operation (as opposed to reloading then clicking next). The "previous" link is also always active, for symmetry.
After some more thinking, I still felt like it would be nice to let people at least send me their thoughts on my writings. The natural way to let them do this is by letting them reply by email. Just writing my email address in clear and letting all the spam bots in the world know it was of course out of question, but setting up a mailing form is pretty easy. The most important thing to validate with such a form is that the data sent is actually valid UTF8 and that the length of everything is fine. The content has of course to be sent explicitly as plain text.
So, being able to receive comments is nice, but I thought it would be good to be able to repost the good ones. Doing this manually would quickly become problematic if the amount becomes overwhelming. The only thing missing was a script to add the comments to the page. Simply adding another simple html file to each post that would contain only comments and automatically have a script appending to it from the email seemed once again simple enough. I would just add a link to each email calling said script to add the comment after reading it. Except that of course, the web server doesn't have write access to those files. After a bit of searching, I decided to have the script launched by the link show a password prompt (using https), then connect to localhost over ssh with my credentials and launch another script to do the work.
This last part turned out to be more complex than I had hoped for, but I did gain a few good things with this design, compared to doing it the traditional way and using a database:
- No need to make any additional webpage for administering the comments and the posts. Posts can be written directly in place with a simple text editor. The only thing that needs to be maintained manually is a list linking the post number to the date and give the files the right name so they can be found. That's acceptable.
- Pretty good performance-wise. The only expensive operations done when loading a post is finding the file containing the post itself and the one containing the comments, which is done with wildcard searching in a directory. This is a very common operation on the file system, so I expect it to be executed pretty fast. The content of those files are directly sent as output, which should be fairly fast as well. The only thing that may be slow is to repost the comments, but that doesn't matter.
- Easy to backup. Since everything is file-based. The only thing needed is to copy the directory containing the posts and the comments. For the comments not reposted, a backup of the emails is also needed.
- Easy to install. Again, it's just a question of getting the files at the right place and point the web server to them. At this point some paths need to be edited, but that can be cleared up.
All in all, I'm pretty happy with how it turned out. I have left some implementation and configuration details out of my explanation but they shouldn't matter very much. If anyone is interested in having the source, I may release it as open-source if I manage to decide which license to use.