Tabs versus Spaces:
An Eternal Holy War.

© 2000 Jamie Zawinski <jwz@jwz.org>


The last time the tabs-versus-spaces argument flared up in my presence, I wrote this. Gasoline for the fire? Maybe.

I think a big part of these interminable arguments about tabs is based on people using the same words to mean different things.

In the following, I'm trying to avoid espousing my personal religion here, I just thought it would be good to try and explain the various sects.

Anyway. People care (vehemently) about a few different things:

  1. When reading code, and when they're done writing new code, they care about how many screen columns by which the code tends to indent when a new scope (or sexpr, or whatever) opens.

  2. When there is some random file on disk that contains ASCII byte #9, the TAB character, they care about how their software reacts to that byte, display-wise.

  3. When writing code, they care about what happens when they press the TAB key on their keyboard.

Note that I make a distinction between the TAB character (which is a byte which can occur in a disk file) and the TAB key (which is that plastic bump on your keyboard, which when hit causes your computer to do something.)

As to point #1:

As to point #2, the tab character: there is a lot of history here.

As to point #3, the tab key: this is an editor user interface issue.

  1. Some editors (like vi) treat TAB as being exactly like X, Y, and Z: when you type it, it gets inserted into the file, end of story. (It then gets displayed on the screen according to point #2.)

    With editors like this, the interpretation of point #2 is what really matters: since TAB is just a self-inserting character, the way that one changes the semantics of hitting the TAB key on the keyboard is by changing the semantics of the display of the TAB character.

  2. Some editors (like Emacs) treat TAB as being a command which means ``indent this line.'' And by indent, it means, ``cause the first non-whitespace character on this line to occur at column N.''

    To editors like this, it doesn't matter much what kind of interpretation is assigned to point #2: the TAB character in a file could be interpreted as being mod-2 columns, mod-4 columns, or mod-8 columns. The only thing that matters is that the editor realize which interpretation of the TAB character is being used, so that it knows how to properly put the file characters on the screen. The decisions of how many characters by which an expression should be indented (point #1) and of how those columns should be encoded in the file using the TAB character (point #2) are completely orthogonal.

So, the real religious war here is point #1.

Points #2 and #3 are technical issues about interoperability.

My opinion is that the best way to solve the technical issues is to mandate that the ASCII #9 TAB character never appear in disk files: program your editor to expand TABs to an appropriate number of spaces before writing the lines to disk. That simplifies matters greatly, by separating the technical issues of #2 and #3 from the religious issue of #1.

As a data point, my personal setup is the same as the default Emacs configuration: the TAB character is interpreted as mod-8 indentation; but my code is indented by mod-2.

I prefer this setup, but I don't care deeply about it.

I just care that two people editing the same file use the same interpretations, and that it's possible to look at a file and know what interpretation of the TAB character was used, because otherwise it's just impossible to read.

In Emacs, to set the mod-N indentation used when you hit the TAB key, do this:

To cause the TAB file-character to be interpreted as mod-N indentation, do this:

To cause TAB characters to not be used in the file for compression, and for only spaces to be used, do this:

You can also do this stuff on a per-file basis. The very first line of a file can contain a comment which contains variable settings. For the XP code in the client, you'll see many files that begin with

The stuff between -*-, on the very first line of the file, is interpreted as a list of file-local variable/value pairs. A hairier example:

If you have different groups of people with different customs, the presence of these kinds of explicit settings are really handy.

I believe vi has a mechanism for doing this sort of thing too, but I don't know how it works.

To keep myself honest (that is, to ensure that no tabs ever end up in source files that I am editing) I also do this in my .emacs file:

  (defun java-mode-untabify ()
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward "[ \t]+$" nil t)
        (delete-region (match-beginning 0) (match-end 0)))
      (goto-char (point-min))
      (if (search-forward "\t" nil t)
          (untabify (1- (point)) (point-max))))
    nil)

  (add-hook 'java-mode-hook 
            '(lambda ()
               (make-local-variable 'write-contents-hooks)
               (add-hook 'write-contents-hooks 'java-mode-untabify)))

That ensures that, even if I happened to insert a literal tab in the file by hand (or if someone else did when editing this file earlier), those tabs get expanded to spaces when I save. This assumes that you never use tabs in places where they are actually significant, like in string or character constants, but I never do that: when it matters that it is a tab, I always use '\t' instead.

Here are some details on vi, courtesy of Woody Thrower:

Standard vi interprets the tab key literally, but there are popular vi-derived alternatives that are smarter, like vim. To get vim to interpret tab as an ``indent'' command instead of an insert-a-tab command, do this:

To set the mod-N indentation used when you hit the tab key in vim (what Emacs calls c-basic-offset), do this:

To cause the TAB file-character to be displayed as mod-N in vi and vim (what Emacs calls tab-width), do this:

To cause TAB characters to not be used in the file for compression, and for only spaces to be used (what emacs calls indent-tabs-mode), do this:

In vi (and vim), you can do this stuff on a per-file basis using ``modelines,'' magic comments at the top of the file, similarly to how it works in Emacs:

So go forth and untabify!


[ up ]