(Text Processing) Paradigms Lost

Tom Scocca has wrote a brilliant essay in Slate today on the absurdities of Microsoft Word being the standard text processing tool in the age of digital publishing. I struggle to get students doing statistical and demographic analysis in R to not use Word because of all the unwanted junk it brings to the most trivial text-processing task. Using the word2cleanhtml website, Scocca shows how a two-word text chunk written in Word contains the equivalent of eight pages of unnecessary hidden text!

I encounter all the nonsense associated with the annoying default “annoying typographical flourishes” that Scocca discusses in my role as associate editor of a couple of journals and a regular reviewer for NSF. Both of these roles make extensive use of web-based platforms for managing workflows associated with writing-intensive tasks (ScholarOne for editing and Fastlane for NSF) and both snarf on the typographical annoyances Scocca enumerates (“smart” quotes, automatic em-dashes, etc.). When you do an NSF panel, you receive a briefing explaining that if you are going to write your panel summaries in Word, you need to turn off smart quotes and avoid other things that will lead to nonsense in the plain-text formatted fields of Fastlane. Of course, no one does this.

Don’t get me started on track changes…

I do the great majority of my own writing in a plain text-processor. My personal favorite is Aquamacs, a Mac-native variation on GNU Emacs. Emacs is definitely not for everyone, but there are lots of other possibilities. Scocca writes that he has turned to TextEdit, which is another Mac-native, but there are plenty of other options that run on different systems. Here is a list of possibilities.

It will be interesting to see how online collaborative tools such as Google Docs change the way people do text processing.  I find that more of my students do their work in Google Docs. It’s certainly not a majority yet but the fraction is growing rapidly each year. As Scocca notes, Google Docs provides a much more sane alternative to track changes, among other things.

Microsoft clearly needs to get serious and do a bit of innovation here if they want to stay in this particular game. I, for one, will not miss MS Word if it should go the way of WordStar.

2 thoughts on “(Text Processing) Paradigms Lost”

  1. Wow, Jamie, I find it amazing that there've been no comments preceding mine, in three months ...

    This struck a cord. I DETEST MS Word, but somehow must keep returning to it (pure coercion!) when my favorite little discoveries — "independent" little (shareware) word processing applications — go extinct. Plus of course the professional/publishing dictates which you discuss. And remember WriteNow? I liked that one; Terry Deacon introduced me to it.

    Problem is, I've invested — and ultimately wasted — SO much otherwise productive time over the years falling in love with applications of all sorts, i.e., alternatives to the "big guys," that aren't sustainable. I place part of the blame for this on a friend who turned me on to VersionTracker —> CNET ...

  2. I don't get a lot of comments generally, unless there is R code involved.

    Pages is a pretty slick inheritor of the WriteNow legacy and I don't think it's going away any time soon. I use it when I absolutely have to do something word-processory like write a letter or recommendation. I typically open most word files with Pages too. It now works with EndNote and MathType as well. What would really take it to the next level for me is if they had a filter that could read in a LaTeX document and convert it to .pages format, taking all the typeset material and passing it through MathType the way that tex2word does.

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>