Home
Robin
16 February 2008 @ 05:58 pm
Today, I decided to learn XSLT  
I have to shred XML movie data from IMDb into a relational structure, for a project at work. I whipped up something using Perl's XML::Simple, because it's a simple problem, but then I figured it would be nicer if I could use standards to translate from the XML to the insert statements, especially if I could use a stream-based parser to keep memory requirements lower...as you might imagine, IMDb has a lot of movie data. So, I decided to look into XSLT, which I hear is the de facto XML transformation standard, and really awesome, if you can wrap your head around it.

Having been told, by a number of people, that it's actually a fairly difficult idea to wrap your head around, I set aside a large chunk of time to go and learn it. (I'm using the rest of that time to write this rant.) It took me 20 minutes to realize it was just a gimped version of LISP macros, and I'm embarrassed it took me that long. There's nothing conceptually innovative there; it's just a case of looking up the syntax when you need it.

I hate XML. For years and years, I saw the hype, and everyone was "learning" XML. I saw XML listed under "programming" sections in bookstores, and even on resumes. It's just a file format, people! Ever look at HTML? Now imagine that you can specify anything you want for element names between the angle brackets. Throw in a few optional headers at the top, and you've got XML. Want to specify which element names are allowed, inside of which other elements? Make a DTD, describing what elements can contain what other elements. This is not rocket science, and it's not even innovative -- LISP had the same type of hierarchical data structures, complete with a similar syntax, in 1959.

My main beef with it, I think, is that it's so godawful hard to read. Why oh why did anyone think that <name>content</name> was a good set of delimiters? Wouldn't it be clearer -- and more consistent with the underlying structure -- to use simple parentheses, like (name (content)), or even (name content)? That would be much easier to read. Less redundant. Oh noes, we have to count parentheses, instead of searching for a specific end tag! Err...except for the times when we have to count the tags too, because they're nested. Okay. It's a shame there's no hierarchical data syntax that uses that. Oh wait. Nevermind. LISP data syntax. In 1959.

And now, there's XSLT. Well, since 1999 or so. We can embed control flow into our data! Now that control flow is in the same syntax as our data, imagine the possibilities for templating: we can intersperse data and code! Surely, this is innovative. Oh, wait. No. LISP made that innovative leap in 1959, with its partial-execution macro system. (Granted, in this instance, XSL may be easier to read than the LISP macro syntax.)

I admit that it's easier to specify a tree structure with an XML DTD than it is in LISP, or actually anything else I can think of. You can do it, though. Since 1959. Because data and code are the exact same thing in LISP (wow, what an innovation!), you can just "evaluate" the data as code: If it parses, it's legit.

I'm mentioning LISP a lot because it was first. all of these things have been around, and exist in a number of other languages. Perl, Ruby, Python, ML and lots of other languages have hierarchical data syntax. SAX parsing? Every compiler known to mankind uses a similar technology, since the nearly the dawn of compilers. XPath? You have to index the heck out of your XML to make that fast, then you use -- surprise! -- relational databases to do it. The worst of both worlds: Hard to parse by humans, and hard to parse by machines!

XPath, XSLT, SAX ... they're all just libraries implemented for manipulating an arbitrarily decided "standard" syntax. There are better tools for getting each of those jobs done. It's (now) universally supported, so I suppose I'm stuck with it. That's really the only reason to use it, in my opinion. It just happens to be a very compelling reason.

So, yeah. XML is another stupid file format, amid a plethora of equally useful formats. The only thing making it special is organizational backing. Go ahead and use it, but stop thinking it's innately special somehow. Please? It's getting really old.
 
 
Robin
05 July 2006 @ 04:50 pm
Advice on programming advice  
So, I've been programming for a very long time, 19 years or so. As you might expect, people ask me for advice on learning to program.

Problem is, I learned to program in 1987. My first language was Basic, my second language was FORTRAN, and my third language (in 1993) was C. I didn't really bother learning any sort of object-oriented languages until college, which for me started in 1996. I taught myself Perl and Java on a lark in 1999. I learned Smalltalk, LISP, ML, and a variety of other languages for classes in college. Somewhere along the line I decided to teach myself Ruby and Python, and I'm looking at O'Caml and Haskell now because they look really interesting, and significantly different than what I've seen before.

At this point, I figure I have a good grasp of how to design and architect a wide variety of programming tasks, and I can certainly explain my design decisions to people who are curious. What I can't explain, though, is how I reached those decisions. Why did I start thinking in that direction? Honestly, it's usually because I've seen something like that before, written by someone else, and it looked like a good fit.

Really, there's nothing special about programming. The programming language is really just a small part of programming. I don't use the first two languages I learned: They're obsolete, now. The crux of programming is really the logic aspect of it. If you want to be a good programmer, practice logic puzzles. That's really what programming is: Figure out the solution to the logic puzzle, and then describe it to the computer in a language the computer "understands".

When asked a programming question, in an interview for example, I figure out the solution before I select a language. Sometimes the solution is easiest for me in Perl, sometimes it's easiest for me in C, and sometimes it's easiest to me in LISP. I assume things will start to be easy for me in Ruby sometime soon.

Still, people ask me, "What language should I learn first?" Really, I have no idea. I could say to learn C, but that's probably not such good advice for a first language anymore. I've actually been saying "Ruby" these days, because it's simple to learn, I can point to some nice tutorials for it, and it's really close to the "solving the logic problem" aspect of programming. I wonder, though, if I would be nearly as good a programmer if I had learned Ruby before C -- learning C has forced me to learn how a computer actually works, and has taught me how to write very efficient code.

Anyway, what I'm trying to say is that I'm a decently experienced programmer who has no idea where he'd start learning to program today. Does anyone have any advice for me to pass on?
 
 
Robin
12 November 2005 @ 08:39 pm
an atypical entry  
I've found that I have a hard time thinking in Microsoft. I can think in C, C++, Smalltalk, Java, Bourne, BASIC, Perl, LISP, and any kind of scripting you could care to name. But, hand me C#, VB, J, or any of the other language Microsoft has designed in the last ten years, and I just can't hack it. Well, I can, but it just seems ... the long way around. Don't get me wrong, this isn't a "Microsft is Teh Evil" thing -- the things they do well, they do quite well, and deserve credit for. I just don't get it when it comes to the .NET platform.

general wankery regarding computer language design, probably only interesting to computer programming nerds )
 
 
Current Mood: tired
 
 
Robin
08 August 2004 @ 12:38 pm
If Richard Feynman applied for a job at Microsoft  
Interviewer: Now comes the part of the interview where we ask a question to test your creative thinking ability. Don't think too hard about it, just apply everyday common sense, and describe your reasoning process.

Here's the question: Why are manhole covers round?

Feynman: They're not. Some manhole covers are square. It's true that there are SOME round ones, but I've seen square ones, and rectangular ones.

Interviewer: But just considering the round ones, why are they round?

Feynman: If we are just considering the round ones, then they are round by definition. That statement is a tautology.

Interviewer: I mean, why are there round ones at all? Is there some particular value to having round ones?

Feynman: Yes. Round covers are used when the hole they are covering up is also round. It's simplest to cover a round hole with a round cover.

Interviewer: Can you think of a property of round covers that gives them an advantage over square ones?

Feynman: We have to look at what is under the cover to answer that question. The hole below the cover is round because a cylinder is the strongest shape against the compression of the earth around it. Also, the term "manhole" implies a passage big enough for a man, and a human being climbing down a ladder is roughly circular in cross-section. So a cylindrical pipe is the natural shape for manholes. The covers are simply the shape needed to cover up a cylinder.

Interviewer: Do you believe there is a safety issue? I mean, couldn't square covers fall into the hole and hurt someone?

Feynman: Not likely. Square covers are sometimes used on prefabricated vaults where the access passage is also square. The cover is larger than the passage, and sits on a ledge that supports it along the entire perimeter. The covers are usually made of solid metal and are very heavy. Let's assume a two-foot square opening and a ledge width of 1-1/2 inches. In order to get it to fall in, you would have to lift one side of the cover, then rotate it 30 degrees so that the cover would clear the ledge, and then tilt the cover up nearly 45 degrees from horizontal before the center of gravity would shift enough for it to fall in. Yes, it's possible, but very unlikely. The people authorized to open manhole covers could easily be trained to do it safely. Applying common engineering sense, the shape of a manhole cover is entirely determined by the shape of the opening it is intended to cover.

Interviewer (troubled): Excuse me a moment; I have to discuss something with my management team. (Leaves room.)

(Interviewer returns after 10 minutes)

Interviewer: We are going to recommend you for immediate hiring into the marketing department.

Keith Michaels
krm@sdc.cs.boeing.com
 
 
Current Mood: amused
Current Music: Apocalyptica - Harmageddon