Hixie On HTML Parsing

Waist deep in an ongoing effort to write a spec for parsing HTML documents Ian Hickson asks offers up this Tag Soup riddle asking what should the DOM look like for the following invalid code:

<!DOCTYPE html><em><p>XY</p></em>

The answer as you’ll see in his brain dump isn’t simple nor obvious.

Now if your head isn’t spinning after reading through that you should also take the time to read through his post from a few days ago where he started his parsing related brain dump. And if it is, well, take two aspirin, but also at least take a look at the little utility he wrote and used in both these posts—Live DOM Viewer—where you can drop some markup in and instantly see the DOM tree represented.

Comments Temporarily(?) Removed