Validator.nu HTML Parser 1.2.1
May 25th, 2009 · No Comments
→ No CommentsTags: Uncategorized
This Week in HTML 5 – Episode 32
May 6th, 2009 · No Comments
→ No CommentsTags: Uncategorized
This Week in HTML 5 – Episode 31
April 23rd, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
This big news this week is the <datagrid> element. This is a brand spanking new element introduced in r2962.
In the
datagriddata model, data is structured as a set of rows representing a tree, each row being split into a number of columns. The columns are always present in the data model, although individual columns might be hidden in the presentation.Each row can have child rows. Child rows may be hidden or shown, by closing or opening (respectively) the parent row.
Rows are referred to by the path along the tree that one would take to reach the row, using zero-based indices. Thus, the first row of a list is row "0", the second row is row "1"; the first child row of the first row is row "0,0", the second child row of the first row is row "0,1"; the fourth child of the seventh child of the third child of the tenth row is "9,2,6,3", etc.
The chains of numbers that give a row's path, or identifier, are represented by arrays of positions, represented in IDL by the
RowIDinterface.The root of the tree is represented by an empty array.
Each column has a string that is used to identify it in the API, a label that is shown to users interacting with the column, a type, and optionally an icon.
The possible types are as follows:
Keyword Description textSimple text. editableEditable text. checkableText with a check box. listA list of values that the user can switch between. progressA progress bar. meterA gauge. customA canvas onto which arbitrary content can be drawn. Each column can be flagged as sortable, in which case the user will be able to sort the view using that column.
Columns are not necessarily visible. A column can be created invisible by default. The user can select which columns are to be shown.
When no columns have been added to the
datagrid, a column with no name, whose identifier is the empty string, whose type istext, and which is not sortable, is implied. This column is removed if any explicit columns are declared.Each cell uses the type given for its column, so all cells in a column present the same type of information.
The other major change to the spec this week is the <keygen> element. As I mentioned in episode 12, someone went to the trouble of documenting the <keygen> element, and there has been a surprising amount of discussion about it in the past six months. Simply put, the keygen element represents a key-pair generator control. You include it in a <form>. When your browser submits the form, the private key is stored in the local keystore, and the public key is packaged and sent to the server. [r2960]
Not much else went into the spec this week, but there's been a lot of interesting activity around the web.
- A new W3C Working Draft of HTML 5 is out. As I've mentioned before, this is just a snapshot of progress-to-date. By its very nature, it is out of date as soon as it's published, since the working group continues to progress while the webmaster gnomes are publishing.
- Also published: the latest draft of "HTML 5 differences from HTML 4", compiled by Opera's Anne van Kesteren.
- Mozilla bug 465007: "Harmonize content sniffing in HTML 5 and Firefox." The next version of Firefox will sniff images the way the HTML 5 specification recommends. I am still opposed to content sniffing on philosophical grounds, but philosophy doesn't get you very far on the open web, and documented heuristics are better than undocumented heuristics. And interoperable, documented heuristics are even better!
- Speaking of content sniffing, Adam Barth's [PDF] whitepaper, Secure Content Sniffing For Web Browsers, is an excellent read.
- Henri Sivonen's The Last of the Parsing Quirks is equally fascinating.
- Superset encodings [Re: ISO-8859-* and the C1 controlrange] is an incredibly detailed look into the insane world of character encoding.
- You can still help us review HTML 5! Your input is important!
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Weekly Review
The Road to HTML 5: Link Relations
April 17th, 2009 · No Comments
Welcome back to my semi-regular column, "The Road to HTML 5," where I'll try to explain some of the new elements, attributes, and other features in the upcoming HTML 5 specification.
The feature of the day is link relations.
In this article:
- What are link relations?
- How can I use link relations?
- Changes to link relations since HTML 4
- Extending rel even further
What are link relations?
Regular links (<a href>) simply point to another page. Link relations are a way to explain why you're pointing to another page. They finish the sentence "I'm pointing to this other page because..."
- ...it's a stylesheet containing CSS rules that your browser should apply to this document
- ...it's a feed that contains the same content as this page, but in a standard subscribable format
- ...it's a translation of this page into another language
- ...it's the same content as this page, but in PDF format
- ...it's the next chapter of an online book that this page is also a part of
And so on. HTML 5 breaks link relations into two categories:
Two categories of links can be created using the link element. Links to external resources are links to resources that are to be used to augment the current document, and hyperlink links are links to other documents. ...
The exact behavior for links to external resources depends on the exact relationship, as defined for the relevant link type.
Of the examples I just gave, only the first (rel=stylesheet) is a link to an external resource. The rest are hyperlinks to other documents. You may wish to follow those links, or you may not, but they're not required in order to view the current page.
Common link relations include <link rel=stylesheet> (for importing CSS rules) and <link rel=alternate type=application/atom+xml> (for Atom feed autodiscovery). HTML 4 defines several link relations; others have been defined by the microformats community. HTML 5 attempts to consolidate all the known link relations, clean up their definitions (if necessary), and then provide a central registry for future proposals.
How can I use link relations?
Most often, link relations are seen on <link> elements within the <head> of a page. Some link relations can also be used on <a> elements, but this is uncommon even when allowed. HTML 5 also allows some relations on <area> elements, but this is even less common. (HTML 4 did not allow a rel attribute on <area> elements.)
See the full chart of link relations to check where you can use specific rel values.
Changes to link relations since HTML 4
Link relations were added to the HTML 5 spec in November 2006. (Back then the spec was still called "Web Applications 1.0.") r319 kicked off a flurry of rel-related activity. The original additions were primarily based on research of existing web content in December 2005, using Google's cache of the web at the time. Since then, other relations have been added, and a few have been dropped.
rel=alternate
rel=alternate has always been a strange hybrid of use cases, even in HTML 4. In HTML 5, its definition has been clarified and extended to more accurately describe existing web content. For example, using rel=alternate in conjunction with the type attribute indicates the same content in another format. Using rel=alternate in conjunction with type=application/rss+xml or type=application/atom+xml indicates an RSS or Atom feed, respectively.
HTML 5 also puts to rest a long-standing confusion about how to link to translations of documents. HTML 4 says to use the lang attribute in conjunction with rel=alternate to specify the language of the linked document, but this is incorrect. The HTML 4 Errata lists four outright errors in the HTML 4 spec (along with several editorial nits); one of these outright errors is how to specify the language of a document linked with rel=alternate (The correct way, described in the HTML 4 Errata and now in HTML 5, is to use the hreflang attribute.) Unfortunately, these errata were never re-integrated into the HTML 4 spec, because no one in the W3C HTML Working Group was working on HTML anymore.
- r324:
rel=alternateadded to HTML 5 - r485 defines how to use the
mediaattribute in conjunction withrel=alternate - r1942 make the
titleattribute required forrel="alternate stylesheet".
rel=archives
New in HTML 5
rel=archives "indicates that the referenced document describes a collection of records, documents, or other materials of historical interest. A blog's index page could link to an index of the blog's past posts with rel="archives"."
rel=author (and the removal of the rev attribute)
New in HTML 5
rel=author is used to link to information about the author of the page. This can be a mailto: address, though it doesn't have to be. It could simply link to a contact form or "about the author" page.
rel=author is equivalent to the rev=made link relation defined in HTML 3.2. Despite popular belief, HTML 4 does not include rev=made, effectively obsoleting it. (You can search the entire spec for the word "made" if you don't believe me.)
Given that rev=made was the only significant non-typo usage of the rev attribute, HTML 5 added rel=author to make up for the loss of rev=made in HTML 4, thus allowing the working group to obsolete the rev attribute altogether. Other than the un/semi/sortof-documented rev=made value, people typo the "rev" attribute more often than they intentionally use it, which suggests that the world would be better off if validators could flag it as non-conforming.
The decision to drop the rev attribute seems especially controversial. The same question flares up again and again on the working group's mailing list: "what happened to the rev attribute?" But in the face of almost-universal misunderstanding (among people who try to use it) and apathy (among everyone else), no one has ever made a convincing case for keeping it that didn't boil down to "I wish the world were different." Hey, so do I, man. So do I.
- r329:
rel=authoradded to HTML 5 - WA1: rev attribute (and followup two years later)
- Where did the "rev" attribute go?
- Removing @rev
- Absent rev?
rel=external
New in HTML 5
rel=external "indicates that the link is leading to a document that is not part of the site that the current document forms a part of." I believe it was first popularized by WordPress, which uses it on links left by commenters. I could not find any discussion of it in the HTML working group mailing list archives. Both its existence and its definition appear to be entirely uncontroversial.
rel=feed?
New in HTML 5, but may not be long for this world
rel=feed "indicates that the referenced document is a syndication feed." Right away, you're thinking, "Hey, I thought you were supposed to use rel=alternate type=application/atom+xml to indicate that the referenced document is a syndication feed." In fact, that's what everyone does, and that's what all browsers support. Firefox 3 is the only browser that supports rel=feed. (It also supports rel=alternate type=application/atom+xml.) The rel=feed variant was proposed in the Atom working group in 2005 and somehow found its way into HTML 5. Just yesterday, I was discussing whether HTML 5 should drop rel=feed due to lack of browser implementation and complete and utter lack of author awareness.
- r319:
rel=feedadded to HTML 5 - r335: more thorough definition
- Autodiscovery paces
- Inferring rel="feed" from the media type
- PaceEntryMediatype (2, 3)
- IRC chat about browser compatibility and usage within existing web content (continued on next page)
rel=first, last, prev, next, and up
HTML 4 defined rel=start, rel=prev, and rel=next to define relations between pages that are part of a series (like chapters of a book, or even posts on a blog). The only one that was ever used correctly was rel=next. People used rel=previous instead of rel=prev; they used rel=begin and rel=first instead of rel=start; they used rel=end instead of rel=last. Oh, and -- all by themselves -- they made up rel=up to point to a "parent" page.
HTML 5 includes rel=first, which was the most variation of the different ways to say "first page in a series." (rel=start is a non-conforming synonym, for backward compatibility.) Also rel=prev and rel=next, just like HTML 4 (but mentioning rel=previous for back-compat). It also adds rel=last (the last in a series, mirroring rel=first) and rel=up.
The best way to think of rel=up is to look at your breadcrumb navigation (or at least imagine it). Your home page is probably the first page in your breadcrumbs, and the current page is at the tail end. rel=up points to the next-to-the-last page in the breadcrumbs.
- r319:
rel=first/prev/next/lastadded to HTML 5 - r320:
rel=upadded to HTML 5 - r1126 and r1127 make it clear that
rel=first/prev/next/lastrefer to any sequence of pages, not just a hierarchical structure. - r1130 makes it legal to duplicate the
upkeyword in a singlerelattribute.
rel=icon
New in HTML 5
rel=icon is the second most popular link relation, after rel=stylesheet. It is usually found together with shortcut, like so:
<link rel="shortcut icon" href="/favicon.ico">
All major browsers support this usage to associate a small icon with the page (usually displayed in the browser's location bar next to the URL).
Also new in HTML 5: the sizes attribute can be used in conjunction with the icon relationship to indicate the size of the referenced icon. [sizes example]
- <link rel=icon width="" height=""> (initial proposal, later changed to
sizes) - Re: <link rel=icon width="" height="">
- <link rel=icon sizes=?> What if sizes is incorrect?
- The sizes="" attribute for rel=icon
- r339:
rel=iconadded to HTML 5 - r1558 adds the
sizesattribute, and r1559 adds an example - r1712 says that, if the browser decides that multiple icons are appropriate and equally desirable, the last one listed should be used.
- r1713 further explains how to determine precedence if multiple icons are listed.
rel=license
New in HTML 5
rel=license was invented by the microformats community. It "indicates that the referenced document provides the copyright license terms under which the current document is provided."
- real world semantics, the presentation that launched a thousand ships
- creative commons supports rel="license"
- rel="license" spec on microformats.org
- r340:
rel=licenseadded to HTML 5 - rel-license-issues
rel=nofollow
New in HTML 5
rel=nofollow "indicates that the link is not endorsed by the original author or publisher of the page, or that the link to the referenced document was included primarily because of a commercial relationship between people affiliated with the two pages." It was invented by Google and standardized within the microformats community. The thinking was that if "nofollow" links did not pass on PageRank, spammers would give up trying to post spam comments on weblogs. That didn't happen, but rel=nofollow persists. Many popular blogging systems default to adding rel=nofollow to links added by commenters.
rel=noreferrer
New in HTML 5
rel=noreferrer "indicates that the no referrer information is to be leaked when following the link." No browser currently supports this. [rel=noreferrer test case]
- r1119:
rel=noreferreradded to HTML 5 - no referer attrribute for <a>, same reason as ping - initial discussion of a mechanism for blocking referrer information on individual links
- several messages about a way to disable referer headers for links
- "I don't think that spelling the attribute "noreferer" is consistent. It should be "noreferrer"."
- "I'll switch to [no]referrer."
- r1950 extends
rel=noreferrerto also blow away the 'opener' when used with target=_blank
rel=pingback
New in HTML 5
rel=pingback specifies the address of a "pingback" server. As explained in the Pingback specification, "The pingback system is a way for a blog to be automatically notified when other Web sites link to it. ... It enables reverse linking -- a way of going back up a chain of links rather than merely drilling down."
Blogging systems, notably WordPress, implement the pingback mechanism to notify authors that you have linked to them when creating a new blog post.
- r343:
rel=pingbackadded to HTML 5 - Pingback 1.0 specification
- WordPress Trackback Tutorial
rel=prefetch
New in HTML 5
rel=prefetch "indicates that preemptively fetching and caching the specified resource is likely to be beneficial, as it is highly likely that the user will require this resource." Search engines sometimes add <link rel=prefetch href="URL of top search result"> to the search results page if they feel that the top result is wildly more popular than any other. For example: using Firefox, search Google for CNN; view source; search for the keyword "prefetch".
Mozilla Firefox is the only current browser that supports rel=prefetch.
- What improves Web applications? "A big plus point would be to prefetch the next page so it loads immediately the 'next' button is clicked."
- r345:
rel=prefetchadded to HTML 5 - Link prefetching FAQ on Mozilla Developer Center
- test whether your browser supports
rel=prefetch
rel=search
New in HTML 5
rel=search "indicates that the referenced document provides an interface specifically for searching the document and its related resources." Specifically, if you want rel=search to do anything useful, it should point to an OpenSearch document that describes how a browser could construct a URL to search the current site for a given keyword.
OpenSearch (and rel=search links that point to OpenSearch description documents) is supported in Microsoft Internet Explorer since version 7 and Mozilla Firefox since version 2.
- OpenSearch: Autodiscovery in HTML/XHTML
- r345:
rel=searchadded to HTML 5 - Creating OpenSearch plugins for Firefox
- Available search plugins for Firefox
- Search Providers for Microsoft Internet Explorer
- r484 adds a note about OpenSearch documents in particular
rel=sidebar
New in HTML 5
rel=sidebar "indicates that the referenced document, if retrieved, is intended to be shown in a secondary browsing context (if possible), instead of in the current browsing context." What does that mean? In Opera and Mozilla Firefox, it means "when I click this link, prompt the user to create a bookmark that, when selected from the Bookmarks menu, opens the linked document in a browser sidebar." (Opera actually calls it the "panel" instead of the "sidebar.")
Internet Explorer, Safari, and Chrome ignore rel=sidebar and just treat it as a regular link. [rel=sidebar test case]
- r346:
rel=sidebaradded to HTML 5 - r668 revamps the definition based on the concept of a "secondary browsing context."
rel=tag
New in HTML 5
rel=tag "indicates that the tag that the referenced document represents applies to the current document." Marking up "tags" (category keywords) with the rel attribute was invented by Technorati to help them categorize blog posts. Early blogs and tutorials thus referred to them as "Technorati tags." (You read that right: a commercial company convinced the entire world to add metadata that made the company's job easier. Nice work if you can get it!) The syntax was later standardized within the microformats community, where it was simply called "rel=tag".
Most blogging systems that allow associating categories, keywords, or tags with individual posts will mark them up with rel=tag links. Browsers do not do anything special with them, but they're really designed for search engines to use as a signal of what the page is about.
- rel-tag microformat specification
- rel-tag FAQ
- r346:
rel=tagadded to HTML 5
rel=contact
rel=contact was briefly part of HTML 5, but r1711 removed it because it conflicted with the same-named XFN relationship.
Extending rel even further
There seems to be an infinite supply of ideas for new link relations. In an attempt to prevent people from just making shit up, the WHATWG maintains a registry of proposed rel values and defines the process for getting them accepted.
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Tutorials
This Week in HTML 5 – Episode 30
April 13th, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
There has been very little spec-related activity this week, so I will briefly repeat Ian Hickson's request to Help us review HTML5 and then turn to a fascinating debate happening right now on the WHATWG mailing list.
The debate revolves around perceptions and expectations of privacy. Brady Eidson (Apple/WebKit) kicks off the discussion with Private browsing vs. Storage and Databases:
A commonly added feature in browsers these days is "private browsing mode" where the intention is that the user's browsing session leaves no footprint on their machine. Cookies, cache files, history, and other data that the browser would normally store to disk are not updated during these private browsing sessions.
This concept is at odds with allowing pages to store data on the user's machine as allowed by LocalStorage and Databases. Sur[e]ly persistent changes during a private browsing session shouldn't be written to the user's disk as that would violate the intention of a private browsing session. ...
- Disable LocalStorage completely when private browsing is on. Remove it from the DOM completely.
- Disable LocalStorage mostly when private browsing is on. It exists at window.localStorage, but is empty and has a 0-quota.
- Slide a "fake" LocalStorage object in when private browsing is enabled. It starts empty, changes to it are successful, but it is never written to disk. When private browsing is disabled, all changes to the private browsing proxy are thrown out.
- Cover the real LocalStorage object with a private browsing layer. It starts with all previously stored contents. Any changes to it are pretended to occur, but are never written to disk. When private browsing is disabled, all items revert to the state they were in when private browsing was enabled and writing changes to disk is re-enabled.
- Treat LocalStorage as read-only when private browsing is on. It exists, and all previously stored contents can be retrieved. Any attempt to setItem(), removeItem(), or clear() fail.
Ian Fette (Google/Chrome) explains how Google Chrome handles LocalStorage in "incognito" mode:
[W]hilst the [incognito] session is active, pages can still use a database / local storage / ... / and at the end of the session, when that [temporary] profile is deleted, things will go away. I personally like that approach, as there may be legitimate reasons to want to use a database even for just a single session.
Darin Fisher (Google/Chrome) follows up to clarify Google Chrome's behavior:
Chrome's "incognito mode" means -- is defined as -- starting from a clean slate (as if you started browsing for the first time on a new computer), and when you exit incognito mode, the accumulated data is discarded. That's all there is to it. The behavior of LocalStorage and Database in this mode is deduced easily from that definition.
Jonas Sicking (Mozilla/Firefox) explains his opposition to option 5:
My concern with this is the same as the reason we in firefox clear all cookies when entering private browsing mode. The concern is as follows:
- A search engine stores a user-id token in a cookie. They then use this token to server side store the users 10 last searches.
- A user uses this search engine to search for various items. Doing this causes the user-id token to be stored in a cookie.
- The user then switches to private browsing mode.
- The user makes a search for a present for his wife.
- The user switches back into normal browsing mode.
At this point it is still possible to see the search for the wifes present in the websites store of recent searches.
Something very similar could happen for localStorage I would imagine, where the user-identifing information is stored in the localStorage rather than a cookie.
Josh "timeless" Soref (core Firefox developer) explores the privacy implications of different options:
[Option 1: Disabling LocalStorage won't work because] Many sites will just assume that they know a given useragent supports localstorage, so they'll be surprised and break. This will mean that a user can't use certain sites.
[Option 2: Enabling LocalStorage with 0 quota] will enable sites to know that the user is browsing in private, which is probably also a violation of the user's trust model. If I were to be browsing in private, I wouldn't want most sites to know that I'm doing this.
[Option 4 or 5: Starting with existing LocalStorage data] means the site will know who you are (on average), and is almost certainly never what the user wants.
Jonas Sicking (Mozilla/Firefox) tentatively states
For what it's worth, I believe we're currently planning on doing 2 in firefox.
Brady Eidson concludes:
I strongly share Jonas' concern that we'd tell web applications that we're storing there data when we already know we're going to dump it later. For 3 and 4 both, we're basically lying to the application and therefore the user.
... So far I'm standing by WebKit choosing #5 for now.
Drew Wilson summarizes his thoughts on the matter:
I think the #1 goal for incognito mode has to be "maximum compatibility" -- let sites continue to work, which kills options #1 & 2. A secondary goal for incognito mode would be "don't let sites know the user is in incognito mode" -- this kills approach #1 and #5, and possibly #2 (depending on whether there are significant non-incognito use cases that also have 0 local storage quota).
For my part, I agree with Drew, and I would add this: I use Google Chrome's "incognito mode" quite frequently when I'm developing websites. It's an easy way to test from a "blank slate" with no cookies and no cache, and it's much easier than juggling multiple profiles. If data in my LocalStorage "bleeds" into incognito mode, this use case would become unreliable and web development would be harder for me. (Bil Corry makes this point too.)
On a more philosophical level, it's nobody's business that I'm in private browsing mode. (Scott Hess makes this point too.) If authors can detect it, I consider that a serious bug. (Imagine the ha.ckers.org headline: "Safari Hole Allows Sites To Detect 'Private' Browsing, Punish Users.") Even worse, if LocalStorage could be used as a "super-cookie" for less-than-honorable sites to track me from normal usage to incognito usage, then it's not really "private browsing" in any sense of the word that matters.
In the early days of Greasemonkey, there were discussions of whether Greasemonkey should send or provide some detectable signal to page authors that Greasemonkey was running and the user had active scripts modifying the current page. To which I replied:
If Greasemonkey makes any overtures towards allowing web publishers to "opt out" or override my browsing experience in any way, I will immediately fork it and make it my life's mission to maintain the fork as long as possible.
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Weekly Review
This Week in HTML 5 – Episode 29
April 7th, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
The big news for the week of March 30th is the addition of a synchronous database API to the Web Storage spec (which was split out from the HTML 5 spec a few weeks ago). This new API defines a DatabaseSync object whose methods return SQLTransactionSync objects. This directly mirrors the asynchronous database API, which had already defined a Database object whose methods return SQLTransaction objects. [r2958]
Another interesting change this week is r2921, which adds the placeholder attribute to the <textarea> element. I tracked the initial discussion of the placeholder attribute in episode 8 and noted its appearance in HTML 5 in episode 13. Previously you could only use the placeholder attribute on <input type=text>, <input type=email>, <input type=url>, and <input type=password>, but Thomas Broyer pointed out that Google Code (among others) uses placeholder text on <textarea> elements. Such sites could now theoretically migrate their current script-based solutions to HTML 5 markup.
Other interesting changes this week:
- r2928 recommends that browsers should reset the
text-indentproperty when rendering a<textarea>element. - r2930 notes a strange edge case where paragraphs (
<p>elements) can end up overlapping each other if they are used as fallback content within an<object>element. - r2933 adds event handler DOM attributes like
onclickto the WebIDL definition of thedocumentobject. - r2936 allows the
spellcheckattribute to be present with no value, as a synonym forspellcheck="true". I first mentioned thespellcheckattribute in episode 23, and again in The Road to HTML 5: spellchecking. - r2937 allows
<textarea wrap=off>. - r2941 further tweaks the algorithm for parsing legacy color attributes, which is trickier than you might think.
- r2943 allows the
widthandheightattributes of an<img>element to be 0.
Around the web:
- Jacob Seidelin posts a
<canvas>cheat sheet. - Peter-Paul Koch knows way more about dates than you do.
- Eric Meyer uses HTML 5 to present the results of the "A List Apart Survey 2008."
- Alex Nicolaou explains how the new versions of Google Mail and Google Calendar use HTML 5 for offline functionality.
- Jon Tan talks about practical ways to tweak your markup now to migrate to HTML 5 later.
- addfullsize.com is an entire site devoted to adding an attribute to the
<img>element. - Anne van Kesteren thinks URLs are tough, and he's right.
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Weekly Review
Help us review HTML5!
April 2nd, 2009 · No Comments
Are you interested in reviewing HTML5 for errors?
- Jump in! All feedback is welcome, from anyone.
- Open the specification: either the one-page version, or the multipage version or the PDF copy (A4, Letter)
- Start reading! See below for ideas of what to look for.
If you find a problem, either send an e-mail to the WHATWG list (whatwg@whatwg.org, subscription required), file a bug (registration required), send an e-mail to the public-html-comments@w3.org list (no subscription required), or send an e-mail directly to ian@hixie.ch.
If everything goes according to plan, all issues will get a response from the editor before October. You can track how many issues remain to be responded to on our graph.
What to look for
The plan is to see whether we can shake down the spec and get rid of all the minor problems that have so far been overlooked. Typos, confusion, cross-reference errors, as well as mistakes in examples, errors in the definitions, and major errors like security bugs or contradictions.
Anyone who helps find problems in the spec — however minor — will get their name in the acknowledgements section.
You don't really need any experience to find the simplest class of problems: things that are confusing! If you don't understand something, then that's a problem. Not all the introduction sections and examples are yet written, but if there is a section with an introduction section that isn't clear, then you've found an issue: let us know!
Something else that would now be good to search for is typos, spelling errors, grammar errors, and the like. Don't hesitate to send e-mails even for minor typos, all feedback even on such small issues is very welcome.
If you have a specific need as a Web designer, then try to see if the need is met. If it isn't, and you haven't discussed this need before, then send an e-mail to the list. (So for example, if you want HTML to support date picker widgets, you'd look in the spec to see if it was covered. As it turns out, that one is!)
If you have some specific expertise that lets you review a particular part of the spec for correctness, then that's another thing to look for. For example if you know about graphics, then reviewing the 2D Canvas API section would be a good use of your resources. If you know about scripting, then looking at the "Web browsers" section would be a good use of your time.
Staying in touch
You are encouraged to join our IRC channel #whatwg on Freenode to stay in touch with what other people are doing, but this is by no means required. You are also encouraged to post in the Discussion section on the wiki page for this review project, or in the blog comments below, to let people know what you are reviewing. You can get news updates by following @WHATWG on Twitter.
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: WHATWG
This Week in HTML 5 – Episode 28
April 2nd, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
The big news for the week of March 23rd is that SVG can once again be included directly in HTML 5 documents served as text/html:
I've made the following changes to HTML5:
- Uncommented out the XXXSVG bits, reintroducing the ability to have SVG content in text/html.
- Defined
<script>processing for SVG<script>in text/html by deferring to the SVG Tiny 1.2 spec and blocking synchronousdocument.write(). The alternative to this is to integrate the SVG script processing model with the (pretty complicated) HTML script processing model, which would require changes to SVG and might result in a dependency from SVG to HTML5. Anne would like to do this, but I'm not convinced it's wise, and it certainly would be more complex than what we have now. If we ever want to addasync=""ordefer=""to SVG scripts, then this would probably be a necessary part of that process, though.- Added a paragraph suggesting: "To enable authors to use SVG tools that only accept SVG in its XML form, interactive HTML user agents are encouraged to provide a way to export any SVG fragment as a namespace-well-formed XML fragment."
- Added a paragraph defining the allowed content model for SVG
<title>elements in text/html documents.
r2904 (and, briefly, r2910) give all the details of this solution. There are still a number of differences between the text in HTML 5 and the proposal brought by the SVG working group. Some of these are addressed further down in the announcement:
- SVG-in-XML is case-preserving; SVG-in-HTML is not.
- SVG-in-XML requires quoted attribute values; SVG-in-HTML does not.
- When SVG-in-XML uses CDATA blocks, they show up as CDATA nodes in the DOM; when SVG-in-HTML uses CDATA blocks, they show up in the DOM as conventional text nodes. [Clarified based on Henri's feedback]
- The
<svg>element can not be the root element of a text/html document.
Doug Schepers, who has been the SVG working group's HTML 5 liason, does not like this solution:
To be honest, I think it's not a good use of the SVG WG's time to provide feedback when Ian already has his mind made up, even if I don't believe that he is citing real evidence to back up his decision. What I see is this: one set of implementers and authors (the SVG WG) and the majority of the author and user community (in public comments) asking for some sort of preservation of SVG as an XML format, even if it's looser and error-corrected in practice, and a few implementers (Jonas and Lachy, most notably) disagreeing, and Ian giving preference to the minority opinion. Maybe there is sound technical rationale for doing so, but I haven't been satisfied on that score.
Turning to technical matters, one of the features of web forms in HTML 5 is allowing the attributes for form submission on either the <form> element (as in HTML 4) or on the submit button (new in HTML 5). Originally, the attributes for submit buttons were named action, enctype, method, novalidate, and target, which exactly mirrored the attribute names that could be declared on the <form> element.
However, in January 2008, Hallvord R. M. Steen (Opera developer) noted that "INPUT action [attribute] breaks web applications frequently. Both GMail and Yahoo mail (the new Oddpost-based version) use input/button.action and were seriously broken by WF2's action attribute."
Following up in November 2008, Ian Hickson replied, "I notice that Opera still supports 'action' and doesn't seem to have problems in GMail; is this still a problem?" to which Hallvord replied, "GMail fixed it on their side a while ago. It is still a problem with Yahoo mail, breaking most buttons in their UI for a browser that supports 'action'. We work around this with a browser.js hack. ('Still a problem' means 'I tested this again a couple of weeks ago and things were still broken without this patch'.)"
Ian replied, "This is certainly problematic. It's unclear what we should do. It's hard to use another attribute name, since the whole point is reusing existing ones... can we trigger this based on quirks mode, maybe? Though I hate to add new quirks." Hallvord did not like that idea: "In my personal opinion, I don't see why re-using attribute names is considered so important if we can find an alternative that feels memorable and usable. How does this look? <input type="submit" formaction="http://www.example.com/">"
Finally, in March 2009, Ian replied:
That seems reasonable. I've changed "action", "method", "target", "enctype" and "novalidate" attributes on <input> and <button> to start with "form" instead: "formaction", "formmethod", "formtarget", "formenctype" and "formnovalidate".
And thus we have r2890: Rename attributes for form submission to avoid clashes with existing usage.
Other interesting changes this week:
- r2889 adds support for
select.add(e)andselect.options.add(e)with no second argument. - r2888 defines how to determine the character encoding of Web Worker scripts. Briefly, it says to look for a Byte Order Mark, then look at the Content-Type HTTP header, then fall back to UTF-8.
- r2898, r2899, r2901, r2914, and r2916 define a locking mechanism to allow thread-safe read/write access to
document.cookie and.localStorage. The lock is acquired during page fetching (which sets the cookie based on HTTP headers) and released once the cookie is set. It is also released automatically whenever something modal happens (such aswindow.alert()). (I first mentioned the discussion of this issue in episode 27. The problem is that Web Workers allows threaded client-side script execution, which means access to shared storage likedocument.cookieneeds to be made explicitly thread-safe with some sort of locking mechanism.)
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Weekly Review
This Week in HTML 5 – Episode 27
April 2nd, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. In this episode, I'd like to highlight some of the discussions that I've missed in previous episodes.
- Greg Millam writes, "I'm one of the main engineers responsible for captioning support on YouTube, and I've joined the Chrome team at Google to attempt to help drive video captions and subtitling forward." Henri Sivonen replies, "I agree it makes sense to start with something simple. The markupless flavor of SRT would be such a format. However, supporting the formatting tags in later flavors of SRT is a can of worms." [full thread: Captions, Subtitles, and the Video Element]
- Robert O'Rourke writes, "Are there any plans to bring list headers from HTML3 into HTML5?" Ian Hickson replies, "You can do this in HTML5, using <figure> and <legend>." The thread continues in a number of directions. Marcus Ernst writes, "Anyway I would consider it even more appropriate to allow the list inside a paragraph," to which Ian replies, "We had this in the spec originally, but we dropped it due to a variety of issues (it made life harder for editors, it didn't work in text/html even when it looked like it did, people got confused...)." [full thread: List Headers]
- Drew Wilson writes, "There's currently no way to set or get cookies from workers, which makes various types of cookie-based operations problematic." Jonas Sicking replies, "Allowing cookie to be set would unfortunately create a synchronous communication channel between the worker and the main window." The discussion continues, focusing on issues of multi-threaded updates to
document.cookie. Drew Wilson again: "Following up on this. I created two pages, one that tests cookies in a loop, and one that sets cookies in a loop, and ran them in separate windows in Firefox 3, IE7, and Chrome. Chrome and IE7 currently allow concurrent modification of document.cookies (i.e. the test loop throws up an alert). Firefox does not." [full thread: Accessing cookies from workers] - There have been a number of overlapping discussions on whether and how to allow authors to embed RDFa in HTML 5 documents. See 1, 2, 3, and followups. Besides the technical arguments about how it would work, much of the discussion centers around the concept of distributed extensibility, which I've touched on before. For example, here is Chris Wilson (of the Microsoft IE development team): "We have had (in the past as well, imo, in the future) a requirement for decentralized extensibility - that is, that document/content authors can extend the set of elements with their own semantic or behavioral elements. I continue to think there is a requirement for that. (One might well ask why we didn't implement full XML in that case; I'll politely not answer from a historical context, but will point out that the draconian error handling and poor fallback story make delivering content in XML in the browser a poor solution in the ecosystem today.)"
- Steven Faulkner writes, "I have ... taken a stab at a RFC 2119 compatible definition for table summaries: http://esw.w3.org/topic/HTML/SummaryForTABLE/SummarySpecification." [full thread: Draft text for summary attribute definition, continued in March archives]
There has also been a vigorous debate about the license of the specification itself.
- Sam Ruby writes, "In my discussions with Ian and at Mozilla, I gathered that it was a shared understanding that by October that the license for the W3C license would be somehow open source friendly, and specifically that a Creative Commons Attribution license was something that was of common and general interest." The "open source friendly" clause is a reference to the fact that the spec does actually contain some code (in the form of WebIDL declarations), and vendors of open source browsers would like to include this code (or derive code from it) into their products.
- After much discussion, Philippe Le Hegaret (of the W3C) writes, "In response to requests from developers to make it easier to include portions of W3C specifications in software documentation, bug reports, code, and test cases, W3C have drafted a new Excerpt & Citation License. ... Uses like forking of a specification would remain prohibited to protect the due process and the consensus found in a chartered Working Group."
- Ian Hickson immediately replies, "Increasing license proliferation is a really bad idea here. I would be opposed to introducing yet another license. ... [The forking] use case is the main one that I'm concerned about, FWIW."
- Jonas Sicking explains his reasoning about allowing forking: "I think it would gain W3C a tremendous amount of trust if it were to allow [forking]. To many people, me included, having the gurentee that W3C can't go off 'into the weeds' means that I have don't have to worry about my time being wasted when I contribute. I think many people feel the same when they contribute to the forkable software I represent."
- Philippe notes that the W3C "isn't used to the concept of allowing a fork" of their specifications, which is one of the requirements of any "open source friendly" license.
- I believe Maciej Stachowiak (WebKit developer at Apple) best summarized the group's objections: "1) Preventing specification forks is not achievable through license terms; a sufficiently motivated party can create a new spec from scratch. 2) Preventing specification forks is probably not necessary; the one time it happened, the outcome was good and the effort merged back into a realigned W3C. 3) Due to 1 and 2, we should give more consideration to LGPL/GPL compatibility than prevention of forks via licensing terms."
- Ian Hickson agrees: "I do agree that the original use cases are (intentionally and explicitly) not all met by the proposal that was put forward, and I do think the original use cases were an accurate portrayal of the use cases that this working group has consensus on. Compatibility with open source (including GPL and LGPL projects), clear license terms (ideally reusing an existing license), and the ability to fork are all issues that working group members discussed and considered important previously."
[Further reading: Discussions with plh, Draft W3C Excerpt License]
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
→ No CommentsTags: Weekly Review
This Week in HTML 5 – Episode 26
April 1st, 2009 · No Comments
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
The big news for the week of March 16th is this announcement from Ian Hickson:
I've now split out the Server-sent Events and Storage APIs out of HTML5, and I've removed the text for Web Sockets, which was split out earlier. By popular demand I've also done some tweaks to the styling of these specs.
- HTML5
- http://dev.w3.org/html5/spec/
- Server-Sent Events
- http://dev.w3.org/html5/eventsource/
- Web Storage
- http://dev.w3.org/html5/webstorage/
- Web Workers
- http://dev.w3.org/html5/workers/
- Web Sockets
- http://dev.w3.org/html5/websockets/
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocolIt is my understanding that the desire is to publish the Server-Sent Events, Web Storage, Web Workers, and Web Sockets specs through the Web Apps working group, so that is what I put into the "status of this document" sections.
I would like to be able to put more permissive licenses (ideally MIT) on these drafts, rather than the W3C license.
The following sections still haven't been split out:
- URLs
- I'll remove this section as soon as DanC's draft is published.
- Content-Type sniffing
- I'll remove this section once Adam's draft is on a standards track.
- Timeout API
- This section is lacking an active editor.
- Origin
- I'm unsure what will happen with this section.
In IRC, Ian explained that all of these documents are generated from one master file:
# [21:02] <hixie> the source document is run through a bunch of scripts to generate the output documents
# [21:03] <hixie> from that one file i now generate one whatwg spec, four w3c specs, and an rfc
In other news, r2876 (WARNING: VERY LARGE) adds user stylesheets to the HTML 5 specification itself. If you view it in a browser that support switching stylesheets (such as Firefox, under the View → Page Style submenu), you can choose between "Complete specification" (default), "Author documentation only," or "Highlight implementation requirements." The "Author documentation only" stylesheet hides all of the client parsing algorithms and focuses on the elements, attributes, and scripting features that web authors need to know about.
For example, the "author documentation" of the <img> element highlights the required attributes, how to create a new Image() dynamically, and the detailed requirements for providing alternate text, while completely hiding any mention of how image fetching fits into the client's task queue, the gory details of how clients resolve image URLs, or the security risks of allowing pages on the public internet to attempt to load images on the local network. On the flip side, "highlight implementation requirements" highlights these exact issues.
Critics who complained that the HTML 5 specification should be "just a markup language" will be able to have their cake and eat it too. Those who complained that HTML 5 was "too bloated" will have a little less to complain about now that several parts of it have been published as separate documents. On the other hand, critics who complained about these things as a cover for other agendas will have to continue complaining a little while longer.
Tune in next week for another exciting episode of "This Week in HTML 5."
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license