Home MVC Storefront

A Hack's Guide To Unit Testing Generated HTML

UPDATE: Refactored and tweaked the validation method below by popular request!

I know this might make some people groan - but that's OK - I'm used to it. As Eilon Lipton always tells me:

You're a PM dude - you're not supposed to code

Which is true! And to get the PM as Haack tradition alive, here is my latest attempt to completely devastate my reputation as a coder. Have at me, I love it.

Let's Put an X in Front Of HTML Too
The world's gone X-Crazy (XBox, OS X, ASPX pages, ActiveX...) and it seems that nothing's cool anymore unless there's an X associated with it. Maybe I can regain some of the rep I'm about to lose by changing my name to "RobX".

XRob?

Testing For Compliance
One of the main things that I get hammered for (with respect to the MVC Toolkit) is the lack of XHTML Compliance. I tried to pay as much attention as I could to it but... well some things just slip through. I read somewhere that my slip ups (particularly with respect to the method on the form tag not being in quotes was

...yet again evidence that Microsoft could care less [sic] about the HTML spec

On the contrary it was me, being lame.

I know what you're thinking - "can't you test for that somehow?" and up until now it meant copy/pasting a whole mess of HTML up to w3.org to run through the validator. But as of today I decided to let my Mr. Hack take over and created some code to ping w3.org automatically.

The Code
For my compliance Unit Tests (you wouldn't want to do this for every test, for obvious reasons) I'm creating a Select box, comme ci:

    [TestMethod]
    public void Select_BindToIntegerArray() {
        int[] numbers = { 1, 2, 3 };
        string select = SelectBuilder.Select("test", numbers,"","",0,false,null,null);

        //validate it
        Assert.IsTrue(XHTMLValidator.ValidateFragment(select));

    }

UPDATE: Duncan Smart (?) refactored this function yet again - thanks!
And I'm calling on my new hacked up wunderclass- the XHTMLValidator. Here's the code - have at me and make it hurt:

        public static bool IsValidXhtml(string htmlFragment) {

            NameValueCollection values = new NameValueCollection();
            values["fragment"] = htmlFragment;
            values["prefill"] = "1";
            values["prefill_doctype"] = "xhtml10";

            WebClient webClient = new WebClient();
            string postResult = Encoding.UTF8.GetString(
                webClient.UploadValues("http://validator.w3.org/check", values)
                );

            //lame check - but it works
            bool isValid = postResult.Contains("Congratulations");
            return isValid;
        }
 

Yes, I know. But you know what - it's an automatic way to make sure that my tags are compliant :).

If You're Still Here...
I'm also going to use the HTML Agility Pack that's up on CodePlex to make sure all the other bits that are supposed to be present in the generated HTML are indeed there. This is a really cool project for checking your HTML out - from there site:

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).

Hope someone finds this helpful...

Duncan Smart avatar
Duncan Smart says:
Wednesday, January 30, 2008
Rob, you've specified that the data is "x-www-form-urlencoded" - but your data hasn't been URL-encoded at all.

Lance Fisher avatar
Lance Fisher says:
Wednesday, January 30, 2008
I was thinking about trying to do something similar to validate my RSS feeds with feedvalidator.org, but I shied away from making calls to a web service in my unit tests. Don't you think that it could become a problem? It's too bad there aren't any html/rss/etc validator libraries (like .dlls) you could just add into your unit tests. Or are there?

David Fauber avatar
David Fauber says:
Wednesday, January 30, 2008
Great article. I've just written some controls where my primary dissatisfaction has been that the html generating methods are extremely difficult to write meaningful tests for. Not sure I'm going to go the exact same route, but it gives me some ideas and its nice to see I'm not the only one dealing with this.

Ryan Smith avatar
Ryan Smith says:
Wednesday, January 30, 2008
Rob, With respect to X (in this case XHTML) being a fad, I think the real issue is that unless your pages are coded in an XHTML transitional manner, IE will render them in quirks mode. Thus you end up with a web site that looks OK in one browser but completely messed up in the others. I think that's the reason why you get hammered for not validating as XHTML. Thanks for the post though. I can see this coming in handy in the near future.

Rob Conery avatar
Rob Conery says:
Wednesday, January 30, 2008
@Ryan: not poking any fun at XHTML :) - I absolutely understand the need for valid HTML and only wish I could have paid more attention.

Joe Chung avatar
Joe Chung says:
Wednesday, January 30, 2008
Your code looked fine except that you don't need to instantiate ASCIIEncoding (System.Text.Encoding.ASCII) and you should close and dispose your streams and dispose your stream reader either explicitly or via using statements. Also, like Duncan said, you should URL encode your data with HttpUtility.UrlEncode. It's a shame that we can't use the XhtmlConformance Web.config setting, but it only applies to server control output (and not Literal- or Label-encoded HTML either).

Shawn Oster avatar
Shawn Oster says:
Wednesday, January 30, 2008
As much as one might cringe at making web requests in an unit test I definitely think this is the lesser of two evils. Sure, web requests make the tests a little brittle but at least you're testing the XHTML and that pretty much rocks. Nothing hacky about your code either, I think it's pretty damn clean and makes the best of a tough situation (that being the lack of a validator class library). I'm with Joe, those streams need to be disposed or be in a using statement, it's never the best idea to trust the garbage collector to clean up your resources for you (memory yes, streams and handles, no). You should update the code on this page with the revised goodness once you've made the small tweaks so it can live on glory for the cut-n-pasters.

adminjew avatar
adminjew says:
Wednesday, January 30, 2008
SubSonic Default Replacing Letter for reserved types is Guess what its X.

Rob Conery avatar
Rob Conery says:
Wednesday, January 30, 2008
@adminjew: "That was Eric" @Shawn and others - thanks for the tips... I knew better :). Refactored and updated.

Ryan Lanciaux avatar
Ryan Lanciaux says:
Wednesday, January 30, 2008
Rob, Thank you! This will definitely come in handy -- and save some time!

josh avatar
josh says:
Thursday, January 31, 2008
How about Rob "ConX" Conery? .. hopefully not xcon -jx

Mike Minutillo avatar
Mike Minutillo says:
Thursday, January 31, 2008
Hey RobX, This is cool. @Lance Fisher - Would you use this as a "Unit Test" or a higher-level one? I would want this one pre-commit but not on every unit-test run I think. - Mike

Ben avatar
Ben says:
Thursday, January 31, 2008
If you're going to be using this throughout your test suite then it'd be worth installing the validator on a local server. http://validator.w3.org/docs/install.html (There's a windows install guide as well http://validator.w3.org/docs/install_win.html)

Josh Stodola avatar
Josh Stodola says:
Thursday, January 31, 2008
I think this is a slight indication as to the cause of legendary problems with the relationship between standards-compliance and Microsoft. Managers don't take the standards seriously enough, and here is raw proof of it. Rob, I know you understand the need for valid XHTML. So, why is your stuff not validating? I don't think it has anything to do with you being lame. Things slip through because your focus is elsewhere (as determined by your brains order of importance). I am not sure that you should have a higher focus on standards (perhaps managers really do have bigger fish to fry), but at least have some XHTML-savvy sap validate your code before anybody else has the chance to give you shit about it. Anyways, the real reason I commented was to point out that there is already an API for this: http://blog.madskristensen.dk/post/Using-the-W3C-HTML-Validator-API.aspx Best regards...

Duncan Smart avatar
Duncan Smart says:
Thursday, January 31, 2008
Sorry, but couldn't resist (did anyone say "fizz-buzz"?). WebClient is good for simple HTTP stuff like this: public static bool IsValidXhtml(string htmlFragment) { NameValueCollection values = new NameValueCollection(); values["fragment"] = htmlFragment; values["prefill"] = "1"; values["prefill_doctype"] = "xhtml10"; WebClient webClient = new WebClient(); string postResult = Encoding.UTF8.GetString(webClient.UploadValues("http://validator.w3.org/check", values)); //lame check - but it works bool isValid = postResult.Contains("Congratulations"); return isValid; }

Mike avatar
Mike says:
Thursday, January 31, 2008
Rob, You are doing an excellent job with this blog, I am so glad you have joined the MS Team. Keep up the great work!

Lance Fisher avatar
Lance Fisher says:
Thursday, January 31, 2008
@Josh, The fact that PMs are writing unit tests to validate XHTML indicates to me that they do care about standards. This is really cool. @Mike, Ideally, I would like to be able to validate the output in my unit tests, but I would not want to hit the w3c server on every unit test. So with this solution, which I really like in a lot of ways, I would rather just use it on a pre-commit like you suggest. However, there might be another way. I found this great article: http://www.thejoyofcode.com/Validator_Module.aspx They wrote up an HttpModule that validates the output of any aspx page and appends the results to the rendered page. The way they check the validation is by loading the DTD from the W3C and caching it. They use an XmlReader to read the document which will then throw errors if the page doesn't validate. So aside from using the whole HttpModule, I think it should be possible to validate the XHTML with a local DTD. Now whether this covers everything the W3C's validator covers, I don't know, but it would cover quite a bit and you wouldn't have to do a call over the network.

Rob Conery avatar
Rob Conery says:
Thursday, January 31, 2008
@Josh - thanks for the link, I was looking all over for this :). >>I think this is a slight indication as to the cause of legendary problems with the relationship between standards-compliance and Microsoft<< Hardly. We issued a CTP and I created that toolkit in 10 days because we needed it. I went through everything to try and be sure that it was all compliant and I missed 4 attributes out of 323 :p. >>Things slip through because your focus is elsewhere (as determined by your brains order of importance).<< I'll remind your that ScottGu is the General Manager of our unit. He owns just about everything developer-related. Scott created MVC... Seriously though - PMs in Microsoft have a massive degree of freedom and we're expected to stay close to the code - not sit in a chair and push Gantt Charts. I wouldn't work there if that was the case. In truth I should have had a testing suite prepared but at the time I had never had to Unit Test HTML - I mean how do you do that properly? I had 10 days and so I relied on some old-fashioned testing. Not that it's an excuse, but honestly this is just a Preview and it never would go out the door (even as Alpha) without full testing (which is why I'm doing this now).

Josh Stodola avatar
Josh Stodola says:
Thursday, January 31, 2008
I didn't realize the time frame at all. Regardless, I think I should have recognized that this is indeed a step in the right direction. A couple of years ago, I doubt anybody would have even considered validating the output. With that said, I apologize for criticizing an obviously respectable move. I don't believe standards-compliance is a religous priority for Microsoft yet, but it is definitely reassuring to see there is some transcendence. Keep up the great work, guys!

Josh Stodola avatar
Josh Stodola says:
Thursday, January 31, 2008
By the way, Rob (feel free to delete this comment upon reading), I am not sure if you noticed this bug in your blog. When it animates the comment all AJAXY style (it fades in), the fonts end up being all wretched looking (not smooth). I came across this about a month ago, so if you care to fix it, you can: http://mattberseth.com/blog/2007/12/ie7_cleartype_dximagetransform.html

Rob Conery avatar
Rob Conery says:
Thursday, January 31, 2008
@Josh - I'm looking to move off this theme when I get a moment. Probably wait for WP 3 to come round maybe? But thanks for letting me know. No apologies necessary :) - but I do want people to know that MS is now thinking about a lot of this stuff due to some input from the new hires. Phil is pushing the TDD thing harder than ever - it's good stuff! And it's true - ScottGu codes a lot. And thank goodness for it...

Josh Stodola avatar
Josh Stodola says:
Thursday, January 31, 2008
Wow, ScottGu codes? As if I didn't have enough respect for the guy already! Is he from the future? Thanks for the quick reply, it really is good to see you guys are thinking about this stuff. I'm pretty excited about it!

Pete Hurst avatar
Pete Hurst says:
Saturday, February 02, 2008
So, what happens if the test content you're validating contains "Congratulations" buried in some *in*valid markup? Your user registration page will fail the unit test :)

Troy DeMonbreun avatar
Troy DeMonbreun says:
Monday, February 04, 2008
Rob, "PM as Haack tradition" - was that a Freudian Slip? ;-)

Rob Conery avatar
Rob Conery says:
Monday, February 11, 2008
@Pete: the idea here is that this is a Unit Test and hopefully the HTML you pass off to the testing bits won't trip you up. But yah - that's a bit of snag :). I tried to find another way but you get back a massive blob of text and nothing really indicates a pass/fail that I could see. I know it's not optimal, but I tell ya, it caught some pretty crazy errors!

Igor avatar
Igor says:
Sunday, March 09, 2008
Rob, Which part of your linq query is "predictive"?


Search Me
Index Of MVC Screencasts

You can watch all of the MVC Screencasts up at ASP.NET, and even leave comments if you like.

Subscribe

Popular Posts
 
My Tweets
  • Isn't the Rails/Asshole thing dead? http://tinyurl.com/57dmvx
  • Pushups last night: 17, 13, 9, 7, 3
  • @kevindente my wife (and me) consider the Roomba to be on par with Tivo in terms of generation-defining technology
  • @kevindente I'll hold you and we can cry together. Maybe you can ... even ... blog about it.
  • Writing tests for InventoryService - talk about a slipper-slope process! Is there such a thing as Cart Concurrency? I dunno! Maybe?
  About Me



Hi! My name is Rob Conery and I work at Microsoft on the ASP.NET team. I am the Creator of SubSonic and was the Chief Architect of the Commerce Starter Kit (a free, Open Source eCommerce platform for .NET)

I live in Kauai, HI with my family, and when my clients aren't looking, I sometimes write things on my blog (giving away secrets of incalculable value).