The rendering process for browsers is very well defined, and has a very rigid definite ruleset where (virtually) every accountability is noted and handled. This is not optimal for Machine Learning, which works when we have a large pool of examples, and we don't know the ruleset; it will figure it out. Even if you were to train an Neural Network to process that input, there are several things you must account for:
1. Variance in data.
Not all webpages are equal in length or complexity, and making a neural network to generate output from HTML would produce garbage most of the time.
2. Training time.
The time it would take for a neural network to understand HTML tags, attributes, the DOM Tree, and each and every element, including new ones being added every few years, and how each one renders and behaves, would take an extremely long time, most likely several years on a fast computer, if it even were possible
3. Interactivity.
Web pages aren't just static, they change according HTML, CSS and JavaScript. Not only would you have to design your system to account for the rendering step, you would also make it have to understand the Turing Complete scripting language JavaScript, as well as the less complicated, but inherently intertwined with HTML, CSS stylesheet language. If you thought the rendering process was easy, try training a neural network to handle complicated scripting patterns.
4. New Standards
Not all HTML is equal, because of different standards. WHATWG began working on HTML5 in 2004, and browsers started to implement not long after. In 2004, there were very few examples of HTML5 sites to train your network to begin with. Sure, now it's standardized and every website uses it, but what about HTML6? When the first specification is released (probably 2017-2025), virtually no websites will use it, because no one will support it. Only when it finally becomes standard, probably in the late 2020s or early 2030s, will you have enough data to train your monstrous system of neural networks
As for AI in general, one could argue that browsers already use A.I. in their rendering process. They intelligently decide what to render (taking CSS into account), when in order to get the most efficient render time, they selectively use different JavaScript parsers on different sections of the code to optimize the speed, the whole system has been optimized on another ruleset to make rendering and interacting with a webpage as seamless and easy-to-use as possible. Your system will never be as good as what hundreds of humans have optimized over 20 years.
Trying to solve HTML rendering with Neural Networks is akin to trying to nail a nail with a screwdriver. It's just not going to work
Hope this was helpful!