You have to take into account the test he runs - he shines the light into a milkbox, and gets the overall output that way.. which means that, depending on the intensity of the hotspot, the output'll vary significantly.
The L2 and L4's reflectors are different, to begin with - the L2's reflector looks deeper than the L4's, at least in the shots I've seen on the net, which would mean that it would have a tighter and more intense hotspot. The L4's more shallow ~20mm reflector would provide the proverbial 'wall of light' but it means that the hotspot is not as intense as it might otherwise be.
According to Craig's review of the L2 at the LED Museum, it takes about 50 minutes to get to 50% brightness on the L2 on high, and 12.5 hours to 50% brightness on low. The main thing with the L2 is that the head is 'dumb', which is why the body is longer - all the electronics are there, including the regulator for the low level.
You need to take into account beam characteristics when comparing 'output'. The L5 and L6, for example, have really TIGHT hotspots compared to the L4, which mean they're better for longer distances than the L4 is - the brighter hotspot means you can actually illuminate something with an L5 or L6 at range that the L4's dimmer hotspot would not show.