For the layman/novice flashaholic:
There are a bunch of different variables that play into "brightness", but keeping it in layman's terms, you can boil it down to the beam profile of the light and percentages.
The beam profile (how the light looks on a white wall) can make a light seem much brighter or dimmer depending on how it's arranged. A good way to visualize this is a garden hose: Let's say you turn on your garden hose spigot to 25%, and holding the hose in your hand you just let the water fall freely from the nozzle. Then you put your thumb over the nozzle - now the pressurized water streams far out and away rapidly. In which instance was there a greater amount of water? The answer is of course that there was the same amount/flow of water for both examples, the spigot was always at 25%, you merely changed how it came out of the nozzle. Conceptually, this works with light also; you can let light flow out broadly in a nearby flood, or you can compact it into a far-throwing narrow stream - the stream can seem brighter to the eye just because it goes farther and/or has a more intense beam profile.
This matters in flashlights because a light that's a "thrower" will always seem brighter than a "flooder". A good example would be if you had a Maglite that was perfectly focused for a nice, intense hotspot. You note how bright this looks on a white wall, then remove the head from the Mag entirely and shine it at the wall again - now the wall is completely dim. In which configuration did the Mag output more lumens? Again, it was the same, the light bulb put out the exact same amount of light for each test.
So now that we know the eye can be easily fooled just by how a flashlight throws light, we must devise a way to gauge *total output*, not just the output in one small area. This test is called ceiling bounce.
When you shine a flashlight at the ceiling of a darkened room, the room as you see it is now lit only by the *total output* of that light - you've removed the element of beam profile and can now see, at least roughly, how much light is being emitted. The test goes something like this; Standing in a pitch black room with two flashlights you want to compare, you shine the first light at the ceiling - you have to shine it in such a way that you can't see the end of the flashlight itself or the beam profile, so pointing it up next to your ear works nicely. Now you're seeing the room lit by the total output of that light. Next, close your eyes, turn off or cover light one and switch to light two, and open your eyes - is the room brighter or dimmer? The answer will reveal which light has more *lumens* regardless of *throw*. (This method works very quickly and decisively when the two lights are more than 20% disparate, below that and you may need to view the room for a full minute or so and then switch lights to catch the tiny discrepancies.)
And speaking of percentages, they're something you have to take into consideration when looking a lumen numbers. Let's say you're outdoors on a moonless night, and you turn on a 180 lumen light; it will appear very bright and you'll be able to light your way easily. So then you increase your light output to 220 lumens, that's just a little bit brighter, enough that you notice a marginal difference. But here's the kicker - let's say in the same situation, you instead have only a 1 lumen keychain light; on fully-dark adjusted eyes, this is actually "about right" for getting around, and you can navigate fine. But then you increase your light output to 2 lumens, and WOW, that's much brighter, what a difference! Why is 1 vs 2 such a profound difference over 180 vs 220? Percentages - 220 is only 22% brighter than 180, so that's a small difference. But 2 is TWICE/100% over 1, so there's literally twice as much light. Tiny number differences make big perceived differences at the low end of the scale, but not at all on the upper end.