I have scanned a lot of websites or youtube videos talking about this topic, how much light can gather your camera and the famous detector size paradigm and aperture crop factor. None of them seems to be clear for me. I found the various explanations quite fuzzy because they do not take the problem from the origin. Often these explanations goes in various small technicals consideration without treating first the big paradigm that can be explain simply without notion of technology or optic.
In what I am writing bellow their are many details and technicals point missing but this details bring only small little variations of the all paradigm. Also I consider mostly perfect optics and so on, these details are not relevant here.
And to put that away, all what I explain bellow has little to do with the quality of the photograph. Photography is something else. But good news, understanding how thing works does not alter the art and creation process.
I will avoid to talk here about crop factor or f-number, pixel density, etc. After all they are only features.
[ISO and exposure time is not introduced here because not needed. The ISO amplification comes after the light collection (see below). You can consider all the exposure time equal]
I will talk about 'system', which is composed of a lens and a detector. I am going to write obvious stuff but I invite you to follow it to realize how the paradigm is actually very simple. There is no need to involve complicate notions.
The function of the lens is to collect the light and project an image on the detector. The combination of one property of the lens, the focal length, and one property of the detector, its size, makes what we call the field of view (fov). The portion of the world we are imaging.
A lens as an other important property, its diameter. This is the part that let the light come through. On camera lenses, one can virtually change the diameter of the light collecting surface, the hole, with an iris diaphragm. It is easy to note that changing the collecting surface area (the iris size) change the amount of light entering your camera and therefore change the amount of light registered on the detector. This is quite obvious when you think about it, open you curtain completely and you will have more light in your bedroom than if opened half way.
Something less obvious but very easy to verify is that changing the iris size does not change the fov.
So, let consider two systems (100% efficient) with the same field of view, the same portion of the world you are photographing. It does not matter how you achieve that, they are several way to do it and one is to change the detector size. So, same fov, the light is coming from the same 'slice' of the world. The amount of light you are gathering depend of the collecting surface area of your lens (the iris size on a camera). Therefore two systems, with the same field of view, can only have the same amount of light if they have the same entrance collecting surface area.
If that is clear you can deduce the rest. Their is no need here to introduce any kind of pixel size, density, etc ... number of pixel affect mostly the size of the smallest details in your picture. Other little differences are negligible front of the size paradigm.
One can understand that a 'small' system will never be able to collect as many light as a 'big' system.
The f-number is useful for photographer because whatever the focal length of the lens, this number will give him an idea of the camera setting for his picture (more info bellow). It is defined as : f-number = focal-length / Iris-diameter
Therefor putting some number from what we have deduced above. Let us again consider two systems, one with a 50mm f/2 and one with a 100mm f/2 attached on it.
Both systems have the same field of view. You can achieve that by several ways, one is to change the detector size and an other is to put more optical elements between the lens and a equally sized detector (e.g. a tele-convertor on the 50mm or a 'speed booster' on the 100mm). It does not matter how you achieve it, what matters is that you are imaging the same thing, same fov, with the two systems. Given the formula above the 50mm f/2 lens has an iris diameter of 50/2=25mm, the 100mm f/2 lens has a diameter of 100/2=50mm. Therefore the surface (proportional to the square of diameter) of the collecting area is 4 times important on the system with the 100mm f/2. And has we have seen above, this bigger system will gather 4 time more light than the small one.
Often people talk about t-stop instead of f-stop you can see the t-stop as the true efficiency of the lens for a given aperture. It takes into account the glass transmittance and the fact that the diaphragm is not a perfect circle. f-stop is the theoretical value.
Often front of a problem, in science, it is useful to extrapolate the principles to the extremes and see if it does still make sens, and figure the implication of your claims.
Let us imaging that you or I claim that both systems mentioned above can gather the same quantity of light. There is a-priory nothing to stop you to make two other more extreme systems like a 600mm f/2 and a 8mm f/2 with the same field of view. Claiming that both system can gather the same quantity of light is claiming that a tiny 4mm hole can gather as many light as a 300mm hole !
Imagine the implication of that claim, among other, it would mean that you have solved completely the problem of energy on our planet, because you could collect the same quantity of light (energy) with these tiny device than you would do with big device ! No more large fields of solar panels. For most of astronomical applications (apart of high angular resolution) the big telescope would not be needed anymore, just little one would do the trick ! Why should the army bother with big parabola antennas when they could shrink the receiver ... examples are endless. Yes if you found a way to do that you can be multy-billionaire and probably win the next Nobel price of physics !
To give a more complete idea we need to talk about the cases where two systems gives a different field of view. So let us take
A/ a 50mm f/2 and
B/ a 100mm f/2
both mounted on the same body, same detector size. They have to different fov.
To understand I like to take two special cases and generalize them after.
First, imagine that you want to measure the light gathered from one star in a night sky.
A star view from your camera is 'a point source', it is not 'resolved', this mean that it is impossible to distinguish the light coming from one part or another of the star. This has nothing to do with your pixel resolution but is a property of light, what limits you is the diameter of your lens. This means also that the spot left by the star on your detector is no longer connected to the size of the object but the size of the entrance pupil (iris) and the light wavelength you are looking at. Smaller the iris bigger (and darker) the spot. This limitation of nature is called diffraction limit.
Anyway, the spot of the star on your detector is most likely concentrated on one pixel (star trail and camera vibration apart) on both of the A) and B) systems. Since we are considering here only one star, the fov does not matter, all the rest of the image that is not 'the star' is waisted. From what we have seen, the amount of light of the star registered on your detector depend of the collecting surface area of the lens, therefore with the 100mm f/2 (d=50mm) the star will be 4 time brighter than with the 50mm f/2 (d=25mm).
Now, second case, you are not interested about one single star light but to the total amount of star lights, sum of all the stars in your field of view. With the 50mm f/2, the field of view will be (about) twice larger in width and height than with the 100mm f/2. Therefore their will be ~4 time more stars in the 50mm pictures than in the 100mm picture, each stars being 4 time less bright on the 50mm f/2 than the 100mm f/2. So you got it, the total amount of star lights in both 50mm f/2 and 100mm f/2 picture is the same. (of course assuming that a uniform repartition of stars in the field, and all stars have equal magnitude).
On a 'normal' (not boring) picture this is the same, except that they are no stars but many more 'point source' which can be considered has the smallest elementary things that can emit or reflect light your camera can resolve. The bricks of light that make your picture.
All this to explain while the f-number is used and useful.
After all you could tell me, and so what ? Why the quantity of light does matter anyway when you can just brighter the picture anyway.
The short answer is: because most of the noise on your picture is directly related to the amount of light you gather ! Even with the most perfect detector in the world, the nature limits the 'noise quality' of your image. To fight agains it you need more time or more collecting surface. Again, let us consider in the following perfect detectors.
Before going further, I note that talking about noise itself is here irrelevant, what is relevant is how the quantity of noise is compared to the quantity of light (the signal). For this we use a quantity: the Signal to Noise Ratio (SNR).
And, by the way, noise is a random alteration of a signal, which is a bit different than the all day use of the word. For instance the 'noise of a car' is not a 'noise' in physics but a parasite 'signal'.
The SNR depend of the quantity of light you can gather and I'll tell you why with an analogy. Don't try to extrapolate the analogy itself to much, it is limited but I think useful to understand where the noise come from.
Imagine you have a pool [a pixel] and you want to use it to measure the rain drops rate. You can count the number of drop falling on the pool in a given laps of time [exposure time]. If you had plenty of time you would spend an hour to count the number of drops [photon] and then divide it by 3600 (number of second in one hour) and would be able to say that there is, for instance, 4.23 drops per seconds.
It is more likely that your neighbors who have the same pool [neighbor pixels] will measure sensibly the same thing. Because you get your measurement from a lot of drops [signal] and the estimation of how many drops is falling per second should be good enough so that your neighbors found sensibly the same thing than you.
Now imagine that you do not have one hour but only 10 seconds [exposure time] to count the drops. You can feel that you will count near what you have estimate with one our to time, but water drops are not 'scheduled' so the number of drop that you count in your pool in 10 seconds can vary from the number of drops falling in the neighbor pool. You will count maybe 44 drops but a neighbors [neighbor pixel] counted 40, an other 49, 42, 40, 39, 42, .... it is pretty unlikely however that anybody will count 3 or 200 drops for instance.
You can retrieve all the neighbor counts and classify them by how many found 1, how many found 2, 3, ...42, 43, 50,.... this is called a distribution. The average of this distribution will peak at something close to 4.23x10s = 42.3 and the length, the spreading, of this distribution is the noise.
This phenomena is call shot noise or poisson noise (or photon noise if we are talking about light). This particular distribution have a characteristic that the spreading (the noise) is proportional to the square root of the average (the signal). Therefore in our example the signal is ~42.3 and the noise is its square root : 6.5. So the signal to noise ratio is SNR= 42.3/6.5 = 6.5 . The SNR also increase with the square root of the signal.
If you are limited in time (10s) the only way you can improve the system is by increasing the total surface of the pools to increase the number of drop (signal) and therefore increase the SNR. It does not matter a lot if you choose to put several little pool [pixel] together or a big pool, the only little difference is the negligible space between pool [pixel] where you will not be able to collect drops [photons].
For light noise this is the same thing. Light is transported by particles named photons, however photons have nothing to do with the image of a bullet or a water drop. It is hard to represent in your mind what is a photon but for our explanation it does not matter, the only thing that mater in this analogy is that they are detected by quanta like drops in the pool.
We can understand that when taking a picture of a uniform sky, for instance, one pixel will not register the same quantity of photons than the neighbor pixels. Simply by the nature of how light is transported. This will cause the noise in the image.
The ISO does not change anything, ISO is related to a gain, an amplification factor. The relation between ISO and gain is variable from one camera to an other notably when the size, but also the efficiency, of the detector varies. Increasing the gain (ISO on one given camera) is equivalent of multiplying your signal by a number to make the image brighter for your eyes. But when you multiply the signal you are multiplying also the noise by the same amount. Therefore the SNR is unchanged. When you increase the ISO it is because you had little photon to deal with at the first place and this is why the picture look noisy.
Yes, in the real world detector and electronic are not perfect and are bringing their part of noise in the final image SNR. But you do not need it to explain the most part of the noise in your picture. Most of picture are dominated by the photon noise. You will tell me, boolshit, detector are making progress and we can tell the difference when looking at low light picture from 10 years ago camera and from modern camera (size a part). This is true that noise brought by the detectors has been reduced. But more importantly the efficiency of the detectors has been improved (more photon are converted to electron in a modern detector pixel), also the dead area in between pixel reduced (fill factor improved). All this make that the collected signal is higher so ... the SNR is higher on moder detector. Conclusion, at more or less equal technology, still the SNR is driven by the system size.
Apart of optical quality, which is to much technical and above the scope of this problem, when you have two systems with the same field of view and same light gathering (you know what it means) is their an obvious advantage for the system that has a bigger sensor ?
One that I can think of, and can be explained easily, is the dynamical range. Still talking with perfect detectors. Each pixel has a given capacity of electron storage, above this capacity they are full and cannot register light anymore (they are other way to saturate, with the amplification, but let us keep that simple). If the capacity is 60 000 electron on a perfect detector pixel, the dynamical range of one pixel is 60 000, we can record 60 000 level of information (in reality the detector read noise is involved in the DR computation).
Imagine two systems that gather the same quantity of light, from the same field of view, but one have a bigger detector. Both detector's systems are paved with the same pixels (same capacity, same size), obviously the one with the larger detector will have more pixel. We have imposed that the quantity of light gathered is the same for both systems, so the same quantity of light is spread over less pixel for the system with the smaller detector. Since pixel have the same capacity you will saturate on the small detector before the big detector. So, without introducing any complicate technical inputs, you have a higher global image dynamic range on the bigger detector.
Changing the pixel size of the large detector does not change this dynamical range difference, because large pixels have higher capacity.
A collection of common sentences, we can maybe answer or contradict from we have seen above:
"The light gathering with my micro four third (MFT) and a 50mm f/2 is the same than a 100m f/2 on a full frame only the depth of field change because I use exactly the same setting on both camera."
Well we can explain with what we saw above that this is not true. So why are the setting the same on both camera ? Because the gain (amplification factor) on the MFT detector is higher than on the FF for the same ISO setting, to compensate its size disadvantage. This is done in purpose so you do not have to thing about changing setting when you switch to a camera with a different detector but the same f-number lens.
"This camera detector have better high ISO performance than this one."
"You can increase your exposure time to shoot at lower ISO and therefore to have less noise."
This sentences are kind of right. But they are a bit confusing, it sounds like if the ISO amplification is the main guilty for the noise in the image. We have seen that the SNR depend of the quantity of light on top of detector noise. When you shoot at high ISO is because you do not have a lot of light to deal with (dark scene or fast shutter speed or less efficient system) this is why your picture is noisier. Actually, the detector read-noise become lower, compare to the photon noise, when the ISO is increased !
I would say instead :
This camera detector is more light efficient than this one
You can increase your exposure time to have more light and therefore less relative noise.
"Full frame have better low light performance because they have bigger pixel and can gather more light."
The influence in pixel size exists but is much less than the sentence make you think. The difference between 4 pixels, for instance, and 1 big pixel of the size of the 4 is mainly the dead space in between the pixels. However with technology this dead space has been reduced a lot, specially with the addition of micro-lenses in front of each pixel that concentrate the light in the pixel. On a full frame format, multiplying the number of pixel by two change a bit the total efficient area but not that much, in other word the space between pixel is negligible when compared to the detector size. However, it is more a problem on a smart phone detector where the detector is much smaller and space between pixel becomes more influent.
Printing both full frame picture at same size they will have basically the same SNR, however you can print bigger with more, smaller, pixels.
To be accurate when putting 4 pixels instead of 1 you are also adding the read-noise of the 4 pixels, the read-noise will be twice higher (goes with the square root of the number of pixel) with 4 pixels than one. But it matters only in special cases when you want to measure very faint light, in most of the case the photon noise dominate. Also bigger pixel have usually more read-noise, so the difference is not that much.
The sentence was 'more true' 10 years ago when space inter pixel was higher and pixel noise higher but not that much anymore.
I would say:
Full frame camera have better low light performance because they are usually sold and used with bigger lenses than APS-C or MFT.
Again for the same fov, APS-C lenses have usually a smaller pupil diameter.
After you have say that you can speak about little other differences like pixel size, technology, etc ...
After all that said, knowing the consequences of all above I have myself changed my FF system to a smaller APS-C system. Just because I realized that I was carrying my big system less and less. I however kept big aperture lenses, 85mm f/1.4 coupled with a speed booster for special rendering.
Also all this speech about SNR would not be complete without mentioning the noise reduction software. If your gall is to measure precisely the light, there is no such things, the noise cannot be 'reduced' this is an alteration of the signal if you remove the noise you remove the signal. However in photography you do not care about measuring the light, your eye does not care if one noisy pixel on a blue sky has been replaced by extrapolation of its neighbor pixel. This is why noise reduction software are making a great job to please the picture.
The funny thing is that in my daily job I am fighting against noise. In my photography I am some time adding noise to get the mood I want.
Do not hesitate to post constructive comments or question bellow.