Why movies look weird at 48fps, and games are better at 60fps, and the uncanny valley…
Damn you, Peter Jackson!
Let’s end this debate once and for all. Humans can see frame rates greater than 24fps (although plenty of people will argue that they can’t on the internet). I’ll explain more in a future post if necessary, but let’s take that as read.
Once you’ve accepted that fact, the next question is why do movies at 48fps look “videoy”, and why do movies at 24fps look “dreamy” and “cinematic”. Why are games more realistic at 60Hz than 30Hz?
The answer to all of this lies in two things – ocular microtremor, and center-surround receptive fields in the retina. And it predicts where the cut-off lies as well.
Holy oscillating oculomotors, Batman!
You might not know this, but your eyes are wobbling all the time, like a hummingbird on methamphetamines. They just plain jiggle in their sockets. It’s a surprise that you can see anything at all, in fact.
The question is why?
You may already know that you can only see an area of sharp focus roughly the size of a silver dollar held out at arm’s length. This is the part of your retina called the fovea, which is the nice, sharp, color-responsive part of your retina. Your brain stitches together information from this peephole into a version of the world that you actually see. It’s densely packed with color-receptive cells called cones.
Here, go read this Wikipedia article if you need to catch up on your retina knowledge. I’ll wait.
According to this paper (Physical limits of acuity and hyperacuity, Wilson S. Geisler, U Texas) from 1983, the physical limit of acuity for your eye is 6 arcseconds when looking at two parallel thin lines that are really close together (also known as vernier acuity).
Now there’s a formula which tells you the minimum you can possibly distinguish between two lines, with a camera of a given aperture, and it’s called the Rayleigh criterion. (Rayleigh was a pretty smart physicist, who liked to play with waves).
On that page I just linked, there’s a formula which tells you the best you should be able to hope for, for a human eye, under optimal circumstances:
θ = 1.22x10-4 rad
… which is 25.16 arcseconds.
Yeah. So that’s a lot more than 6 arcseconds.
What’s more, cones themselves are 30-60 arcseconds across – between 5x and 10x times the size of the smallest gap you can see.
So that’s theoretically impossible… Or it would be if your eye was just a simple camera. But it’s not. Your retina is actually a CPU all by itself, and does a lot of processing for you. It also has some pretty specialized elements – like the design of the cones themselves.
Let’s look at a cone…
Cones are highly specialized light receptor cells, that have evolved to gather as much data as possible (in the form of light) over millennia. They’re not just simple pixel-readers though – they behave directionally, and prefer to accept light hitting them head-on. This is known as the Stiles-Crawford effect.
The shape of the top of a cone cell is why they’re called cones, and the Stiles-Crawford effect is why they’re cone-shaped. If you can discard light that’s coming off-axis, then you can better determine details – possibly even discriminating diffracted images and making them less fuzzy.
If you look at the picture, the tip of the cone is about 1/3rd the diameter of the cone. So we can take our 30-60 arcsecond measurement and divide it by 3 to get the actual fine-detail receptive field of the cone – give or take.
But now we have gaps in the image. If the sensors are more pin-prick like, how can they discriminate edges that are about the same width as the sensor itself?
The final piece of this puzzle is that the pattern of cones on your retina is not a fixed sensor; the sensor moves.
Ocular microtremor is a phenomenon where the muscles in your eye gently vibrate a tiny amount at roughly 83.68Hz (on average, for most people). (Dominant Frequency Content of Ocular Microtremor From Normal Subjects, 1999, Bolger, Bojanic, Sheahan, Coakley & Malone, Vision Research). It actually ranges from 70-103Hz.
No-one knows quite why your eye does this. (But I think I’ve figured it out).
If your eyes wobble at a known period, they can oscillate so that the light hitting the cones wanders across the cones themselves (each cone is 0.5-40µm across, and the wobble is approximately 1 to 3 photoreceptor widths, although it’s not precise – 150-2500nm). We can use temporal sampling, with a bit of post-processing to generate a higher resolution result than you’d get from just a single, fixed cone. What’s more, eyes are biological systems; we need something to compensate for the fact that the little sack of jelly in your eye is wobbling when you move it anyway, so why not use the extra data for something?
Tasty, tasty jelly.
So here’s the hypothesis. The ocular microtremors wiggle the retina, allowing it to sample at approximately 2x the resolution of the sensors. What do we have in the retina that could do this processing though?
Dolby 8.1 Center-Surround… er… Receptors
The receptive field of a sensory neuron is split into the center and the surround. It works like this:
…. and it’s really great for edge detection, which looks like this if you simulate it:
The cool thing is, this means that if you wobble the image, center-surround and off-center/surround cells will fire as they cross edges in the image. This gives you a nice pulse train that can be integrated along with the oscillation control signal, to extract a signal with 2x the resolution or more.
Bonus round: The Uncanny Valley
Nature likes to re-use components, and the center-surround feature of neurons is no exception. I like to think that this is the cause of the Uncanny Valley phenomenon, where the closer to “real” you look without being 100% on the money, the more disconcerting it feels.
Here’s an example from Wired magazine:
This is a big problem for videogames, because it makes getting to photorealistic human characters really difficult. Climbing out of that valley is, in fact, a total bitch. We’ll get there eventually though – but there’s a lot of subconscious details that we need to figure out to get there. (Which are hard to identify because their processing mostly happens at a pre-verbal, subconscious level in your brain).
Wait a minute. That curve looks a lot like something you might see with a center-surround receptive field. Which looks like this:
Specifically, it’s what you might get if you combine a linear trend line (from less-real to more-real) with a center-surround response in some fashion.
Nature LOVES to reuse building blocks. So it’s quite possible that this response-curve is part of the mechanism that the brain uses to discriminate things – or at least go from gross-feature comparison to high-detail comparison.
Imagine it like this: you’ve got a bunch of cells building up a signal which says “hey, this might be a human!”. That signal grows until more specialized feature-detection mechanisms kick in, and say “er, not quite” on top of that original signal. Eventually they say “Yep, that’s it!”, but in the mean time, thanks to the center-surround behavior collating the signals from lots of different gross-feature recognizers, it barks really loudly when you’re in the zone where that cell clicks on, but before you get it right.
So maybe our “this is an X” mechanism works – at the final recognition stages – via center-surround receptive fields.
Anyway, this is a bit off topic.
Side Effects of Ocular Microtremor, and frame rate
Let’s assume that if (like real life) what you’re seeing is continuously changing, and noisy, your brain can pick out the sparse signal from the data very effectively. It can supersample (as we talked about above), and derive twice the data from it. In fact, the signal has to be noisy for the best results – we know that from a phenomenon known as Stochastic Resonance.
What’s more, if we accept that an oscillation of 83.68Hz allows us to perceive double the resolution, what happens if you show someone pictures that vary (like a movie, or a videogame) at less than half the rate of the oscillation?
We’re no longer receiving a signal that changes fast enough to allow the super-sampling operation to happen. So we’re throwing away a lot of perceived-motion data, and a lot of detail as well.
If it’s updating higher than half the rate of oscillation? As the eye wobbles around, it’ll sample more details, and can use that information to build up a better picture of the world. Even better if we’ve got a bit of film-grain noise in there (preferably via temporal anti-aliasing) to fill in the gaps.
It just so happens that half of 83.68Hz is about 41Hz. So if you’re going to have high-resolution pulled properly out of an image, that image needs to be noisy (like film-grain) and update at > 41Hz. Like, say, The Hobbit. Or any twitch-shooter.
Less than that? Say, 24fps? Or 30fps for a game? You’re below the limit. Your eye will sample the same image twice, and won’t be able to pull out any extra spatial information from the oscillation. Everything will appear a little dreamier, and lower resolution. (Or at least, you’ll be limited to the resolution of the media that is displaying the image, rather than some theoretical stochastic limit).
Some readers of this article have suggested that this is all an artifact of motion-blur – double the frame rate, half the motion-blur, and you naturally get twice the sharpness.
It may play a part – though I’m not sure it plays a large one – The Hobbit, the shutter was set to 1/64th of a second. For regular movies? The shutter exposes for 1/48th of a second. That’s not halving; half the motion blur of 24 fps film would require an exposure time of 1/96th of a second. So I suspect that motion blur isn’t the whole story here.
The supersampling phenomenon has a name
It ends up that there’s an entire field of study in computational optics dedicated to the up-rezzing of images known as Super Resolution. It lets you take multiple images which would normally look like the one on the left of the image below, and turn them into the image on the right:
(image from the Wikipedia article linked above)
I suspect that ocular microtremors are part of the mechanism that the brain uses to do something similar. If you’re looking at frames of video, it’ll only be able to do its job if you have noise in the signal. Fortunately, most movies still do have random, Poisson-distributed noise – in the form of film grain. (Again, this plays back into that whole Stochastic Resonance phenomenon).
What’s the upshot of all this?
At 48Hz, you’re going to pull out more details at 48Hz from the scene than at 24Hz, both in terms of motion and spatial detail. It’s going to be more than 2x the information than you’d expect just from doubling the spatial frequency, because you’re also going to get motion-information integrated into the signal alongside the spatial information. This is why for whip-pans and scenes with lots of motion, you’re going to get much better results with an audience at faster frame rates.
Unfortunately, you’re also going to get the audience extracting much more detail out of that scene than at 24Hz. Which unfortunately makes it all look fake (because they can see that, well, the set is a set), and it’ll look video-y instead of dreamy – because of the extra motion extraction which can be done when your signal changes at 40Hz and above.
The short version is, to be “cinematic”, you really need to be well under 41Hz, and above the rate where motion becomes jerky – also known as the phi phenomenon or “apparent motion”—which is ~16Hz, so that the motion looks like motion.
Ah, you might be thinking… but video is 29.997Hz (for NTSC). Why does it look video-y?
Video isn’t really 29.997Hz…
It’s actually 59.994Hz for broadcast video. It’s just interlaced, so that you only show half of the lines from each frame, every 1/60th of a second. They don’t do this:
Snapshot –> Display Odd Lines –> Display Even LInes
… they do this:
Snapshot –> Display Odd Lines –> Snapshot –> Display Even Lines
… which is a whole different beast. (They may not even snapshot at all, depending on the camera; they may just sample the entire line as they shift it out really really fast from the CCD… so it becomes continuous – even though that may lead to rolling problems due to pixel persistence).
In other words, broadcast video is above the ocular microtremor sampling nyquist frequency, due to interlacing.
This is going to be trickier, because unlike film (which has nice grain, at least 4K resolution – although in reality it’s something like 6000 ‘p’ [horizontally] for 35mm film and 12000 ‘p’ for IMAX, and no “pixels” per se due to the film grain – although digital has meant we need to recreate some of this), we’re dealing with a medium where we’re resolution-limited (most games are 1920x1080 or lower). So we can’t get around our limitations in the same way. You can see our pixels. They’re bigger. And they’re laid out in a regular grid.
So if you really want the best results, you need to do your games at 12000x6750. Especially if someone’s borrowing an IMAX theatre to play them in.
Let’s get real.
Higher resolution vs. frame rate is always going to be a tradeoff. That said, if you can do >~38-43 fps, with good simulated grain, temporal antialiasing or jitter, you’re going to get better results than without. Otherwise jaggies are going to be even more visible, because they’re always the same and in the same place for over half of the ocular microtremor period. You’ll be seeing the pixel grid more than its contents. The eye can’t temporally alias across this gap – because the image doesn’t change frequently enough.
Sure, you can change things up – add a simple noise/film grain at lower frame rates to mask the jaggies – but you may get better results in some circumstances at > 43fps with 720p than at 30fps with 1080p with jittering or temporal antialiasing – although past a certain point, the extra resolution should paper over the cracks a bit. At least, that is, as long as you’re dealing with scenes with a lot of motion – if you’re showing mostly static scenes, have a fixed camera, or are in 2D? Use more pixels.
If you can use a higher resolution and a faster frame rate, obviously, you should go for that.
So my advice is:
- Aim for a frame rate > ~43Hz. On modern TV sets, this means pretty much 60Hz.
- Add temporal antialiasing, jitter or noise/film grain to mask over things and allow for more detail extraction. As long as you’re actually changing your sampling pattern per pixel, and simulating real noise – not just applying white noise – you should get better results.
- If you can still afford it, go for higher resolution
As a bonus, at higher frame rates you can respond more quickly to the action in the game – which is essential for twitch games, where responding to the game matters. This is mostly a side effect of lower end-to-end latency (game loops are generally locked to how fast they can present a new frame, and it’s rare for input to be decoupled from this – so a faster frame rate means lower input lag). It may also be due to being able to see changes in the game more quickly as well – after all, it’s updating twice as fast at 60Hz. If the ocular microtremors play a part, that mechanism may also allow better motion extraction from the cones.
Of course, realistically, the proof of the pudding is in the eating. The only true test is to experiment, and get a variety of people to do an AB comparison. If more people prefer one than the other, and you have a large enough sample size? Go with the one more people like.
Backing it all up with some evidence
So it looks like this post went a wee bit viral.
In response, I guess I need to back up a few of my more tenuous science claims – hey, this is a blog post, ok? I didn’t submit it for publication in a journal, and I figured the rigorous approach would be an instant turn-off for most. Still, enough people have questioned the basis for this – so I’m going to at least bolster up the basic idea (ocular microtremor + stochastic resonance are used for hyperacuity).
So I did some digging around, and here we go – have a paper from actual scientists who did actual experiments (hey! it looks like other people got here before me… but I don’t think anyone’s tied it all back to its interaction with frame-rate before):
[PDF] Stochastic resonance in visual cortical neurons: does the eye-tremor actually improve visual acuity? – Hennig, Kerscher, Funke, Wörgötter; Neurocomputing Vol 44-46, June 2002, p.115-120
Abstract: We demonstrate with electrophysiological recordings that visual cortical cell responses to moving stimuli with very small amplitudes can be enhanced by adding a small amount of noise to the motion pattern of the stimulus. This situation mimics the micro-movements of the eye during fixation and shows that these movements could enhance the performance of the cells. In a biophysically realistic model we show in addition, that micro-movements can be used to enhance the visual resolution of the cortical cells by means of spatiotemporal integration. This mechanism could partly underlie the hyperacuity properties of the visual system.
“The stimulus used in the simulations is a typical vernier stimulus, which consists of two adjoining bars with a small relative displacement of d = 7.5”. The displacement is smaller than the distance between two photoreceptors (30”) and thus cannot be resolved. Hyperacuity, though, allows the detection of displacements in the order of 4” to 10”, which so far has been attributed to the spatial sampling of the ganglion cells […] we investigated the role of micro-movements on the resolution of vernier stimuli. […] It proved to be noise in the amplitude and frequency range of the ocular microtremor that shows a strong effect on acuity”
“[…] microtremor has an even stronger impact. As it’s amplitude increases, the discriminability reaches much higher values for low amplitudes.”
Given that this was a study performed on cat retinas, you might be wondering whether or not this still applies. Well, it ends up that cats have these things called X- and Y- cells, which are retinal ganglion cells which show (for X) “brisk-sustained” and (for Y) “brisk-transient” response.
[PDF] Sustained and transient neurones in the cat’s retina and lateral geniculate nucleus – Cleland, Dubin, Levick; J. Physiol Sep 1971; 217(2); 473-496
I won’t list all of the papers on this, but if you do a search on “cat retina X Y ganglion”, you can find a bunch of them.
These are the cells which do this kind of processing. There wasn’t much proof for them existing in humans until 2007 – here’s the layman’s summary from Science Daily from that discovery. They used a rather neat looking sensor to do the work – pretty much jamming it right up against the retina to see how it ticked:
(Half Life 3 confirmed)
This thing is the size of a pinhead, and contains 512 electrodes, and was used to record the activity of 250 retinal cells simultaneously – of which 5-10 were the data-processing upsilon cells.
And here’s the paper:
[HTML] Identification and Characterization of a Y-Like Primate Retinal Ganglion Cell Type – Petrusca, Grivich, Sher, Field, Gauthier, Greschner, Shlens, Chichilnisky, Litke; J. Neuroscience, October 2007, 27(41): 11019-11027
If you base your conclusion on the first paper I linked to, it’s likely that the Y-Cells are at least in part responsible for this response. Could be the midget cells though; I’m not sure.
Edit: 12/22 – made some edits to clean up a few sentences that were way too woolly, and added a little new material, including links to Super Resolution. I didn’t expect this post to blow up like this – normally my blog posts don’t make a splash like this. Thanks for popping by!
Edit: 12/23 – moar papers from real, actual researchers to back it up
Some of this post is speculation – at least until experiments are performed on this. It may actually be real new science by the end of the day. I’d love to hear from any actual professionals in the field who’ve done research in this area.
Simon Cooke is an occasional video game developer, ex-freelance journalist, screenwriter, film-maker, musician, and software engineer in Seattle, WA.
The views posted on this blog are his and his alone, and have no relation to anything he's working on, his employer, or anything else and are not an official statement of any kind by them (and barely even one by him most of the time).
Interesting, but I’m fairly certain this is actually why The Hobbit looks weird: http://blogs.valvesoftware.com/abrash/why-virtual-isnt-real-to-your-brain-judder/
The artifacts most see in The Hobbit appear to be an exact replica of what Michael Abrash found in a kind of “anti sweetspot” between not having enough FPS and not having enough motion blur.
The post also seems to misunderstand The Uncanny Valley, which while shown to be a real effect, applies only to humanlike objects. Our brain is genetically programmed to read human faces quite well, so imperfections are noticeable there. But again, that literally only applies to human faces, the uncanny valley effect does not exist nor apply to anything else.
“We’re no longer receiving a signal that changes fast enough to allow the super-sampling operation to happen” - this would imply that we would perceive all still photographs as having lower definition than moving pictures. Subjectively, that doesn’t seem to be true for me. Moreover, I don’t see any reason why a slowly-changing image would defeat the supersampling - after all, the effect relies on our eyeballs moving, not the subject. You just seem to be asserting that any image not changing faster than 41Hz (and stills would meet that) defeats super-sampling without explaining how.
After a short Twitter exchange with Simon, I think I now have a slightly better idea of what he’s discussing. I got confused because I understood microtremors as being a means by which our visual systems can work around some of its optical limitations. So if you’re looking at some very high-detail source image (e.g., reality, or a very high quality photo print) the microtremors enable you to perceive more detail than the basic optical constraints would allow with a simple single exposure. In this view, all the detail is ‘out there’ and microtremors enable the eye to get more of that detail.
That’s the scenario to which microtremors are adapted.
But the effect Simon discusses is a little different, and amount to microtremors accidentally working well in a completely different scenario. In this case, the source material is somewhat deficient - any single frame of a film provides less detail than we are able to perceive, because it’s a noisy medium. However, the noise is different in each separate frame, meaning that in principle, it is possible to reconstruct a higher-quality image out of several frames each showing the same scene. And I think Simon’s point is that the mechanism our eyes use to extract more of the detail that’s “out there” when looking at something real also happen to enable us to perceive enhanced detail when looking at a noisy source, but that this only works if the noise is changing fast enough.
With 24fps, the frame-specific noise remains in place for too long, meaning that the microtremors successfully pick up the relatively low quality of each individual frame - this means we see the blur that’s really there, not the detail that could be inferred across multiple frames with with varying noise. But with higher frame rates, we see a different frame at each microtremor, with different noise, and our visual systems are able to work with that to enable us to perceive more detail.
I was originally thrown by the “We’re no longer receiving a signal that changes fast enough to allow the super-sampling operation to happen” line because a changing signal is not something the microtremor mechanism inherently requires, so this seemed like a weird statement. The part that wasn’t clear is that if individual frames in the source image are noisy, sufficiently fast changes to that noise happen to enable the microtremors to ‘work’, even though the scenario for which microtremors are adapted (resolving a ‘perfect’ external image such as the one presented by reality) is quite different from the scenario at hand (inferring extra detail in a still scene from a series images each of which is deficient in slightly different ways).
Interesting though this is, I’m still not convinced that this is the primary factor in producing the “filmy” quality of films. If you run a film through a motion smoothing system (like a lot of modern TVs have) to up the frame rate to 100Hz, a film looks like video, losing its film-like feel, even though the film grain noise is still at a much lower frequency. I always turn motion smoothing off on my TV when watching a film for this very reason. The difference is night and day - watch a DVD or BluRay of a film without motion smoothing, and it looks like a film; watch the exact same source material with motion smoothing and it looks like video. This has led me to the provisional conclusion that the distinctivecharacteristics of motion artifacts at 24fps is the most significant contribution to the a film-like quality.
24fps (or 25fps if you’re working on TV in PAL regions; 30fps elsewhere) certainly hides a multitude of sins. One of the distracting features of watching motion-smoothed film footage is that any slightly inept camera handling suddenly becomes a lot more visible. The massive amount of blur that 24fps effectively adds to anything that moves across the frame can mask a lot of unevenness. (That said, if you’re producing at 48fps, you’d just need to raise your game there; this particular problem only occurs when the film was shot on the assumption that it would be watched at 24fps.)
What he’s saying is that for slower frame rates the eye is observing the same frame for the duration of its wobble while the eye is able to integrate over higher frame rates (like the hobbit and reality) to make it more effective (there are actually cameras that move their sensors around to increase their resolution and I imagine it’s in the same spirit). These things would still be tied to resolution, as in you’d just be viewing the same resolution with more perceptual clarity and maybe better edge resolution, if it weren’t for having the grain at the higher framerate, which makes film resolution a more ambiguous and temporal thing that the eye is integrating over, and according to this article half an oscillation just happens to be about 41hz. As for photographs it’s not that the eye sees them at a lower resolution (photographs are still at the framerate of reality) it’s that we see them as they are, however they’ve been printed out, in the perceptual resolution we see reality (if they have noise in them or poor resolution or artifacts we’ll be able to see those things).
The judder he’s referring to mostly has to do with rendering a hmd and not traditional movies captured on camera. The artifacts he’s referring to are due to the way the eye follows things while still viewing a hmd. If your eye is locked on something being rendered by a hmd, as you move your head the hmd thinks it should be creating motionblur, but relative to your eye the thing that you’re looking at isn’t moving and motionblur makes it looks weird and blurry when it shouldn’t. The rendering for hmds is accounting for this by having no motion blur, high frame rates, and low pixel persistence.
Movies are different, they’re fixed on a wall and not being rendered based on your head and eye movements. I wouldn’t be surprised thought if the hobbit, with all its digital work and composting, had lots of incorrect motion blur (often cg elements don’t match the motion blur of the actual camera footage since most of the time their motion blur is just a hack in post… probably part of the reason why a lot of animation looks so fake).
I’ve done a few tests with re-timing footage from both 24fps as a source to 48fps and 48fps as a source back to 24fps (all 2k footage). In both instances I found the inclusion of extra motion blur in the footage to help make scene feel more cinematic. If you remove it it does start to get really video-y looking.
But the way the footage is lit also makes a big difference. if there is a lot of detail and high contrast lighting without a lot of grain it also makes it look like video. It seems that reducing the contrast and having a bit of a flatter look while adding post motion blur creates some very easy to watch 48fps footage. As for the high res, I’ve also done some testing with some 6k footage from a Red Dragon Camera that I’ve received and again if it’s flatter lit ( or pushed that way in post ) it still looks good at 4k playback but that’s only at 24fps. I have yet been able to play back 48fps at 4k on my 4k display since it’s not a valid UHD format… I’ll have to wait until I can get a new video card and play back at 60fps through HDMI 2.0 (if the computer can handle it that is… )
The frequency aspect - better fidelity at 2x sampling rate or higher - is a well known phenomenon in DSP (Digital Signal Processing), formalized by a fellow named Nyquist. The math shows that and original analog signal (i.e. real life) can be digitized and reconverted back into the analog domain perfectly as long as the original signal does not contain any frequency components above 1/2 the sampling rate. If it does then the output contains harmonic artefacts known as Aliasing.
I’ve also known for long enough I forget the original papers about the saccadal motion and its effect on image resolution.
THe pshological effects are more recent, really only deeply invstigated once animation techniques started hitting the valley. The original Shrek movie ran into that; the animators had to tone down the realism to get out of that discomfort zone.
No, it applies to literally everything. If you’d actually read the thing you’d know its endemic to all displays.
Cinematographers actually have to keep it in mind when deciding how a camera moves. So, that’s that then.
I find all this scientific mumbo-jumbo about the eye very interesting. On the conceptual side, here is a relevant old article that I find compelling. It argues for high frame rates in 3d movies (but not in 2d movies): http://www.shutterangle.com/2012/why-48-fps-is-good-for-3d-movies/ It actually puts 3d movies and games in the same category, opposed to 2d movies; the idea being that 3D is already a step towards reality, and if you take this step you should go all the way.
if you search google for “Mechanical Resonant Frequency of the Human Eye in Vivo” there is a paper from nasa with some interesting information about how the eye responds to vibration and the frequency it resonates at!
pretty interesting. i think maybe it corresponds to what you’re talking about? it says “when the head was vibrated, with either displacement or acceleration held constart, the curves of visual acuity are U-shaped, with the most severe attenuation of acuity in the range of 20 to 40 Hz.”
doesn’t matter now, but with wearable displays…maybe they could improve the resolution or perceived resolution by vibrating. :)
So basicly 83.68 hertz interlaced would be the best compromise?(41.84 hertz per field )or if we cheat you say the minimum is 70 hertz so a user could set the system to 35 hertz per field.mm!I don’t see any other way but interlaced to achieve from 70 hetz and up.yes a few could do it but with the bandwith required best bet would be to use minimum the eyes can accept (probably 35.001hertz per field for 70 hertz )as for this:Snapshot –> Display Odd Lines –> Display Even LInes,I ain’t sure this can even be in interlaced environment!
My back-to-basics question is as follows: Before we discuss moving and changing images, what is the relationship between all these eye motions and visual acuity when viewing static images?
@David Warman mentioned saccadic motion. Pointers include http://en.wikipedia.org/wiki/Saccade , http://www.ncbi.nlm.nih.gov/books/NBK10991/ , and http://www.scholarpedia.org/article/Human_saccadic_eye_movements Some of these refer to an idea that their purpose is to reduce short term burn-in desensitization of sensor cells due to continuous exposure to the exact same stimulus. The magnitude of the ocular microtremors discussed here seems to be much smaller than the smallest saccadic movements mentioned elsewhere.
The science is entertaining, but essentially, it comes down to learned behavior. The “film look” is something we’ve been taught to expect - and revere. The typical reaction of most humans to change is: that’s wrong. Acceptance is a process.
Sure, you can add artificial motion blur to HFR movies to appease the audience. But the preferred thing would be to make ‘em suffer though it and accept that HFR is just objectively better. 24p is a historical limitation, nothing more.
Most new TV sets are getting people accustomed to high frame rates with their interpolation bs anyway.
If the brain reuses pathways like you suggest in regards to the uncanny valley, I wonder if perhaps that could explain the “sense of wrongness” we get occasionally when we suddenly wake up before a house fire or some other crisis. It could be that the stimuli matches just far enough to read “80% OK, but something’s not matching, so this hits the uncanny valley”– and in fact, it makes me wonder if we don’t have psyche reactions to this in all sorts of other areas where the brain reaction is rather sharp and negative.
[…] Cooke goes into an in-depth explanation on why movies and video games look the way they do at differing frame […]
It seems to me that the Nyquist argument is backwards. The Nyquist theorem relates a continuous oscillation to the minimum number of discreet samples needed to represent it (the sampling frequency), and states that the sampling frequency must be at least twice the oscillation frequency.
In the present case, we have a continuous oscillation of the eyeballs matched to a discreet sampling of the motion (the frame rate), so it would seem that the Nyquist theorem would require the frame rate to be twice the micro-tremor frequency, rather than half.
Another way to put it is that in order for the micro-tremor to do its job effectively, it needs reliable and consistent data at both extremes of the oscillation, i.e. twice per oscillation, at ~170Hz.
Due to a lot of spam (and manual moderation being a pain), I’m closing Wordpress comments on the blog. Facebook comments should still work.