Letterboxd Is Functioning as the Critic Now, and That Is a Problem
Letterboxd's aggregate star rating is now the first thing most viewers see about a film, and it is quietly replacing the job criticism used to do. An argument against treating aggregation as judgement.
The average Letterboxd rating for Joker: Folie à Deux is 2.1 stars. The average rating for Megalopolis is 2.6. The average for Anyone But You is 2.9. The average for Challengers is 3.8. The average for The Brutalist is 4.2. These numbers are the first information about each film that a substantial fraction of contemporary viewers will encounter, and they are, increasingly, the last information that viewer needs before deciding whether the film is worth engaging with at all.
This is a problem. I want to describe it.
What Letterboxd actually is
Letterboxd, the platform, is excellent. Let me get this out of the way first. It is the best film-logging interface currently available. Its community is often insightful. Its feature set (lists, reviews, friend-following, watchlist management) is well-designed. The platform itself is not the problem.
The problem is what the platform’s aggregate ratings are being asked to do. Letterboxd aggregates user ratings into a single per-film star score, displayed prominently on each film’s page. The score is calculated from however many users have rated the film, with no weighting for user expertise, viewing context, or rating consistency. A fourteen-year-old who has seen three films and rated them all five stars contributes the same amount to the aggregate as a professional critic who has seen four thousand films and rates on a calibrated scale.
This aggregation, treated as a score, is functioning as a critical judgement. Viewers consult it the way they used to consult Roger Ebert’s review, or Janet Maslin’s, or any other reasonably-trusted professional voice. The aggregate is, in contemporary film culture, replacing the specific practice of reading criticism.
What criticism does that aggregation cannot
Criticism, properly practiced, does specific things that aggregation structurally cannot do.
Criticism makes an argument. A good review does not just place the film on a quality scale. It describes what the film is attempting, whether the attempt succeeds, what the film compares to, what it might mean, what its specific choices (casting, cinematography, structure, tone) produce. The reader of criticism learns not just whether to see the film but how to see the film, what to pay attention to, what the film might be in conversation with.
Criticism rewards specific attention. A critic who has watched two thousand films, rewatched specific films, read specific film history, engaged with specific directors’ prior work, is not casting a simple thumbs-up or thumbs-down. They are bringing specific comparative knowledge to bear on a specific new work. The critic’s authority comes from the specific work they have done.
Criticism acknowledges difficulty. A serious film is often difficult on first viewing. The critic is in a position to say: this film will not fully reveal itself on a first pass, here is what to look for on a second viewing, here is what the director’s previous work suggests about what this film is actually doing. The aggregate score cannot do this. The aggregate score averages first-viewing responses from viewers who have often not read anything about the film beyond its synopsis.
The specific distortions
The aggregate rating produces specific distortions that are by now visible.
Difficult films are undervalued. A film that requires specific prior knowledge, or a second viewing, or patient engagement, will consistently receive lower first-viewing ratings than a film that delivers its rewards in the first hour. Megalopolis is a useful example. Whatever the film’s considerable problems, some of its ambitions are genuinely interesting, and the ambitions require specific cultural and cinematic context to register. The aggregate rating does not reward context. It reduces the film to the average of approximately ten thousand first-viewing Wednesday-night responses.
Populist films are overvalued relative to their actual craft. A well-constructed crowd-pleaser that delivers its genre’s expected pleasures efficiently will receive aggregate ratings well above its actual artistic achievement. Sydney Sweeney and Glen Powell have chemistry; the film around them is competent; the aggregate rating treats the film as better than it is because the viewer had a good time on a Friday night. There is nothing wrong with having a good time on a Friday night. There is something wrong with treating the aggregate of those good times as aesthetic judgement.
Specific demographic biases are reinforced. Letterboxd’s user base skews male, white, and young. The aggregate ratings therefore overweight the preferences of that demographic. Films that specifically appeal to older audiences, women, or non-white viewers are systematically undervalued relative to their actual craft. The aggregate is not neutral. It is the specific preferences of a specific user base, and those preferences are not representative of what criticism, at its best, can identify.
The scale compresses. The effective range of Letterboxd’s five-star scale, for mainstream films, is roughly 2.5 to 4.0. Most films cluster inside that range. The aggregate therefore provides low resolution on precisely the films where resolution would be most useful. A 3.4 film and a 3.7 film are being perceived as meaningfully different, when the actual distance between them is smaller than the noise in the aggregation itself.
Why the aggregate feels more trustworthy than it is
There is a specific epistemic seduction that the aggregate produces. A number derived from thousands of individual ratings feels more reliable than a single critic’s opinion. The logic is: many observations, averaged, should be more accurate than one observation.
This logic is correct for measuring objective quantities (the temperature in Tokyo right now, for instance, is better measured by averaging across many thermometers than by one). It is incorrect for aesthetic judgement. Aesthetic judgement is not an objective quantity. It is a specific practice requiring specific training, specific comparative knowledge, specific attention. Averaging unreliable judgements does not produce reliable judgement. It produces a number with no particular meaning.
The aggregate rating looks like a measurement. It is not a measurement. It is a poll of people who may or may not know what they are talking about, averaged without weighting.
What I am asking for
I am not asking Letterboxd to change its platform. I am asking readers to treat its aggregate scores with the specific epistemic weight they deserve, which is approximately zero.
Read criticism. Read specific critics you have come to trust. Watch films before consulting aggregate scores, where possible, so that your own first-viewing response is not contaminated by the pre-existing aggregate. Rewatch films before concluding you have understood them. Accept that your own single viewing of a film, reduced to a one-to-five-star rating, is not a meaningful contribution to the collective knowledge about that film.
The cost of the aggregate’s dominance is the slow atrophy of the specific practice of reading criticism. That practice is doing work that the aggregate cannot do. If we stop practicing it, we will stop having access to what it produces.
The alternative to the aggregate is not silence. The alternative is specific argued judgement from specific writers who have done the specific work. That is still available. It costs more than glancing at a star rating. It is worth the cost.
Marcus believes good criticism is an argument. He is almost always angry about something, usually for good reason. Horror is his first language.
MORE BY MARCUS VELL →The Long Film Is Back, and the Short One Should Be Worried
Across the last three years, the three-hour-plus film has quietly returned to the centre of serious American cinema. An essay on what the long film does that the short film cannot.
The Anti-Biopic: On Refusing the Cradle-to-Grave Shape
The biopic is a genre with a default structure, and the default is almost always the problem. An argument for the biographical films that refuse the shape.
The Biopic Is Laundering Its Subjects
The contemporary biopic has adopted a specific form that functions as reputation management. An essay on the mechanism by which the biopic launders its subjects, and what the occasional exception does differently.