Discussion about this post

User's avatar
Rob Nelson's avatar

I love it when you write about the academy. Each academic discipline is run by a collection of the smartest people in the room chosen for lifetime employment by someone with one or two degrees of separation from their dissertation advisor. This makes it extraordinarily difficult for academics to "think how flimsily constructed everything is that we know" because, to borrow Upton SInclair's great line, their salary depends on them not understanding it.

Your insider out perspective feels more likely to change minds, or at least open them a crack, than people from the humanities (we have our own problems), or worse, historians, scolding them for not seeing the whole for the parts, for missing life in lifeless numbers, or correcting a date of publication, or whatever.

That said, as a historian, I need you and anyone who reads your comments to know that Charles Peirce (he pronounced it "purse" because Boston) published that wonderful essay in The Monist in 1891. Peirce was a weirdo, iconoclast, the son of a famous Harvard professor, an unrepentant asshole who endured an incurable and painful condition called facial neuralgia, and one of the finest metrologists and neologists in history who gave William James some of his best ideas. Peirce was the sort of writer Emerson had in mind when he said "Beware when the great God lets loose a thinker on this planet."

He was also mostly forgotten until John Dewey and a few other admirers gathered what they could find of his essays and got them published in that 1923 volume. This has led to a slow, steady revival of interest in his writing among historians and the occasional philosopher or member of the Santa Fe Institute's faculty.

The extent to which I can follow you into the thickets of probability maths and statistical arcana is due to my attempts to understand words that Peirce put down in his often difficult nineteenth-century prose. Hence, my over-excited response to seeing his name in one of your essays.

Let me leave you with my favorite of Peirce's neologisms: fallibilism. Peirce defined this as “the doctrine that our knowledge is never absolute but always swims, as it were, in a continuum of uncertainty and of indeterminacy.”

Expand full comment
Rachel Childers's avatar

It is incredibly validating to see a take on statistics that manages to pass between the Scylla and Charbidis that on the one hand, recognizing asymptotic limits and bounds can fail to be sharp or even adequate descriptions of typical behavior, and on the other, accepting just how fragile the dependence on tail conditions is for many of these bounds to be valid at all.

Working in a part of ML that asks for finite sample bounds[^1] has been incredibly humbling relative to both the applied person's "inference is a hurdle I must jump through to show others that I was right along" and the theorist's idyll holidays in asymptopia.

Even there, many of the introductory treatments essentially say: just assume sub-Gaussian tails and apply something like a Hoeffding inequality and you will get essentially the familiar Central Limit Theorem results without having to go to asymptotics. And in some cases this is fine! But as you move beyond some classification approaches where sub-Gaussianity is enforced by construction, you are forced into the same dilemma. For anything like a reasonable description of even fairly simple regression behavior in a typical case, you start needing to look beyond what the tail behavior gives you to the bulk of the distribution, to something like a Bernstein inequality to get an appropriate "fast rates" condition. And in high dimensions, none of these will give you a good average representation and you need to run hat in hand to the statistical physicists to steal their random matrix theory (many of which are only rigorously known for Gaussian data, let alone sub-Gaussian). But then of course sub-Gaussianity is in so many applications completely untenable for real data without a priori bounds; in many of these cases the CLT tells us we will still get basically the same thing "eventually" and Berry-Esseen will even tell us that it won't take forever, but in many cases the tail behavior propagates terribly for quite a while. In these cases, robust statistics in the style of Huber can help accelerate the process, though statisticians are only quite recently developed robust procedures with near-sharp finite sample guarantees for estimating even the humble mean. See, e.g., Median-of-Means or Catoni estimators; the multivariate case gets even worse, resulting in procedures at the bound of computational feasibility.

It's almost as if there's no substitute for putting in the work and thinking hard case by case, and acknowledging how far we might be from getting things right...

Anyway, inspiring post; I would comment on the rest of it, beyond to say that I loved the Misak Ramsey biography, but the classics part runs into my humbling lack of erudition. My Homer knowledge comes almost exclusively from high school literature class heavily abridged excerpts and summaries and from Maya Deane's "Wrath Goddess Sing", which is basically a genderswapped AU extended Agamemnon/transfem Achilles/transmasc Briseis throuple Iliad slashfic and maybe just possibly differs from canon in some details

[1]: BTW, did you know that finite sample high-probability results for general GMM (a simple but seemingly rarely-done exercise I put in a paper a few years ago) require crazy tail assumptions? You almost certainly did, because you've run the simulations and worked out details for many special cases...

Expand full comment
34 more comments...

No posts