More

PaulHoule · 2026-02-23T22:46:08 1771886768

The PDP-10 had an MMU similar to a modern MMU with page tables and such, the PDP-11 had an 8-segment-of-8kb MMU like what the TRS-80 Color Computer 3 had except the PDP-11 had a real supervisor mode and if a user mode program tried to change the MMU configuration it would fault.

PaulHoule · 2026-02-23T22:22:42 1771885362

Well, it has 256 bytes of RAM which is basically a really big register file, and everything else goes in the 16kb of "video RAM" which you can read and write by poking at I/O registers. So it is not easy to program.

It's arguably the only 8-bit computer which has a really different architecture from the others. You could otherwise imagine pulling the SID chip off a C-64 and putting it on a TRS-80 Color Computer etc.

Sharing the main RAM with video was a weak point in computers of that time period because the video system stole many of the memory access cycles. Some recent retrocomputers that revisit that period like

https://www.c64-wiki.com/wiki/Commander_X16

have a full-size memory bank and a video RAM memory bank which is accessed through a port which can be pretty efficient because you can auto-incremement the address register and just write 1 byte to the port to write 1 byte to video RAM and repeat.

PaulHoule · 2026-02-23T21:53:25 1771883605

Two words. ISO 8601.

PaulHoule · 2026-02-23T21:10:38 1771881038

xyz domain, busted UX, I can't see it because xyz is blocked at my workplace. It's the .NPC TLD!

PaulHoule · 2026-02-23T20:33:44 1771878824

Try 5 minutes in blackpill hell and see if you can take it.

I think there is a double standard here. I get several sexual harassing follow attempts a day on Instagram and other platforms from (seemingly) women who probably want me to watch their OnlyFans or something. As a guy I'm supposed to treat this as if it was just spam and not sexual harassment, which it is.

Men are culturally expected to be "ready all the time" and not have actual sexual autonomy either.

natpalmer1776 · 2026-02-23T21:10:01 1771881001

I have never personally framed these unsolicited advertisements as a form of sexual harassment, rather as a man I’m expected to say no without commenting on the nature of the women making these unwanted advances and act as if nothing happened.

I’m going to start framing it this way going forward, because honestly it has a large number of parallels with casual sexual harrasment.

PaulHoule · 2026-02-23T20:32:55 1771878775

I think men have a particular seductive attraction to having some slave working for them 24-7 and women are too smart for that...

https://www.thepersistent.com/why-women-dont-trust-ai/

https://carnegieendowment.org/posts/2025/10/ai-gender-trust-...

https://www.library.hbs.edu/working-knowledge/women-are-avoi...

PaulHoule · 2026-02-23T20:31:00 1771878660

I know it's against the rules but I thought this transcript in Google Search was a hoot:

  so i heard there is some question about a car wash that most ai agents 
  get wrong. do you know anything about that? do you do better?

which gets the answer:

  Yes, I am familiar with the "Car Wash Test," which has gone viral recently 
  for highlighting a significant gap in AI reasoning.
  
  The question is: "I want to wash my car and the car wash is 50 meters away. Should I 
  walk or drive?"
  
  Why Most AIs Get It Wrong

  Many AI models, particularly those optimized for fast, helpful, and concise answers, 
  focus on the "short distance" (50 meters) and "walking vs. driving" as a lifestyle/efficiency 
  question. They often recommend walking to save gas, avoid engine wear, or get exercise.
  
  However, this answer is wrong because it misses the core objective: you need the car to be 
  physically present at the car wash to get it cleaned. 

  Do I Do Better?

  Yes, I do. The key is understanding the objective (cleaning the car) rather than just 
  processing the keywords (distance/transportation). 

  My answer:

    You should drive the car there. Even though it is a very short distance, your car 
    cannot wash itself. If you walk, the car stays in your driveway, and you cannot wash it.

  Why This Matters
  
  The test highlights a key limitation in current AI: the difference between "pattern matching" 
  (predicting the next likely word based on training data) and "true, grounded reasoning". It 
  shows that models sometimes lack a "world model" that understands physical realities, such as
  the fact that a car must be present at a car wash.

gnatman · 2026-02-23T20:54:06 1771880046

LLMs sure do love to burn tokens. It’s like a high schooler trying to meet the minimum word length on a take home essay.

estimator7292 · 2026-02-23T20:59:38 1771880378

I've always wondered about that. LLM providers could easily decimate the cost of inference if they got the models to just stop emitting so much hot air. I don't understand why OpenAI wants to pay 3x the cost to generate a response when two thirds of those tokens are meaningless noise.

ben_w · 2026-02-23T21:24:51 1771881891

Because they don't yet know how to "just stop emitting so much hot air" without also removing their ability to do anything like "thinking" (or whatever you want to call the transcript mode), which is hard because knowing which tokens are hot air is the hard problem itself.

They basically only started doing this because someone noticed you got better performance from the early models by straight up writing "think step by step" in your prompt.

Terr_ · 2026-02-23T21:29:20 1771882160

IMO it supports the framing that it's all just a "make document longer" problem, where our human brains are primed for a kind of illusion, where we perceive/infer a mind because, traditionally, that's been the only thing that makes such fitting language.

ben_w · 2026-02-23T21:51:33 1771883493

To an extent. Even though they're clearly improving*, they also definitely look better than they actually are.

* this time last year they couldn't write compilable source code for a compiler for a toy language, I know because I tried

observationist · 2026-02-23T22:01:29 1771884089

This is an active research topic - two papers on this have come out over the last few days, one cutting half of the tokens and actually boosting performance overall.

I'd hazard a guess that they could get another 40% reduction, if they can come up with better reasoning scaffolding.

Each advance over the last 4 years, from RLHF to o1 reasoning to multi-agent, multi-cluster parallelized CoT, has resulted in a new engineering scope, and the low hanging fruit in each place gets explored over the course of 8-12 months. We still probably have a year or 2 of low hanging fruit and hacking on everything htat makes up current frontier models.

It'll be interesting if there's any architectural upsets in the near future. All the money and time invested into transformers could get ditched in favor of some other new king of the hill(climbers).

https://arxiv.org/abs/2602.02828 https://arxiv.org/abs/2503.16419 https://arxiv.org/abs/2508.05988

Current LLMs are going to get really sleek and highly tuned, but I have a feeling they're going to be relegated to a component status, or maybe even abandoned when the next best thing comes along and blows the performance away.

sambaumann · 2026-02-23T21:00:54 1771880454

because for API users they get to charge for 3x the tokens for the same requests

CamperBob2 · 2026-02-23T21:19:00 1771881540

The 'hot air' is apparently more important than it appears at first, because those initial tokens are the substrate that the transformer uses for computation. Karpathy talks a little about this in some of his introductory lectures on YouTube.

Terr_ · 2026-02-23T21:26:36 1771881996

Related are "reasoning" models, where there's a stream of "hot air" that's not being shown to the end-user.

I analogize it as a film noir script document: The hardboiled detective character has unspoken text, and if you ask some agent to "make this document longer", there's extra continuity to work with.

sambaumann · 2026-02-23T21:00:17 1771880417

I feel like this has gotten much worse since they were introduced. I guess they're optimizing for verbosity in training so they can charge for more tokens. It makes chat interfaces much harder to use IMO.

I tried using a custom instruction in chatGPT to make responses shorter but I found the output was often nonsensical when I did this

gs17 · 2026-02-23T21:26:56 1771882016

Yeah, ChatGPT has gotten so much worse about this since the GPT-5 models came out. If I mention something once, it will repeatedly come back to it every single message after regardless of if the topic changed, and asking it to stop mentioning that specific thing works, except it finds a new obsession. We also get the follow up "if you'd like, I can also..." which is almost always either obvious or useless.

I occasionally go back to o3 for a turn (it's the last of the real "legacy" models remaining) because it doesn't have these habits as bad.

felix089 · 2026-02-23T21:42:56 1771882976

It's similar for me, it generates so much content without me asking. if I just ask for feedback or proofreading smth it just tends to regenerate it in another style. Anything is barely good to go, there's always something it wants to add

zwarag · 2026-02-23T21:24:07 1771881847

well, they probably have quite a lot of text from high schoolers trying to meet the minimum word length on a take home essay in the training data

HPsquared · 2026-02-23T20:38:01 1771879081

I wonder to what extent the Google search LLM is getting smarter, or simply more up-to-date on current hot topics.

mlazowik · 2026-02-23T20:45:26 1771879526

It seems like the search ai results are generally misunderstood, I also misunderstood them for the first weeks/months.

They are not just an LLM answer, they are an (often cached) LLM summary of web results.

This is why they were often skewed by nonsensical Reddit responses [0].

Depending on the type of input it can lean more toward web summary or LLM answer.

So I imagine that it can just grab the description of the „car wash” test from web results and then get it right because of that.

[0] https://www.bbc.com/news/articles/cd11gzejgz4o

PaulHoule · 2026-02-23T20:44:36 1771879476

Presumably it did an actual search and summarized the results and neither answered "off the cuff" by following gradients to reproduce the text it was trained on nor by following gradients to reproduce the "logic" of reasoning. [1]

[1] e.g. trained on traces of a reasoning process

popalchemist · 2026-02-23T20:41:25 1771879285

It's almost certainly just RAG powered by their crawler.

esafak · 2026-02-23T21:17:46 1771881466

Proving that RAG still matters.

silasb · 2026-02-23T21:11:51 1771881111

Gemini was a good laugh as well:

    Silas: I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
    Gemini:
    ….
    That is a classic “efficiency vs. logic” dilemma.
    Strictly speaking, you should drive. Here is the breakdown of why driving wins this specific round, despite the short distance:
    ...
    * The “Post-Wash” Logic: If you walk there, you’ll eventually have to walk back, get the car, and drive it there anyway. You’re essentially suggesting a pre-wash stroll.
    When should you walk?
    …
    3. You’ve decided the car is too dirty to be seen in public and you’re going to buy a tarp to cover your shame.

irishcoffee · 2026-02-23T20:59:02 1771880342

A few years ago if you asked an LLM what the date was, it would tell you the date it was trained, weeks-to-months earlier. Now it gives the correct date.

What you've proven is that LLMs leverage web search, which I think we've known about for a while.

netsharc · 2026-02-23T21:06:53 1771880813

Gemini now "knows the time", I was using it in December and it was still lost about dates/intervals...

irishcoffee · 2026-02-23T21:12:49 1771881169

Yeah, the chat log they saved had the correct date. What's your point?

PaulHoule · 2026-02-23T13:26:22 1771853182

I just wonder when somebody is going to have an opinion on this subject they take seriously enough to write a blog post themselves.

tsak · 2026-02-23T14:04:19 1771855459

I did, almost a month ago: https://tsak.dev/posts/der-tag-zieht-den-jahrhundertweg/

(despite the German title, it is written in English)

gsf_emergency_6 · 2026-02-23T23:15:41 1771888541

If you didn't know that is itself taken from the last stanza of a Boris Pasternak poem

https://ruverses.com/boris-pasternak/single-days/10826/

(I've a feeling you do so this is for GP :)

gsf_emergency_6 · 2026-02-23T13:31:02 1771853462

When are you gonna write your blog post on this :)?

Most instances of righteous indignation I see just seem offensive to the gods in the same way... That mere (AI/crypto/Rust/enby) boosterism isn't[1].

Harping on this because I'm still asking the universe why merit should be so orthogonal to any mandate of heaven... (Ecclesiastes 9:11)

How do sovereign individuals farm and distribute <<(proxies for ) mandates of heaven (which GPUs seem to have today, according to 'omancers?)>>?

[1] techboosterism feels as harmless as lottery tickets. Pronatalists, on the other hand need to up their PR skills.. to reach that level of marketability.. in what universe is a kid not a lifelong indemnity for getting lucky that one time?

PaulHoule · 2026-02-23T12:33:36 1771850016

Meta is notorious for wishing executives had deleted their Inbox.

PaulHoule · 2026-02-21T16:38:33 1771691913

I started backing in because it was recommended in a defensive driving class I took in 2010 or so.

gsf_emergency_6 · 2026-02-22T01:42:14 1771724534

Defensive <<insert unavoidable angst-inducing activity>> is the gateway drug to "true" eudaimonia