God Emperor of Mastodon (new account)

@michal@sapka.pl

This is my new, future account. While most interaction still happen on the bsd.cafe one, I aim to migrate fully here.

I am posting mostly culture, emacs, bsd and anti-ai based ramblings. I don't curse and I try not bait people into pointless anger rages.

I am a nerd (or dork? I don't know), a software engineer and I started to hate what technology has bacame.

This is a self-hosted, single-user instance running on snac2.

Messages auto deleted every few weeks
Homepagehttps://michal.sapka.pl

339 following, 862 followers

0 ★ 0 ↺

[?]God Emperor of Mastodon (new account) » 🌐
@michal@sapka.pl

I'll allow myself to simply paste GenAI summary (top post) form ClaudeAI subreddit about Opus 4.8, so you don't have to:

Okay, so the general vibe here is less 'Hooray, new model!' and more 'Please, for the love of all that is holy, don't take away Opus 4.6'.

the consensus is overwhelmingly skeptical, bordering on hostile. The community is clearly still traumatized by the Opus 4.7 update, which most users consider a massive downgrade. The news that 4.8 "builds on 4.7" has gone down like a lead balloon.

The Cult of 4.6: Everyone is terrified of losing access to Opus 4.6, the subreddit's beloved gold standard. Many are reporting it's already gone from their model list, while others can still access it, causing widespread panic and threats to switch to competitors.

Token Terror: The number one concern after losing 4.6 is how fast 4.8 will incinerate usage limits. Early reports are not comforting, with users claiming it's even more token-hungry than 4.7, rendering the Pro plan a "sampler" at best.

Benchmark Blues: Nobody trusts the benchmark charts. The phrase "trust me bro" has been thrown around a lot, with users pointing out that 4.7 also had a pretty graph before it disappointed everyone.

Sonnet & Haiku Neglect: A significant number of users wish Anthropic would focus on improving the cheaper models instead of another incremental Opus release, especially given the cost concerns.

>A few brave souls are reporting positive initial results, saying it feels like a fixed 4.7, but they are a tiny island in a vast ocean of doubt. The rest of the thread is basically a support group for people who miss the good old days of Opus 4.6.

Guess the vibe on the vibe island if significantly off.

This is in line with my understand: models are stagnated, and most improvement is from work on harnesses. And they can't do it without burning even more tokens.

...

[?]stfn » 🌐
@stfn@mastodon.com.pl

@michal this is the perfect occasion to use the "oh no anyway" meme

    ...
    0 ★ 0 ↺

    [?]God Emperor of Mastodon (new account) » 🌐
    @michal@sapka.pl

    Oh, far from it. People use it. I use it (the saying is "for better or worse" but it's not much of the "better"). If it's worse, and C suits insists it's your fault because some internet celeb says it's the Second Coming, then it's your problem. I could easily burn 10k USD a month on 4.6 if I tried. I burned 2K USD by simply seeing how usable it is (the answer: not 2k USD). If it's worse and more expensive, and you need to use it, then well. The CFO may make the wrong choice about which part of the equation to keep.

      [?]nicole mikołajczyk @ piwo 🔜 gpn [she/her] » 🌐
      @mkljczk@pl.fediverse.pl

      @michal i find it funny how the models development basically stagnated since i stopped using them so i don’t have to check the new ones to know their weaknesses and strengths

        History