1/1/2026 at 7:40:29 AM
All these improvement in a single year, 2025. While this may seem obvious to those who follows along the AI / LLM news. It may be worth pointing out again ChatGPT was introduced to us in November 2022.I still dont believe AGI, ASI or Whatever AI will take over human in short period of time say 10 - 20 years. But it is hard to argue against the value of current AI, which many of the vocal critics on HN seems to have the opinion of. People are willing to pay $200 per month, and it is getting $1B dollar runway already.
Being more of a Hardware person, the most interesting part to me is the funding of all the developments of latest hardware. I know this is another topic HN hate because of the DRAM and NAND pricing issue. But it is exciting to see this from a long term view where the pricing are short term pain. Right now the industry is asking, we have together over a trillion dollar to spend on Capex over the next few years and will even borrow more if it needs to be, when can you ship us 16A / 14A / 10A and 8A or 5A, LPDDR6, Higher Capacity DRAM at lower power usage, better packaging, higher speed PCIe or a jump to optical interconnect? Every single part of the hardware stack are being fused with money and demand. The last time we have this was Post-PC / Smartphone era which drove the hardware industry forward for 10 - 15 years. The current AI can at least push hardware for another 5 - 6 years while pulling forward tech that was initially 8 - 10 years away.
I so wished I brought some Nvidia stock. Again, I guess no one knew AI would be as big as it is today, and it is only just started.
by ksec
1/1/2026 at 1:26:48 PM
This is not a great argument:> But it is hard to argue against the value of current AI [...] it is getting $1B dollar runway already.
The psychic services industry makes over $2 billion a year in the US [1], with about a quarter of the population being actual believers. [2].
[1] The https://www.ibisworld.com/united-states/industry/psychic-ser...
[2] https://news.gallup.com/poll/692738/paranormal-phenomena-met...
by wpietri
1/1/2026 at 1:36:19 PM
What if these provide actual value through placebo-effect?by apexalpha
1/1/2026 at 3:07:44 PM
I think we have different definitions of "actual value". But even if I pick the flaccid definition, that isn't proof of value of the thing itself, but of any placebo. In which case we can focus on the cheapest/least harmful placebo. Or, better, solving the underlying problem that the placebo "helps".by wpietri
1/1/2026 at 4:02:31 PM
I'll preface by saying I fully agree that psychics aren't providing any non-placebo value to believers, although I think it's fine to provide entertainment for non-believers.> Or, better, solving the underlying problem that the placebo "helps".
The underlying problems are often a lack of a decent education and a generally difficult/unsatisfying life. Systemic issues which can't be meaningfully "solved" without massive resources and political will.
by computably
1/1/2026 at 7:03:44 PM
Actually, I'd go one step further and say they are harmful to everybody else.It might just be my circles, but I've seen Carl Sagans quote everywhere in the last couple of months.
"“Science is more than a body of knowledge; it is a way of thinking. I have a foreboding of an America in my children’s or grandchildren’s time—when the United States is a service and information economy; when nearly all the key manufacturing industries have slipped away to other countries; when awesome technological powers are in the hands of a very few, and no one representing the public interest can even grasp the issues; when the people have lost the ability to set their own agendas or knowledgeably question those in authority; when, clutching our crystals and nervously consulting our horoscopes, our critical faculties in decline, unable to distinguish between what feels good and what’s true, we slide, almost without noticing, back into superstition and darkness.”"
by jay_kyburz
1/1/2026 at 1:59:50 PM
You talking about psychics or LLMs?by recursive
1/1/2026 at 2:17:42 PM
Yesby grosswait
1/1/2026 at 5:06:40 PM
2022/2023: "It hallucinates, it's a toy, it's useless."2024/2025: "Okay, it works, but it produces security vulnerabilities and makes junior devs lazy."
2026 (Current): "It is literally the same thing as a psychic scam."
Can we at least make predictions for 2027? What shall the cope be then! Lemme go ask my psychic.
by ctoth
1/2/2026 at 3:28:09 AM
I suppose it's appropriate that you hallucinated an argument I did not make, attacked the straw man, and declared victory.by wpietri
1/1/2026 at 5:25:38 PM
2022/2023: "Next year software engineering is dead"2024: "Now this time for real, software engineering is dead in 6 months, AI CEO said so"
2025: "I know a guy who knows a guy who built a startup with an LLM in 3 hours, software engineering is dead next year!"
What will be the cope for you this year?
by bopbopbop7
1/1/2026 at 8:15:30 PM
The cope + disappointment will be knowing that a large population of HN users will paint a weird alternative reality. There are a multitude of messages about AI that are out there, some are highly detached from reality (on the optimistic and pessimistic side). And then there is the rational middle, professionals who see the obvious value of coding agents in their workflow and use them extensively (or figure out how to best leverage them to get the most mileage). I don't see software engineering being "dead" ever, but the nature of the job _has already changed_ and will continue to change. Look at Sonnet 3.5 -> 3.7 -> 4.5 -> Opus 4.5; that was 17 months of development and the leaps in performance are quite impressive. You then have massive hardware buildouts and improvements to stack + a ton of R&D + competition to squeeze the juice out of the current paradigm (there are 4 orders of magnitude of scaling left before we hit real bottlenecks) and also push towards the next paradigm to solve things like continual learning. Some folks have opted not to use coding agents (and some folks like yourself seem to revel in strawmanning people who point out their demonstrable usefulness). Not using coding agents in Jan 2026 is defensible. It won't be defensible for long.by aspenmartin
1/1/2026 at 8:25:58 PM
Please do provide some data for this "obvious value of coding agents". Because right now the only thing obvious is the increase in vulnerabilities, people claiming they are 10x more productive but aren't shipping anything, and some AI hype bloggers that fail to provide any quantitative proof.by bopbopbop7
1/1/2026 at 8:39:15 PM
Sure: at my MAANG company, where I watch the data closely on adoption of CC and other internal coding agent tools, most (significant) LOC are written by agents, and most employees have adopted coding agents as WAU, and the adoption rate is positively correlated with seniority.Like a lot of things LLM related (Simon Willison's pelican test, researchers + product leaders implementing AI features) I also heavily "vibe" check the capabilities myself on real work tasks. The fact of the matter is I am able to dramatically speed up my work. It may be actually writing production code + helping me review it, or it may be tasks like: write me a script to diagnose this bug I have, or build me a streamlit dashboard to analyze + visualize this ad hoc data instead of me taking 1 hour to make visualizations + munge data in a notebook.
> people claiming they are 10x more productive but aren't shipping anything, and some AI hype bloggers that fail to provide any quantitative proof.
what would satisfy you here? I feel you are strawmanning a bit by picking the most hyperbolic statements and then blanketing that on everyone else.
My workflow is now:
- Write code exclusively with Claude
- Review the code myself + use Claude as a sort of review assistant to help me understand decisions about parts of the code I'm confused about
- Provide feedback to Claude to change / steer it away or towards approaches
- Give up when Claude is hopelessly lost
It takes a bit to get the hang of the right balance but in my personal experience (which I doubt you will take seriously but nevertheless): it is quite the game changer and that's coming from someone who would have laughed at the idea of a $200 coding agent subscription 1 year ago
by aspenmartin
1/2/2026 at 2:30:35 AM
We probably work at the same company, given you used MAANG instead of FAANG.As one of the WAU (really DAU) you’re talking about, I want to call out a couple things: 1) the LOC metrics are flawed, and anyone using the agents knows this - eg, ask CC to rewrite the 1 commit you wrote into 5 different commits, now you have 5 100% AI-written commits; 2) total speed up across the entire dev lifecycle is far below 10x, most likely below 2x, but I don’t see any evidence of anyone measuring the counterfactuals to prove speed up anyways, so there’s no clear data; 3) look at token spend for power users, you might be surprised by how many SWE-years they’re spending.
Overall it’s unclear whether LLM-assisted coding is ROI-positive.
by Denzel
1/1/2026 at 8:54:53 PM
Anecdotes don’t prove anything, ones without any metrics, and especially at MAANG where AI use is strongly incentivized.Evidence is peer reviewed research, or at least something with metrics. Like the METR study that shows that experienced engineers often got slower on real tasks with AI tools, even though they thought they were faster.
by bopbopbop7
1/1/2026 at 10:16:11 PM
That's why I gave you data! METR study was 16 people using Sonnet 3.5/3.7. Data I'm talking about is 10s of thousands of people and is much more up to date.Some counter examples to METR that are in the literature but I'll just say: "rigor" here is very difficult (including METR) because outcomes are high dimensional and nuanced, or ecological validity is an issue. It's hard to have any approach that someone wouldn't be able to dismiss due to some issue they have with the methodology. The sources below also have methodological problems just like METR
https://arxiv.org/pdf/2302.06590 -- 55% faster implementing HTTP server in javascript with copilot (in 2023!) but this is a single task and not really representative.
https://demirermert.github.io/Papers/Demirer_AI_productivity... -- "Though each experiment is noisy, when data is combined across three experiments and 4,867 developers, our analysis reveals a 26.08% increase (SE: 10.3%) in completed tasks among developers using the AI tool. Notably, less experienced developers had higher adoption rates and greater productivity gains." (but e.g. "completed tasks" as the outcome measure is of course problematic)
To me, internal company measures for large tech companies will be most reliable -- they are easiest to track and measure, the scale is large enough, and the talent + task pool is diverse (junior -> senior, different product areas, different types of tasks). But then outcome measures are always a problem...commits per developer per month? LOC? task completion time? all of them are highly problematic, especially because its reasonable to expect AI tools would change the bias and variance of the proxy so its never clear if you're measuring the change in "style" or the change in the underlying latent measure of productivity you care about
by aspenmartin
1/1/2026 at 11:59:26 PM
Meta internal study showed a 6-12% productivity uplift.by Ianjit
1/1/2026 at 10:38:44 PM
To be fair, I’ll take a non-biased 16 person study over “internal measures” from a MAANG company that burned 100s of billions on AI with no ROI that is now forcing its employees to use AI.by bopbopbop7
1/2/2026 at 2:04:56 AM
What do you think about the METR 50% task length results? About benchmark progress generally?by anorwell
1/2/2026 at 12:08:21 AM
I could have guessed you would say that :) but METR is not an unbiased study either. Maybe you mean that METR is less likely to intentionally inflate their numbers?If you insist or believe in a conspiracy I don’t think there’s really anything I or others will be able to say or show you that would assuage you, all I can say is I’ve seen the raw data. It’s a mess and again we’re stuck with proxies (which are bad since you start conflating the change in the proxy-latent relationship with the treatment effect). And it’s also hard and arguably irresponsible to run RCTs.
All I will say is: there are flaws everywhere. METR results are far from conclusive. Totally understandable if there is a mismatch between perception and performance. But also consider: even if task takes the same or even slightly more time, one big advantage for me is that it substantially reduces cognitive load so I can work in parallel sessions on two completely different issues.
by aspenmartin
1/2/2026 at 12:28:24 AM
I bet it does reduce your cognitive load, considering you, in your own words "Give up when Claude is hopelessly lost". No better way to reduce cognitive load.by bopbopbop7
1/2/2026 at 1:03:22 AM
I give up using Claude when it gets hopelessly lost, and then my cognitive load increases.by aspenmartin
1/1/2026 at 10:40:57 PM
> - Give up when Claude is hopelessly lostYou love to see "Maybe completely waste my time" as part of the normal flow for a productivity tool
by insin
1/2/2026 at 12:09:51 AM
That negates everything else? If you have a tool that can boost you for 80% of your work and for the other 20% you just have to do what you’re already doing, is that bad?by aspenmartin
1/2/2026 at 1:45:44 AM
There's a reason why sunk cost IS a fallacy and not a sound strategy.by shimman
1/1/2026 at 11:56:35 PM
The productivity uplift is massive, Meta got a 6-12% productivity uplift from AI coding!by Ianjit
1/1/2026 at 8:28:04 PM
The nature of my job has always been fighting red tape, process, and stake holders to deploy very small units of code to production. AI really did not help with much of that for me in 2025.I'd imagine I'm not the only one who has a similar situation. Until all those people and processes can be swept away in favor of letting LLMS YOLO everything into production, I don't see how that changes.
by nsxwolf
1/1/2026 at 8:41:12 PM
No I think that's extremely correct. I work at a MAANG where we have the resources to hook up custom internal LLMs and agents to actually deal with that but that is unique to an org of our scale.by aspenmartin
1/1/2026 at 9:37:53 AM
2025 was the year of development tool using AI agents. I think we'll shift attention to non development tool using AI agents. Most business users are still stuck using chat gpt as some kind of grand oracle that will write their email or powerpoint slides. There are bits and pieces of mostly technology demo level solutions but nothing that is widely used like AI coding tools are so far. I don't think this is bottle necked on model quality.I don't need an AGI. I do need a secretary type agent that deals with all the simple but yet laborious non technical tasks that keep infringing on my quality engineering time. I'm CTO for a small startup and the amount of non technical bullshit that I need to deal with is enormous. Some examples of random crap I deal with: figuring out contracts, their meaning/implication to situations, and deciding on a course of action; Customer offers, price calculations, scraping invoices from emails and online SAAS accounts, formulating detailed replies to customer requests, HR legal work, corporate bureaucracy, financial planning, etc.
A lot of this stuff can be AI assisted (and we get a lot of value out of ai tools for this) but context engineering is taking up a non trivial amount of my time. Also most tools are completely useless at modifying structured documents. Refactoring a big code base, no problem. Adding structured text to an existing structured document, hardest thing ever. The state of the art here is an ff-ing sidebar that will suggest you a markdown formatted text that you might copy/paste. Tool quality is very primitive. And then you find yourself just stripping all formatting and reformatting it manually. Because the tools really suck at this.
by jillesvangurp
1/1/2026 at 2:42:48 PM
> Some examples of random crap I deal with: figuring out contracts, their meaning/implication to situations, and deciding on a course of actionThis doesn’t sound like bullshit you should hand off to an AI. It sounds like stuff you would care about.
by arcatech
1/1/2026 at 3:48:14 PM
I do care about it; kind of my duty as a co-founder. Which is why I'm spending double digit percentages of my time doing this stuff. But I absolutely could use some tools to cut down on a lot of the drudgery that is involved with this. And me reading through 40 pages of dense legal German isn't one of my strengths since I 1) do not speak German 2) am not a lawyer and 3) am not necessarily deeply familiar with all the bureaucracy, laws, etc.But I can ask intelligent questions about that contract from an LLM (in English) and shoot back and forth a few things, come up with some kind of action plan, and then run it by our laywers and other advisors.
That's not some kind of hypothetical thing. That's something that happened multiple times in our company in the last few months. LLMs are very empowering for dealing with this sort of thing. You still need experts for some stuff. But you can do a lot more yourself now. And as we've found out, some of the "experts" that we relied on in the past actually did a pretty shoddy job. A lot of this stuff was about picking apart the mess they made and fixing it.
As soon as you start drafting contracts, it gets a lot harder. I just went through a process like that as well. It involves a lot of manual work that is basically about formatting documents, drafting text, running pdfs and text snippets through chat gpt for feedback, sparring, criticism, etc. and iterating on that. This is not about vibe coding some contract but making sure every letter of a contract is right. That ultimately involves lawyers and negotiating with other stakeholders but it helps if you come prepared with a more or less ready to sign off on document.
It's not about handing stuff off but about making LLMs work for you. Just like with coding tools. I care about code quality as well. But I still use the tools to save me a lot of time.
by jillesvangurp
1/1/2026 at 4:07:49 PM
One of the lessons I learned running a startup is that it doesn't matter how good the professionals you hire are for things like legal and accounting, you still need to put work in yourself.Everyone makes mistakes and misses things, and as the co-founder you have to care more about the details than anyone else does.
I would have loved to have weird-unreliable-paralegal-Claude available back when I was doing that!
by simonw
1/1/2026 at 2:47:41 PM
Agree. Even asking it can anchor your thinking.by nrclark
1/1/2026 at 3:55:00 PM
`Also most tools are completely useless at modifying structured documents`we built a tool for this for the life science space and are opening it up to the general public very soon. Email me I can give you access (topaz at vespper dot com)
by topaztee
1/1/2026 at 8:20:36 AM
> All these improvement in a single year> hard to argue against the value of current AI
> People are willing to pay $200 per month, and it is getting $1B dollar runway already.
Those are 3 different things. There can be a LOT of fast and significant improvements but still remain extremely far from the actual goal, so far it looks like actually little progress.
People pay for a lot of things, including snake oil, so convincing a lot of people to pay a bit is not in itself a proof of value, especially when some people are basically coerced into this, see how many companies changed their "strategy" to mandating AI usage internally, or integration for a captive audience e.g. Copilot.
Finally yes, $1B is a LOT of money for you and I... but for the largest corporations it's actually not a lot. For reference Google earned that in revenue... per day in 2023. Anyway that's still a big number BUT it still has to be compared with, well how much does OpenAI burn. I don't have any public number on that but I believe the consensus is that it's a lot. So until we know that number we can't talk about an actual runway.
by utopiah
1/1/2026 at 8:23:06 PM
> People pay for a lot of things, including snake oil, so convincing a lot of people to pay a bit is not in itself a proof of valueBut do you really believe e.g. Claude code is snake oil? I pay $200 / month for Claude, which is something I would have thought monumentally insane maybe 1-2 years ago (e.g. when ChatGPT came out with their premium subscription price I thought that seemed so out of touch). I don't think we would be seeing the subscription rates and the retention numbers if it really was snake oil.
> Finally yes, $1B is a LOT of money for you and I... but for the largest corporations it's actually not a lot. For reference Google earned that in revenue... per day in 2023. Anyway that's still a big number BUT it still has to be compared with, well how much does OpenAI burn. I don't have any public number on that but I believe the consensus is that it's a lot. So until we know that number we can't talk about an actual runway.
this gets brought up a lot but I'm not sure I understand why folks on a forum called YCombinator, a startup accelerator, would make this sound like an obvious sign of charlatanism; operating at a loss is nothing new and anthropic / openAI strategy seems perfectly rational: they are scaling and capturing market share, and TAM is insane.
by aspenmartin
1/1/2026 at 9:39:03 AM
Investing a trillion dollars for a revenue of a billion dollars doesn't sound great yet.by pjc50
1/1/2026 at 2:11:55 PM
Indeed, its the old Uber playbook at nearly two extra orders of magnitude.It is a large enough number to simply run out of private capital to consume before it turns cash flow positive.
Lots of things sell well if sold at such a loss. I’d take a new Ferrari for $2500 if it was on offer.
by steveBK123
1/1/2026 at 7:34:11 PM
Did Uber actually do a lot of capital investment? They don't own the cars, for example.by pjc50
1/2/2026 at 1:46:45 AM
Uber nakedly broke the law and beat down labor, I'm honestly shocked none of the executives went to prison.by shimman
1/1/2026 at 7:55:45 PM
I believe they spent a huge amount of money on incentives to help sign up drivers, and discounts to help attract customers.by simonw
1/2/2026 at 1:08:16 AM
Yes, but that's loss leader rather than capital investment. You can't put a customer on the balance sheet and depreciate them. Once you've paid for a free ride, you own nothing tangible.by pjc50
1/1/2026 at 6:56:12 PM
You say that as if Uber's playbook didn't work. Try this: https://www.google.com/finance/quote/UBER:NYSEby aoeusnth1
1/1/2026 at 2:56:02 PM
Uber’s playbook worked for Uberby derwiki
1/1/2026 at 10:13:41 PM
> Every single part of the hardware stack are being fused with money and demand. The last time we have this was Post-PC / Smartphone era which drove the hardware industry forward for 10 - 15 years. The current AI can at least push hardware for another 5 - 6 years while pulling forward tech that was initially 8 - 10 years away.It’s very unclear how much end-consumer hardware and DIY builders will benefit from that, as opposed to server-grade hardware that only makes sense for the enterprise marker. It could have the opposite effect, like hardware manufacturers leaving the consumer market (as in the case of Micron), because there’s just not that much money in it.
by layer8
1/1/2026 at 7:05:29 PM
It's a great tool, but right now it's only being used to feed the greed.>> Again, I guess no one knew AI would be as big as it is today, and it is only just started.
People have been saying similar about self driving cars for years now. "AI" is another one of those expensive ideas that we'll get 85% of the way there and then to get the other 15% will be way more expensive than anyone will want to pay for. It's already happening - HW prices and electricity - people are starting to ask, "if I put more $ into this machine, when am I actually going to start getting money out?" The "true believers" are like, soon! But people are right to be hugely skeptical.
by HumblyTossed
1/1/2026 at 7:45:28 PM
There are some things it's really great at. For example, handling a css layout. If we have to spend trillions of dollars and get nothing else out of it other than being able to vertically center a <div> without wrestling with css and wanting to smash the keyboard in the process, it will all have been worth it.by jliptzin
1/2/2026 at 5:15:42 AM
Not to be cheeky, but isn’t this justdisplay: flex; align-items:center;
now?
by falkensmaize
1/1/2026 at 8:28:23 PM
I agree -- skepticism is totally healthy. And there are so many great ways to poke holes in the true underlying narratives (not the headlines that people seem to pull from). E.g. evaluation science is a wasteland (not for wont of very smart people trying very hard to get them right). How do we tackle the power requirements in a way that is sustainable? Etc. etc.But stuff like this im not sure I understand:
> It's a great tool, but right now it's only being used to feed the greed.
if its a great tool, then how is it _only_ being used to "feed the greed" and what do you mean by that?
Also I think folks are quick to make analogies to other points in history: "AI is like the dot com boom we're going to crash and burn" and "AI is like {self driving cars, crypto, etc} and the promises will all be broken, its all hype" but this removes the nuance: all of these things are extremely different with very specific dynamics that in _some_ ways may be similar but in many crucial and important ways are completely different.
by aspenmartin
1/2/2026 at 12:49:07 AM
>> if its a great tool, then how is it _only_ being used to "feed the greed" and what do you mean by that?Look around?
by HumblyTossed
1/2/2026 at 1:02:20 AM
Very confused, I still don’t know what you mean at allby aspenmartin
1/1/2026 at 8:03:40 AM
Seems like Nvidia will be focusing on the super beefy GPUs and leaving the consumer market to a smaller playerby coffeebeqn
1/1/2026 at 9:55:34 AM
I don't get why Nvidia can't do both? Is it because of the limited production capabilities of the factories?by Flow
1/1/2026 at 10:20:22 AM
Yes. If you're bottlenecked on silicon and secondaries like memory, why would you want to put more of those resources into lower margin consumer products if you could use those very resources to make and sell more high margin AI accelerators instead?From a business standpoint, it makes some sense to throttle the gaming supply some. Not to the point of surrendering the market to someone else probably, but to a measurable degree.
by ACCount37
1/1/2026 at 12:42:57 PM
We will have to wait and see but my bet is that Nvidia will move to Leading Edge node N2 earlier now they have the Margin to work with. Both Hopper and Blackwell were too late in the design cycle. The AI hype and continue to buy the latest and great leaving Gaming at a mainstream node.Nvidia using Mainstream node has always been the norm considering most Fab capacity always goes to Mobile SoC first. But I expect the internet / gamers will be angry anyway because Nvidia does not provide them with the latest and greatest.
In reality the extra R&D cost for designing with leading edge will be amortised by all the AI order which give Nvidia competitive advantage at the consumer level when they compete. That is assuming there are competition because most recent data have shown Nvidia owning 90%+ of discreet market share, 9% for AMD and 1% for Intel.
by ksec
1/1/2026 at 8:20:18 AM
AMD owns a lot of the consumer market already; handhelds, consoles, desktop rigs and mobile ... they are not a small player.by _s
1/1/2026 at 8:21:28 AM
They said "smaller" not small.by utopiah
1/1/2026 at 8:31:40 AM
These are not all improvements. Listed:* The year of YOLO and the Normalization of Deviance
* The year that Llama lost its way
* The year of alarmingly AI-enabled browsers
* The year of the lethal trifecta
* The year of slop
* The year that data centers got extremely unpopular
by chias
1/1/2026 at 6:33:14 PM
Not that YOLO, PJ Reddie released that in 2015by Y_Y
1/1/2026 at 3:26:09 PM
Said differently - the year we start to see all of the externalities of a globally scaled hyped tech trend.by mbesto
1/1/2026 at 2:16:39 PM
> * The year that data centers got extremely unpopularI was discussing the political angle with a friend recently. I think Big Tech Bro / VC complex has done themselves a big disservice by aligning so tightly with MAGA to the point AI will be a political issue in 2026 & 2028.
Think about the message they’ve inadvertently created themselves - AI is going to replace jobs, it’s pushing electric prices up, we need the government to bail us out AND give us a regulatory light touch.
Super easy campaign for Dems - big tech trumpers are taking your money, your jobs, causing inflation, and now they want bailouts !!
by steveBK123
1/1/2026 at 4:06:39 PM
> People are willing to pay $200 per monthSome people are of course, but how many?
> ... People are willing to pay $200 per month
This is just low-key hype. Careful with your portfolio...
by Atomic_Torrfisk
1/1/2026 at 10:21:24 AM
Is the AI progress in 2025 an outstanding breakthrough? Not really. It's impressive but incremental.Still, the gap between the capabilities of a cutting edge LLM and that of a human is only this wide. There are only this many increments it takes to cross it.
by ACCount37
1/1/2026 at 7:26:12 PM
>> But it is hard to argue against the value of current AI, which many of the vocal critics on HN seems to have the opinion of.What is the concrete business case? Can anyone point to a revenue producing company using AI in production, and where AI is a material driver of profits?
Tool vendors don’t count. I’m not interested in how much money is being made selling shovels...show me a miner who actually struck gold please.
by belter
1/2/2026 at 12:13:13 AM
A lot of programmers seem willing to pay for the likes of Claude Code, presumably because it helps them get more done. Programmers cost money so that's a potential cost saving?by tim333
1/1/2026 at 10:02:12 AM
[flagged]by tstrimple
1/1/2026 at 10:54:36 AM
Sam Altman [1] certainly seems to talk about AGI quite a bitby cherryteastain
1/1/2026 at 10:30:14 AM
Honestly, I wouldn't be surprised if a system that's an LLM at its core can attain AGI. With nothing but incremental advances in architecture, scaffolding, training and raw scale.Mostly the training. I put less and less weight on "LLMs are fundamentally flawed" and more and more of it on "you're training them wrong". Too many "fundamental limitations" of LLMs are ones you can move the needle on with better training alone.
The foundation of LLM is flexible and capable, and the list of "capabilities that are exclusive to human mind" is ever shrinking.
by ACCount37
1/1/2026 at 3:27:48 PM
They seem to be missing a bit on learning as you go and thinking about things and getting new insights.by tim333
1/1/2026 at 11:03:32 PM
In-context learning and reasoning cover that already, and you could expand on that. Nothing prevents an LLM from fine-tuning itself either, other than its own questionable fine tuning skills and the compute budget.by ACCount37
1/1/2026 at 3:28:18 PM
That depends on how you define AGI - it's a meaningless term to use since everyone uses it to mean different things. What exactly do you mean ?!Yes, there is a lot that can be improved via different training, but at what point is it no longer a language model (i.e. something that auto-regressively predicts language continuations)?
I like to use an analogy to the children's "Stone Soup" story whereby a "stone soup" (starting off as a stone in a pot of boiling water) gets transformed into a tasty soup/stew by strangers incrementally adding extra ingredients to "improve the flavor" - first a carrot, then a bit of beef, etc. At what point do you accept that the resulting tasty soup is not in fact stone soup?! It's like taking an auto-regressively SGD-trained Transformer, and incrementally tweaking the architecture, training algorithm, training objective, etc, etc. At some point it becomes a bit perverse to choose to still call it a language model
Some of the "it's just training" changes that would be needed to make today's LLMs more brain-like may be things like changing the training objective completely from auto-regressive to predicting external events (with the goal of having it be able to learn the outcomes of it's own actions, in order to be able to plan them), which to be useful would require the "LLM" to then be autonomous and act in some (real/virtual) world in order to learn.
Another "it's just training" change would be to replace pre/mid/post-training with continual/incremental runtime learning to again make the model more brain-like and able to learn from it's own autonomous exploration of behavior/action and environment. This is a far more profound, and ambitious, change than just fudging incremental knowledge acquisition for some semblance of "on the job" learning (which is what the AI companies are currently working on).
If you put these two "it's just training/learning" enhancements together then you've now got something much more animal/human-like, and much more capable than an LLM, but it's already far from a language model - something that passively predicts next word every time you push the "generate next word" button. This would now be an autonomous agent, learning how to act and control/exploit the world around it. The whole pre-trained, same-for-everyone, model running in the cloud, would then be radically different - every model instance is then more like an individual learning based on it's own experience, and maybe you're now paying for compute for the continual learning compute rather than just "LLM tokens generated".
These are "just" training (and deployment!) changes, but to more closely approach human capability (but again, what to you mean by "AGI"?) there would also need to be architectural changes and additions to the "Transformer" architecture (add looping, internal memory, etc), depending on exactly how close you want to get to human/animal capability.
by HarHarVeryFunny