1/13/2025 at 11:58:22 AM
An old manager of mine once spent the day trying to kill a process that was running at 99% on Windows box.When I finally got round to see what he was doing I was disappointed to find he was attempting to kill the 'system idle' process.
by WediBlino
1/13/2025 at 4:29:12 PM
Years ago I worked for a company that provided managed hosting services. That included some level of alarm watching for customers.We used to rotate the "person of contact" (POC) each shift, and they were responsible for reaching out to customers, and doing initial ticket triage.
One customer kept having a CPU usage alarm go off on their Windows instances not long after midnight. The overnight POC reached out to the customer to let them know that they had investigated and noticed that "system idle processes" were taking up 99% of CPU time and the customer should probably investigate, and then closed the ticket.
I saw the ticket within a minute or two of it reopening as the customer responded with a barely diplomatic message to the tune of "WTF". I picked up that ticket, and within 2 minutes had figured out the high CPU alarm was being caused by the backup service we provided, apologised to the customer and had that ticket closed... but not before someone not in the team saw the ticket and started sharing it around.
I would love to say that particular support staff never lived that incident down, but sadly that particular incident was par for the course with them, and the team spent inordinate amount of time doing damage control with customers.
by Twirrim
1/13/2025 at 5:44:57 PM
In the 90s I worked for a retail chain where the CIO proposed to spend millions to upgrade the point-of-sale hardware. The old hardware was only a year old, but the CPU was pegged at 100% on every device and scanning barcodes was very sluggish.He justified the capex by saying if cashiers could scan products faster, customers would spend less time in line and sales would go up.
A little digging showed that the CIO wrote the point-of-sale software himself in an ancient version of Visual Basic.
I didn't know VB, but it didn't take long to find the loops that do nothing except count to large numbers to soak up CPU cycles since VB didn't have a sleep() function.
by panarky
1/13/2025 at 7:40:29 PM
That's hilarious. I had a similar situation, also back in the 90s, when a developer shipped some code that kept pegging the CPU on a production server. He insisted it was the server, and the company should spend $$$ on a new one to fix the problem. We went back-and-forth for a while: his code was crap versus the server hardware was inadequate, and I was losing the battle, because I was just a lowly sysadmin, while he was a great software engineer. Also, it was Java code, and back then, Java was kinda new, and everyone thought it could do no wrong. I wasn't a developer at all back then, but I decided to take a quick look at his code. It was basically this:1. take input from a web form
2. do an expensive database lookup
3. do an expensive network request, wait for response
4. do another expensive network request, wait for response
5. and, of course, another expensive network request, wait for response
6. fuck it, another expensive network request, wait for response
7. a couple more database lookups for customer data
8. store the data in a table
9. store the same data in another table. and, of course, another one.
10. now, check to see if the form was submitted with valid data. if not, repeat all steps above to back-out the data from where it was written.
11. finally, check to see if the customer is a valid/paying customer. if not, once again, repeat all the steps above to back-out the data.
I looked at the logs, and something like 90% of the requests were invalid data from the web form or invalid/non-paying customers (this service was provided only to paying customers).
I was so upset from this dude convincing management that my server was the problem that I sent an email to pretty much everyone that said, basically, "This code sucks. Here's the problem: check for invalid data/customers first.", and I included a snippet from the code. The dude replied-to-all immediately, claiming I didn't know anything about Java code, and I should stay in my lane. Well, throughout the day, other emails started to trickle in, saying, "Yeah, the code is the problem here. Please fit it ASAP." The dude was so upset that he just left, he went completely AWOL, he didn't show up to work for a week or so. We were all worried, like he jumped off a bridge or something. It turned into an HR incident. When he finally returned, he complained to HR that I stabbed him in the back, that he couldn't work with me because I was so rude. I didn't really care; I was a kid. Oh yeah, his nickname became AWOL Wang. LOL
by jimt1234
1/13/2025 at 8:44:40 PM
Hehe, being a Java dev since the late 90’s meant seeing a lot of bad code. My favorite was when I was working for a large life insurance company.The company’s customer-facing website was servlet based. The main servlet was performing horribly, time outs, spinners, errors etc. Our team looked at the code and found that the original team implementing the logic had a problem they couldn’t figure out how to solve, so they decided to apply the big hammer: they synchronized the doService() method… oh dear…
by eludwig
1/13/2025 at 9:12:26 PM
For those not familiar with servlets, this means serializing every single request to the server that hits that servlet. And a single servlet can serve many different pages. In fact, in the early days, servlet filters didn't exist, so you would often implement cross-cutting functionality like authentication using a servlet.TBF, I don't think a lot of developers at the time (90's) were used to the idea of having to write MT-safe callback code. Nowadays thousands of object allocations per second is nothing to sweat over, so a framework might make a different decision to instantiate callbacks per request by default.
by foobazgt
1/14/2025 at 2:39:40 AM
I am a little confused. He was intentionally sabotaging performance?by liontwist
1/14/2025 at 7:56:53 AM
People write code that does sleep statements when waiting for something else to happen. It makes sense in some contexts. Think of it like async/await with an event loop. Except you are using the OS scheduler like your “event loop”. And you sleep instead of awaiting.Now, if your language lacks the sleep statement or some other way to yield execution, what should you do instead when your program has no work to do? Actually, I don’t know what the answer is.
by aoanevdus
1/14/2025 at 2:57:53 PM
Later versions of VB (4 and later IIRC) did have a sleep function, though many didn't bother using it and kept with their DoEvents loops instead (which would allow their own UI updates to process but still kept the CPU pegged as much as their process could). With earlier versions you could actually call the windows sleep API. Whether using the OS sleep() or the built-in function (itself just a wrapper around the OS sleep() function), it was worth calling DoEvents a couple of times first to ensure any progress information you'd updated on your UI had been processed, so the user can see it.by dspillett
1/14/2025 at 4:33:32 PM
Thanks for explaining.(I disagree that you should be sleeping for any OS event, this is what blocking kernel events do automatically)
by liontwist
1/13/2025 at 12:48:11 PM
That's what managers do.Silly idle process.
If you've got time for leanin', you've got time for cleanin'
by m463
1/13/2025 at 12:46:27 PM
I abandonned Windows 8 for linux because of an bug (?) where my HDD was showing it was 99% busy all the time. I had removed every startup program that could be and analysed thouroughly for any viruses, to no avail. Had no debugging skills at the time and wasn't sure the hardware could stand windows 10. That's how linux got me.by cassepipe
1/13/2025 at 4:16:45 PM
Recent Linux distributions are quickly catching up to Windows and macOS. Do a fresh install of your favorite distribution and then use 'ps' to look at what's running. Dozens of processes doing who knows what? They're probably not pegging your CPU at 100%, which is good, but it seems that gone are the days when you could turn on your computer and it was truly idle until you commanded it to actually do something. That's a special use case now, I suppose.by ryandrake
1/13/2025 at 4:55:12 PM
IME on Linux the only things that use random CPU while idle are web browsers. Otherwise, there's dbus and NetworkManager and bluez and oomd and stuff, but most processes have a fraction of a second used CPU over months. If they're not using CPU, they'll presumably swap out if needed, so they're using ~nothing.by ndriscoll
1/13/2025 at 7:14:53 PM
This is one the reasons I love FreeBSD. You boot up a fresh install of FreeBSD and there are only a couple processes running and I know what each of them does / why they are there.by craftkiller
1/13/2025 at 8:21:29 PM
At least under some circumstances Linux shows (schedulable) threads as separate processes. Just be aware of that.by m3047
1/13/2025 at 4:56:10 PM
this is why I use arch btwby johnmaguire
1/13/2025 at 6:04:14 PM
Add Gnome3 and you can have that too! Source: me, a arch+gnome user, who recently had to turn off the search indexer as it was stuck processing countless multi-GB binary files...by diggan
1/13/2025 at 7:50:31 PM
Exactly, or Void, or Alpine, but I love pacman.by johnisgood
1/13/2025 at 5:15:33 PM
this guy archesby rirze
1/14/2025 at 12:52:19 AM
I recommend using systemd-cgls to get a better idea of what's going on.by ciupicri
1/13/2025 at 4:01:13 PM
Why is this such a huge issue if it merely shows it's busy, but the performance of it indicates that it actually isn't? Switching to Linux can be a good choice for a lot of people, the reason just seems a bit odd here. Maybe it was simply the straw that broke the camel's back.by margana
1/13/2025 at 5:03:31 PM
1. I expect that a HD that is actually doing things 100% of the time is going to have it's lifespan significantly reduce, and2. If it isn't doing anything and it just lying to you... when there IS a problem, your tools to diagnose the problem are limited because you can't trust what they're telling you
by RHSeeger
1/13/2025 at 5:58:09 PM
Over the years I have used top and friends to profile machines and identify expensive bottlenecks. Once one comes to count on those tools, the idea of one being wrong, and actually really wrong! --is just a bad rub.Fixing it would be gratifying and reassuring too.
by ddingus
1/13/2025 at 2:00:37 PM
I had this happen with an nvme drive. Tried changing just about every setting that affected the slot.Everything worked fine on my Linux install ootb
by saintfire
1/13/2025 at 4:50:23 PM
Windows 8/8.1/10 had an issue for a while where when it was run on spinning rust HDD it would peg it out and slow the system to a crawl.The only solution was to swap over to a SSD.
by BizarroLand
1/13/2025 at 3:38:33 PM
[dead]by dr-detroit
1/13/2025 at 5:26:26 PM
To be fair, it is a really poorly named "process". The computer equivalent of the "everything's ok" alarm.by nullhole
1/13/2025 at 5:39:57 PM
Long enough ago (win95 era) it wasn't part of Windows to sleep the CPU when there was no work to be done. It always assigned some task to the CPU. The system idle process was a way to do this that played nicely with all of the other process management systems. I don't remember when they finally added CPU power management. SP3? Win98? Win98SE? Eh, it was somewhere in there.by chowells
1/13/2025 at 6:49:30 PM
I remember listening on FM radio to my 100MHz computer running FreeBSD, which sounded like calm rain, and to Windows 95, which sounded like a screaming monster.by drsopp
1/14/2025 at 3:39:11 AM
There were a number of hacks to deal with this. RAIN was very popular back in the day, but AMNHLTM appears to have better compatibility with modern CPUs.by eggsome
1/13/2025 at 6:42:11 PM
reminds of when i was a kid and noticed a virus had taken over a registry. from that point forward i attempted to delete every single registry file, not quite understanding. Between that and excessive bad website viewing, I dunno how i ever managed to not brick my operating system, unlike my grandma who seemed to brick her desktop in a timely fashion before each of the many monthly visits to her place.by Agentus
1/13/2025 at 10:17:38 PM
The things grandmas do to see their grandsons regularly. Smart. :-)by bornfreddy
1/13/2025 at 9:27:29 PM
I worked at a government site with a government machine at one time. I had an issue, so I took it to the IT desk. They were able to get that sorted, but then said I had another issue. "Your CPU is running at 100% all the time, because some sort of unkillable process is consuming all your cpu".Yep, that was "System Idle" that was doing it. They had the best people.
by jsight
1/13/2025 at 12:20:00 PM
Did he have a pointy hair?by belter
1/13/2025 at 7:44:12 PM
I wonder if you make a process with idle in it you could end up in the reverse track where users ignore it. Is there anything preventing an executable being named System Idle.by mrmuagi
1/13/2025 at 9:32:11 PM
You're keeping us in suspense. Did he ever manage to kill the System Idle process?by kernal
1/13/2025 at 12:58:09 PM
Windows used to have that habit of making the processes CPU starved, and yet claiming the CPU was idle all the time.Since the Microsoft response to the bug was denying and gaslighting the affected people, we can't tell for sure what caused it. But several people were in a situation where their computer couldn't finish any work, and the task-manager claimed all of the CPU time was spent on that line item.
by marcosdumay
1/13/2025 at 8:13:06 PM
As a former Windows OS engineer, based on the short statement here, my assumption would be that your programs are IO-bound, not CPU-bound, and that the next step would be to gather data (using a profiler) to investigate the bottlenecks. This is something any Win32 developer should learn how to do.Although I can understand how "Please provide data to demonstrate that this is an OS scheduling issue since app bottlenecks are much more likely in our experience" could come across as "denying and gaslighting" to less experienced engineers and layfolk
by nerdile
1/14/2025 at 1:01:25 AM
I'm not the original poster, but I ran into something similar late in Win 7 (Win 8 was in beta at the time). We had some painting software, and we used open-MP to work on each scan-line of a brush in parallel.It worked fine on Mac. On Windows though, if you let it use as many threads as there were CPUs, it would nearly 100% of the time fail before making it through our test suite. Something in scheduling the work would deadlock. It was more likely to fail if anything was open besides the app. Basically, a brush stoke that should complete in a tenth of a second would stall. If you waited 30-60 minutes (yes minutes), it would recover and continue.
I vaguely recall we used the Intel compiler implementation of OpenMP, not what comes with MSVC, so the fault wasn't necessarily a Microsoft issue, but could still be a kernel issue.
I left that company later that year, and MS rolled out Windows 8. No idea how long that bug stuck around.
by 1000100_1000101
1/13/2025 at 3:36:57 PM
> Since the Microsoft response to the bug was denying and gaslighting the affected peopleWell. I wouldn't go that far. Any busy dev team is incentivized to make you run the gauntlet:
1. It's not an issue (you have to prove to me it's an issue)
2. It's not my issue (you have to prove to me it's my issue)
3. It's not that important (you have to prove it has significant business value to fix it)
4. It's not that time sensitive (you have to prove it's worth fixing soon)
It was exactly like this at my last few companies. Microsoft is quite a lot like this as well.
If you have an assigned CSAM, they can help run the gauntlet. That's what they are there for.
See also: The 6 stages of developer realization:
https://www.amazon.com/Panvola-Debugging-Computer-Programmer...
by RajT88
1/13/2025 at 4:37:57 PM
Even when you have an expensive contract with Microsoft and a direct account manager to help you run the gauntlet you still end up having to deal with awful support people.Years ago at a job we were seeing issues with a network card on a VM. One of my coworkers spent 2-3 days working his way through support engineer after support engineer until they got into a call with one. He talked the engineer through what was happening. Remote VM, can only access over RDP (well, we could VNC too, but that idea just confuses Microsoft support people for some reason.)
The support engineer decided that the way to resolve the problem was to uninstall and re-install the network card driver. Coworker decided to give the support engineer enough rope to hang themselves with, hoping it'd help him escalate faster: "Won't that break the RDP connection?" "No sir, I've done this many times before, trust me" "Okay then...."
Unsurprisingly enough, when you uninstall the network card driver and cause the instance to have no network cards, RDP stops working. Go figure.
Co-worker let the support engineer know that he'd now lost access, and a guess why. "Oh, yeah. I can see why that might have been a problem"
Co-worker was right though, it did finally let us escalate further up the chain....
by Twirrim
1/15/2025 at 2:44:14 AM
But was it fixed after the driver reinstall?by brokenmachine
1/13/2025 at 4:02:56 PM
>If you have an assigned CSAMThat's an unfortunate acronym. I assume you mean Customer Service Account Manager.
by ziddoap
1/13/2025 at 4:42:00 PM
Customer Success Account Manager. And I would agree - it is very unfortunate.Definitely in my top 5 questionable acronym choices from MSFT.
by RajT88
1/14/2025 at 7:05:08 AM
That 1 to 4 gauntlet sounds orfully close to: https://youtube.com/watch?v=nb2xFvmKWRYby robocat
1/13/2025 at 4:05:05 PM
Your reticence to accept the term gaslighting clearly indicates you've never had to interact with MSFT support.by thatfunkymunki
1/13/2025 at 4:41:16 PM
On the contrary, I have spent thousands of hours interacting with MSFT support.What I'm getting at with my post is the dev teams support has to talk to, which they just forward along their responses verbatim.
A lot of MSFT support does suck. There are also some really amazing engineers in the support org.
I did my time in support early in my career (not at MSFT), and so I understand well it's extremely hard to hire good support engineers, and even harder to keep them. The skills they learn on the job makes them attractive to other parts of the org, and they get poached.
There is also an industry-wide tendency for developers to treat support as a bunch of knuckle-dragging idiots, but at the same time they don't arm them with detailed information on how stuff works.
by RajT88
1/13/2025 at 5:09:11 PM
> What I'm getting at with my post is the dev teams support has to talk to, which they just forward along their responses verbatim.But the "support" that the end user sees is that combination, not two different teams (even if they know it's two or more different teams). The point is that the end user reached out for help and was told their own experiences weren't true. The fact that Dave had Doug actually tell them that is irrelevant.
by RHSeeger
1/13/2025 at 5:27:13 PM
I guess I see your point.If we're going to call it gaslighting, then gaslighting is typical dev team behavior, which of course flows back down to support. It's a problem with Microsoft just like it is a problem for any other company which makes software.
by RajT88
1/13/2025 at 6:48:54 PM
I've never seen the same behavior from any other software supplier.Almost every software company out there will jump into their customers complaints, and try to fix the issue even when the root cause is not on their software.
by marcosdumay
1/13/2025 at 7:00:18 PM
I can't say I've seen it with every vendor. Or even internal dev team I've been an internal customer of - but I've seen it around a lot.You might be lucky in that you've worked at companies where you are a big enough customer they bend over backwards for you. For example: If you work for Wal-Mart, you probably get this less often. They are usually the biggest fish in whatever pond they are swimming in.
by RajT88
1/13/2025 at 1:47:50 PM
I've never heard of this. How do you know it's windows "gaslighting" users, and not something dumb like thermal throttling or page faults?by gruez
1/13/2025 at 2:07:31 PM
Well this is one possible scenario. Power management...."Windows 10 Task Manager shows 100% CPU but Performance Monitor Shows less than 2%" - https://answers.microsoft.com/en-us/windows/forum/all/window...
by belter
1/13/2025 at 3:00:59 PM
It's gaslighting because it consists on people from Microsoft explicitly saying that it is impossible, it's not how Windows behave, and the user's system is idle instead of overloaded.Gaslighting customers was the standard Microsoft's reaction to bugs until at least 2007, when I last oversaw somebody interacting with them.
by marcosdumay
1/13/2025 at 6:39:28 PM
To be fair, there are worse mistakes. It does say 99% CPU.by fifilura
1/13/2025 at 12:19:25 PM
[dead]by TacticalCoder