This is a disaster (for Intel)

LMG Clips
24 Jul 202408:33

Summary

TLDRIn this video, the discussion revolves around Intel's 13th and 14th generation chips facing significant stability issues, leading MMO publisher Alderon Games to switch to AMD servers. Despite initially working well, these Intel chips deteriorated, resulting in nearly 100% failure rates. The video delves into the potential recall implications, the immense cost and time required for Intel to fix these issues, and the broader impact on Intel's reputation, particularly with major clients like Dell and the U.S. government. The video underscores the serious challenges Intel faces in addressing these hardware problems.

Takeaways

  • 🚨 MMO publisher Alderon Games reports high failure rates in Intel's 13th and 14th gen chips, leading them to switch to AMD.
  • 🖥️ Alderon Games initially used Intel Core i7 and i9 chips for servers, which is not uncommon in the gaming industry despite traditional preferences for Xeon and EPYC.
  • ⚠️ The failure rates for these Intel chips are nearly 100%, with higher-end models (KS series) experiencing the worst issues.
  • 🔍 The failures appear to be hardware-related, as noted by tech reviewer Wendell from Level1Techs.
  • 🔧 A recall may be necessary if Intel cannot fix the issue, posing significant logistical and financial challenges.
  • 🏭 Fixing the hardware problem would take months, as it would require redirecting resources from future chip designs to the current faulty models.
  • 💼 The impact on major clients like government and large corporations could be substantial, as they rely heavily on stable and reliable server components.
  • 📉 The reputational damage for Intel extends beyond consumer trust, affecting relationships with large OEMs like Dell and HP.
  • 🕑 Intel is currently working on a fix for an ETB bug contributing to the instability, but there's no solution for the root problem yet.
  • 🤔 One interim solution might be for Intel to keep replacing defective chips under warranty, although this is not sustainable long-term.

Q & A

  • What company is experiencing issues with Intel's 13th and 14th gen chips?

    -Alderon Games is experiencing issues with Intel's 13th and 14th gen chips.

  • What action is Alderon Games taking in response to the stability issues?

    -Alderon Games is switching from Intel to AMD for their servers.

  • What specific problem is Alderon Games encountering with Intel's chips?

    -Alderon Games is encountering extremely high failure rates with Intel's chips, primarily due to errors from the chips.

  • Why is it not uncommon for game servers to use desktop chips like Core i7 and Core i9?

    -It's not uncommon because performance and value might be more important than features like ECC memory for utmost stability in game servers.

  • Which Intel chip models are reported to have the most issues?

    -The KS models of Intel chips are reported to have the most issues.

  • What potential solution is discussed for the chip failures?

    -A potential solution discussed is a recall by Intel, but it's complicated by the long time required to fix and manufacture new chips.

  • Why would a recall of the defective chips be a significant challenge for Intel?

    -A recall would be a significant challenge because it would require pulling R&D teams from future projects and take months to manufacture enough replacements.

  • What is the concern for large institutions like Dell or the U.S. government regarding the defective Intel chips?

    -The concern is the monumental cost and disruption caused by the defective parts in their systems.

  • What is the reputational risk for Intel if they cannot resolve the chip issues?

    -The reputational risk for Intel is losing trust with major customers like Dell, HP, and large institutions, who rely on Intel's historically reliable performance.

  • What is Intel's current status in addressing the root problem of the chip instability?

    -Intel is currently working on a fix for the ETB bug contributing to instability but has not yet indicated a fix for the root problem.

Outlines

00:00

💻 Alderson Games Switches to AMD Due to Intel Chip Issues

Alderson Games announces a shift from Intel's 13th and 14th gen processors to AMD, citing significant stability issues. The decision follows high failure rates in servers, especially with Intel Core i7 and i9 chips, despite Intel's attempts to steer clients towards their enterprise solutions. The video highlights how using desktop chips in servers is more common than expected, especially for game servers where performance and value are prioritized over features like ECC memory. The most affected Intel models are the 'K' series chips, which are experiencing the worst failures. The lower core count and lower clocked SKUs are less impacted. Wendle from Level One Techs suggests that Intel might need to issue a recall if they can't resolve the problem.

05:00

🛠 The Complexity of Fixing Intel's Defective Chips

Fixing the defective Intel chips presents a significant challenge due to the long development cycle of modern processors. Even if Intel starts working on a hardware fix immediately, it would take months to produce a single unit and much longer to manufacture enough to replace all faulty chips. This delay poses a substantial problem, especially for large customers like the US government or major OEMs like Dell and Lenovo, who have designed systems around these defective parts. Intel is currently addressing an ETB bug contributing to the instability but has not yet fixed the root problem. The reputational damage to Intel is concerning, particularly with major institutions that have traditionally relied on Intel for their server market share.

Mindmap

Keywords

💡Intel 13th and 14th Gen Chips

These refer to Intel's latest generations of processors, specifically the 13th and 14th series. In the video, these chips are discussed in the context of their stability issues and high failure rates when used in servers by Alderon Games. The instability and potential defects in these processors are central to the video's theme of reliability in computing hardware.

💡Alderon Games

Alderon Games is a company mentioned in the video that has been experiencing issues with Intel's 13th and 14th generation processors in their servers. They reported high failure rates with these chips, leading them to switch to AMD processors. This example illustrates the practical implications of hardware reliability in the gaming industry.

💡Server Stability

Server stability refers to the consistent and reliable operation of server hardware and software. In the video, the instability of Intel's processors is highlighted as a significant problem for Alderon Games, causing operational disruptions. This concept is crucial as it underscores the importance of reliable hardware for companies relying on server-based operations.

💡Core i7 and Core i9

These are high-performance processors from Intel, often used in both consumer desktops and professional workstations. The video discusses the unexpected use of these consumer-grade CPUs in server environments by Alderon Games, which typically would utilize more robust, server-specific processors like Intel's Xeon or AMD's EPYC.

💡Xeon and EPYC

Xeon and EPYC are Intel's and AMD's server-grade processors, respectively. These processors are designed for high reliability and performance in demanding server environments. The video contrasts these with consumer-grade processors, highlighting that companies like Alderon Games sometimes choose consumer CPUs for cost or performance reasons, despite potential stability trade-offs.

💡Hardware Recall

A hardware recall is an action taken by a manufacturer to address a widespread defect in a product, often involving the return or replacement of the defective items. The video speculates on the possibility of Intel needing to issue a recall for the faulty processors if a fix cannot be found, illustrating the potential scale and impact of such an issue on both consumers and large institutions.

💡R&D (Research and Development)

R&D involves the activities companies undertake to innovate and introduce new products or services. The video mentions Intel's R&D teams working on future processors, emphasizing the lengthy development cycles and the challenges of diverting resources to address issues with current products. This underscores the complexity and time investment involved in producing cutting-edge technology.

💡ETB Bug

ETB (Execution Trace Buffer) bug refers to a specific technical issue mentioned in the video that contributes to the instability of Intel's processors. Although not fully detailed, the bug is part of the broader discussion about the defects in Intel's hardware, highlighting how even minor flaws can have significant impacts in high-reliability applications.

💡Consumer and Enterprise Impact

This concept refers to the differing effects that hardware issues can have on individual consumers versus large organizations or businesses. The video highlights concerns about Intel's reputation among major clients like the U.S. government and large OEMs (Original Equipment Manufacturers) such as Dell, compared to consumer market perceptions. This distinction is important for understanding the broader implications of hardware reliability issues.

💡Brand Reputation

Brand reputation is the public perception of a company or product. In the video, the potential damage to Intel's reputation is discussed, particularly concerning its reliability in the server market. The discussion includes how significant hardware issues can tarnish a company's reputation, affecting its relationships with large customers and influencing future business decisions.

Highlights

Alderon Games is switching its Intel 13th and 14th gen servers to AMD due to significant stability issues.

Intel's 13th and 14th gen chips are reported to have serious problems, with nearly 100% failure rates in some cases.

Despite initial success, the chips later deteriorated, leading to overwhelming errors from Intel chips.

There is a discussion on why Alderon Games is using Core i7s and Core i9s for servers instead of typical server-grade chips.

Game servers might prioritize performance and value over running ECC memory for utmost stability.

In Alderon's testing, lower core count and lower clocked SKUs seemed to be impacted less by the failures.

Level One Techs suggests that the issue looks like a hardware problem, which might necessitate a recall by Intel.

If Intel needs to fix the problem in hardware, it would require significant time and resources, pulling teams off future products.

Fixing the issue would be enormously costly for Intel, potentially taking months or more to manufacture new units.

The cost of defective parts is monumental for large customers like the United States government or OEMs like Dell.

Intel is currently working on a fix for the ETB bug that contributes to the instability but has no solution for the root problem.

Intel might have to rely on warranty replacements until the warranties expire, hoping to minimize the brand damage.

The reputational damage with major institutions like Dell and HP is more concerning than consumer brand damage.

Intel's server market share is partly maintained by the perception that choosing Intel is a safe decision.

This hardware issue might change the perception that 'you never get fired for buying Intel,' impacting Intel's reputation.

Transcripts

play00:00

MMO publisher says Intel is selling

play00:03

defective chips yeah alderon games says

play00:07

that it is switching it's Intel 13th and

play00:10

14th gen servers and we touched on this

play00:12

last week um when I hadn't watched the

play00:15

level one Tex video yet but we did talk

play00:17

about how wendle from level one tax um

play00:19

has been talking about how Intel's 13th

play00:22

and 14th gen chips are having some

play00:24

serious problems well alderon says that

play00:26

they are switching them out for AMD

play00:29

following significant stability issues

play00:31

according to alderon their servers have

play00:34

had extremely high failure rates which

play00:36

is overwhelmingly due to errors from

play00:39

Intel chips and I can hear you already

play00:42

asking why the devil is alderon games

play00:47

using cor i7s and cor i9s for

play00:52

servers that's not as crazy as you might

play00:54

think it's actually I I still I still

play00:58

while Intel and AMD have at times

play01:00

certainly and especially Nvidia have

play01:03

certainly put pressure on barebones

play01:05

chassis manufacturers and uh Solutions

play01:08

providers um to steer their clients

play01:13

toward Zeon and epic and I don't know

play01:17

what is what's nvidia's branding even

play01:19

it's not Tesla anymore but whatever

play01:21

what's their what's their branding for

play01:22

their gpus these H grid who who knows

play01:25

but but their Enterprise Solutions while

play01:27

they certainly steer them sometimes

play01:29

those Enterprise Solutions don't

play01:31

necessarily make the most sense and in

play01:33

the case of something like a game server

play01:35

where performance and value might be

play01:38

more meaningful to you than running ECC

play01:40

memory for the utmost instability um it

play01:44

it could make sense to

play01:46

deploy um to deploy um just like one use

play01:50

of Game servers or even or even I think

play01:52

you can I think I've seen ones that are

play01:54

like more like four four in a one U or

play01:58

we oh we looked at oh man what was that

play02:00

cool one that we looked at from uh super

play02:03

micro it was about a year ago but it was

play02:05

a bunch of just like ryzen um in in

play02:08

little like like blades uh super cool

play02:11

anyway uh the point is that it is way

play02:14

more common than you would think to run

play02:15

desktop

play02:17

chips so let's have a look at what the

play02:20

breakdown is here uh in alderon testing

play02:24

the chips initially worked fine but

play02:27

later deteriorated failing at a rate of

play02:30

nearly

play02:32

100% And you can see here which models

play02:35

seem to be the most likely to be

play02:38

affected although we don't know what

play02:40

exactly the mix is of their deployment

play02:43

so the ks and this is already something

play02:47

that has been touched on K's seem to be

play02:51

experiencing the worst failures the

play02:53

lower core count Lower clocked SKS seem

play02:56

to be impacted

play02:58

less and according to level one tax this

play03:02

looks like Hardware uh we talked about

play03:04

this last week about you know Intel

play03:07

could need to do a recall here if they

play03:08

can't figure out how to fix this good

play03:11

Lord what would this look like yeah have

play03:14

no idea I don't think in my time as a PC

play03:19

Enthusiast I have seen such a high are

play03:23

always Rock Solid like well here's the

play03:27

exception that proves the rule well no

play03:29

there's been stuff I thought copper mine

play03:31

was a huge disaster um that Pentium 3

play03:36

yeah that's what I'm saying like in

play03:37

pretty

play03:39

much like that was before I was in high

play03:41

school that's 25 years ago it's been a

play03:43

decent assumption that your CPU is like

play03:46

not really the problem for a long

play03:49

time absolutely freaking wild man um I

play03:54

just yeah I have absolutely no idea what

play03:56

this would look like because you got to

play03:58

go A Step Beyond

play04:00

just uh you know oh well you know Intel

play04:04

takes the chips back right like that

play04:06

that's a recall

play04:07

right no what is going what are they

play04:11

going to what are they going to issue

play04:12

people in exchange like what are they

play04:15

going to give them now these are an

play04:16

inherent flaw there's there's a there's

play04:19

like okay okay like let's look at it

play04:21

from a couple of different angles they

play04:22

can't like recall the car replace the

play04:24

airbag and give you the car back yeah I

play04:26

like I've got my let's say I'm I'm an

play04:28

end user customer okay I I bought a

play04:30

motherboard from gigabyte and some

play04:32

crucial memory and a seic power supply

play04:34

I've got my computer here right and and

play04:36

I got this bad Intel chip and I need a I

play04:38

need a new one inel here you go you got

play04:40

my

play04:41

chip well now what they they send back a

play04:45

fixed one well no you you this is the

play04:49

Intel's R&D teams okay are working on

play04:52

CPUs that are coming out next year the

play04:54

year after and the year after that that

play04:57

gives you some idea of how long you're

play05:00

working on a chip before it ever sees

play05:02

the light of day just the the spin up

play05:05

time if you watched our Intel Fab tour

play05:07

the spin up time from okay we've got the

play05:10

design finished and now at least in

play05:14

principle and now we need a physical one

play05:16

in order to try to power it on is I

play05:19

forget if it was weeks or

play05:21

months but it's a long flipping time

play05:23

because all of the different steps that

play05:25

are required to manufacture a modern

play05:27

processor for Intel to fix this in

play05:32

Hardware they they would this is this is

play05:35

enormously costly they're going to have

play05:36

they would have to pull teams that are

play05:38

working on future products and get them

play05:41

essentially back on the CPU design of

play05:45

13th and 14th gen and even if they did

play05:49

that yesterday it would be months before

play05:53

they would have a single unit to ship to

play05:55

anyone and when you consider how much of

play05:57

their manufacturing capacity would have

play05:59

been used used over the last couple of

play06:02

years to ship all the chips that they

play06:04

had it would be months or maybe even a

play06:09

year plus before they could actually

play06:11

manufacture enough of these bloody

play06:12

things in order to issue all the fixes

play06:16

so in the meantime imagine you are not a

play06:19

gamer who just bought one chip and put

play06:21

it in your computer and are kind of

play06:22

going where the devil's my chip now

play06:24

imagine that you're a customer like the

play06:26

United States government or Dell

play06:30

your expectations are now completely

play06:32

different and there's a huge cost to you

play06:35

especially if you're someone like an oem

play06:36

like a Del or Lenova HP the cost to you

play06:40

of these systems that you've designed

play06:41

around this now defective

play06:44

part is

play06:47

Monumental good luck Intel it's going to

play06:50

be rough yep Intel is currently working

play06:52

on a fix for the ETB bug it says

play06:54

contributed to the instability but

play06:56

there's no indication that the company

play06:57

has a fix for the root problem says the

play06:59

last of our

play07:01

notes

play07:04

well

play07:08

cool I mean I guess yeah one option um

play07:12

is they can just keep sending you more

play07:15

of them as long as you're under warranty

play07:17

and then hope

play07:20

that everyone's warranty expires at some

play07:23

point and uh that this doesn't leave too

play07:26

bad of a stain on their brand I think

play07:29

it's a pretty B big stain on the brand I

play07:30

think that's probably more realistic and

play07:32

the the stain on the brand to Consumers

play07:34

is not the one that I'm worried about if

play07:36

I'm Intel like yeah sure whatever um the

play07:40

amount of cares for Consumer sales right

play07:41

now has never been that and like Gamers

play07:43

already mostly kind of hate them anyway

play07:45

so realistically what difference does it

play07:48

make um it's it's the brand it's the

play07:52

reputational damage with the Dells and

play07:54

the HPS and the and the large

play07:57

institutions that I'm really worried

play07:59

about because a big part of why Intel

play08:02

has uh managed to maintain the kind of

play08:05

for example server market share that

play08:07

they have is because you never get fired

play08:10

for buying

play08:11

Intel o not because their performance

play08:14

has necessarily been better there are

play08:16

other reasons um you know Intel makes

play08:18

Investments That AMD traditionally

play08:20

hasn't because they haven't had the

play08:21

resources for it in terms of deployments

play08:23

and management and stuff like that but

play08:26

um a big part of it is you don't get

play08:29

fired for buying

play08:30

Intel and now you might

Rate This

5.0 / 5 (0 votes)

関連タグ
IntelAMDServer IssuesChip RecallAlderon GamesTech NewsCPU StabilityGaming ServersEnterprise SolutionsHardware Failure
英語で要約が必要ですか?