Dual 3090Ti Build for 70B AI Models

Ominous Industries
16 Mar 202420:39

Summary

TLDRIn this video, the creator shares their experience upgrading their PC by adding a refurbished Nvidia 3090 TI to their existing setup to enhance their large language model (LLM) machine. They detail the process of testing the new card, transferring components into a larger case, and overcoming challenges with cable management and power supply. The video concludes with the successful integration of both GPUs, demonstrating improved performance and the potential for future upgrades.

Takeaways

  • ๐Ÿ›’ The user purchased a refurbished 3090 TI and a 390 from Micro Center for $7.99 and $6.99 respectively.
  • ๐Ÿ’ก The user is interested in enhancing their LLM (Large Language Model) machine with an additional 3090 to run larger models.
  • ๐Ÿ“ฆ The refurbished graphics cards came with a power adapter and were boxed.
  • ๐Ÿ”ง The user's current PC case is an old model from Building 19, which has been through multiple iterations of use.
  • ๐Ÿ›  The user is not particularly skilled or concerned with cable management, which is evident in their build.
  • ๐Ÿ”„ The user tests the new card by swapping it with the old one and performing a quick benchmark test.
  • ๐Ÿ”Œ The user mentions that the power supply unit (PSU) is mounted at the top, which is not common in newer cases.
  • ๐Ÿ”‰ The user's build includes a unique feature: aluminum wheels similar to those on newer Apple computers.
  • ๐Ÿ”„ The user plans to transfer all components into a larger case but first tests the new card in the current setup.
  • ๐Ÿ’ป The user encounters an issue with memory when trying to load a 70b model, which requires increasing the swap file size.
  • ๐Ÿ“ˆ The user successfully tests the new setup with two cards, achieving a speed of 16 tokens per second for the 70b 4bit quantized model.
  • ๐Ÿ› ๏ธ The user acknowledges the need for a new motherboard to support both cards at full capacity due to PCI lane limitations.

Q & A

  • What graphics cards were available at Micro Center for refurbishment?

    -Micro Center had the 3090 TI and the 390s Founders Edition available for refurbishment.

  • What was the price difference between the refurbished 3090 TI and the regular 390 at Micro Center?

    -The refurbished 3090 TI was priced at $7.99, while the regular 390 was priced at $6.99.

  • Why does the speaker want to add another 3090 TI to their setup?

    -The speaker wants to add another 3090 TI to create a more potent LLM (Large Language Model) machine, allowing them to run larger models.

  • What does the refurbished graphics card come with according to the script?

    -The refurbished graphics card comes with a power adapter.

  • Where did the speaker originally purchase the case they are using?

    -The speaker bought the case from a local store in the New England area called Building 19.

  • What was the original purpose of the case before it became the speaker's main PC case?

    -The case was originally purchased as a fully built computer called Velocity Micro from Building 19 and has gone through various iterations, including mining crypto.

  • Why is the speaker considering getting a new motherboard?

    -The speaker is considering a new motherboard because the current one may not support both new graphics cards at full x16 PCI lanes due to the processor's limitations.

  • What issue did the speaker encounter when trying to load the 70b model?

    -The speaker encountered an out-of-memory issue when trying to load the 70b model, despite having the correct amount of video RAM.

  • What did the speaker do to resolve the out-of-memory issue for the 70b model?

    -The speaker increased the size of their swap file to resolve the out-of-memory issue.

  • What is the estimated tokens per second speed for the model running on the dual GPU setup?

    -The estimated tokens per second speed for the model running on the dual GPU setup is around 16.9 tokens per second.

  • What is the speaker's plan for the current PC build after the test?

    -The speaker plans to temporarily use the current build until they get a larger motherboard and then redo the entire build with proper cable management.

Outlines

00:00

๐Ÿ› ๏ธ Upgrading to a More Potent LLM Machine

The video script describes the process of upgrading a personal computer to enhance its capabilities for running large language models (LLMs). The narrator purchases a refurbished 3090 TI and a 390 Founders Edition from Micro Center, with the intention of adding the 3090 TI to their existing setup. They detail the testing of the new card, ensuring it works properly before transferring it into a larger case. The script also reflects on the history of the current PC case, which was bought as a pre-built system from Building 19 and has seen various uses over the years. The narrator acknowledges their lack of cable management skills and proceeds to swap the old card with the new one, aiming to test it thoroughly before integrating it into their main setup.

05:01

๐Ÿ”ง Integrating Dual GPUs in a Compact Case

This paragraph details the attempt to fit two powerful graphics cards into a single case, despite the challenges of limited space and potential airflow issues. The narrator discusses the technical aspects of ensuring the motherboard can support both GPUs, the process of physically fitting them into the case, and the concerns about the power supply unit (PSU) fitting and sagging due to its weight. They also touch on the aesthetic preference for an older, worn-out look in their builds, as opposed to modern RGB setups. After successfully installing the cards and ensuring they work together, the narrator considers the need for a larger motherboard with more PCI lanes to avoid bifurcation and improve airflow in the future.

10:04

๐Ÿ’ป Testing and Troubleshooting the New Setup

The narrator proceeds with testing the new dual-GPU setup, initially facing issues with memory limitations when attempting to load a large model. They address this by increasing the size of their swap file, which allows for more temporary storage space. After successfully loading the model, they transfer the system onto a 512GB SSD and test the functionality of the new setup through a text generation web UI. The script highlights the increased intelligence and speed of the model compared to previous versions, noting a real-time, readable output at 16 tokens per second. The narrator also discusses the need for better cable management and plans for a future rebuild in a new case with improved design.

15:07

๐Ÿ”„ Reassembling and Optimizing the PC Build

In this paragraph, the narrator focuses on the reassembly of their PC after the successful testing of the dual-GPU setup. They mention the intention to use the HDMI port from the motherboard to free up the graphics cards for their primary task. The script describes the process of reassembling the case, including the challenges of cable management and the use of zip ties to organize the cables. The narrator also discusses the need for a more permanent solution involving a larger motherboard and improved airflow, acknowledging that the current setup is a temporary one. They conclude by expressing satisfaction with the outcome of their weekend project and the functionality of the upgraded system.

20:13

๐ŸŽจ Final Touches and Aesthetic Considerations

The final paragraph of the script discusses the final steps in reassembling the PC case and the aesthetic considerations involved. The narrator ensures that both graphics cards are visible within the case, despite the limited space, and expresses satisfaction with the visual outcome. They mention the difficulty in seeing both cards clearly but are pleased with the overall look. The script concludes with the successful reassembly of the case, with a focus on the aesthetic appeal of the build, and the anticipation of future improvements with a new case and motherboard.

Mindmap

Keywords

๐Ÿ’กMicro Center

Micro Center is a well-known American retailer specializing in computers, electronics, and related products. In the video, the narrator mentions purchasing refurbished 3090 TI and 390 graphics cards from Micro Center, indicating it as the source for the hardware upgrade for their computing setup.

๐Ÿ’กRefurbished

Refurbished refers to products that have been previously used, returned, or repaired and then certified for sale again. In the context of the video, the narrator bought refurbished graphics cards, which are likely to be more affordable than new ones, and still suitable for their intended use in a more potent machine setup.

๐Ÿ’ก3090 TI

The 3090 TI is a high-end graphics card model, part of NVIDIA's lineup, known for its powerful performance in gaming and professional applications like machine learning and image generation. The video's theme revolves around enhancing the narrator's existing setup by adding another 3090 TI to improve computational capabilities.

๐Ÿ’กFounder's Edition

The Founder's Edition typically refers to the first version of a product released by the original manufacturer. In the script, the narrator mentions the Founder's Edition of the 390 graphics card, which is one of the models they considered purchasing from Micro Center.

๐Ÿ’กLLM (Large Language Model)

A Large Language Model (LLM) is a type of artificial intelligence that processes and generates human-like text based on input data. The video's narrator is interested in creating a more potent LLM machine, which is why they are upgrading their hardware with additional powerful graphics cards.

๐Ÿ’กPCIe Lanes

PCIe Lanes refer to the data channels on a computer's motherboard that connect to peripheral devices like graphics cards. The script discusses the limitation of the current motherboard's PCIe lanes, which affects the performance of the graphics cards when used in tandem.

๐Ÿ’กBifurcated

In the context of PCIe lanes, bifurcated means that a single PCIe slot is split into two, effectively reducing the bandwidth available to each device. The narrator mentions the limitation of their current motherboard where the PCI lanes are bifurcated, impacting the performance of the dual graphics card setup.

๐Ÿ’กThreadripper

Threadripper is a brand of high-end desktop processors by AMD, known for their multiple cores and threads, making them suitable for heavy multitasking and compute-intensive applications. The narrator plans to upgrade to a Threadripper processor with a new motherboard to better support their dual graphics card setup.

๐Ÿ’กCable Management

Cable management is the practice of organizing and routing cables in a computer case to maintain a clean and efficient build. The narrator admits that cable management is not their strong suit, as evidenced by the messy wiring in their current setup.

๐Ÿ’กPower Supply Unit (PSU)

A Power Supply Unit, or PSU, is the component in a computer that supplies power to the system. The script mentions concerns about the PSU's size and mounting position in the case, as well as the need for additional power connectors for the new graphics cards.

๐Ÿ’กImage Generation

Image generation refers to the process of creating visual content, often using AI or machine learning models. The video's narrator tests the new graphics card setup by running an image generation task, which is a practical application demonstrating the card's capabilities.

๐Ÿ’ก70B Model

The 70B model likely refers to a specific version of a large language model with a 70-billion-parameter configuration. The narrator attempts to download and run this model on their upgraded setup but encounters memory limitations, requiring adjustments to their system's swap file.

Highlights

Micro Center had 3090 TI refurbed for $7.99 and 390s for $6.99.

The goal is to add another 3090 to create a more potent LLM machine.

The refurbished card comes with a power adapter.

Testing the new card by removing the old one for a quick bench test.

The case was bought as a fully built computer from Building 19.

The case has been used since 2008 and has gone through many iterations.

The computer originally came with aluminum wheels similar to newer Apple computers.

Cable management is not a priority in this build.

The new card is tested for functionality before transferring to a larger case.

The open Dolly Del test is run to ensure the card is recognized.

The 70b model failed to load due to insufficient video RAM, requiring an increase in swap file size.

A 512GB SSD was used to transfer everything for the model.

The new setup achieved 16.9 tokens per second for text generation.

The build quality of the current case is questionable due to heat concerns.

A new motherboard and case are planned for a future rebuild.

The final assembly includes using the HDMI port from the motherboard.

The reassembled case showcases both graphics cards, despite the small size.

Transcripts

play00:02

so Micro Center had 3090 TI refurbed as

play00:06

well as the 390s the founders Edition

play00:10

ones the ti was

play00:13

$7.99 and the regular 390 was $6.99 I

play00:18

believe I've been wanting to get another

play00:20

3090 TI for

play00:23

3090 to make a more potent llm

play00:27

machine I currently have one

play00:31

so today I'm going to be adding in

play00:32

another one so that I can run

play00:36

larger

play00:39

llms it's what it looks like boxed up

play00:42

and

play00:45

refurbed comes with

play00:48

the power

play00:51

adapter and we'll pull the card

play00:58

out

play01:03

[Applause]

play01:14

wonderful so before I go about

play01:16

transferring this all into a new larger

play01:18

case I'm just going to test the new card

play01:21

by removing the old one that I have

play01:24

making sure it works with some image

play01:25

Generation stuff just quick bench test

play01:29

this case I actually bought as a fully

play01:31

built computer from a local store here

play01:33

in the New England area called Building

play01:36

19 it was called a velocity

play01:39

micro and it was a really expensive

play01:41

pre-build and for some

play01:44

reason their whole stick was to get

play01:47

things and sell them much cheaper at

play01:48

Building 19 so I've had this case since

play01:51

probably 2008 2009 it's gone through a

play01:55

large amount of

play01:56

iterations from mining um

play02:00

crypto a long time ago to now doing llm

play02:06

stuff so unfortunately today is the last

play02:09

day that this case is going to be my

play02:10

main PC case which is sad for

play02:15

me this computer also came with these

play02:18

cool aluminum wheels that you may have

play02:21

seen on newer Apple computers for the

play02:23

price of a whole entire computer but

play02:26

these were a bit cheaper also I find it

play02:30

now relevant to mention that cable

play02:32

management is not my strong suit nor

play02:34

something that I pay much care to so

play02:37

please don't be too hard on me for that

play02:40

so I'll quickly remove this card swap it

play02:42

with the new one and put everything back

play02:58

together

play03:08

so I have the new card in now I will

play03:12

simply plug in the existing wiring that

play03:15

was

play03:16

here and voila I'll bring it over to my

play03:21

test bench area and run it just to make

play03:24

sure everything's all right and then

play03:26

I'll go about transferring all this into

play03:28

a much larger case

play03:31

also side note if the camera did Pan

play03:33

down to a large mess down here on the

play03:35

floor this is not how I live this is a

play03:38

workshop so there is large amounts of

play03:42

trash and machinery and the likes of

play03:48

that so it's fired up it's running and

play03:52

the card has lit up which is a good sign

play03:56

so now we're in the btu environment and

play03:59

I'm just going to quickly run this to

play04:01

see if the card is being

play04:04

recognized and it

play04:07

is got our power draw vrm wonderful so

play04:12

the next step is to

play04:14

run the open do Dolly Del please correct

play04:19

me if that's an

play04:22

issue which takes a little while to run

play04:25

then we'll open up the web interface so

play04:28

that we can do image generation

play04:34

now this is up and I'm going to test it

play04:36

and of course this is not why I've

play04:38

gotten a second GPU this is just a way

play04:40

to quickly just test this one make sure

play04:41

everything's working all

play04:51

right all right let's see usually

play04:56

get about that speed on the other card

play05:00

so that's good and we'll go over here

play05:03

we'll run this again just to see if it's

play05:04

being utilized more should be around

play05:11

yep and the uh I go to Micro Center brow

play05:16

picture did work well let's make this

play05:18

adhere a bit more all the way shall we

play05:22

and we'll try this one more time as well

play05:24

as this while it's

play05:27

generating yep perfect

play05:30

is using a lot of power when it does

play05:32

this I've had them generate some

play05:34

insanely large images like 4,000 by

play05:38

4,000 so that picture the contrast isn't

play05:41

messed up on the camera it actually

play05:43

looks that messed up but so the initial

play05:45

test has worked well and I will now

play05:47

shove Two Cards into a

play05:51

case I put both of the cards in this

play05:53

case just to see what it will look like

play05:55

now this motherboard is only temporarily

play05:58

here I will eventually soon get a thread

play06:00

Ripper with a new motherboard so that

play06:03

the PCI lanes are not bifurcated

play06:06

bifurcated bifurcated I don't know but

play06:09

cuz this processor doesn't have enough

play06:11

PCI Lanes to run these both in

play06:15

x16 but considering they're going to

play06:17

have to be this close anyway which I

play06:19

know is probably horrible from an

play06:20

airflow perspective I kind of want to

play06:23

just keep them in my case that I'm

play06:25

essentially bonded to until I get a new

play06:28

motherboard and then just totally redo

play06:30

the build probably from an engineering

play06:33

standpoint is bad however I find that

play06:36

sometimes we look past that to make

play06:39

aesthetic decisions and I'm no

play06:42

different so this case uh the power

play06:44

supply is mounted up top which I noticed

play06:47

today after going to Micro Center does

play06:49

not seem to be the norm any longer seems

play06:51

like they all get mounted on the bottom

play06:53

which based on the size of these seems a

play06:57

reasonable decision I am not quite sure

play06:59

if the new one is going to fit in here

play07:01

without sagging massively

play07:03

so I'll perhaps have to support it some

play07:09

way

play07:12

and one two three okay those are the

play07:15

wrong

play07:16

screws they're all the right screws

play07:21

now you can see this computer is uh aged

play07:26

I know a lot of people do nice new RGB

play07:29

builds and I always prefer the more worn

play07:33

out and old aesthetic and this thing has

play07:37

been around the

play07:38

block which I like I like older worn out

play07:42

things especially

play07:55

guitars if you hear fan noise right now

play07:57

by the way that's actually one of my

play08:00

uh light bulbs I know that sounds weird

play08:03

but the lighting I'm using the bulbs

play08:05

have fans in them and one of them

play08:08

is rather

play08:11

unhappy all

play08:13

right I've got

play08:17

that just going to unplug everything

play08:20

which I should have done that's

play08:28

h

play08:36

I'm just going to reuse the cables that

play08:39

were

play08:41

here and I will only need to add

play08:45

the additional ones for the extra

play08:47

graphics

play08:50

card right we now have this up of course

play08:54

there

play08:58

1,000 it's been a wonderful

play09:01

PSU never any problems with

play09:07

it

play09:13

now oh man this

play09:15

is of

play09:18

questionable uh build quality here

play09:22

my jury

play09:25

rigging truth be told I really wanted to

play09:27

just cut up an old Power Mac G 5 case

play09:30

and Frankenstein that into holding both

play09:32

of these graphics cards but that

play09:34

requires a large amount of aluminum

play09:36

cutting which I'm not so keen on having

play09:39

to

play09:40

do let's check the size difference

play09:42

between the old and new actually does

play09:45

not seem as

play09:48

substantial

play09:50

as I first feared obviously

play09:54

the there

play09:56

so and how long is that

play10:00

30 millim 40 millim not too bad I think

play10:04

we might be all right so I'll move this

play10:07

off to the

play10:11

side

play10:13

and I

play10:15

will put this

play10:18

back not that way not that way that

play10:28

way

play10:37

this is

play10:39

a Perhaps it is a good idea to just use

play10:43

a new

play10:50

case so these don't go in

play10:53

this thread

play10:56

it fortunately my fears of sagging have

play10:59

been quelled due to the fact that this

play11:00

does have some supporting elements of

play11:04

the case frame here and these tabs here

play11:06

and here so the PSU won't

play11:12

sag fortunately this came with screws

play11:14

because I was trying to put the ones

play11:16

that were in there back in and I must

play11:18

not have been in my right mind uh in the

play11:20

last iteration build of this computer

play11:22

because they

play11:23

were completely the wrong

play11:28

size for p wonderful you can't see

play11:47

that I would lie to you instead had

play11:49

fully taken this apart so that I could

play11:51

clean the dust from the fans in the

play11:52

front however that would be fitting I'm

play11:55

taking it apart to get the top of the

play11:56

case off because that will allow me far

play11:59

far easier access to the power supply to

play12:02

plug everything back

play12:04

in I noticed a lot of newer cases and

play12:07

builds seem to have emphasis on ease of

play12:10

use in terms of actually plugging

play12:12

everything in and getting everything

play12:13

wired and

play12:16

clonin so with this

play12:19

off I now

play12:25

have large amount of access to put the

play12:28

power supply back in really to get

play12:32

everything I have everything wired up

play12:35

now including the two connectors

play12:37

necessary for the graphics

play12:40

cards and fortunately it all worked all

play12:43

right here this case actually has pretty

play12:46

adequate depth for a larger power supply

play12:48

including the cords protruding out so

play12:51

now I'm just going to gently slam this

play12:53

in and is

play12:57

sagging up a large bit but it's also not

play13:00

screwed into the back which gives it

play13:02

some lateral support that

play13:04

way please do feel free to shame me for

play13:07

the wiring I'm going to clean it up a

play13:08

bit before I turn it on which will just

play13:10

consist of zip tying things to one of

play13:12

these caddies and now I'm going to test

play13:15

it real quick I don't actually 100% know

play13:18

if this motherboard's going to be okay

play13:20

with supporting both of these cards from

play13:21

a pcie standpoint so I may have to run

play13:24

back out and get a new motherboard which

play13:26

I'd prefer not to have to do but if I

play13:28

must I

play13:31

must it's all plugged in now both cards

play13:35

are on and running I just have it naked

play13:38

just CU I wasn't sure it would work with

play13:40

this motherboard and you can see

play13:44

here and I run this both cards do pop up

play13:49

so now I'm going to download a large

play13:52

model in the text generation web

play13:58

UI 70b 4bit

play14:02

Quant and this will take a little

play14:05

while about 40 gigabyte download so

play14:11

yeah after a rather large amount of time

play14:14

I realized that the 70b model would not

play14:19

load it was saying it was out of memory

play14:22

even though I had the correct amount of

play14:24

video RAM I needed to increase the size

play14:27

of my swap file

play14:29

uh essentially what that means is say

play14:31

you have like things on a desk like this

play14:34

and you need more space temporarily you

play14:36

can move them to shelves and then you

play14:38

know move them back I think that's right

play14:41

I don't know check it on chat GPT later

play14:45

so this uh loaded now and I transferred

play14:49

everything over to a 512 gig SSD that I

play14:52

had lying around which was uh a bit

play14:56

involved of a process so this is an

play14:59

instruct follow model let's see if this

play15:00

actually starts working chat

play15:04

instruct character Gallery this

play15:07

one and I will say hello

play15:12

friend I spelled hello wrong let's see

play15:15

what

play15:17

happens oh there what can I do for you

play15:20

today five tokens a second I'm not quite

play15:23

sure what the speed is supposed to be

play15:25

for a model of this size let me check

play15:28

the Nvidia

play15:33

panel to see what the utilization of the

play15:36

cards is

play15:39

currently

play15:41

so we have 23 gigs of RAM being used on

play15:45

one and then 19 gigs of the

play15:49

other uh so let's just see what this

play15:52

says

play15:57

uh

play15:59

sorry this is po to

play16:12

see cool this works pretty well I think

play16:15

I need to do some cleaning in terms of

play16:16

the response

play16:19

uh way because it's actually showing the

play16:22

line Brakes in format like that so that

play16:25

did 17 tokens a second 16 .9 seems

play16:30

pretty good I'll run this again to see

play16:31

if the cards are heating up right now

play16:33

because they are in an open air

play16:35

environment but this case is very small

play16:38

so yeah we're going up a bit So

play16:42

eventually I'll swap them out of this

play16:43

case when I get a larger motherboard

play16:45

with hopefully a thread

play16:47

Ripper but for now this is pretty cool

play16:50

it seems relatively intelligent let's

play16:53

ask it one more thing

play16:55

maybe tell me a

play16:57

story

play17:02

about a big Micro

play17:07

Center trip and we'll leave this up to

play17:10

its

play17:13

interpretation okay so it does know

play17:15

about a local Micro Center

play17:19

store pretty

play17:22

good this is definitely much more

play17:24

intelligent than the 7B dolphin models I

play17:27

was using earlier on the single card and

play17:29

speed-wise it's 16 tokens a second I

play17:33

think I was getting like 50 to 60 on a

play17:35

single 39 DTI with a 7B uh EXL 2 model I

play17:42

think that's the word so but this is

play17:44

totally readable in real time you can

play17:46

see it this has been a pretty fun

play17:50

Endeavor so far just for

play17:52

a simple

play17:54

Saturday I've got two cards in there

play17:57

which are both likely be being

play17:59

suffocated due to the heat of one

play18:00

another however this will be a temporary

play18:03

solution now all that's left to do is

play18:05

put this case back together and clean up

play18:08

the wiring a

play18:11

bit now for the grand

play18:27

reassembly now I'm going to quickly just

play18:30

try to tie these cables at least

play18:33

together with the zip ties that were

play18:34

included with the new power

play18:36

supply one last thing I want to note is

play18:40

that I am just going to use the HDMI

play18:43

port from the motherboard instead of

play18:45

from either of the graphics cards so

play18:47

that they're not preoccupied with

play18:50

anything other

play18:53

than running what they need to be

play18:56

running let's see if this is even

play18:58

possible able to do this with any form

play19:00

of cless

play19:02

whatsoever again this will be

play19:05

rebuilt in a new

play19:07

case with proper Cable Management

play19:11

sometime in the next few

play19:20

months

play19:22

and there's really not much I can do

play19:27

here

play19:38

Well for

play19:39

now cool with it put these

play19:43

back and final piece of the

play19:47

puzzle will be the side which could use

play19:49

a swift

play19:55

ring here's some Swiffer pad ASMR

play20:01

all right this needs more than I can

play20:03

give it with

play20:05

that this on without pinching

play20:13

the

play20:14

cords

play20:19

309s

play20:20

and it's in it's back

play20:23

together kind of hard to

play20:26

see both cards in there but it'll

play20:29

probably look pretty cool let me turn

play20:32

this L one off there we go made no

play20:35

difference should look cool

play20:37

on

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
PC UpgradeGPU 3090 TIGraphics CardMachine LearningGaming RigHardware ReviewMicro CenterTech DIYCable ManagementPerformance Test