GPT-4o VS Claude 3.5 Sonnet - Which AI is #1?
Summary
TLDRThis video provides a comprehensive, practical comparison between GPT-4 and Claude 3.5 Sonet, focusing on real-world applications rather than benchmark tests. The host examines both models' performance in tasks like writing, summarizing, data analytics, coding, and reasoning. The tests reveal strengths and weaknesses in each model, highlighting that while Claude excels in coding and visualization, GPT-4 offers superior functionality in writing, summarization, and customizability. Despite some limitations, both models demonstrate significant capabilities, making the choice between them dependent on specific user needs.
Takeaways
- 😀 The video is a comprehensive test comparing GPT 40 and Claude 3.5 Sonnet, two AI models, focusing on practical applications rather than scientific benchmarks.
- 🔍 The test covers a range of practical uses, including writing, summarizing, vision, data analytics, coding, and reasoning to determine which AI is most practical for everyday work.
- 💰 Both AI models are available in paid versions, but the free versions are limited in usage, prompting the need for a subscription for extensive testing.
- 📝 In creative writing tasks, both GPT 40 and Claude 3.5 Sonnet performed well, with neither showing a clear advantage in generating product descriptions or emails.
- 📚 Text summarization capabilities were tested with both AIs providing accurate and concise summaries, with GPT 40 showing a slight edge in tone and detail.
- 🖼️ When analyzing complex images, GPT 40 initially provided incorrect time frames but corrected itself after further prompts, while Claude 3.5 Sonnet was more accurate from the start.
- 📊 In data analytics, both AIs were comparable, but GPT 40 had an advantage in creating PowerPoint presentations directly from data, a functionality lacking in Claude 3.5 Sonnet.
- 🏗️ Claude 3.5 Sonnet excelled in coding tasks, creating interactive visual dashboards and games, outperforming GPT 40 in these areas.
- 🔎 Research capabilities were found to be lacking in both AIs, with the video suggesting the use of other tools like Perplexity AI for more accurate research.
- 🤖 Complex reasoning tasks were handled well by both AIs, solving riddles and mathematical problems with correct logic and reasoning.
- 📱 For content creation, Claude 3.5 Sonnet provided a more usable tweet for social media, while GPT 40's output was less practical and engaging.
- 🚀 The video highlights the importance of choosing the right AI tool based on specific needs, acknowledging the strengths and limitations of both GPT 40 and Claude 3.5 Sonnet.
Q & A
What is the main purpose of the video?
-The main purpose of the video is to conduct a practical head-to-head test comparing GPT 40 and Claude 3.5 Sonnet, two AI models, to determine which is more practical for everyday work and business use.
What types of tasks will the video cover in the comparison?
-The video will cover tasks such as writing, text summarizing, vision and data analytics, coding, and reasoning to evaluate the AI models' performance in everyday applications.
How does the video differentiate between a scientific test and a practical test?
-A scientific test is typically more structured and formal, like those in benchmark testing. A practical test, as used in the video, focuses on how the AI models perform in real-world scenarios and everyday tasks.
What are the limitations of the free versions of GPT 40 and Claude 3.5 Sonnet mentioned in the video?
-The free versions of both AI models are extremely limited in terms of usage, with Claude 3.5 Sonnet only allowing about 10 messages before requiring an upgrade to a subscription.
What is the first writing prompt given to both AI models in the video?
-The first writing prompt asks the AI models to create a short, punchy product description for a game-changing software tool in the world of marketing that revolutionizes customer relationship management for businesses.
How does the video evaluate the AI models' performance in text summarization?
-The video asks the AI models to provide two summaries of an article: one with two to three sentences and another with five to six sentences that includes more details.
What is the main difference between the AI models' capabilities in handling vision tasks as shown in the video?
-The main difference is that GPT 40 allows uploading of more images and has connected apps for easier image handling, while Claude 3.5 Sonnet has a feature called 'artifacts' for creating visual presentations and tables.
How does the video assess the AI models' performance in data analytics?
-The video tests the AI models' ability to analyze complex images and data, such as a graph representing interest rates, and to create visual presentations or tables based on the data.
What limitations does the video highlight regarding Claude 3.5 Sonnet's capabilities in research?
-The video highlights that Claude 3.5 Sonnet does not have internet access and therefore cannot provide current articles, reports, or relevant links for research, unlike GPT 40 which sometimes provides incorrect links.
What is the video's conclusion regarding the comparison between GPT 40 and Claude 3.5 Sonnet?
-The video concludes that Claude 3.5 Sonnet performs better in coding and data visualization using code, while GPT 40 has advantages in writing, summarization, and having a memory function. The choice between the two depends on the specific needs and use cases.
What additional capabilities does GPT 40 offer that Claude 3.5 Sonnet does not, according to the video?
-GPT 40 offers additional capabilities such as web browsing, image generation with Dolly 3, a memory function to improve responses based on previous interactions, and the ability to build custom GPTs for specific tasks.
Outlines
🤖 AI Model Comparison: GPT 4 vs. Claude 3.5
This paragraph introduces a comparative test between GPT 4 and Claude 3.5 Sonnet AI models, focusing on practical applications rather than scientific benchmarks. The test will cover various tasks like writing, summarizing, vision, data analytics, coding, and reasoning. Both AIs are used in their paid versions, which are limited in availability. The test begins with creative writing prompts for marketing a new CRM tool, assessing the AIs' ability to generate punchy product descriptions and emails within a specified word count.
📝 Writing and Summarization Capabilities
The paragraph discusses the AI models' performance in writing and summarization tasks. It highlights the AIs' ability to produce accurate and professional text without obvious AI-generated traits. The models are tested on summarizing an article into short and detailed versions, with Claude 3.5 showing quick response and accuracy, while GPT 4 provides a factual and toned-down summary. The paragraph also touches on the ability to change models within the paid version of GPT for varied responses.
🔎 Vision and Data Analytics Tests
This section examines the AI models' capabilities in vision and data analytics. The AIs are tasked with interpreting a complex historical image and a graph representing US credit card interest rates. GPT 4 initially provides incorrect time frames but corrects itself upon further prompting, while Claude 3.5 accurately identifies the time frame but struggles with specificity regarding the graph's subject. Both AIs demonstrate the ability to create presentations, but GPT 4 has an edge with its integration into PowerPoint.
🔍 Limitations and Research Flaws
The paragraph points out the limitations of the AI models in conducting research and providing accurate links. GPT 4 often generates non-functional links, while Claude lacks internet access altogether. The narrator recommends using other tools like Perplexity AI or Google Gemini for research purposes due to the inaccuracies and limitations of the AI models in this context.
💻 Coding and Interactive Dashboards
This paragraph assesses the AI models' coding abilities, specifically their capacity to create interactive dashboards from financial reports. Claude 3.5 excels in this task, quickly generating visual representations and interactive elements. In contrast, GPT 4 struggles to produce the same results, offering lengthy explanations and processes instead of direct coding solutions.
🎲 Game Development and Complex Reasoning
The paragraph explores the AI models' capabilities in game development and complex reasoning. Claude 3.5 successfully creates a functional checkers game with interactive elements, while GPT 4 fails to produce a working game. Both AIs perform equally well in solving logic puzzles and riddles, demonstrating their reasoning abilities.
📊 Content Creation and Social Media Optimization
The final paragraph evaluates the AI models' skills in content creation and social media optimization. Claude 3.5 effectively condenses a YouTube script into a concise tweet suitable for quick consumption, while GPT 4 provides both a tweet and a LinkedIn post, although the quality of the tweet is questionable. The paragraph emphasizes the importance of creating shareable and engaging content for social media platforms.
📝 Conclusion and Practical Usage Decisions
The conclusion summarizes the head-to-head test, highlighting Claude 3.5's strengths in coding and visual tasks, and GPT 4's advantages in writing and summarization. The narrator discusses the limitations of both AIs, such as Claude's lack of web browsing and GPT 4's occasional inaccuracies. The paragraph concludes with the narrator's personal decision to use both AIs for different tasks, emphasizing the importance of using the right tool for the job.
Mindmap
Keywords
💡AI models
💡Benchmark testing
💡Practical test
💡Paid version
💡Creative writing
💡Text summarizing
💡Vision capabilities
💡Data analytics
💡Coding
💡Reasoning
💡Content creation
Highlights
Comprehensive head-to-head test comparing GPT 40 versus Claude 3.5 Sonnet.
GPT 40 and Claude 3.5 Sonnet benchmark testing results discussed.
Practical test approach to evaluate AI models for everyday work and business.
Paid versions of both AI models are available but have usage limitations.
Creative writing prompt to assess AI's ability to generate product descriptions.
AI's performance in writing email drafts evaluated.
Preference for AI responses without additional explanatory text.
AI models' performance in text summarization tested with a complex article.
Comparison of AI's vision capabilities with a complex historical image.
AI's data analytics ability tested with a graph of interest rates.
Chat GPT's ability to create presentations from data analyzed.
Claude's limitations in creating PowerPoint presentations directly.
AI's performance in coding tasks, such as creating a dashboard from a financial report.
Claude's superior performance in creating visual presentations with code.
AI's capability in creating interactive games like checkers.
AI's performance in complex reasoning tasks, such as solving riddles and mathematical problems.
AI's ability to condense YouTube scripts into concise tweets or LinkedIn posts.
Claude 3.5 Sonnet's overall lead in the practical test but with significant limitations.
Limitations of Claude in web browsing, image generation, and memory function.
Advantages of GPT 40 in custom AI model creation for specific tasks.
Recommendation to use both AI models depending on the task at hand.
Transcripts
in this video I want to do a
comprehensive head-to-head test
comparing GPT 40 versus claw 3.5 Sonet
which just came out and it beats GPT 40
in their Benchmark testing but I don't
want to do a scientific test like they
do in their benchmarks I want to do a
practical test how we use these AI
models every day for work and business
so then we could actually figure out
which one is the most practical model
that we use if we had to pick one which
one is the right one to pick so that's
typically how I do these tests with very
practical applications so I'm going to
cover about 10 different things and
we're going to use things like writing
text summarizing text and then we'll get
into vision and data analytics and then
we'll do some coding and reasoning too
at the end so both GPT 40 that I'm going
to use and the Claude 3.5 that I'm going
to use are going to be the paid version
but both of those are available
completely for free they are extremely
limited and how much you could use them
right now so I was using CLA 3.5 sonnet
and I only got about 10 messages before
I ran out so I had to upgrade and get
the subscription to do a real
head-to-head test first let's start with
a writing prompt a lot of us use these
models to write all kinds of different
things right so this one is going to be
a little bit of a creative writing in
the world of marketing you're launching
a gamechanging software Tool
revolutionizing customer relationship
management for business write a short
Punchy product description and I told it
exactly how many words in this case and
I'll do the same thing for cloud 3.5
here here's the result from chat GPT it
went right to the answer introducing
customer connect the ultimate CRM tool
that transforms customer interactions
pretty good I asked out how many words
it was 41 our instruction is 50 words
approximately now Claud gave us a little
bit of a longer one it's 54 wordss here
but again I said approximately and he's
asking me if I wanted to adjust it this
one named it autoc CRM revolutionize
your customer relationship our AI
powered platform automized follow-ups
delivers real time insights and boost
retention effortlessly okay so they both
did a good job nothing here that screams
that an AI wrote it and I typically use
these models to help with writing email
drafts so write an email introducing
this to our email list and keep it short
let's see what it comes up with now here
we got claude's answer and this time
again it does the same thing it gives us
that sentence that we don't need that's
not part of our email and it kind of
gives you a quick recap of what it's
done and then you'll have to kind of
copy and paste the middle part here
where chat GPT just gives us the copy
option right here I could just copy this
there's no extra words so I personally
prefer to have zero extra words when I
ask something in a prompt I just want
the answer I don't want it to kind of
explain it to me if I need it to explain
to me I would ask it to explain it to me
and I found chat GPT typically every
time I give it tasks where it has to
give me an answer that I could just copy
and paste it just does exactly that and
as I'm reading the tone of this email
here they both again kind of did an
equal job this one is a little bit
overly excited so it's a little bit
sounds too promotional uses words like
boost which is very common with AI and
if I go to chat GPT again the exact same
thing happens there's boost I commonly
see that and I erased my memory here on
chat GPT because I've actually trained
it in the memory function which I've
made a different video about not to use
specific words but since Claude doesn't
have memory I excluded that and I just
cleared CLE down my memory here so it
just writes like it would without any
kind of backend instructions here and
when it comes to writing I ran it across
10 different tests and there was not a
clearcut winner I think if I was
comparing claw 3.5 versus the previous
GPT model it would have been a clear
winner but with 40 and Claw 3.5 Sonet
right now I think when it comes to
writing is pretty on par now next I want
to show you text summary and a lot of
times I consume information now using
these models when there's a large amount
of information or a huge article or a
big newsletter I usually just put it
here and I let it just summarize it for
me very quickly so let's do that here
okay here's the prompt provide two
summaries for this article the first one
two to three sentences the second one
five to six sentences and includes more
details and I'm going to use this
anthropic introduction here about Sonet
so I'm going to go ahead and copy this
this whole page and I'll just paste it
here I typically just copy the entire
page with the footer and everything and
he knows what to do with it it's a much
faster way to do this I'll send this out
here and this time let me just kind of
show you the speed because this 3.5
sonit is actually pretty quick so he
paid attention to the length he got it
write I read through this and everything
was accurate there was no kind of
hallucination with this answer
everything was exactly from that article
okay here's chat gpt's answer the short
summary again right on point the detail
summary I really like the tone it was
not at all emotional very factual kind
of how I like it without much detail
prompting I didn't really tell it what
kind of tone to take wanted to see what
it does by default and one functional
thing I like here GPT actually if you
have the paid version you could use GPT
4 and actually get a different response
using a different model and compare your
results inside of cloud if I try to do
the same thing with a paid version I
could TR change models and use Opus for
example but then that's going to require
me to start a new chat so I can't
functionally use it the same way not
that I would do that often but sometimes
I find myself not quite liking what 40
gave me and I wanted to see what the
older four model which was good which is
good still I wanted to see what that
gave me something I could do in chat GPT
okay now let's test its Vision
capabilities then I'll test its Vision
capabilities with some data analytics
but here I want to see if you could find
out what's in a very complex image this
is as weird of an image I could find
this is world history in one image
that's what I Googled and just looking
at it you can't tell you can't make out
just about anything that's happening
maybe some some of the years here you
could make out but nothing else let's
see if Claude and CH PT could figure it
out so again with these models you could
upload images here with Claude you could
upload five and each has to be 10
megabytes with chat GPT you could upload
10 here so that is one of the benefits
and it has connected apps so you could
actually connect to your one drive with
Microsoft or mine is connected to my
Google Drive which makes this a whole
lot more useful when it just comes to
these functionalities CLA is really
lacking when it comes to just basic
functionalities I'll point out bunch of
them too as we go through this video but
let me upload that image and I'm just
going to press enter I'm not even going
to give it a prompt here this is chat
GPT 40 okay and from what I could see it
found the name for it stream of time a
timeline m map the rise and fall of
different civilization Empires and
Nations and it says it's from 250 ad to
1700 ad and I'm going to say analyze
this image and explain this to me so
sometimes just changing something and
giving it in different format bullet
point table I use that all the time okay
and it's giving us a really nice
presentation here in table format so it
looks like it's putting the different
country civilizations here and showing
the time period where they were around
the key events and so on and summary
okay really nice answer here that we got
out of chat GPT assuming is correct
let's go to CLA and do the same thing
okay again I'll just press enter to
start with no prompt okay so it got the
same theme it picked up on some of the
colors and it's telling us it's from
580 to 1180 let's make that same table
here and this one is using this thing
called artifacts which is something you
could turn on in settings is completely
beta right now but it kind of creates
things on the right side and it's great
for coding and visual presentations like
tables like this so I really like this
new update I covered this in a different
video that I posted about claw 3.5 and
it created kind of a different table so
it just gave us the elements and the
description for these elements and chat
GPT gave us something completely
different so this looks more usable but
when I'm looking at this it actually got
the time wrong I did a little bit closer
look at this image if you look a little
bit closer on the very bottom it does
end at the Timeline ends at 1100 ad so
Claude was correct he got the
information right so I'm just going to
follow up here to ask for the similar
table that chat GPT gave me I'm going to
ask it for a timeline with each
civilization and the rise and fall okay
and this time it looks like it did a
better job and just to be sure I ran
this three different times and each time
I got a very similar result here so
basically chat GPT gave us something
that looks better right so at first
glance this looks more interesting in
and it looks like it gave us better
information but it totally made up the
time that graph started from 500 to 1100
this did not give us anything that is in
that time frame it kind of extended the
time frame here so I wouldn't take any
of this information at face value these
are some of the limitations of these
large language models in general you
can't just look at the output and just
assume it's right so sometimes it even
makes sense to have two different
subscriptions if you're doing Vision k
capabilities of data analytics to just
run them and then using your own
judgment to see actually which one makes
sense I had to really take a close look
at this picture to kind of try to figure
it out and this was one of the more
complex things that I've given these
large language model models to analyze
but I could tell Claude 3.5 right now is
winning there here's kind of a
challenging graph to see what it comes
up with so I'm giving it this graph but
I cropped out what this graph represents
but what it is is the interest rate for
used card in the US here so I'm going to
see what it comes up with okay here's
chat gpts telling us this is a Trends
graph from 2014 to
2024 is stable all the way till the
pandemic then he has a dip which is
telling us right here it has a 3% or it
dips below 3% which is right so it was
all the way around four dip to closer to
three and then significant increase
which is accurate let's see if you could
actually figure out what this represents
and he thinks it represents the federal
funds rate which again they do set the
interest rates so pretty close but I
didn't really think it would figure out
that this is for the car market in the
US in this time frame but I was curious
to see if it's going to do any type of
research it's going to look online but
it came up with two different
conclusions here federal funds rate or
Central Bank policy which is not correct
let's go see what Claude did but that
was not my test by the way I just wanted
to see if it did that extra step right
now I want to see if it could just
analyze things pull in the numbers and
then use those numbers to do deeper
analysis and it looks like Claud again
no problems here it gave us the range it
told us the range of the interest rate
here and it told us exactly how it's
changing over time and this time I also
asked it what this graph represented and
it gave us five different options none
of them were very specific to what I was
going for but generally it's all about
the interest rate here and it kind of
figured that out but it wasn't specific
enough to use car market in the US and
the interest rate on that now I'm going
to follow up with chat GPT I'm going to
say create a presentation based on this
information now it's going to go through
here create these kind of slides for us
so it's giving us titles for the slides
then it's telling us would you like to
create a PowerPoint file created with
this contents let's say yes okay it's
done and gave us a link let me go ahead
and download this link to see what it
gave us okay it looks like it gave us a
detailed PowerPoint here it does need
some styling it typically doesn't do the
styling here but PowerPoint has this AI
this designer AI where you could just go
ahead and select different designs here
from the side and get yourself a
finished presentation so nice job with
chat GPT and I asked Claude to create a
visual presentation and look what Claude
did here this is with the artifacts
option turned on again you could turn
that on in the settings but it wrote
some code and then it creates this
preview window and it created this nice
visual graph I mean this is kind of the
same as our current graph let me see if
it could make us a PowerPoint
presentation but this is really cool
right inside of your viewport here let
me see if it could create a PowerPoint
presentation here okay it's doing the
same kind of thing it's creating the
slides or the text here for the
different slides and what bullet point
should go in each one and it looks like
it cannot do that so I can't create or
edit or provide download links to
PowerPoints directly so all it was able
to do is kind of write code and create
this nice visual presentation for for us
right within chat but in this case I did
want a PowerPoint presentation now
that's one of the big limitations of
cloud there's lot of functionalities
like this one was a really useful
practical thing right I want to give it
some data just from an image get the
context from that and turn it into a
presentation chat GPT could do that in
one minute right and we could then use
PowerPoint to design it using the AI
inside of that Claud can't do that so it
could only do things like these visual
representations and again I ran this
through a bunch of different tests and I
think with data analytics so far in my
early testing they were pretty equal so
functionality goes to chat GPT but in
the function of data analytics they both
are about the same right now now at this
point I usually do a head-to-head test
with image generation but the only way
you get image generation right now is
using chat GPT with a paid subscription
and that gives you access to Dolly 3
that generates images for you Claude
cannot and never has been able to
generate images so I can't compare that
so that obviously is a point for chat
GPT so if you need image generation in
your day-to-day work you'll have to use
another tool like co-pilots free and
that has doly 3 built into it but you
can't just use Claude because that
doesn't have image generation at all so
if that's part of your workflow keep
that in mind now let's do a little bit
of research here I'm going to ask chat
PT write about ai's disruption in the
accounting industry and give me specific
links and articles and reports and here
you gave us some information I'm going
to go to the bottom of it to make sure
he's giving us some relevant links here
and for some reason the links are not
clickable so sometimes it makes up links
sometimes it gives us links that look
like hyperlinks but when you go to click
them they're not clickable I'm going to
tell it they're not clickable I asked it
to give me the links again and I still
couldn't click them the third time I
couldn't click them so I asked them to
give me the links like this so I could
copy and paste the links let's see nope
made a page there let's try this one
cannot find this page okay so a lot of
times chant GPT when you use it for
research when you need specific
information from specific websites and
resources like this it just does not
work it literally makes up links like
you're seeing here this has happened to
me probably every other time that I've
used chat GPT for research okay on the
other hand let's look at Claude so
Claude did again a nice job gave us
specific use cases of things that could
be interrupted by AI potential
challenges if I go to the bottom here I
don't have access to current articles or
reports okay so keep this in mind Claude
does not have internet access it never
has had internet access where GPT 40 has
internet access sometimes it makes huge
mistakes like you just saw but sometimes
it works so in this case it doesn't work
at all chat GPT could follow URLs a lot
of times I'm optimizing my website for
example I give it a URL it goes craws
the website it tells me things to
improve I can't do things like that here
for research I would not use any of
these tools I wouldn't use Claud I
wouldn't use chat GPT I would use
perplexity AI so that is a great
research tool it uses the power of these
models in the background but it's really
designed to be a search engine that's AI
powered I have a different video about
that or I'll use Google Gemini and let
Google Gemini give me a snippet and I
know that is kind of pulling from more
accurate listings based on the Google
search right so both of these in my
opinion get a zero okay now let's do
some coding I'm going to see if I can
make a dashboard with these models we're
going to go to the Nvidia website here
and I'm going to pull in one of their
financial reports here so this is a
massive massive document I believe it's
98 Pages let's download this okay I'm
going to ask Claude I uploaded this
document turn this into a visual
dashboard here to see what we get and
usually if you have that artifacts
option turned on it starts writing the
code right here on the side and as soon
as it's done it turns into preview mode
where you could actually see the output
which is awesome this is one of my
favorite updates look at this it created
this visual update for me and it's
interactive so I could hover over things
wow this is nice all right let's see if
chat GPT could do the same now the nice
thing by the way is both GPT 40 and Claw
3.5 Signet now they have such a big
context window that I could just use a
98 page document as part of my prompt
and upload that okay so chat GPT just
gave us bunch of information from that
document so it pulled in bunch of
different numbers and things like that
it did not create the visual
presentation it's ask me if I want to
proceed I'll say yes and again it looks
like it gave us a ridiculously long
stepbystep process on how to use this
other app to do this outside of chat
gptt it's not even attemp attempting to
write oh it's still going it's not even
attempting to write any code for us
again I went back and forth three
different times with chat GPT to try to
just get it to do this before it used to
create interactive and I think it still
does but for some reason in the last
couple of days I haven't got it to do
any type of coding or create any type of
interactive graphs here when I give a
very specific instructions to do so okay
so when it comes to visualization of
data using Code well cloud is obviously
the winner there now let's see if we
could create a game this time I'm going
to create a game of checkers I typically
do a game of snake or Tic Tac Toe let's
see if he could do a game of checkers
without again any information about what
kind of language to use in just 10
seconds he wrote the code and he gave us
the preview now let's see if it actually
works it says current player red let's
go ahead and try to move our piece from
here to here black from here to here and
I'll take this piece nope oh he almost
worked but it doesn't quite know how to
take a piece I asked chpt to create a
game of checkers and this time it's
giving us again bunch of text board
layouts okay where's the code okay there
we go we finally got it and he chose
python here and here's the python game
that chat GPT wrote it does not have any
pieces I can't start a new game okay so
it just made the board so I'll just give
it one prompt to try to fix it although
I didn't give Claude any prompts okay
here's a new one we got pieces this time
and okay it did not add any
functionality so it just basically
designed a game that doesn't do anything
so I don't know so far I've tested this
a handful of times and every single time
claw 3.5 Sonet when it came to any type
of coding it be chat GPT 40 okay let's
test out complex reasoning here so
here's the prompt at a party each guest
shakes hands with every other guest
exactly once if there were a total of 66
handshakes how many guests were at a
party okay so CHP created a nice formula
over here and I know the answer is 12 so
let's see if we get to that answer it
came up with two answers 12
And1 since the guest can't be negative
the number is 12 and it gave us that
answer let's try Claude okay Claude took
the same path it came up with 12 or1
since the answer can't be negative the
number must be 12 okay let's try this
one what has a voice that can't speak it
has a bed but never sleeps it has a
mouth but never eats and it runs but it
has no feet okay Claud says this is a
riddle and the answer is a river and
chat GPT same thing it thinks it's a
river okay I'm obviously not doing a
scientific test but you could see
they're both kind of doing a good job
when when it comes to logic and solving
riddles and puzzles now this next one is
for Content creation so what I'm going
to do is I'm going to give it a YouTube
script I'm just going to upload my last
YouTube script here and I'm going to ask
it to turn it into a tweet this is my
prompt extract the core lessons and
actionable items from this YouTube
script condense it into a concise tweet
or LinkedIn post suitable for quick
consumption and I'll add that YouTube
transcript here okay here's our tweet
Claud 3.5 Sonet and new free AI model
from anthropic outperforms GPT 4 if
forgot the 40 part on most benchmarks
Improvement in speed vision 200k context
window introduces artifacts okay really
nice there is no reason why besides
fixing this little missing o here I
would use this right here you created a
couple hashtags it's good usually I'm
having a hard time finding even when I'm
building my own gpts to do a really good
job where I would actually use that
tweet or that link Linkin post okay so
chat GPT decided to give us a tweet and
a LinkedIn post so the other one only
gave us one and then I guess I could use
it for both but look at this tweet new
CLA 3.5 is here faster forget
subscriptions dive into the future never
would I use that that's just extremely
uh bad okay if I see a post like this
I'm not following that person right that
is not a useful post so once again it
looks like claw 3.5 Sonet is beating
chat GPT okay so with my practical test
claw 3.5 Sonic came ahead overall but
huge limitations I want to point out so
if you had to choose between one of them
here's a huge limitation when it comes
to the paid subscription you don't have
web browsing with Claude a huge downside
the information is not going to be
relevant and up to dat if you use this
for research I recommend you use
perplexity anyway for research even the
free version of perplexity is going to
beat both of these I think that's a huge
downside for Claude if you need to
create images huge downside for Claude
Claude doesn't have a good way to search
any previous conversations but neither
does chat GPT but chat GPT does if you
have the desktop app or if you have the
mobile app they have a search function
which is huge it's missing though right
now inside of the chat GPT website for
some reason now two huge downsides with
Claude it doesn't yet have memory
function and that I had no idea how much
it's going to improve the functionality
of chat GPT by default when you talk to
it sometimes you store things to memory
and it gets smarter and smarter based on
your previous conversations and gives
you better responses so that's a huge
benefit of chat GPT and the biggest
reason why I would choose GPT 40 over
Claude is because with a paid version of
chat GPT you could build custom gpts
those are very specific little mini gpts
basically with your knowledge base and
with your very specific set of
instructions for my company we've build
well over I think we have 15 20
different custom gpts and each one does
a very specific task at this point I
wouldn't really even know how to
function day-to-day without those Claude
obviously just doesn't have those
co-pilot for some reason is getting rid
of those but that is my favorite part of
generative AI is those custom gpts that
I could train to do just do one thing
really really well where the broad
version of chat GPT and CLA are just not
going to be that good at it right they
don't have that specific knowledge base
they don't have that specific set of
instructions so I've covered custom gpts
in a different video and exactly how to
build them so if you haven't watched
that and you're not using that I highly
recommend them they will solve so many
issues for you and it will save you so
much time throughout the week so I'll
link that video here I hope you found
this head-to-head useful and you can
make a clear decisions between the two
right now just from the function for my
personal use I kind of have to use both
because some of these coding just basic
coding things that I'm doing just CLA is
just so much better with that so I'm
going to use Claud for that kind of
stuff I'm going to use chat GPT for my
everyday writing summarization things
like that and I'm going to use
perplexity for research I'll see you on
the next video
Посмотреть больше похожих видео
GPT 4o Vs Claude 3.5 Sonnet - Head to Head Comparison - Who wins?
Reflection 70B (Fully Tested) : This Opensource LLM beats Claude 3.5 Sonnet & GPT-4O?
Claude 3.5 Sonnet vs GPT-4o: Side-by-Side Tests
ChatGPT Plus X Claude PRO: QUAL VALE ASSINAR?
Claude vs. GPT: Which is best for note-taking?
Claude 3.5 Deep Dive: This new AI destroys GPT
5.0 / 5 (0 votes)