New AI Tools That Are Actually Useful
Summary
TLDRこの動画スクリプトでは、最近のAIアプリケーションの進歩とその実際の使用例について紹介されています。特に注目すべきは、ChatGPTの競合者であるCloud3の登場と、GoogleがPixel電話のアシスタントをアップグレードしたことです。Cloud3は特定の用途において優れており、画像認識やアイデア生成に長けています。また、MicrosoftのCopilotのアップデート、Brilliant.orgの学習プラットフォーム、GoogleのGemini、TTS Arena、透明な背景を持つ画像を生成する新しいインターフェース、Stable AIの3Dモデル変換、Pikaの口の同期機能など、様々なAIツールとその利活用例が紹介されています。さらに、geospy.aiというアプリの存在と、それがプライバシーに与える可能性についても触れています。
Takeaways
- 🔥 生成式AI现在非常火热,不断有新的实用应用出现。
- 🤖 ChatGPT被认为是有用的应用,大多数人同意它有日常使用场景。
- 🌟 Cloud3声称在某些用例中比GPT-4更好,尤其是在图像识别和创意思维方面。
- 🔍 有一个网站chat.lmsys.org可以免费试用Cloud3和GPT-4,比较它们的性能。
- 📚 Microsoft的Copilot更新了笔记本功能,支持长达18,000字符的提示。
- 🎨 Brilliant.org是一个互动学习平台,提供超过100门课程,帮助用户更好地利用AI工具。
- 📱 Google的Gemini替换了Pixel手机上的Google助手,提供了更智能的助手功能。
- 🗣️ TTS Arena是一个文本到语音的比较平台,用户可以比较不同语音合成器的效果。
- 🖼️ Automatic 11.11的新界面可以生成具有透明背景的图像,这在扩散模型中是不可能的。
- 🤖 Stability AI发布了一个新的功能,可以将图像转换为3D模型。
- 🎥 Pika labs提供了一个新功能,可以将视频中角色的嘴唇同步到提供的文本。
- 🌍 geospy.ai是一个可以根据图像内容推测地理位置的应用,尽管它目前只能定位到城市级别。
Q & A
生成的AIが現在どのアプリケーションが注目されていますか?
-現在、ChatGPTの競合者であるCloud3が注目されています。特定のユースケースでChatGPTよりも優れているとされています。
Cloud3はどのような点でChatGPTよりも優れているとされていますか?
-Cloud3は、画像認識とアイデアの生成、ブレインストーミングの能力において優れているとされています。特に、画像を提示する際の応答が優れているとされています。
Cloud3を無料で試す方法はありますか?
-はい、chat.lmsys.orgというサイトを通じて無料でCloud3を試すことができます。このサイトは、異なるチャットボットのランキングを集約化することを目的としています。
MicrosoftのCopilotにどのようなアップデートが加わりましたか?
-MicrosoftのCopilotには、最大18,000文字のプロンプトを許可するノートブック機能と、いくつかのプリセットが用意されている新しいCopilot GPTが追加されました。
Brilliant.orgはどのようなプラットフォームですか?
-Brilliant.orgは、グラフィックデザイン、写真、絵画など、AIを最大限に活用するために必要なスキルを学ぶことができるインタラクティブ学習プラットフォームです。
GoogleのGeminiはどのようなものですか?
-GoogleのGeminiは、Pixel電話のGoogle Assistantを置き換えるための新しい大規模言語モデルです。これにより、よりスマートで、タスクの作成やメールの確認などができるようになりました。
TTS Arenaはどのようなサービスですか?
-TTS Arenaは、テキストから音声を合成するサービスです。ユーザーは、プロンプトを入力して、異なる音声合成器の出力を比較することができます。
Automatic 11.11の新しいインターフェースはどのような機能を提供しますか?
-Automatic 11.11の新しいインターフェースは、透過背景を持つ画像を生成する機能を提供します。これは、現在利用可能な任意の拡散モデルでは実現されていません。
Stability AIの新しいリリースはどのようなものですか?
-Stability AIの新しいリリースは、画像から3Dモデルを生成するものです。これは、非3Dアーティストにとって、非常に魅力的な機能です。
Pika labsの新しい機能は何ですか?
-Pika labsの新しい機能は、提供されたテキストに合わせてビデオ内のキャラクターの唇を同期するものです。これにより、生成された音声とキャラクターの口の動きが一致するようになります。
geospy.aiアプリはどのような機能を提供しますか?
-geospy.aiアプリは、画像を分析して画像が撮影された地理的な位置を示すものです。現在は、城市的なものを特定できますが、プロバージョンでは正確な座標を提供する予定です。
Outlines
🔥 AIの進撃とChatGPTのライバル
この段落では、生成型AIの現在の状態と、特にChatGPTの役割について説明されています。ChatGPTは、実際に役立つと広く認められているアプリケーションの一例です。また、Cloud3というChatGPTの競合者が登場し、特定のユースケースで優れていることが示されています。Cloud3は、画像認識とアイデア生成、ブレインストーミングにおいて優れているとされています。また、無料で試す方法も紹介されています。
🌟 GoogleのGeminiとAIツールの進化
この段落では、GoogleがPixel電話のアシスタントをGeminiという新しい言語モデルに置き換えたこと、MicrosoftのCopilotのアップデート、そしてBrilliant.orgという学習プラットフォームの紹介がされています。Geminiは、よりスマートなアシスタントとして機能し、Apple Siriに代わることで、より多くのタスクを実行できるようになりました。Copilotは、長いプロンプトを処理するためのノートブック機能や、特定の用途に合わせたGPTのプリセットを提供する新機能が追加されています。Brilliant.orgは、AIを最大限に活用するためのツールを学べる場所です。
🎉 AIの新しいユースケースと技術の進歩
この段落では、AI技術の新しいユースケースと進歩が紹介されています。GoogleのGeminiがApple Siriを置き換えること、TTS Arenaというテキストから音声を合成するサイト、透明な背景を持つ画像を生成する新しいインターフェース、そしてStable AIの新しい3Dモデル生成技術などが触れられています。また、Pika labsの新しい機能で、ビデオのキャラクターの唇をテキストに合わせることができることも紹介されています。最後に、geospy.aiというアプリが話題になっています。これは、画像から緯度情報を特定するAI技術です。
Mindmap
Keywords
💡generative AI
💡ChatGPT
💡Cloud3
💡Google Assistant
💡AI video
💡image recognition
💡transparent backgrounds
💡geolocation
💡AI tools
💡Brilliant.org
Highlights
Generative AI is currently very popular with new, practical applications emerging frequently.
ChatGPT is widely recognized as a useful AI tool with everyday applications.
A new ChatGPT competitor, Cloud3, claims to be superior in certain use cases.
Cloud3 excels in image recognition and brainstorming, outperforming GPT-4 in these areas.
A website, chat.lmsys.org, allows users to compare Cloud3 and GPT-4 for free.
Microsoft's Copilot has been updated with new features, including a notebook feature for long prompts.
Brilliant.org, an interactive learning platform, is highlighted as a resource for enhancing AI tool usage.
Google's Gemini is replacing Google Assistant on Pixel phones, offering advanced capabilities.
The TTS Arena is introduced as a platform for comparing text-to-speech synthesizers.
A new interface within Automatic 11.11 generates images with transparent backgrounds, a significant advancement.
Stability AI has released a tool that converts images to 3D models, showcasing impressive results.
Pika labs has developed a feature that syncs video character lips to provided text, improving upon synthetic voice technology.
Geospy.ai is an app that can determine the geolocation of an image, raising privacy concerns.
The video emphasizes the importance of staying informed about AI advancements to protect oneself from potential misuse.
The AI Advantage channel provides a playlist of previous videos for those interested in learning more about AI use cases.
The video concludes by encouraging viewers to subscribe for updates on the latest AI use cases.
Transcripts
Straight up, generative AI is on fire right now. There's no two ways about it. We just keep getting
new applications that are actually useful left and right. And look, the word useful is a very
subjective word. But I think most people watching these videos would agree that ChatGPT is useful.
That is the one app where most of the population would probably agree that there's some everyday
use cases for that, that just really makes sense. Well, this week, we have a ChatGPT competitor
that is legitimately better in certain use cases. That's exciting. And then also Google went ahead
and upgraded their assistant on Pixel phones. So they effectively have an AI powered assistant in
their phones already. Independent of what you think about the recent product releases, this
is interesting. And then there's many more little things like this spy software, you can just put
in an image and it uses AI to detect where that was taken. Or this image generator that actually
generates images with transparent backgrounds, something that was just not possible up until now,
you had to have some level of photo editing skills to get those results. So without further ado,
let's dive into another week of AI use cases. These are all the apps that came out over the
course of the last week that you could be putting to work today. So let's just start talking about
Cloud because this is probably the biggest upgrade of the week. As most people agree that ChatGPT is
the number one app they use and the number one app that is useful for them when it comes to AI
tools. Well, Cloud3 claims to be a better GPT-4, but so do many apps, right? Is it actually? Yeah,
it kind of is not across the board. It lacks a lot of the features of ChatGPT. And with certain use
cases, I would still stick to GPT-4. But in many use cases that really matter, it is better. Now,
I'm not going to go into depth on this, I created a dedicated video on that. You can check out that
video over here. If you just want a summary, then I would say these benchmarks generally
don't matter. What matters is the use cases and the use cases that I tested really impressed me,
particularly the image recognition is best in class. Whenever you're prompting with images,
you want to be using Cloud3 and its ability to generate ideas and brainstorm is excellent.
That is one of the things I go to ChatGPT for all the time. And from my initial testing, which I've
done a lot of, Cloud3 just seems to be better across the board at that. The recommendations
make more sense. With other use cases of mine, like creative writing or coding, it seems to not
be that clear. In some cases, GPT-4 is better. In some cases, Cloud3 is better. Again, if you
want a more in-depth discussion on this, check out the dedicated video on this. One last thing
that I want to point out is that there's actually a site where you can use it for free. You can try
it for free and compare it to GPT-4. And that site is chat.lmsys.org. We talked about this before.
This project aims to crowdsource the ranking of different chatbots. And as a part of that,
you can actually go to the website, go to this arena tab on top and switch the model to Cloud3
Opus. This is their flagship model. On the other side, you can switch to GPT-4 1.1.0.6 preview and
you'll be comparing the best version of Cloud to the best version of GPT-4 today. Down here,
you just enter the prompt, then it runs it in both and you get the results for free. Now,
the whole point of this site is for you to rate it. So, you know, if you have a preference, go
down here and pick which one you prefer. But there you go. That's a free way to try Cloud3 and GPT-4.
As you can see here, sometimes it is overloaded, but yesterday I used it all day and it worked
perfectly. All right, let's move on to the next use case. All right, next up very briefly, we have
some updates to Microsoft's Copilot. We talked about this before, but some of these updates are
really nice. And by the way, this is completely free. But before I show you the updates,
I'm gonna head up to the settings and change it to the dark mode. Way better. So what I want to talk
about here is the notebook feature, which actually allows for a whopping 18,000 character prompts.
Good stuff. Okay, it actually took me a while to understand this feature, but what this is really
good for is refining prompts. Because it takes so much context, you can copy-paste a long document
into here and you will get the output right here. And then you can refine it in here as you go. It's
really all about the big input window. And the second feature they added here is these Copilot
GPTs. So as of now, there's only a few presets, but they're opening this up as they go. And I just
found this direction very interesting. That's why I wanted to share. So you can go for one of these
use cases like a fitness trainer, for example, and then you have prompt recommendations and it takes
on the persona of a fitness trainer. It's really nice to see integrate these different things that
we've seen in other apps. And this is all free. But now let's move on. AI tools like this are
great, but they can't do it all for you. To get the most out of a tool like Midjourney,
you have to know a little bit about graphic design, photography, or painting. And that's
exactly where Brilliant.org comes in, the sponsor of today's video. Brilliant is an interactive
learning platform where you can acquire the tools you need to truly get the most out of AI and
achieve your goals. Answer a few simple questions when you sign up and Brilliant will help you find
the learning path for you. This includes courses and Brilliant has over 100 courses available for
you to try for free right now. And every month they're adding more. One thing that's really great
is that they add these case studies like this one in the data analysis learning path called Topping
the Charts with Spotify. It uses real Spotify data to teach you about advanced probabilities,
making the concepts relatable, and genuinely fun to learn. If you ever felt like learning isn't
for you, let's be real, it's probably because of traditional education. And that's not the case
with Brilliant. Their bite-sized interactive lessons get you hands-on with the concepts,
a technique that has been proven time and time again to help you learn more efficiently. Here
on this channel, we're all lifelong learners because that's what you need to be if you want
to stay on top of AI. Most of the things we cover here are brand new, so by the very definition,
if you want to use it, you gotta learn it. And Brilliant is here to support you on that journey.
Go to brilliant.org slash the AI advantage and start your learning journey today for free for
the first 30 days. And the first 200 people that sign up will get 20% off the annual subscription.
Thanks again to Brilliant for sponsoring this video. But now let's move on to the next use case,
which is actually Google's Gemini that replaces their assistant. And before the comments fill up
with Google Gemini hate, as it usually happens recently, hear me out. This is actually too good
not to cover. I think this just makes so much sense. Because essentially, what they're doing
is they're replacing Google Assistant on Apple, that would be Apple Siri, with their Gemini
large language model. So it just becomes so much smarter. And if you paid close attention, yes,
they did this right on release, but it couldn't do a lot of things. It couldn't create tasks or look
into your emails. Now it can. But as an iPhone user, I didn't get to test this yet. Luckily,
one of the AI Advantage team members, Daniel has a Pixel phone, and he tested this out. So
here's a little screen recording on how it works. You can use voice input to tell it things like,
do you have any unread email messages, and then it uses Google Workspace to actually look
at your emails, and it gives you all the subject lines. And then you could prompt on top of that,
summarize all of them, which ones are important, related to my work, whatever you might be doing.
Set a reminder to take the trash out, or ask about upcoming football matches. Now if you look across
the internet, some people claim that they might have jumped the gun on this, that it's not a fully
developed product. But I really like the fact that they took this bold step. And this is one of those
use cases where it's just a no-brainer. Would I want a better Siri in my iPhone? Absolutely.
Will it come? For sure. When? I have no idea. It might be 2030. We shall wait and see. It just
makes so much sense though, right? Because usually these assistants, they're pretty two-dimensional,
and with a large language model in there, especially one of the better ones, plus the
ability to actually interact with some of my apps, eliminating the need for me to go around and,
you know, press buttons and spend time setting it up. That's a real no-brainer, so I'm glad Google
took the step here. So yeah, if you have a Pixel phone, you can switch out your Google Assistant
for Gemini. And you'll officially be running around with a supercomputer in your pocket that
has an AI assistant that runs on it. The future is here, I guess. Okay, let's move on to the next
one. And this one is actually super fun. Okay, so this is the TTS Arena, aka text-to-speech arena.
And if you've been following the channel, you might have seen me talk about chatbot arena, where
you run a prompt and it gives you two outputs with different large language models, and then you get
to vote which one you prefer. This effectively crowdsources the ranking of these chatbots instead
of relying upon these benchmarks that have kind of lost their credibility across the last few months,
if you ask me. So this is the same concept, but for speech synthesis. In other words, I could go
in here and say something like, welcome to the AI Advantage, and then I could synthesize this,
and it's going to generate this with two random synthesizers. And honestly, when I was testing
this out, I just found myself stuck in here doing this for 10 minutes. It feels like a game,
and the game loop here is really spot on. Just test out one. Welcome to the AI Advantage. And
then another. Welcome to the AI Advantage. Yeah, so in my opinion, clearly the first one is better.
And then you just go into the next round. You can randomly generate phrases like this. The lathe to
make brass objects. That's terrible. He used the lathe to make brass objects. It's a bit better.
And then you just keep doing this. And then you can tab over here on the leaderboard, and you'll
see the best speech generators according to the site. Maybe less of a use case, but more of a
resource. Nevertheless, a fantastic site that I think could be useful to many of you. All right,
moving on. Very briefly, this just popped up on my radar a few minutes before I started
recording this, literally. And what this is, is a new interface that you can use inside of Automatic
11.11 that generates images with transparent backgrounds. So if you're not aware, this
is not possible with any diffusion model as of now. DALI, Stable Diffusion, Excel, Mid Journey,
it doesn't matter. None of them can do transparent backgrounds. That is actually a bigger deal than
you think, because once you have a transparent background that is actually flawless, you can
composite it really easily with other things. Like, think about a lot of the problems we have
with AI images now, like the text not working out properly. Well, if you could generate two images,
one where it's just the text, stylized just like the background image, and then the background
image, you could connect them really easily, and you wouldn't have to create everything in
one process, which seems to be a big problem with these diffusion models. Another great thing about
this is that it eliminates the need to remove the background in some workflows. So people are
building these workflows where they link a lot of nodes together, and oftentimes one of them removes
the background, and then everything can happen in an automated fashion. But the problem is, if
you have automated background removal, it's just not going to work every single time, especially
in cases like this, when you're trying to remove the background behind hair, for example. That's
particularly tricky in Photoshop, there's an extra tool for that. And then you have different sliders
that you adjust to make it really look good. And if you use an automated tool, it's never really
going to be perfect at this. Well, you avoid that entire process if you generate the image without
the background from the get go. Same thing with glass, notoriously tricky, and you need to know
the specific Photoshop technique to do this well. And automated apps weren't good with this. So I'm
excited to see where this leads. We're going to cover the workflows people built with this when
something new comes out. But I thought this was significant enough to include this and to
let you know about it. As in my opinion, this is a bigger leap than most people realize, hey, I just
becoming better every single week. Okay, moving on to something we're going to try here. It's a
new release by stability AI. And in my humble opinion, as a non 3d artist, this is probably
the best image to 3d model that I've seen yet. And again, this is just my very brief take,
I tested a few images, but it's actually quite good. And you can try it out over here on this
hugging face space. Okay, so what we're going to do here is we're going to screenshot one of the
Sora generated videos. So this takes images as an input. So we need an image. And what I'm going to
do here is I remember this little character down here, I'm just going to take this guy. Oh, yeah,
that's the one. I love this little dude. And what we're going to do is we're just going to
screenshot him, that's important to get the full body. And then we're going to just upload him over
here. And look, in real time, I'm just going to say generate. And if this model is not overloaded,
it's actually surprisingly fast. So let's have a look. 17 seconds later. Not bad, right? Okay,
it looks like somebody punched him in the face or something. But this is pretty good. And I
found this result to be consistent across multiple examples. Obviously, some of the showcase examples
are some of the best ones you'll find. Look at all the detail on this robot. Not bad. So yeah,
stability AI pushing the envelope on what's possible with these models. And again, I
understand that most people won't be using these. But yeah, I find it interesting that these use
cases do come up. And over time, we'll see these integrated into apps we know and love. And then
next up a new feature in the AI video world, Pika labs has shipped something for everybody where it
takes a video and it syncs the lips to the text that you provided with. So if you generated a
synthetic voice, now you can make the character in the video match that Pika now supports lip
sync. And it is truly spectacular. Wait, hold up, hold up, hold up. These things can talk for real.
Now let me tell you, I went ahead and test this and it's really hit or miss. Now I gotta say the
voices are really good because they partnered with 11 labs to generate those. But the lip syncing,
it's okay. If you have animated characters, it works really well. I wouldn't even bother trying
it on photo realistic things that doesn't work at all. Or I think it would be more accurate to
say that it would never fool a human. But that's a real video of a human talking. But for animated
stuff, it can be really useful. And if you're trying to tell a story and your character talks,
it's a no brainer. You need this in your workflow. Okay, then I have one more use case for today. And
this one is really, let's say I was very hesitant in putting this into the video. But you know what
I think you should know about it. It's trending all across Instagram. And it's this app called
geospy.ai. And essentially, you put an image and it tells you the geolocation of the image based
on what it sees. Now the thing is not fully rolled out. You can sign up for early access
to the pro version, which will be able to provide you with the exact coordinates of the image. And
this stuff is a bit creepy. And if I had a choice, if this thing should exist and be available to the
public or not, yeah, I would go with not without hesitation. But that's just not how it works. This
thing exists, and I feel a responsibility to actually share the existence of it with you.
In my testing, it has been quite good. But as of now, it can only locate the city, not the exact
coordinates. Okay, so if I head on over to Google Maps, and I go into Street View, I'm just going
to screenshot a random part of this. So I don't know if you're not a GeoGuessr pro, I don't think
you could tell that this is a specific city. I don't know, maybe. And all I'm going to do is
upload this. And this tool is free right now. So you can try this on any image on your phone. I'm
just going to upload this and we're going to see if it identifies it correctly. And yeah, there
you go. Country Portugal, city Lisbon, estimated coordinates. Again, these just identify the city,
not the exact location that is supposed to come in the pro version. Either way, I think it's better
to be aware of these things rather than shutting your eyes and pretending like these things are not
happening. Also, in our weekly newsletter, we did multiple segments on deepfakes and fake telecalls,
because I think it's really interesting for you to know about this stuff. The only way to protect
yourself is to be informed. If you're just going to shut your eyes and pretend like AI technologies
are not progressing, well, you're just opening yourself up to people abusing technologies like
this. You know, now that you heard that app like this kind of is under development, and other
people are going to do that too. And in a few years, it might be quite common, you might just
reconsider the next thing that you share on your Instagram story, or you might make your profile
private, whatever it might be. I think education around these topics is the only way forward. And
I hope I can contribute my part to that. And with that being said, those were all of this
week's use cases. If you want to learn more, we have a playlist with all of the previous videos.
So just because something came out a few weeks ago doesn't mean it's not super useful. Particularly
last week's videos seem to be really useful to a lot of things. I still use GigaBrain for research
most days. And this week, I've been using Cloud every single day. So new and useful stuff is
coming out all the time. If you want to stay on top of it, subscribe to the channel. And I'll see
you next Friday when we look at next week's use cases. All right, that's it for now. See you soon.
5.0 / 5 (0 votes)