Using Healthscores in ACI to troubleshoot issues
Summary
TLDRIn this tutorial, Jody explores the use of health scores in Cisco's Application Centric Infrastructure (ACI) for troubleshooting network issues. Health scores are assigned to every object in ACI, including endpoints, endpoint groups, and application profiles. When a network component fails, the health score decreases, signaling a problem. Jody demonstrates how to identify the issue by shutting down an interface on a traffic generator, causing the health score to drop. The video guides viewers through the process of using the health score to pinpoint the affected interface and switch port, and explains the expected recovery time for health scores after resolving the issue.
Takeaways
- 💡 Health scores in ACI are attached to every object, providing a unique way to monitor the health of the network fabric.
- 🔍 Endpoints, endpoint groups, and application profiles in ACI all have health scores that reflect their operational status.
- 📉 A decrease in health score indicates a potential issue with the corresponding network entity, such as a server or interface going down.
- 🛠️ ACI's health scores can guide troubleshooting by pinpointing which part of the application or network is affected when an issue occurs.
- 📊 The health score provides a quick overview of the network's health at the tenant level, with a score of less than 100 signaling a need for investigation.
- 🔎 Drilling down into the health score can lead to specific nodes and ports that are impacted, aiding in the identification of the root cause of a problem.
- 🔌 The video demonstrates a practical scenario where an interface is shut down, causing a health score decrease, and then restored to observe the recovery.
- ⏱️ There is a 'soaking period' during which the health score may not immediately return to 100% after an interface is brought back up.
- 🔗 The video emphasizes the importance of monitoring health scores and acting upon changes as part of maintaining a healthy ACI fabric.
- 📚 For further learning, the video suggests visiting unofficialaciguy.com or YouTube for more ACI how-to guides, design guides, and best practices.
Q & A
What is the primary focus of the video?
-The video focuses on using health scores within the Application Centric Infrastructure (ACI) to troubleshoot issues in a network fabric.
What is unique about ACI's approach to troubleshooting?
-ACI has a unique approach where it attaches a health entity to every object within the system, including endpoints, endpoint groups, and application profiles, allowing for a health score to reflect the status of each component.
How does the health score system work when a server or interface fails?
-If a server or interface fails, the health score, which is typically 100, will decrement to reflect the reduced health of the application or entity, indicating a problem.
What action does the video demonstrate to simulate a network issue?
-The video demonstrates shutting down an interface on a traffic generator to simulate a network issue and observe how ACI's health scores react to this change.
How does ACI respond when an interface is down?
-When an interface is down, ACI reads the configuration to understand its role in an application and then updates the health score to notify the administrator of the issue.
What steps are taken to identify the problematic interface in the video?
-The video shows how to drill down from the application profile to the endpoint group and then to the specific node and port to identify the problematic interface.
What is the significance of the health score dropping to 80 in the video?
-A health score dropping to 80 indicates that the application profile has been impacted, suggesting that one or more of the interfaces associated with the profile are down.
How does the video demonstrate the resolution of the network issue?
-The video demonstrates resolving the network issue by plugging back in a cable that was unplugged, which is expected to restore the health score over time.
What is the 'soaking period' mentioned in the video?
-The 'soaking period' refers to the time it takes for the health score to return to 100 after an interface is brought back up, which can take a few minutes.
What additional guidance does the video provide for monitoring health scores?
-The video advises that if an interface is brought back up and the health score does not immediately improve, it is normal due to the soaking period, and one should wait for the health score to update.
Where can viewers find more resources on ACI?
-Viewers can find more resources on ACI, including how-to's, design guides, and best practices, on the website unofficialaciguy.com or on YouTube.
Outlines
🔍 Exploring Health Scores in ACI for Troubleshooting
This paragraph introduces the concept of health scores within the Cisco Application Centric Infrastructure (ACI) and how they can be utilized for troubleshooting network issues. Jody explains that every object in ACI, such as endpoints, endpoint groups, and application profiles, has an associated health score. The health score is designed to reflect the status of network components and can help identify when a device or interface fails. Jody demonstrates this by having Javad disable an interface on a traffic generator, which is part of an application within ACI. The health score decreases from 100 to 80, indicating the impact. The video then guides viewers through the process of identifying the problematic interface by drilling down into the health scores of different components until the specific node and port are identified.
🔧 Restoring Health Scores and Interface Recovery in ACI
In this paragraph, the focus is on the process of recovering a network interface in ACI and the subsequent recovery of health scores. It begins with a discussion on the health score dropping below 100%, indicating a fault that needs investigation. The video then shows the process of bringing the interface back up, emphasizing that there is a 'soaking period' where the health score takes a few minutes to reflect the restored status. The video reassures viewers that it is normal for the health score not to immediately return to 100% after an interface is reactivated. The demonstration concludes with the health score returning to 100%, confirming the successful resolution of the issue. The video ends with a prompt for viewers to visit unofficialaciguy.com or YouTube for more resources on ACI.
Mindmap
Keywords
💡Health Scores
💡ACI (Application Centric Infrastructure)
💡APIC (Application Policy Infrastructure Controller)
💡Endpoints
💡Endpoint Groups
💡Application Profiles
💡Troubleshoot
💡Traffic Generator
💡Interface
💡Static Path Binding
💡Soaking Period
Highlights
Introduction to using health scores in ACI for troubleshooting fabric issues.
Explaining the unique feature of health entities attached to every object in ACI.
Description of how health scores reflect the status of endpoints and application profiles.
Demonstration of how health scores change when a server or interface is lost.
Practical example of taking down an interface and observing health score changes.
Explanation of how ACI reads intent and notifies when a configured interface is down.
Guidance on using health scores to identify the problematic interface.
Drilling down into the health score to pinpoint the affected node and port.
Discussion on the importance of static path binding in identifying switch and port issues.
Procedure for checking the health score after resolving an issue.
Explanation of the expected time for health scores to improve after an interface is brought back up.
Observation of the health score returning to 100 after an interface is restored.
Emphasis on the importance of investigating when health scores are less than 100%.
Instruction on how to find faults when the health score indicates a problem.
Highlight of the warning system in ACI that alerts users to health score drops.
Summary of the process for using health scores to troubleshoot and resolve network issues in ACI.
Encouragement for viewers to explore more ACI how-to's, design guides, and best practices.
Transcripts
welcome to unofficial aci guide
this is jody today we're going to look
at using health scores and aci to
troubleshoot issues in our fabric
let's take a look all right now we're
going to look at the
additional ways that we can troubleshoot
using the apic
now like like all other networking
devices aci and the apic will
allow you to configure and synthesis
logs uh
snmp traps things such as that but
one of the unique things about aci is
the way that this
is built um we have health
um a health entity attached to every
object in aci so what i mean by every
object
the endpoint in aci which is your bare
metal server or a vm
that makes up your applications the
endpoint
groups the application profiles all of
those have health scores tied to them
the idea being is that if we have 10 or
15 devices that make up
an application in one of our networks
and we lose a server we lose an
interface from a server
that 100 that you see there should be
decremented it it should go down
to indicate that the health of that
application
or that entity has gone down so we're
going to
have javad come through he's going to go
in and take down
an interface uh to the aspirin traffic
generator that we've been using to do
some of the testing
and because that is configured we've
configured that
interface to be a part of one of our
applications aci will
read our intent and say hey you've
configured this
to be a part of your application it's
now down i need to let you know about
that
so we're going to shut down an interface
we're not going to tell you which one it
is
but we're going to use the health score
inside of the tenant
to help guide us to which interface it
is so java
take it away thank you uh here as you
can see
i'm looking at tenant part and
the health score right now shows 100 so
everything looks good
i'm going to go to my traffic generator
i've got
palmer sending to dingo traffic back and
forth
and i'm going to go in palmer i'm going
to shut
this interface offline
we lost the link if i go back here
and let me refresh my screen
as you can see now we have some uh
some some losses and
now we're gonna go ahead drill each so
from the
application profile point of view
net 100 epg net 102 and net 104 they've
been impacted
and you can see the health code has gone
down to 80. so let's just focus on one
of these
i'm going to look at net 100 epg drill
it down
and you see now it's 30 click again
so now it tells us node 1004
location and now that go
over here it tells us the port each one
five
is the one that's connected to traffic
jam has been impacted
so if we go in sorry john if we go into
that application epg now now that it's
given us the 1004
eth one five and we can look at the
static path binding uh
and see if if we have a static path
binding for that right
that's right here static nodes one
thousand four
eight one five okay so
so that's a quick way to find out
the switch where the switch and where
the port
impacted you know and uh
so if we go back to the if we go back
and
we you know whatever we talk to the
server team
they figure out the cable was unplugged
or whatever and they go through and they
they unplug that they plug it back in
i would expect these scores to improve
after after that interface comes back up
correct
that's correct so also from the system
point of view
and if you look at tenant oh you get a
warning
yeah you see that the warning on
so if i click on the
less than 99
which brings me back to and also you see
from the summary
there is something going on see that it
says eddie
so then i'll go to my health
and as you see health is less than 100
percent
and let me see if there's any faults
here right now no there's no fault but
since we the interface has gone down but
the health score
is definitely we need to investigate
that's how we you know
right okay
oh very good
so what do we take bring the interface
back up
and then i mean it will take it will
obviously take a little while for
once the interface comes up the health
isn't going to
improve right away there is a um a
soaking period that these
uh faults and that these health scores
go through
uh before it will come back to 100 so
it'll take two to three minutes
for this uh for this to be reflected in
inside of aci
so if you're bringing in a face back up
and it doesn't come back right away
don't worry about that's expected
i just brought up the interface so we're
just going to have to wait and
as i mentioned it takes a couple of
minutes for it to
to show their health go back to 100.
party back up so that took maybe we
won't have to pause the video
let's turn it back to 100 or so that
took less than
30 seconds for it to come back up okay
very good well uh that's it for health
scores and inside of aci javid thanks
for your time
luis thanks sir thank you thank you
thanks for watching the video today if
you'd like more aci how-to's design
guides and best practices check us out
on the web at unofficialaciguy.com or on
youtube
浏览更多相关视频
Cisco Application-Centric Infrastructure: Understanding Faults and Health Scores | packtpub.com
NAC313 - VLAN
Stratix 5800 Port Mirroring
How To: Route 53 Health Checks (4 Min) | AWS | Monitor Health & Performance Of Your Web Application
What is a Credit Score and How is it Calculated?
Aplikasi SIG Untuk Kesehatan Part 1 | CARA INPUT DATA KESEHATAN
5.0 / 5 (0 votes)