Breaking the Chain of Blame: How to Get True Test Observability? - Ken Hamric, Tracetest.io

CNCF [Cloud Native Computing Foundation]
29 Jun 202416:29

Summary

TLDRThe video script delves into the concept of 'Breaking the Chain of Blame' by leveraging test observability. It showcases a case study involving a Node.js application, where a broken Playwright test unveils a deprecated third-party API as the root cause of failure. The speaker advocates for the proactive use of observability in pre-production and production environments, beyond just SREs, to prevent issues before they impact users. The script introduces trace-based testing, emphasizing its importance in creating a more efficient and blameless system.

Takeaways

  • 🔍 **Observability in Testing**: The speaker emphasizes the importance of using observability not just in production but also in pre-production to proactively find issues before release.
  • đŸ› ïž **Breaking the Blame Game**: The session aims to illustrate how observability can help break the cycle of blame in software development by providing clear insights into where issues originate.
  • đŸ€– **Playwright Test Example**: The script uses a Playwright test on a Node.js application to demonstrate how observability can be integrated into testing to identify problems.
  • 📈 **Trace Test Observability**: Introducing the concept of test observability, which includes having a trace for every test and using those traces for trace-based testing.
  • 🔗 **System Under Test**: The application used in the example is a simple Node.js app with a frontend, two services, and an asynchronous process that communicates with an external API.
  • 🔬 **Blame Game Scenario**: The script outlines a scenario where a test failure leads to a blame game between different teams, highlighting the inefficiency of this approach.
  • đŸ‘„ **Cross-Functional Collaboration**: The need for collaboration between QA engineers, automation engineers, backend developers, and external vendors is stressed to effectively utilize observability.
  • 📊 **Test Observability Benefits**: The benefits of test observability are discussed, including the ability to diagnose problems faster and reduce the impact of the blame game.
  • 🛑 **Root Cause Analysis**: The script describes how observability helped in identifying the root cause of a test failure, which was traced back to a deprecated third-party API.
  • 🔄 **Actionable Insights**: The session concludes with actionable insights, encouraging the use of existing instrumentation, adding observability in both pre-prod and prod, and adopting a proactive rather than reactive approach to testing.
  • 📝 **Postmortem Analysis**: The importance of conducting a postmortem after an outage to understand the root cause and prevent future occurrences is highlighted.

Q & A

  • What is the main goal of the presentation?

    -The main goal is to introduce the concept of using observability in a testing environment, specifically in pre-production, to proactively prevent issues rather than only using it reactively in production.

  • Who are the key participants mentioned in the presentation?

    -The key participants are QA engineers, automation engineers, backend developers, a front-end developer, an engineer from an external vendor (PokeAPI), and the pointed head boss, who is also the founder of Trace.

  • What problem is being addressed in the presentation?

    -The problem addressed is the failure of a Playwright test in the pipeline, which halted the process. The issue was traced back to a third-party API endpoint that had been deprecated.

  • What does the application under test consist of?

    -The application is a Node.js app with a React front end, two services, and an API. It verifies requests, throws them on a message bus, returns a 200 status, and performs asynchronous processing involving a cache, PokeAPI, and a database.

  • What specific problem occurred with the Playwright test?

    -The Playwright test failed during the delete Pokemon process. It was found that the import process was not working, leading to the test failure.

  • How was the root cause of the issue identified?

    -The root cause was identified by using observability tools to trace the process, which showed errors when reaching out to the PokeAPI. It was found that the API endpoint had been deprecated.

  • What concepts are introduced to resolve such issues?

    -The concepts of test observability and testability are introduced. Test observability involves having traces for every test, and testability involves using these traces to create tests.

  • How does Trace-based testing work?

    -Trace-based testing involves validating a system's behavior by comparing the system's output with generated traces. Assertions are made against the spans in the traces to verify expected behaviors.

  • What are the benefits of using test observability?

    -The benefits include quicker identification of issues, better trace quality as developers start caring about traces, and proactive prevention of problems by verifying in pre-production environments.

  • What are the key takeaways from the presentation?

    -Key takeaways include leveraging existing instrumentation and observability backends, using observability in both pre-production and production environments, using it proactively, and increasing its adoption across the entire team to reduce the blame game.

Outlines

plate

Cette section est réservée aux utilisateurs payants. Améliorez votre compte pour accéder à cette section.

Améliorer maintenant

Mindmap

plate

Cette section est réservée aux utilisateurs payants. Améliorez votre compte pour accéder à cette section.

Améliorer maintenant

Keywords

plate

Cette section est réservée aux utilisateurs payants. Améliorez votre compte pour accéder à cette section.

Améliorer maintenant

Highlights

plate

Cette section est réservée aux utilisateurs payants. Améliorez votre compte pour accéder à cette section.

Améliorer maintenant

Transcripts

plate

Cette section est réservée aux utilisateurs payants. Améliorez votre compte pour accéder à cette section.

Améliorer maintenant
Rate This
★
★
★
★
★

5.0 / 5 (0 votes)

Étiquettes Connexes
ObservabilityTestingDiagnosticsBlame GameTrace TestingAutomationAPI TestingDevOpsSREQuality AssuranceSystem Monitoring
Besoin d'un résumé en anglais ?