Managing Observability Data at the Edge with the OpenTelemetry Collector and OTTL - Evan Bradley
Summary
TLDRIn this insightful session, Evan, a contributor to the OpenTelemetry Collector and OTL, introduces the audience to the Collector's capabilities and OTL's role in observability pipelines. He presents a case study of 'Global Telescopes' to demonstrate how to use OTL for data processing, including filtering, parsing, and redacting sensitive data. The session also covers routing and sampling strategies, showcasing the flexibility of OTL for telemetry data management. Evan wraps up with a sneak peek into upcoming features and concludes with a Q&A, inviting participants to explore and test OTL's capabilities.
Takeaways
- 📈 Evan has been contributing to The otel Collector and OTL for about two years.
- 🔄 The otel Collector is a middleware in observability pipelines, processing and routing data through its internal pipeline model.
- 🔍 OTL is a language used for reading from and writing to data as it flows through the otel Collector, offering flexibility and a common configuration format.
- 🏭 A hypothetical case study is presented involving a company called Global Telescopes, which has applications hosted worldwide and deals with data privacy and telemetry processing.
- 🛠 The Collector processes data through receivers, processors, and exporters, allowing for filtering, enriching, and routing of data.
- 🔐 Data redaction and handling of PII are demonstrated, including the use of hashing (SHA-256) to protect sensitive information.
- 📊 The routing and sampling of data in a centralized collector are covered, with a focus on handling errors and payment service data.
- 📄 Examples of OTL statements and configurations are provided, illustrating the setup and processing within the otel Collector.
- 🆕 New features include optional parameters, functions as parameters, and additional functions to enhance the capabilities of OTL.
- 🔧 Future improvements aim to handle list values and stabilize the transform processor, making the system more user-friendly.
Q & A
What is the main focus of Evan's presentation?
-Evan's presentation focuses on The Otel Collector and OTL, including an introduction to these tools, a hypothetical case study, and how to apply these components to various setups.
What are The Otel Collector and OTL?
-The Otel Collector is a middleware in observability pipelines that processes and routes data. OTL is an easy-to-read language that allows for reading from and writing to data as it flows through the collector.
How does the internal pipeline model of The Otel Collector work?
-The internal pipeline model of The Otel Collector consists of different components, such as receivers, processors, and exporters. Data enters through receivers, is processed by processors, and is sent out by exporters.
What is the purpose of connectors in The Otel Collector?
-Connectors in The Otel Collector are used to connect pipelines and perform tasks such as routing data. They add flexibility to the pipeline model.
What is the main advantage of using OTL in The Otel Collector?
-The main advantage of using OTL is its flexibility and common configuration format, which allows users to work with data in the collector without worrying about input or output formats.
What is the hypothetical case study presented by Evan?
-The case study involves a company called Global Telescopes, which needs to handle data from applications hosted worldwide, comply with local data privacy laws, and process telemetry data through sidecar collectors and a centralized collector.
How does the filter processor help in the case study?
-The filter processor is used to filter out unnecessary data, such as noisy logs at debug level, to reduce the amount of data that needs to be processed further.
What is the purpose of redacting data in the case study?
-Redacting data, such as purchase IDs considered PII, ensures that sensitive information is not exposed when data leaves the region. It also helps in handling data deletion requests by hashing the PII.
How does the routing connector function in the case study?
-The routing connector directs data to the appropriate backend based on annotations. Data from newly acquired teams is routed to their old backend, while other data is routed to the company-wide backend.
What new features have been added to OTL recently?
-Recent additions to OTL include optional parameters, functions as parameters, and 15 new functions. Future improvements are planned for handling list values and stabilizing OTL in the transform processor.
Outlines
👋 Introduction and Overview
Evan introduces himself and outlines the session's agenda. He discusses the OTel Collector and OTL, their components, and their functionalities. He also sets the stage for a hypothetical scenario to demonstrate the practical application of these tools.
🔄 OTL and Data Processing
Evan explains OTL's role in data processing within the OTel Collector. He provides a brief overview of OTL's syntax and its usage for setting attributes. He then moves into the hypothetical case study of Global Telescopes, detailing how data is processed, filtered, and routed through different collectors.
🔍 Filtering and Parsing Data
Evan elaborates on the process of filtering and parsing logs using the OTel Collector. He describes the use of the filter processor to remove unnecessary data, like noisy logs, and the parsing of JSON logs into a structured format. He also introduces the concept of redacting sensitive information, such as purchase IDs, using SHA-256 hashing.
🔄 Routing and Sampling
Evan discusses the routing and sampling of data in the OTel Collector. He explains how data is routed to different pipelines based on annotations and how tail sampling is used to reduce data volume. Specific conditions for sampling, such as errors and payment services, are highlighted.
🆕 New Features and Testing
Evan covers recent updates and new features added to the OTel Collector, including optional parameters and functions as parameters. He also talks about upcoming improvements and provides links to documentation. Additionally, he addresses questions from the audience about persistence, retry logic, and the importance of hashing PII close to the source.
Mindmap
Keywords
💡Observability
💡otel Collector
💡OTL (OpenTelemetry Language)
💡Pipeline
💡Receivers
💡Processors
💡Exporters
💡Connectors
💡Case Study
💡Filter Processor
💡Tail Sampling
Highlights
Introduction to Evan, the speaker, who has been contributing to The OpenTelemetry Collector and OTL for about two years.
Overview of the agenda for the session, including an introduction to the OpenTelemetry Collector, OTL, and a case study.
The OpenTelemetry Collector is a middleware for observability pipelines, capable of processing and routing data with an internal pipeline model.
Explanation of the components in the Collector's pipeline, including receivers, processors, and exporters.
The innovative use of connectors to link pipelines together for advanced routing and processing.
OTL (OpenTelemetry Language) is introduced as an easy-to-read language for data manipulation within the Collector.
OTL's flexibility and standardization across different components in the Collector.
Case study of Global Telescopes, a conglomerate dealing with data privacy laws and telemetry processing.
Use of sidecar collectors to filter and redact data before it leaves the region, complying with local data privacy laws.
Introduction of a hypothetical situation where Global Telescopes acquired a new company, requiring data routing changes.
Implementation of tail sampling to reduce costs in the centralized collector's data processing.
The use of filter processors to eliminate noisy logs and unnecessary span events.
OTL's cache feature for temporary data storage during pipeline processing.
Parsing JSON logs into a structured format for further processing and redaction.
Redaction of PII (Personally Identifiable Information) such as purchase IDs using hashing techniques before data transmission.
Handling data deletion requests by hashing identifiers to locate and remove data from the backend.
Routing data using the routing connector with OTL support for efficient data management.
Tail sampling policy implementation to selectively process and reduce data volume in the companywide backend.
New features in OTL, including optional parameters and an expansion of functions to enhance flexibility.
Future plans for OTL to handle list values and stabilize HL in the transform processor for improved functionality.
Debugging capabilities added to OTL for testing and development, including debug logging and traces.
Transcripts
all right Hello Seattle how we
doing all right good just make sure
you're
awake okay so a little bit about me uh
I'm Evan I help maintain both The otel
Collector and OTL both of which I've
been contributing to for roughly about
two years now and uh before we get
started I want to quickly go over what
we're going to cover today so I'm going
to give a quick intro to The otel
Collector I know we've talked about a
little bit but uh we're to go a little
bit deeper so I just want to make sure
everyone's on the same page and then I'm
going to cover what OTL is after that
I'm going to introduce a hypothetical
situation that I've uh devised that
we're going to solve using OTL and a
handful of popular components uh and I'm
hoping that by the end of this you're
going to have an idea of how you could
apply these components to your own
setups so first uh for anyone who isn't
familiar The Collector s is a middleware
in your observability pipelines and can
process and Route data as it flows
through um the collector's flexibility
comes from its internal pipeline model
which is composed of different
components that you can string together
and data comes into the collector
through receivers you'll see that on the
left there which translate an external
format into the collector's internal uh
pipeline data format and the data stays
in this internal format until it leaves
the collector uh after receivers the
data will go through processors which
can Reda filter enrich Etc and then
finally it leaves the collector at expor
which translate that into an external
format and send it somewhere else for
example to an ootl endpoint uh something
really cool though is that you can
connect pipelines together with
components that are called connectors
and you can do all sorts of things with
connectors but the thing that I'm going
to focus on today is just going to be
routing
data so a quick intro to OTL OTL is an
easyto read language that allows reading
from and writing to data in place as it
flows through the collector it's
steadily becoming the standard way to
work with data in The Collector since
it's flexible and offers a common
configuration format across a whole
bunch of different components that use
it um and since all collector components
work with this internal data format you
can use OTL without having to worry
about the input or output format of the
data and at the bottom here you can see
an example OTL statement so this just
basically sets an attribute where some
name matches a regular expression so
hopefully pretty straightforward easy to
read
moving into the case study uh let's
consider a company uh Global telescopes
that is a telescope manufacturing and
sales conglomerate that sells to
organizations worldwide and to serve its
customers where they are it has
applications that are hosted in regions
around the world uh but this comes with
some complexity and to deal with local
data privacy laws and scale their
Telemetry processing the applications
send their data to sidecar collectors
that filter out extra data and redact it
uh before before it leaves the region
after it leaves the region uh the data
is then collected into a centralized
collector where they can take actions
that need to be handled companywide or
otherwise require a single collector
instance and for this example let's say
that the Gateway collector needs to do
two things uh first as conglomerates due
the global telescopes company just
acquired another company that is being
added to its consumer retail arm and the
new teams haven't yet fully integrated
with the rest of the company so the data
from their apps needs to be routed into
their old back end
and then for the rest of the data that
is routed into the company wide backend
it needs to be sampled to cut down on
costs and they want to do tail sampling
to uh do this that needs to be added
into a single collector to work properly
um so if you look at this setup here
this is a uh basically a pipeline
diagram model for a single region uh for
what this would look like uh inside of
the collectors that we've configured so
basically data comes in through OTP is
processed in the side car in the filter
and transform processors and then is
sent on to the second collector which
determines where the data needs to go
using the routing connector and then
finally samples with the tail sampling
processor for the data that ends up in
the companywide back end diving into it
let's start with the side car so first
we want to use the filter processor to
filter out a bunch of data uh sometimes
the developers leak their log level at a
little higher setting than they should
and these logs are pretty noisy so we
want to filter them out or uh as Jamie
was talking about earlier maybe they
have some extra span events uh from
their instrumentation that we want to
get rid of um regardless um the filter
processor can do the job so for this
example specifically uh collector log
levels are stored as integers and lower
numbers mean nois your logs so we want
to cut out anything that's at debug
level or lower uh however we still want
to keep info and error level logs so
those are going to be passed through uh
and with that we've cut out quite a bit
of data so we can now move on to
um additional processing on the logs so
we want to parse them now the logs are
sent to the collector in Json but we
want them in a structured format so we
need to parse them here and before we
dive too deep into this I do want to
call out OT Tail's cache feature which
basically serves as a way to store
temporary data inside of a map while
you're working with something so the
cache starts out empty before the first
statement there and then after the very
last statement is going to be cleared
once again before the next payload comes
in so purely temporary and only is used
to hold data while we're doing these
computations so what we're going to do
is we're going to parse the body put it
in the cache and then we're going to
take the pars attributes map and parse
log body out and then put them on the
structured log and at the result of this
we get a structured
log with that we can now redact some
stuff out of it so let's say we have a
purchase ID and this is considered pii
so we want to redact this before sending
it to our back end and to make things
interesting let's say that we also need
to handle data deletion requests so in
order to do that we would need to be
able to locate the data inside of our
back end given some customer input let's
say they give us the purchase ID and we
want to be able to find the equivalent
data for that uh we could hash it let's
say using a shot 256 hash so first uh
I'm not a lawyer so don't take this as
advice for how to handle pii it's just
illustrative um but what we could do
here is we could U basically take the
purchase ID out of an attribute uh match
it with a regular expression and then
use the match group to uh as input to a
shot 256 function and then replace the
attribute on there um and the cool thing
about this is that that shot 256
function is actually like an OTL
function it's not hardcoded into the
replace pattern function at all you
could replace it with whatever hashing
uh algorithm that you wanted and
additionally that whole function
argument is optional so if you didn't
pass that in it would just use the
capture group without uh hashing it at
all and um but if you you know obviously
want it you can have it there too
um so with all the data reacted the data
can now safely leave the region and we
can move on to the second collector here
which has the pipelines that are covered
uh are highlighted in this diagram um so
we're going to route and then we're
going to sample uh let's start with
routing and then after we route if it
goes to the retail pipeline no more
processing is needed that team will
handle it so we can quickly cover that
uh we can do that with the routing
connector so the routing connector also
supports OTL and we have this pretty
pretty easy all of our applications are
already annotated with what branch of
the company they apply to so we can just
check if it's anything that starts with
retail goes to the retail back end we're
done anything else goes to the
companywide back end and with that we're
done with the retail pipeline so we can
move on to uh our companywide pipeline
that is mostly for industrial telescopes
um and again we need to do sampling
there so we're going to use this tail
sampling policy and this checks does a
handful of uh checks using OTL before
determining whether to sample it or not
uh if any of these conditions matches
then the trace is sampled if none of
them match then it's dropped so first we
want to make sure that if there's an
error we know about it so we're going to
sample all errors uh then we really like
to be paid so anything from the payment
service is definitely sampled uh and
everything else we do a pretty
rudimentary uh sampling algorithm where
we get the heximal representation of our
random Trace ID and then check if it
starts with one of the 16 heximal
characters in this case a um and that
gives us a one in 16 chance of data
being sampled so with this we've cut
down the data that we have uh we're good
to go the data is then forwarded to the
company wi back end and this is the
resting config you can see here that
it's pretty simple everything's
configured uh using like the same kind
of configuration format it's all ootl I
do want to call out that
this is not a production ready config
there's a couple of like recommended
options that I've left out of here just
to make the config a bit shorter but
this should hopefully give you an idea
that you can use a variety of different
components and they're all kind of
configured in the same sort of way and
that you can more or less kind of tweak
and query how you
like uh finally I do want to cover some
some new features here that we've added
recently uh first the one I or two these
first two I've covered uh earlier uh the
first one's optional parameters so for
example for this pars key value function
uh we have a default
delimiter and users might want to
override that so they're given the
option to set their own delimiter if
they want similarly with functions as
parameters you can pass in a function if
uh your function accepts that if it
matches the function signature uh not
common but useful for complex use cases
and then finally we've added 15 new
functions so far this year and we're
continually adding more so if there's
functional that you feel like was
missing prior uh check back because it
might be there now going forward uh
looking a little bit ahead we're going
to be looking at how to handle list
values that's something that is kind of
a bit of a gap right now we'd like to
improve uh and then we're going to be
looking at trying to stabilize HL in the
transform processor uh looking ahead to
hopefully consolidating uh basically
that list of processors possibly a few
more um into the transform processor
just make it easier for users to
determine which processor to do
um and here are some dock links if you
want to scan or type those in but um
I'll leave this up but we're done I
think we're ready for questions
now right um okay so that is I'm
definitely not the expert on this but
there are um persist there are
extensions that allow you to persist
data in the pipelines on disk or on a
store like S3 or something like that so
that could be one option second you
would also want in your pipeline some
retry logic so so you have a collector
crash you know it's got some data in its
pipeline but you've persisted that it's
safe you're going to be still sending
data to it you'll want retries for when
the collector comes back
up so okay uh that that's a tough
question and I'm definitely not
qualified to answer it but my
understanding of how that would work is
I would probably recommend sharting you
check the trace ID and then you use that
to route to a particular collector I I
dressy is that sound like a good
okay cool
okay
thanks um so for here was the
flexibility but I would recommend to do
it as close to source as possible so the
reason that I had this in the example
here um and I'm sorry I forgot to repeat
the question the question is is there a
reason that you would want to Hash um
pii in the collectors opposed to as
close the application as possible um and
again I would do definitely do that as
close as possible
and the um yeah there's no reason to do
it any further but the reason I did it
here was because if you were in the same
re usually Pi you don't want to leave
the region right so if you put a
collector as close as your application
as possible you can be sure it won't
leave the region
unredacted a good question um I'm happy
you asked actually because I didn't get
or I didn't take time to call it out um
so the question is is there any good way
to test OTL um and the answer is yes
Tyler actually just added some debug
logging so if you turn on debug logging
The Collector it'll print out debug logs
that show the state of the data before
and after an execution uh we're also and
this is being reviewed right now I think
it's actually being pretty it's pretty
close to being merged uh we're adding
traces as well to the transform
processor
Weitere ähnliche Videos ansehen
Fine-Tuning Auto-Instrumentation - Jamie Danielson, Honeycomb
Getting Started with Magnet AXIOM Examine - Search and Filters
Tuning OTel Collector Performance Through Profiling - Braydon Kains, Google
APAV Case Study Course: Session 5 - Survey Plan
Cast and Convert functions in SQL Server Part 28
Amazon Elasticsearch Service로 우리 서비스에 날개 달기-박진우,솔루션즈 아키텍트,AWS::AWS Summit Online Korea 2021
5.0 / 5 (0 votes)