Breaking Monero Episode 05: Input Selection Algorithm
Summary
TLDRIn this episode of 'Breaking Monero', the hosts delve into the intricacies of the ring input selection algorithm, exploring beyond the term 'randomly selected'. They discuss the evolution from a completely random distribution to a recent zone selection and finally to a matching distribution model, based on empirical observations. The conversation highlights the importance of balancing the selection algorithm with ring size for optimal privacy, addressing heuristics like the 'newest output' and 'coinbase outputs' to enhance plausible deniability in ring signatures.
Takeaways
- 🔒 The video discusses the Manero ring input selection algorithm, which is crucial for maintaining privacy through ring signatures in cryptocurrency transactions.
- 🔄 The term 'randomly selected' is criticized for being vague and inaccurate when describing the selection of decoys in ring signatures.
- 📊 The script explains that a completely random distribution of ring inputs can lead to heuristics that adversaries might use to de-anonymize transactions, such as the tendency to spend newer outputs more often.
- 🌐 An improved approach is the 'recent zone selection', which gives preference to more recent outputs within a certain time frame, thus making it harder to identify the actual spent output.
- 📉 The Manero team has moved towards a 'matching distribution' model, which is based on empirical observations and aims to mimic real-world transaction patterns more closely.
- 📘 The script emphasizes the importance of the selection algorithm's continuous iteration to counteract new and existing heuristics that could compromise privacy.
- 🤖 The selection algorithm incorporates elements of randomness, ensuring that no two transactions will have identical ring signatures, even if they follow the same model.
- 💰 The discussion includes the handling of 'coinbase outputs', which are newly generated funds and are treated differently in the selection algorithm to avoid heuristics based on their novelty.
- 🔍 The script mentions the complexity of creating a selection algorithm that is resistant to all possible heuristics, acknowledging that it's an ongoing challenge.
- 🔄 The importance of balancing improvements to the selection algorithm against potential unintended consequences is highlighted.
- 🔑 The video concludes with the goal of providing the best plausible deniability through ring signatures, with a commitment to ongoing improvement and iteration.
Q & A
What is the main topic discussed in the 'Breaking Monero' episode?
-The main topic discussed in the episode is the Monero ring input selection algorithm, focusing on the nuances and specifics of how decoys are selected in ring signatures.
Why is the term 'randomly selected' considered vague in the context of Monero's ring signatures?
-The term 'randomly selected' is considered vague because it does not accurately describe the complex process behind the selection of decoys in Monero's ring signatures, which involves more than just random chance.
What is a 'recent zone selection' in the context of Monero's ring input selection algorithm?
-A 'recent zone selection' refers to a method where the algorithm is more likely to select decoys from a specific recent time period, such as the last 1.8 days, to make the selection appear more plausible and less predictable.
What are the potential issues with a completely random distribution method for ring signatures?
-A completely random distribution method can lead to unintended consequences, such as the creation of strong heuristics that an adversary might use to guess the real output based on the age of the outputs, often assuming newer outputs are more likely to be spent.
How does the matching distribution algorithm differ from the completely random one?
-The matching distribution algorithm is based on empirically observed distributions and mathematical models, making the selection of outputs more representative of actual spending patterns rather than purely random.
What is a 'coinbase output' in the context of Monero?
-A 'coinbase output' is a special output in every Monero block of transactions that generates new money as part of the protocol, rewarding miners for their work.
Why might an adversary consider coinbase outputs as decoys rather than the real spend?
-An adversary might consider coinbase outputs as decoys because they are newly generated money and it is assumed that people are less likely to spend this 'new money' as the true spender.
What is the significance of the ring size in relation to the selection algorithm?
-The ring size is significant because it affects the effectiveness of the selection algorithm. A larger ring size can help mitigate the shortcomings of a less-than-perfect selection algorithm, while an improved selection algorithm can make better use of a given ring size.
How does the selection algorithm need to evolve to maintain privacy in Monero?
-The selection algorithm needs to evolve continuously to counter new heuristics and analysis methods that adversaries might develop, ensuring that the privacy provided by ring signatures is maintained and strengthened over time.
What is the ultimate goal of the Monero team regarding the ring signature selection algorithm?
-The ultimate goal of the Monero team is to provide the best plausible deniability possible with ring signatures, and they aim to achieve this by continuously iterating and improving the selection algorithm.
Outlines
🔍 Exploring Monero's Ring Signatures and Input Selection
This paragraph delves into the intricacies of Monero's ring signature input selection algorithm. Initially, the concept of 'randomly selected' decoys was criticized for its vagueness and potential inaccuracies. The speaker introduces a visual example to illustrate different selection methods, starting with a completely random distribution that could lead to heuristic vulnerabilities, such as the tendency to spend newer outputs. The paragraph discusses the evolution of Monero's approach, moving from a simple recent zone selection to a more sophisticated matching distribution model based on empirical observations. The aim is to enhance privacy by making the selection of decoys less predictable and more representative of actual spending patterns.
🛠 Refining Monero's Selection Algorithm Against Heuristics
The second paragraph continues the discussion on Monero's input selection algorithm, emphasizing the continuous refinement process to counteract various heuristics that could compromise privacy. It highlights the challenge of balancing the inclusion of coinbase outputs, which are special transaction outputs generating new money, to avoid creating patterns that adversaries might exploit. The speaker explains how the algorithm was adjusted to mitigate the heuristic involving the age of outputs and the introduction of a small window to select outputs, resulting in fewer coinbase outputs being chosen. The paragraph also touches on the complexity of addressing multiple heuristics and the ongoing effort to improve the selection algorithm to maintain Monero's privacy standards.
🔄 The Arms Race of Privacy: Iterating Monero's Selection Algorithm
The final paragraph wraps up the discussion by emphasizing the ongoing arms race in financial privacy. It acknowledges the inevitability of trade-offs in refining the selection algorithm and the importance of addressing both major and minor heuristics. The speaker uses the analogy of road maintenance to illustrate the continuous need for improvement and adaptation in the face of new challenges. The paragraph also discusses the relationship between ring size and the selection algorithm, noting that increasing the ring size can mitigate the impact of a less-than-perfect algorithm, while an improved algorithm makes better use of a larger ring. The goal remains to provide the best plausible deniability through ring signatures, with a commitment to ongoing iteration and improvement.
Mindmap
Keywords
💡Ring Input Selection Algorithm
💡Ring Signatures
💡Decoys
💡Heuristics
💡Coinbase Outputs
💡Random Distribution
💡Recent Zone Selection
💡Matching Distribution
💡Plausibility
💡Continuous Improvement
💡Privacy
Highlights
Exploring the nuances of Manero's ring input selection algorithm and the inaccuracies of the term 'randomly selected'.
Introduction to the concept of ring signatures and decoys in the context of Manero's privacy features.
Demonstration of a completely random distribution algorithm and its potential flaws.
Heuristics that favor spending newer outputs over older ones and their implications for privacy.
The introduction of a recent zone selection to counteract the heuristics based on output age.
The evolution from a random system to a matching distribution model based on empirical observations.
How the newest outputs are more likely to be selected in the current Manero selection algorithm.
The importance of the timing aspect in the selection algorithm and its role in enhancing privacy.
Addressing the challenge of coinbase outputs and their potential as a heuristic for adversaries.
Modifications to the selection algorithm to mitigate against heuristics involving coinbase outputs.
The complexity of creating a selection algorithm that is resistant to all possible heuristics.
The analogy of plugging holes to describe the continuous improvement of the selection algorithm.
The relationship between ring size and the selection algorithm in providing strong privacy.
The arms race in financial privacy and the iterative process of improving selection algorithms.
The importance of balancing changes to the selection algorithm to avoid inadvertently creating new vulnerabilities.
The goal of providing the best plausible deniability possible with ring signatures in Manero.
The commitment to continuous iteration and improvement in Manero's ring signature selection algorithm.
Transcripts
welcome back to breaking monaro today we
are talking about Manero
ring input selection algorithm we've
spoken in the past about ring signatures
and decoys in other previous episodes
and suring specifically has talked about
how the decoys are selected but it uses
a very vague word called a vague phrase
at least called randomly selected right
and in this episode we're gonna get far
more nuance far more specific about what
we mean by this this mysterious phrase
and also the phrase itself isn't very
accurate either so it's important to add
some additional clarification here on
what's actually happening and there's a
lot more to random than
behind-the-scenes so I'm going to start
with the screenshare showing an example
of some of my narrows input selection
algorithms over the past so on the top
here you can see an example of a
completely random distribution algorithm
algorithm so on the Left you have old
outputs that were generated at the very
very beginning of my narrows history on
the right you have new outputs that are
generated very very recently especially
within the past few days or so let's say
that the green circle is the actual
output that was spent this is the real
money to sent and the blue ones are the
decoys that are selected now a
completely random distribution method
might sound great to begin with because
you know any input could be selected for
any reason but this leads to a lot of
unintended like consequences as a result
so you can make pretty strong heuristics
that say people are far more likely to
spend new money than old money so as a
result the latest input the green one
highlighted here is most likely to be
the real one and well you don't
necessarily have the ground truth to
prove that this is true it could be
tested as very reliable over time you
could make the potentially very strong
heuristic there and you can see in the
example on the screen on the first on
the first line there that is the case
because that's often the case right so
Manero sought to improve upon this
iterate upon this and you can see on the
second line there there's an example of
a recent zone selection
so you have again the whole history of
Menards outputs but you have a short
recent zone period where you're more
likely to select other decoys from the
specific period so the narrows code
might specify for instance that the
recent zone needs to be about 1.8 days
and that you should select about half of
the decoys from this said recent zone so
you can see on this example here that
about half the decoys are selected from
this recent zone and then for the rest
of the tale going back to previous tie
like the very beginning of my narrows
history you still have the ability to
select these outputs but they're less
common than new outputs and this helps
address the specific heuristic we're
speaking about where the the latest
output is the most the latest output in
the ring is usually the true one because
now you have a more latest out latest
decoys included in this ring so
therefore you have a more plausible
selected outputs in this case and the
recent zone was nice and simple it was a
really easy way to implement this sort
of feature and it would definitely was
an improvement over the existing
completely random system Manero began
with but it's not ideal
and so Manero has moved to what more
resembles the bottom line there which is
a matching distribution one that is
based on empirically observed
distributions based off what we've
Manero and outside Reacher's researchers
found with Bitcoin and Manero it's a
mathematical model so you can see that
in this case the newest outputs are even
more likely to be selected for instance
so this hopefully this diagram helps
show how it's not just about how many
inputs there are in a transaction it's
also about how you select them and
there's a lot of implications on how
these are actually selected but it's
more than just timing as I show here
timing is just one part of how this is
done and for it to this end Sarang is
going to speak a little bit more
specifically about other factors
involved in the selection algorithm yeah
absolutely
I mean and it's worth noting that you
know technically the way that we still
do it and the way that we've always done
it did have elements of randomness to it
I mean random is kind of it's a
it's often a really poor word right so
for example even though typically the
outputs that you'll choose kind of
follow that particular mathematical
model there is an element of randomness
involved so you know two people can
definitely choose very very different
rings but you know on average they'll
follow a pattern that looks
approximately like the pattern that
you've showed so there's always still
randomness into it but like you said
timing heuristics or you know guesses
that an adversary might they make make
based just on the age of the output like
you said are only one heuristic they're
pretty big heuristic right because you
know in general for a lot of old
transactions you could guess what you
thought the newest one was and and you
know you might be right although you
couldn't prove it you know implicitly
but timing is just one part and we've
iterated since then to kind of mitigate
against other smaller heuristics that
were not timing based on us the one
example deals with something called
coinbase outputs if you're not familiar
with the term basically every Manero
block of transactions that is generated
has a special output in it that
generates new money as part of the
protocol that's kind of what helps to
reward miners for doing work in part and
those coinbase outputs I like to think
of as you know newly generated money so
in general do people spend newly
generated money or coin based outputs as
the true spender you know probably not
necessarily as often as non-new might or
nonpoint base outputs so for example if
I happen to choose ring that contained I
don't know 10 coin based outputs and
then my true output which was not a coin
base output an adversary might look at
that and think hmm I would say it's much
more likely that this person you know
didn't spend a coin base out because
that's all very new money so that could
be a heuristic that they might use they
might think coin based outputs are
probably decoys well in that case that
would kind of imply that we should
select fewer a coin based outputs as
part of our rings how many is too many
or too few I mean that's not a very
well-defined problem with a very
well-defined solution but as we've
iterated our selection algorithm to make
it better against this you know guess
newest heuristic involving output age
you know we probably introduced more
coin based outputs then some people
would have liked so we made a slight
modification to the algorithm where
instead of just choosing a block and
then yanking a decoy out of it which
tends to give us more coin based outputs
than fewer instead we actually look at a
very small window around that particular
block so our effectively increasing the
size of the bin from which we get to
choose our outputs and what that ends up
meaning statistically is that we end up
choosing fewer coin based outputs which
kind of mitigates against this much
smaller heuristic and that of course is
not the only heuristic type you might
come up with either so coin based
outputs are one thing an adversary might
use to look at to try to make guesses
timing which we've worked on of course
and have talked about might be one that
a adversary might use um there's other
ones for example if I have a transaction
that has two different inputs each of
those has a separate ring maybe the
adversary is able to look at the
different decoys and outputs that are in
those rings and maybe the adversary will
find that there's a transaction way back
when in the blockchain that generated
two different outputs and maybe one of
those outputs appears in one of the
inputs to my new transaction and the
other output appears as an input to the
other one again it's just a guess
because it may have happened by chance
but probably not the adversary might try
to conclude that the outputs that were
generated in a previous transaction are
now being spent by me and might make
some conclusions based on that again
without external information a heuristic
is not a proof but it gives the
adversary something that they might try
to guess so in general this is very very
complex I will say right now it is
pretty impossible to get rid of all
possible heuristics so we can always
make our selection algorithms better and
as Justin pointed out and as I've kind
of hinted at we have done this over time
we iterate to get better and better a
good way to think about this is
something that Justin brought up in fact
kind of with like a plugging of a whole
analogy if you want to give that what
you kind of liked yeah of course so um
what an example like we we know of a
specific heuristic for instance the the
guessed newest heuristic might be an
example so we can iterate Manero
selection algorithm to help counter this
sort of heuristic and the actual
effectiveness of it but in doing so
we're still choosing some other way to
select outputs that people can't develop
heuristics for so there's no limit to
the number of heuristics that people can
come up with they can continue making
complicated heuristics over time pretty
much no matter what we do so we're
always plugging these holes that we're
aware of and the biggest holes we know
of but we might indirectly be making
smaller holes we might
making holes that were not necessarily
aware of because maybe the heuristics
haven't been conceived yet especially by
participants in the Monaro community so
this is definitely something that will
need continuous improvement it needs
continuous iteration in order to make it
better to keep patching these holes
you can't just pave a road and never
expect to have potholes you have to
suffer through it like the Minnesota
winters like I have where you go and
have to keep patching these potholes
that keep appearing right they keep
coming out and when you think you're
done some trucks gonna drive over it and
come up with and make a new one right so
these these sort of circumstances keep
happening apologies for that terrible
analogy I'm trying to place terrain here
but yeah we try to address the big ones
first the example of the guest nuke uist
but there might have been some
consequences as an example by changing
the selection algorithm to follow a more
mathematical distribution we actually
selected more of those coin based
outputs and we needed to go back and
sort of refine how we did this work was
every improve was every iteration still
an improvement overall from what we can
see now it should be yes we addressed
the existing heuristics and done more
good than harm but there's always going
to be some sort of trade-off that we're
going to sort of keep playing with and
so to speak yeah absolutely I mean we we
definitely receive reports all the time
about people who come up with you know a
particular small pattern that they might
see among certain transactions or
outputs based on the way that we make
our selections and some of those are
very very small um doesn't mean that you
know as optimal that we keep them in
there but you know to some extent it's
not necessarily always obvious how to
make an absolute good change to counter
some of those small heuristics without
inadvertently kind of ruining the work
that we've done for some of the much
bigger heuristics so you know if we see
a small heuristic pointed out and we
don't change our algorithm because of it
you know it doesn't mean that we are not
concerned about those heuristics but it
means that we always have to balance the
good that we'd be doing by making such a
change with the inadvertent harm that
might be caused by it so it's it's
always an arms race right financial
privacy as a whole as you're probably
learning from this video series is an
arms race analysis only gets better over
time and that's great you know just
because we receive small reports
sometimes and big reports other
about different heuristics that come up
doesn't mean we don't want to see them I
love seeing those reports I love
learning more about what other people
are doing with this analysis but it just
reminds me that it is an arms race
analysis gets gets better we get better
because of that you know that might
invite more analysis which just has kind
of this spiraling effect toward us
getting better
yes speaker speaking about spiraling
effects towards getting better one of
the great things about ring signatures
is that the selection algorithm and the
ring size sort of go hand in hand and
sort of a positive or negative feedback
loop it's supposed you're purposely
making things worse but as Mineiro
increases its ring size it sort of
decreases the severe negativeness of
perhaps a bad selection algorithm there
are some limitations of a selection
algorithm if you just keep picking more
and more inputs for example right more
and more decoys the you know
shortcomings of a specific algorithm
might decrease will or should ideally
decrease meanwhile if you improve your
selection algorithm you make better use
of the decoys that you have so these two
components are really critically
important and having strong privacy in
venero making the most out of his ring
signatures because if you have a larger
ring size with the terrible selection
algorithm then you're not going to have
great privacy because you're able to
develop really strong here as six
potentially and likewise if you have a
really even if you had a some mechanism
of having a perfect algorithm under
every circumstance but you had a like
really small ring size for example then
you're also not great either so these
things really critically work hand in
hand at providing really the privacy
that ring signatures offer it's more
than just what they say they provide out
of the box which is one out of inman
arrows cases he's now 10 11 one out of
11 are possibly spent it's all the
additional metadata itami analysis
coinbase metadata that is associated
with this and so the selection algorithm
needs to adjust over time to to
compensate for this otherwise we could
just pick the first 10 outputs that were
ever generated on maneras blockchain and
be like oh it's either the latest order
of the first 10 and that's obviously not
very
so you know it's input selection
algorithm is very important from
narrowing signatures
all right Sarang do you have any last
closing thoughts to leave the viewer
with just that our goal still remains to
provide the best plausible deniability
possible with ring signatures and over
time we continue to learn about better
and better ways to do that and so we
keep iterating and iterating and
iterating to get better and we're
absolutely not perfect now but we
continue to try to get better and better
all right Thank You Sarang thank you
everyone for watching this episode of
breaking Manero we will catch you in the
next one thank you
5.0 / 5 (0 votes)