The Game of Self-Care: Tit for Tat with Your Future Self

What game theory and Buddhist philosophy reveal about why you sabotage your own future

Oct 20, 2024

Today, I would like to muse about game theory combined with anattā (no-self), and explore what practical implications that such a mental model could have in the ways we treat ourselves.

Anattā is actually not quite what I am describing, it is just the closest word I could find, so you have to forgive me for deviating a bit from the concept. I'll try to elaborate more specifically what I mean:

I recently found myself rewatching an old video (embedded below) by Dr. Alok Kanojia (or Dr K, as he is known as on the internet, due to his YouTube channel HealthyGamer) on why it is so hard to be consistent.

This particular video is one of the more shining examples of why I admire Dr K so much, in that he brings together his training as a psychiatrist with his background as a monk, with his avid interest in Gaming, into a beautifully integrated narrative:

In this case, he makes an exposition of Avidya that I had not quite heard before. Avidya, in Sanskrit, means “ignorance” or “lack of knowledge and refers to misunderstanding of what the self is.

There are multiple different interpretations of Avidya in Hinduism and Buddhism, and Dr K’s explanation of it is so peculiar that I cannot really tell if it comes more from the buddhist side or the hindu side or if it is his personal take, but either way, the mental model that he postulates in the video is that we constantly spawn into someone else's body and mind.

I.e. every morning you are actually a new “operator” spawning into somebody else's body. You inherit memories etc, but the essence of you is new, and stays around for a rather short time.

If one entertains this mental model, it provides an explanation why it is so hard to do things that give rewards in the future - if we are new selves that all the time only get to exist today, if the self is essentially a mayfly, then why not use our credit card to buy a pizza today, especially since a prior version of ourself did that too.

If we save money and eat something healthy today, the next us is better off for our sacrifice, but why would we do that when we can have pizza now and have the future us deal with the consequence.

I personally find this mental exercise delightful and interesting. I used to be quite provoked by much religious terminology and ideas, but lately I have started evaluating ideas (especially ideas that we cannot validate) according to the criteria of usefulness rather than correctness.

What if you thought of yourself as a completely different person compared to yesterday?

Basically, the mental exercise here is, what if your consciousness actually isn’t bound to your specific being, but actually rotates around in everyone all the time?

I find it to be reminiscent of the scenario expressed in ‘The Egg’, a wonderful short story by Andy Weir’s that has been beautifully captured in animation by Kurzgesagt: The Egg - A Short Story

The Nucleus Accumbens and Future Discounting

Shifting away from the spiritual perspective, Dr K also talks about the neurological reasons for this, and it has to do with the workings of the nucleus accumbens, part of the basal ganglia.

This part of the brain is, roughly simplified, the brains reward center - instrumental in mediating pleasure, reinforcement learning, and motivation and decision-making. It’s a dopamine hub, sort of.

A well documented behavior of humans related to this is called Future Discounting and refers to the tendency of individuals to value immediate rewards more highly than future ones—essentially prioritizing short-term gratification over long-term benefits.

In Future Discounting experiments, participants are presented with choices between receiving a smaller amount of money immediately or a larger amount at some point in the future. The idea is to assess how much people “discount” the value of future rewards.

Participants are asked something like:

“Would you prefer $50 now or $100 in a year?”

“Would you prefer $50 now or $80 in 6 months?”

Before you continue reading, I really, really encourage you to spend a few seconds to self-inspect your own intuition to the two questions above. Then do some quick math on what the interest rate you are given in the above scenarios, and see if you think your first intutition matches your second-guessing rationale.

Writing this chronicle can be very elucidating - prior to writing this one, I thought of Future Discounting as an example of humans being irrational:

In studies, people consistently tend to heavily discount rewards as the delay increases. For example, someone might prefer $50 now over $100 in a year, even though the delayed option offers a much better return.

Hyperbolic discounting is actually rational

One detail that I’ve previously overlooked when reading about this is that discounting isn’t linear, but hyperbolic (which is kinda like parabolic and exponential but not).

In the early 1980s, psychologist George Ainslie discovered something peculiar. He found that while a lot of people would prefer $50 immediately rather than $100 in 6 months, they would not prefer $50 in 3 months rather than $100 in 9 months.

The handwavy explanation for this behavior is that both options feel “far away”, but it turns out that this is selling our automated System 1 neurology short. Our “conscious mind”, System 2, is often thought of as “the rational” mind, but when it comes to heuristics for complex decision scenarios with unpredictable factors (such as, you know, reality) then System 1 often has some surprisingly well-oiled approaches.

In this case, the intuitive decision making of humans is shown to follow a non-linear hyperbolic discounting curve. I.e. it’s absolutely not arbitrary and seems evolved to this specific point, especially when you consider that rats, pigeons and monkeys also do hyperbolic discounting.

Hyperbolic discounting can seem irrational at first glance - the question examples I mentioned earlier (“Would you prefer $50 now or $100 in a year?” and “Would you prefer $50 now or $80 in 6 months?” ) are actually 100% and 120% APY respectively, so on paper the reasonable thing to do is to pick the deferred amount, which is often why these experiments makes some economists go "hoomans are so stupid loollll".

However, us fleshy beings do not live on paper, we live in reality, where we have to factor in the interest variance and risk of the deal falling through.

There are mathematical models for doing that the ultra rational way (for example including a random walk for interest rates), and when you factor that in, it’s pretty much the same as hyperbolic discounting, which means our intuition factors in so many factors that it seems irrational, which is FUCKING SUPER COOL.

I strongly recommend this fantastic article by Chris Propel, which makes the argument in a very robust way: Hyperbolic discounting — The irrational behavior that might be rational after all

Trusting your gut: A rational decision

There is millions of years of evolution behind the brain and in this case, what seems like the irrational decision is, in fact, the wise decision, and our brains do it completely automatically.

I often find it hard to trust my own intuition, second-guessing myself all the time, and part of me rolls my eyes when I hear sentences like “trust your gut” or “follow your heart” or “trust in the wisdom of the body”, but when I read articles like this, I realize that I should pay more respect to the heuristics of my human intuition.

Our intuitions are studied, hijacked and exploited by companies, especially in recent years, and that has perhaps painted intuition in a bad light. I realize that I might have blamed my intuition for being stupid when in fact we should be blaming exploiters of it for being sinister.

What if you realized that you just started your life, and only have 5 minutes to live?

Speaking of sinister opponents, let me drag us from this dangerously far-off digression back to track (as it is also quite frankly mudding up a point I am later about to make), and move us to how this relates to game theory.

In his video on consistency and avidya, Dr K makes an analogy in gaming, and paints a scenario where players take turns playing 5 minutes each in a game. I.e. you randomly get dropped into a game, you play 5 minutes, and then control of the game character moved to another player.

I find it really interesting to think about how this would change how I play. For example, I would focus way less on the end result, and more about enjoying myself and the moment. At the same time, I would be acutely aware of that the next player is the same as me and will get my game state, so I would try to leave it in an enjoyable state. If we win “together” I would also be more inclined to be disciplined to do my best, I think.

A particularly interesting thought is that the everyday gaming behavior of typing “gg” and quitting as things started to go sour, as I would do when I start losing a game of Hearthstone, just becomes inconceivable.

If these these are the only 5 minutes I have to enjoy the game, then playing as well as I can in a given situation, taking as much delight and learning as possible from the way down, becomes rewarding in itself.

Plus, I can do so without the guilt/shame of having to put myself in the bad situation, and can completely accept that it isn’t my fault, but my responsibility. As such, I will do my best and maybe even turn the ship around.

However, as Dr K points out, a game like this would go sour when one player “defects”. An example of this would be a player losing control of their presence, patience and equanimity, and impulsively going for a high-risk, quick win. This way, they get the chance of sweet victory within their 5-minute window. It will likely fail, and cost everyone the game, but since you only have 5 minutes to play, isn’t that the “rational” choice?

I think this very eloquently illustrates the concept of Avidya, where seeing oneself as separate from others, causes suffering to oneself, and also makes everyone involved (including oneself, assuming everyone defects), to be objectively worse off.

The Prisoners Dilemma: MUCH more to this one than I thought

But another thought that comes to mind is that the scenario Dr K paints is also highly reminiscent to the Prisoners Dilemma in Game theory.

Many will be familiar with the The Prisoner’s Dilemma, but in case you aren’t, I’ll briefly describe it here so that you don’t have to open a browser tab:

Imagine two prisoners, each isolated from the other, with a choice to make: they can either collaborate (stay silent) or defect (betray the other).

If they both collaborate, they get a light sentence.
If one defects while the other collaborates, the defector goes free while the collaborator gets a heavy sentence.
If they both defect, they both get a harsh punishment.

The logical, self-serving choice is to defect, because it minimizes risk in an uncertain situation.

However, if both prisoners follow that line of reasoning, they end up worse off than if they had trusted each other. It’s a maddeningly simple scenario, but it highlights how real life—like this dilemma—isn’t a single isolated event. We’re always part of repeated interactions, where being nice (cooperating) actually leads to better outcomes over time. You can almost feel the tension between short-term gain and long-term success humming underneath it all.

To be honest, I’ve historically had a bit of a disdain for the Prisoners Dilemma, as it is often used as an example of how self-serving behavior is better for the individual. It always felt rather sad to me, but also wrong in a way that I could not verbalize until I ran across Robert Axelrod’s work on the Prisoners Dilemma.

What Game Theory Reveals About Life, The Universe, and Everything (Veritasium)

This video by Veritasium interviews Axelrod where he describe the work that is the basis for his book The Evolution of Cooperation. It is a really good video, but I’ll describe the relevant context here to keep you from tabbing away:

Axelrod’s work flipped my perspective on the Prisoner’s Dilemma - rather than focusing on isolated decisions, he explored what happens when the dilemma is repeated over time—a much more realistic scenario, given that life isn’t a one-off event.

In his famous tournaments, where different strategies (simple bots) were pitted against each other in repeated rounds of the dilemma, and what emerged was surprisingly hopeful.

Nice, Forgiving and Retaliatory guys win in the end

Self-interested strategies actually consistently did the worst in these tournaments. The most successful strategies weren’t those rooted in ruthless self-interest, but rather those that embodied three key qualities: they were

A) nice (they began by cooperating),
B) forgiving (they were willing to move past betrayals if the other player returned to cooperation), and
C) retaliatory (they weren’t pushovers—they punished defection, but only after it occurred).

What’s uplifting about this is that, despite the conventional wisdom that “defecting” is logical, it turns out that in repeated interactions (which is a facetiously succinct definition of “society” 😉), being kind, fair, and a little tough consistently wins out in the end.

Adapting Game Theory to self-care/discipline in a game of conciousnesses

Can we learn something from this that we can apply in our collaboration with ourselves?

If we humor the notion that our selves are mayflies of consciousness, teleported into our bodies and brains, just for today - and maybe even only until lunchtime, or the end of this pomodoro - then choosing between pizza and salad does, in a way, become a game theory experiment.

Can we somehow take inspiration from the prisoners dilemma, and Axelrods winning bot principles of Nice, Forgiving and Retaliatory, to ourselves?

Seems like a stretch_,_ but stretching analogies is my fourth favorite type of stretching, so we’ll have fun trying!

1) A Nice Strategy

Let’s roll with Nice first. This one is rather simple to envision - this would entail doing something that is decidedly nice for your future self.

This might entail clearing out your email inbox at the very end of the day in order to give yourself the grace to do guilt-free creative writing in the morning. Or, going to sleep early instead of watching the newly released episode of Severance (I am looking forward to the second season more than life itself)

Nice is not the same as Altruistic or Ascetic: I want to highlight that viewing ourselves in this wonky mayfly-conciousness-game-theory model makes it nonsensical to be self-abnegating (Renouncing one’s own interests for the sake of others.)

I.e. spending your day preparing for a joyless presentation the next day does not make sense for a mayfly. Your next-day-self will enjoy having it done, but if they also self-abnegate their entire day, the being's entire life will be joyless (i.e. this would be the inverse of a scenario where all mayflies opts for the always “do cocaine, eat pizza, watch the office, future self will deal with problems” strategy)

2) A Retaliatory Strategy

A more interesting challenge is how we can be Retaliatory to ourselves?

This introduces the need of the concept of holding a grudge into the model (if there is no concept of a grudge we cannot model forgiving) and for there to be a grudge we kind of need to have the concept of another. So for the purposes of this thought experiment, I suppose we will think of our future and past selves as “the others”.

Let’s assume that we wake up in the morning of our important presentation. We feel horrible, realizing that we have slept very little, having stayed up watching the new season of Severance all night.

We open up the presentation slides, that is half-finished at best, and we realize that we have to scramble to get it done.

In that situation, our former “the others” didn’t behave in a way that was supportive for us. In a game theoretical sense, they certainly defected.

While there are objectively bad strategies for the prisoners dilemma, there isn’t any “best” strategy, but a strong one is Tit For Tat, which is simple, when the other defects, Tit for tat immediately defects too, and after that goes back to being nice (we’ll get to Forgiving in a bit)

When we are dealing with ourselves, punishing is a bit awkward to model (people certainly do it, with penance etc) but it kind of only makes sense if you think of the other as your evil twin - if the others are a group of consciousness, then retaliation doesn’t quite work, but if we think of it as crackdowns, it might work?

For example, it might entail canceling the streaming subscription, or if this defection has occurred several times before, throwing out the TV completely (something that yours truly is seriously pondering).

Maybe you also schedule some spinning class in the morning the next day, or maybe an even harder commitment box?

As an example of a crackdown in my own life: To force our own hands in getting started with exercise, me and my girlfriend pre-purchased 3 months of sessions 3 times per week with a personal trainer.

That was truly a commitment box of steel - if I did not drag myself there in the morning, I'd have wasted a lot of money, my girlfriend would be stood up, my personal trainer disappointed, and they would both be at the same time and space with my tardyness as a most handy discussion topic.

A prison for myself of my own design that I am incredibly proud of.

3) A Forgiving Strategy

Forgiving is the third key aspect of a successful Prisoners Dilemma strategy.

If we are playing life as the Tit for Tat strategy, after we have thrown out our TV, we’ll let bygones be bygones and be kind to our future self the next day (assuming they were kind to us even though we threw out the TV)

A tit for tat playing against tit for tat will just be nice to each other all the time, but let's be honest, life is hard and we cannot really be Tit for Tat all the time. Sometimes, we are just not very good, and we play as “Tester — start by attacking; if retaliation is received, then back off and start cooperating for a while; then, throw in another attack; tester is designed to see how much it can get away with.”

Tit for Tat lets bygones be bygones, but the problem with it is that it might get stuck in loops where punishing just goes back and forth. Therefore, a modified version of Tit For That, called Generous Tit For Tat, emerged in the field. It is basically a Tit for Tat that nearly always retaliates, but sometimes it doesn’t.

I am acutely aware that my analogy is getting quite stretched out at this point, but I think that there is something to be said for randomly forgiving your former self sometimes, sometimes the day is just randomly bad and crackdowns will just make things worse, and the removal of the TV will just make the others sadder, not nicer, as they were already trying to be nice, but randomly couldn’t.

Idea: Receive a letter from you yesterday every morning

On a final note, this whole experiment gave me the idea of evening journaling in the form of a letter to the others. In the prisoners dilemma, a crucial component of the dilemma is that the prisoners are not allowed to communicate.

We cannot really talk to our future or past selves, but we can send a letter to our future selfes to read. I imagine, perhaps, a journal entry that describing what your experience of the day was like - how you experienced that you were supported (which can be thought of as a specific form of gratitude journal entry) and what you did to support future you (or how you cracked down hard on something because you did not like picking up the slack of other yous).

It’s a kind of cool format in general for a journal, writing what expectations you have of the next you for the coming day (and perhaps what you thought of the expectations placed on you by your predecessor).

Kindness or Crackdown today?

I hope that your Sunday self was kind to you, giving you a well-supported Monday. If not, will you let it be bygones and hazard to be kind to Tuesday, or will they be facing some tough love with a much needed crackdown?

Follow tip: Milos Makes 3D maps (with R)

If you find delight in maps and dataviz - I ran across a really cool niche account this week, Milos Makes Maps - "I paint the world with R and teach you how to unleash your inner map artist." Beautiful work and extensive R tutorials.

Be a light, not a judge, be a model, not a critic

funfun.email

Discussion about this post

Ready for more?