## How to optimize the optimization process that is recovery from anorexia.

*If you prefer, you can find abridged versions of the three parts in this miniseries on Psychology Today starting here.*

## Part 1: Introducing cost functions and their limitations

“Optimizing for” something has a simple colloquial meaning of setting things up to get as much as you can of something—anything from health to happiness to salary payments in the bank at the end of the month. It also has a more specific meaning in the mathematics of optimization: selecting the best option (with regard to an agreed-upon scoring metric) from a set of alternatives.

In the last two parts of my recent series on cognitive dissonance, I used optimization in the first sense: as a simple concept to structure questions we can usefully ask ourselves about what we’re doing in life and why. In this new series of posts I’d like to explore how the more technical aspects of optimization theory can also provide tools to help in recovery from an eating disorder. This series has been written in collaboration with my partner James Anderson, an engineer/mathematician who specializes in optimization amongst other topics (and who also made this excellent contribution to the blog a few years ago), so you needn’t take my word for it! Our aim is to convince you that a little maths goes a long way when it comes to any big life decisions you may be making, recovery included.

So what are we doing when we’re optimizing for something, in technical terms? A typical optimization problem consists of three key elements: the decision variables (the choices you’re making), the cost function (often referred to as the objective function), and a set of constraints (not all problems necessarily have these). Imagine you’re an investment banker tasked with creating a stock portfolio. Here the decision variables are the number of each stock you want to buy. The objective is to maximize the predicted profits (an alternative objective could be minimize risk), and the constraints are that you only have certain amount of money to spend. The choice of stock that produces the maximum profit without exceeding the budget is the optimal choice.

Likewise, we can frame recovery from an eating disorder as an optimization problem. Here our goal is to either maximize some utility, e.g. health or happiness, or minimize costs, e.g. time off work/studies, or discomfort, or weight gain (more on this later), or probably to do both simultaneously. An algorithm is at play, with some degree of explicitness, constantly trying to find the optimal solution to the choice between getting more of the utility and incurring more of the costs. Optimization algorithms are iterative, i.e. they begin at some initial point with a candidate decision variable, and then proceed by incrementally adjusting the decision variable until it is not possible to make the objective function cost any smaller (or larger if you’re maximizing).

Figure 1 shows a simple example of an optimization problem with one variable and no constraints. Our goal is to choose a value of x that makes our cost function as small as possible. On the horizontal axis are the possible values of x that we can choose from. The points on the red curve tell us what the cost is for choosing that particular value of x. In this example, the optimal solution (the choice of x that provides the lowest cost) is attained at x=2, and the associated cost is -7. By definition of it being optimal, there is no other choice of x that returns a lower cost. Because this example is very small (only one decision variable), we can easily visualize the problem. The optimal point denoted by the gold circle is at the bottom of the bowl. A sub-optimal choice of x is any other point on the red curve (one choice is given by the green circle which corresponds to choosing x=-6, and has a cost of 9). Intuitively, it seems that to obtain an optimal solution, we could pick any point on the curve and simply walk downhill until it flattens out. You may have heard of “machine learning” in the news a lot recently. Believe it or not, this idea of walking downhill is behind the scenes in just about every ML technique invented.

The cost function we used was x^2-4x-3. (This is just made up for illustration; there’s nothing meaningful about this choice other than it looks pretty!) The red line is obtained by plotting this value for various choices of x. In our banking example, should our banker be dabbling in the FTSE 100, then the curve would exist in 100 dimensions and we wouldn’t be able to draw it and visually find the bottom. But even in 100 dimensions, mathematically the concept of down is still defined and so we can use a computer to tell us which way to go.

So now we have the basics of how the decision variable, the cost function, and the constraints interact in any optimization process. Optimization is a powerful framework for generating decisions in response to complex sets of aims and constraints, but like everything, it has its limitations. The main two inherent limitations of optimization algorithms are as follows:

**The optimization algorithm is based on a model of the process, and the model is the only reality that exists for the purposes of this algorithm.**With respect to the banking example, our friendly investment banker has to model how she thinks the market will behave, and from this extract a cost function. Of course, there is no model that exactly describes this process and so simplifying assumptions are made. She may also introduce other types of assumption that make solving the problem more tractable (more on this later in the series). The point is, at some point the cost function and constraints are specified and you have to set about solving the problem. If your model is garbage, your solution will be too. This means a solution that is technically optimal (satisfies the constraints, and is the best choice according to your cost function) can be practically useless, like if the banker’s definition of shareholder value bears no relation to what any actual shareholder would want, or (in the personal context) if your definition of happiness actually amounts to “what my mother taught me happiness should look like” and you’re not a lot like your mother. In such cases, then what is optimal in your model will not in fact be the best thing you can do.**Optimization algorithms are myopic.**At each iteration, the algorithm will always choose to update its choice of decision variable so as to reduce its cost. Graphically, this corresponds to always going downhill until you no longer can. In the example in Figure 1 this strategy is a good one, because no matter where you start, going downhill always reduces your cost. Unfortunately, real life doesn’t look like Figure 1 (it would be kind of dull if it did). A more lifelike example is shown in Figure 2.

In this example, simply going downhill won’t guarantee that you obtain the lowest cost. The optimal solution (the choice of x that has the lowest cost associated with it) is shown in gold. However, there are three “valleys” in this example and using the “head downhill” strategy will fail to take you to the optimal solution if you start in the wrong place. If you were to start at the grey triangle, going downhill will take you to the blue circle, and here you get stuck. It’s true that you’re better off now than the position you started at, but the “locally optimal” choice of x has a cost which is far worse than the true optimal solution.

If you start from the grey star, a slightly strange thing happens. Moving in the downhill direction (to the right—which corresponds to the algorithm choosing a larger x) will decrease your cost. However, if you were to go left, a small initial increase in cost would then give way to a large decrease in cost as you find the optimal solution at the foot of the valley. So which way should you go? Looking at the figure, you’re probably saying to yourself, “go left, it’s obvious—a small discomfort followed by a huge gain”! You are of course correct. But it’s not quite that simple. We’ve neglected to tell you precisely what information the algorithm has access to when working out where to go. Recall, the algorithm’s only goal is to choose a decision vector x that makes the cost as small as possible. The easiest way to explain how the algorithm chooses x to achieve this is through metaphor.

Imagine you’re skiing on a mountain range with lots of valleys and your hotel is at the bottom of the lowest one. You’ve been skiing all day, lost track of time, and now it’s getting dark and more annoyingly you’re surrounded by thick fog. A few minutes ago you got off a chairlift which took you up to a pass. All you can see is the ground around you within a 6 foot radius. It’s relatively straightforward to work out which way down is by looking at the ground, but what you can’t assess is whether this piste is taking you down into the right valley. You now have three options:

- Stay where you are—you’re guaranteed not to get home tonight and risk freezing (though if you’re lucky a pisteur will turn up and rescue you).
- Ski downhill—you’re in a densely populated ski area with resorts at the bottom of every valley, and even if this is the wrong valley, an expensive cab ride home is better than a cold night on the mountain.
- Sidestep or carry your skis uphill—hopefully you’ll get back to a sign telling you which way to your resort (unfortunately you don’t know how far up you will have to go, you’ll definitely expend a lot of energy, and it’s possible your resort isn’t even in the neighbouring valley).

To summarize, the optimization algorithm has the same local information as our unfortunate skier: With access to only local information, skiing down the mountain (reducing the cost) certainly has its benefits: temperature increases as altitude decreases, ski resorts are likely positioned at the foot of a piste. But it’s also possible that clambering uphill to get to the ridge will enable you to find your valley and get home safely. The problem is, you can’t see what’s over the ridge.

This optimization viewpoint provides an interesting lens through which to view semi-recovery. You were initially ill, and you have made some progress towards recovery. Through a sequence of choices you have got a bit better, but then something happens that makes you realize you’re not done yet. Let’s say you notice you’re getting a bit obsessive about having the “right” snacks in your bag whenever you go out. You take some action: you stop packing the snacks. Unfortunately this gets you to the foot of the wrong valley: Now when you go out you don’t have to bother with the forward planning, but now you get hungry, you don’t manage to spontaneously buy things to eat, and your old dysfunctional hunger responses kick back in. Only drastic (non-obvious) action (i.e., going uphill even although it feels unnatural) will get you to full recovery—the global optimum. In this case that might look like making a plan for always taking snacks with you, and of bigger and scarier and more varied kinds than you were having before, until the distorting effects of your long-term malnutrition are fully eliminated and you get relaxed about spontaneously buying things when you want them. Rather than pretending you can already do without any planning, temporarily planning *more *is what will actually get you fully better.

This starts to show how we can use the optimization lens as a way of thinking about how we ended up in semi-recovery in the first place. Recall the two limitations of an optimization algorithm. Limitation 1 (the model is the only reality for the algorithm) suggests that one reason for getting stuck was that for some reason your model of health (or whatever your recovery endpoint is) was incorrect. It was implicitly modelled (the cost function was chosen) such that healthy living looked like Figure 1, in which case any improvement in eating habits (e.g. eating more like healthy people do) would take you closer to perfect health, when in fact, reality looks much more like Figure 2, and there are choices that reduce local costs but get you further away from health. (In our example, skipping straight to a casual “I don’t need to carry food around with me” attitude, when you’re not yet physically robust enough to weather long periods without food without slipping back into disordered ways.) Armed with this incorrect model, Limitation 2 dooms you to failure. You make small gains, but unfortunately towards a local optimum (functional and passing as “normal” in the day-to-day, but getting further from actually resolving your malnutrition and its after-effects). If you’re lucky, the cost of being stuck in the local valley is not too different from being fully healthy, but often the cost is much more.

Coming back to the first limitation, the problem of not knowing (whether) the deeper valley exists at all may be a question of model accuracy too. Let’s say you want to stop counting calories. You’re considering deliberately aiming for some weight gain to help you stop, because calorie-counting’s purpose for you has always been to prevent you from gaining weight, so making weight gain an explicit aim should help you get rid of it. You may overestimate the steepness and/or length of the local incline in making the calculation about whether to incur the weight-gain cost. For instance, you may assume that as soon as you count less you’ll balloon in size, or even that your counting is an effective weight control strategy in the first place, when in fact neither is true: there may be far more effective mechanisms available to regulate bodyweight than your counting habit, and the amount you’ll need to gain to reach that self-sustaining stability is, let’s say, 2 kg rather than 10. An extreme case of this, where model fit starts to look increasingly delusional, is in obsessive compulsive disorder (a temporary version of which is often a consequence of malnutrition), where the short-term costs of not doing the checking or counting or other ritualized behaviour are radically overestimated, the long-term costs of doing it are radically underestimated, and the model is equally radically adrift from the real world. We’ll come back to the question of how to test out whether the model you have is actually representative of reality, and how to make it more so if it isn’t.

In conclusion, then, the locally optimal point denoted by the blue circle is the mathematical modelling of the classic semi-recovered state: thinking that this life—in which you’re not terribly unwell anymore but are still restricting and exercising and counting and doing all you can to control your body size—is as good as it gets. You’re too terrified of the costs of change (most often, these amount to not much more than “weight gain”) to give yourself a chance to get to really recovered—because you’re not sure whether really recovered (the gold circle) really exists, or is possible for you. You hesitate, and then quite likely you do nothing, because even the local optimum has stability, which is attractive—but this is a problem when the stability is keeping you somewhere globally suboptimal. One half of the problem is that the valley you’re in feels like home because you’ve been here so long that it seems like it should be. The other half of the problem is that you don’t know what the costs of getting out of it really are, or how most efficiently to incur them. What you need is a trusty tour guide to get you over the ridge and back to where you should be.

In the next part of this series, we’ll move beyond the limitations of optimization as they apply to semi-recovery to explore more ways in which the recovery process is an optimization process and can itself be optimized by viewing it through this lens. This amounts, if you like, to turning yourself into the tour guide you trust.

**Part 2: Multiple cost functions, aka optimizing for more than one thing at once**

In Part 1 we considered two limitations of optimization processes and the light they shed on pseudo-recovery. Let’s now think more about what the implications of optimization are for getting full recovery to happen. In Part 1 we only considered problems where there was just a single objective, i.e. just one thing to be minimized or maximized. Of course this is not necessarily realistic. In this post we will concentrate on the more lifelike situation of optimization with multiple objectives. For simplicity, we focus on the case of two objectives, but everything carries over to an arbitrary number of objectives. Equally, the cartoon idealization of optimization as “finding the floor of the valley” that we introduced in Part 1 will carry over to this setting, as we’ll be combining the two objectives into a single parameterized objective function (sounds far more complicated than it really is) to be minimized.

Recall our friendly stockbroker. While her job is to build a profitable portfolio for her investors, most investors don’t have the stomach for highly volatile stocks—even although this often leads to greater profits. In order to keep everyone happy she should ideally minimize risk whilst maximizing profit. Unfortunately, these are conflicting objectives. Government bonds are very stable, but the profit margin is small. A new startup may return huge profits, but may also tank! It’s not possible to make huge profits while taking no risk. In this multi-objective setting, there is no longer a single obviously optimal solution (as there was in the “deepest valley” case with only one objective). **Whenever you’re trying to optimize for more than one objective, there is always a family of solutions**. Each member of this family will be optimal for a particular weighting of objectives. Let’s make this concrete using our investment banking example. Our banker wants to select a portfolio with two objectives: minimize loss (for our purposes this the same as maximizing profit: we view a negative loss as a profit) and minimize risk. We combine these two objectives into one optimization problem with a new cost function that looks like

Cost*: w**loss(*x*)+(1-*w*)*risk(*x*),

recalling that the goal is to pick *x *to make the cost as small as possible. The parameter *w* can take any value in between 0 and 1, and our friendly banker selects a value to control how much she cares about profits versus risk. For example, if she selects *w=1* then the cost reduces to simply minimizing loss(*x*), i.e. the optimization problem she solves to select her portfolio ignores risk. To see why this is the case, set *w=1 *in cost and you end up with 0*risk(*x*)—which is always equal to zero regardless of the choice of *x*. (We said it’s not as complicated as it sounds!) Using an identical argument, if she chooses *w=0, *then when she optimizes, she ignores making a profit in favour of being risk-averse. A value of *w=0.5 *places equal emphasis on profit and risk mitigation, and any other choice favours one over the other. The point is, any choice of *w *will result in an optimization problem for which there exists an optimal solution, and you cannot say one solution is “better” than another. Technically they are all optimal. In Figure 3 we depict this graphically.

The curve represents the family of solutions for a given choice of *w. *Points not on the curve represent either suboptimal or impossible solutions. In the suboptimal case, risk could be decreased making less profit, profit could be increased without incurring any more risk, or both! The cloud illustrates the space where you can do both at once to improve your solution: there’s really no reason to hang around here. Points to the left of the curve are unattainable: either the conflicting objectives simply don’t permit you make this choice, or else to be at this point means violating a constraint, e.g., the banker spending more money than she has access to. An extreme case of this is indicated by the heart. Here you would make huge profits while incurring no risk at all. Sounds great, but never going to happen!

Again, the key message here is that we cannot say that any one point on the curve (i.e., any single solution) is better than any other: they’re all technically optimal (just for different values of the weight parameter *w*). Let’s say your two cost functions are, with whatever degree of explicitness, ill health and bodyweight. (We choose ill health rather than good health for simplicity, so we’re minimizing both the “bad” things: as with minimizing loss, negative ill health is good health! And we choose bodyweight because fear of weight gain tends to be the greatest sticking point for the majority of people who struggle to get fully better from restrictive eating disorders, i.e. this fear is a primary driver in the psychological optimization process.) This means there are multiple different combinations of ill health and bodyweight that will satisfy the optimality requirement, because where you get more of one you typically get less of the other.

In the perceived trade-off between ill health and bodyweight, being *extremely* underweight or overweight is similar to the heart point in Figure 3: it is not allowed because a constraint in the optimization problem would rule this out. Of course, if your model of health is wrong, and does allow for runway-model stick-thinness, then the corresponding optimization problem doesn’t bear any resemblance to reality and being extremely underweight will be seen as a viable optimal solution. This is an example of Limitation 1 mentioned in Part 1: If your model is garbage, so is your solution. A lot of what it means to grow up and gather maturity and wisdom is probably (hopefully!) that we learn to make our models less like garbage, by working out what the meaningful constraints are and within what range they operate, often through the pain of trial and error. What you like to eat and how much of it is just one example, but a critical one—and one that far too many people let be guided by poor dietary science, advertising, peer pressure, and other factors that have little to do with what will help *your* life feel good.

Alongside the problem of lacking the appropriate model constraints, a possibly more important because more common problem arises when the model is correct, but you look at extreme points on the curve of solutions. There may be points on the curve that correspond to solutions of optimization problems where it is possible to be under- or overweight relative to health. Such a situation occurs if too much “weight” is given to one objective, i.e., *w *is very close to 1 or 0. Although technically an optimal solution, that doesn’t necessarily make it a sensible choice. For example, this may correspond to a BMI of 18, which although healthy according to some of the more irresponsible of the researchers who conduct clinical trials on therapeutic interventions for anorexia, is not compatible with full health for most people, especially not for almost anyone emerging from long-term malnutrition. Thinking back to our banker, this could correspond to a choice of *w=1* and the profit being minuscule but highly risk-averse. Here you’re prioritizing keeping your weight low over reducing your ill health, and the outcome is the predictable standard case post-anorexia: years or decades of stuckness in partial recovery.

As the example of the misguided clinical definitions of a “healthy” BMI suggests, it is easy to revert to “textbook” versions of an optimization processes, where you treat readymade parameters as all you need, without remembering that the constraints and the weightings need to be right for *you*. Algorithms of often fiendish complexity are at work in our lives all the time, whether or not we realize it, and if we don’t realize it, the default parameters will tend to dominate. Given how low-quality much eating disorder treatment is, and how well it aligns with the most common anorexic fears, the defaults are unlikely to do you as much good as your tailored versions could: They’re likely to push you into errors such as fruitless psychologizing and/or adhering to nonsensically low BMI limits (Troscianko and Leon, 2020). Your personal definition of health combined with your body’s physiology, for example, mean that not all points on the theoretically optimal curve are equally optimal for you. In the optimization processes constantly running to dictate our behaviours (anything from whether to walk or run over to the window, to whether to up my energy intake by 500 calories from tomorrow), choices are being made, with priors shaped by a lifetime’s worth of learning, and decisions reached via rapid and complex simulations of predicted outcomes. Turning a critical gaze on some of these often near-automatic choices, via the injection of a bit more individually oriented intuition (or common sense, whatever you want to call it) is often crucial to helping algorithmically optimal translate into actually good.

In a more directly positive sense, too, the existence of a curve of optimal solutions is a useful thing to bear in mind because it counters the paralysis it’s easy to feel during recovery at the idea that there is only one endpoint. This often manifests in the form of feeling you’ve “messed up recovery” already so there’s no point carrying on; that you have no clue where to go from here; that all your options feel wrong, etc. Pretty much everyone feels some version of this at some point, partly because every recovery process is nonlinear: something as life-changing and extended and periodically frightening as this always involves stopping and starting, regressing and restarting, making good progress in some areas while flatlining on others. An unhelpful notion of the perfect (and nonexistent) recovery process, which is likely to make you give up when your recovery doesn’t correspond to it, arises partly as a function of an unhelpful notion of the perfect (and nonexistent) recovery outcome: the idea that there’s a single, possibly quite brittle, hard-to-find destination, and that all the odds are stacked against you ever actually stumbling upon it.

This is in stark contrast to the reality, which is that fully recovered is by definition flexible, nonsingular. Being well again is about comfortably inhabiting a range of options, in your bodyweight and everything else. It’s anorexia, after all, that has always insisted on narrow stasis. Relatedly, the perceived brittleness of recovery as process and endpoint may relate to the perceived narrowness of “normality” as a guide and/or an intended endpoint. You can read more about the complications involved in aspiring to normality in a previous post (“Who wants to be normal?”), but for our purposes here, it’s worth noting that there are as many normalities as there are sampling methods: what you decide are your relevant dimensions, your relevant demographic, your way of ascertaining the actual behaviours and/or values of the demographic you’re observing—all these factors determine what gets spat out at the end that you choose to label “normality”. If you choose to sample from catwalk fashion or bikini competitions, you’ll end up with a particularly unpleasant brittleness of definition. All real people and sets of people have their curves, and the extent to which they’re OK with moving around on them, and away from and back towards them, is a decent proxy for how relaxed a life they’re living, and how much you might want to aspire to something similar. If you choose to emulate in your later recovery that subset of your friends who go on about how much less good they’d feel without their 3 x weekly Peloton and their quarterly juice cleanses, you can expect to end up as brittle as they are—actually more so, because of your history of disordered eating and exercise. If you choose more flexible models, or create your own if no decent ones are available around you, you have a far better chance of achieving a recovery that deserves the name.

In the third and final part of this series, we’ll run through some options to help you keep the optimization process that is recovery keep on track despite the complexities of competing objectives.

## Part 3: Applied optimization

In Part 2 of this series we considered what happens when you have an optimization problem (let’s say recovery) with multiple competing objectives (let’s say getting healthy whilst having a bodyweight that is acceptable to you). In this final part of the series, we consider some troubleshooting options to help recovery stay on track despite the complexities of competing objectives that change as you do.

As we saw in Part 2, as soon as you have more than one objective, you have more than one optimal solution: specifically, you have a whole curve populated with optimal solutions. It often doesn’t feel this way when we’re doing something difficult where we’re having to balance multiple aims. The most immediate reason why we don’t see the curve is usually fear and misinformation. Hiding the curve from you is what fear does best, because fearing something means, in optimization terms, trying to minimize the hell out of it. This means that you radically reduce the number of available optimization options. With respect to Figure 3 in Part 2, this would be as if you could see only a tiny fraction of the curve that has all the optimal points on it, because e.g. only a tiny range of weight gain is acceptable to you thanks to your fear.

This fear is probably responsible for the fact that the majority of people who have anorexia don’t get fully better, i.e. never even get to the optimal curve. And even if it doesn’t prevent you from reaching the optimal curve, fear might well persuade you, once you’re on it, that you’re not on a curve, i.e. inhabiting one of a family of equally good points, but at a unique optimal point that must not be deviated from. You might have got to a weight where you are truly capable of full health, but be terrified about gaining or losing even a kilo or two, in case that wrecks everything. This sense of having no options, no freedom of movement, will reliably endanger any potentially fully-recovered state through paradoxical fear of its endangerment. It’s worth remembering that misinformation comes in many forms, including dressed up in medical clothing. For example, one form of life-wrecking misinformation is the notion that a BMI of 20 to 25 is healthy and that this is all any of us should aim for or consider acceptable, whatever our genetics or our life circumstances.

Figure 4 gives a crude approximation of the biomedical view, in which BMI 20 to 25 counts as uniformly “healthy” and every increment above or below is an immediate jump into the next category, all of which are considered problematic, and increasingly so as their distance from “healthy” increases. This is reflected by an increase in cost with each increment. A standard somewhat more sensible view is given in green, with a curve denoting gradual increases in costs away from a “low-20s” BMI. An anorexic viewpoint given in red is the “20 is already fat” view. Recall that as viewed as an optimization problem, the goal is to “choose” a BMI that takes you to the bottom of these curves. All these views are profoundly limited in their applicability to your specific life and health, not least because BMI is such a profoundly limited measure and because the optimum may be quite a bit higher than we tend to think (Nuttall, 2015), but it’s obvious that the anorexic model is considerably more damaging than the other two, while shaped by both. A therapeutic process will typically try to align the red curve closer with the green.

Emily has listened to a depressing number of anecdotes about doctors and therapists telling their patient/client at the start of weight restoration, “don’t worry, we won’t let you get too fat, you can stop when you get to 20 [or insert even lower number here]”. This type of approach is great for pandering to the eating disorder and so reducing some forms of conflict within the therapeutic relationship, and potentially for getting initial buy-in from the ill person. Unfortunately, it also condemns their recovery effort to failure, because dictating your health based on an arbitrary number is what you’ve been doing all along, and in no way solves your problem. It also makes people who have already got to 19 or 20 and are in no way better yet feel they’ve done something terribly wrong, when all they’ve done is the thing most people do: not go far enough. The “get to a minimally ‘healthy’ BMI and then start dieting” model of “recovery” is a function of the terror of “overweight” that has grown medically normalized even amongst professionals treating people whose primary problem is that very terror.

This brings us to the question of where the optimization objectives come from. Why should bodyweight be pitted against health in the first place? Let’s say that your model includes this health/bodyweight tradeoff because your concept of attractiveness is tied tightly to body size and shape, i.e. overall you believe that slimness correlates with attractiveness. Your belief that this is the case has numerous dodgy foundations (explored in this pair of “Is thin beautiful?” posts), but it is the model you’re currently imposing on reality. “Reality” here is other people’s perceptions of you—the people you want to be attractive for—and let’s say that it happens that 99% of the people you want to attract would actually find you *more* attractive at a higher weight than you’re at now. Your messed-up model is thus pushing you to keep engaging in weight control behaviours to avoid an outcome that would make you both healthier and more attractive. The model may also be predicated on the assumption that being “attractive” to other people will result in more success, happiness, or other things you want. This too may not turn out to be true: For instance, being “attractive” beyond a certain level may result in less not more professional respect. The point is, how we expect the world to work and how it does are often not particularly well aligned. Again this is illustrated in Figure 4. The eating disorder sufferer’s curve gets arbitrarily high for increases in BMI beyond a certain point and never increases as BMI decreases. In contrast, the notional “average” person considers very high and very low BMIs to be poor “choices”.

What to do? How do you improve your model? The best way is to perform tests against the structure of reality followed by feedback of the results into the prior model to create an updated, better one. In the real world, tests against reality usually look like changing one thing and seeing what else changes, e.g. in this case gaining 5 kilos and seeing whether the relevant other people interact differently with you, and if so how. Another method might be via model-focused attitude change prior to behaviour (and then body) change. For example, one way of changing a model is to change the weightings on the various functions it includes, e.g. in this case deciding on a 2:1 ratio of ill health to bodyweight. Doing so would put you on a completely different part of the curve in Figure 3 from the fear-dictated segment you were on before. One way to make this change really happen might be by giving yourself an opportunity to exacerbate your cognitive dissonance around body ideals, as I described in this post: People who have to publicly criticise the thin ideals they currently endorse experience a level of unpleasant dissonance that can drive significant attitude change in service of reducing the dissonance. That is, you come to really believe your criticisms (that the thin ideal is one of the best ways to keep women docile, say) because it’s so uncomfortable having to make those criticisms while still deep down believing the opposite (that being thinner really is just better).

In general, though, attitude change tends to be more reliably achieved by starting with behaviour, not least because behaviours have a two-in-one efficacy: they directly change both physical (e.g. bodyweight) and also psychological states. In the end, nothing will really make you care less about being thin other than getting less thin—partly because of the cognitive rigidity and obsessiveness that are malnutrition’s inevitable consequences, partly because, as with most other everyday worries, fearing and avoiding things tends to make far greater spectres of them than actually living them.

Therapy, counselling, coaching, and other forms of professional support are of course another context where model improvement is one of the main stated aims. When you’re seriously ill, everything tends to look bad, and when you’re seriously ill with an eating disorder, change (involving weight restoration) tends to look especially bad. Therapy may help you see that there is in fact a downhill, cost-reducing direction to be (often quite straightforwardly) taken. Returning to our simple one-objective model in Part 1, therapy may also be invaluable in helping you out of the shallow valley towards the one where things are genuinely good, not just tolerable. Therapeutic guidance may, more generally, help you improve the accuracy of your model by smoothing out the perceived spikiness of the curve, as we show in Figure 5.

Figure 5 shows how the anorexic view often changes once recovery has started. During recovery, each new change that induces weight gain is perceived as coming with significant costs. Each time you do make the change and gain the weight, you tend to get a little bit more sensible about it: you recognise that you’re getting all kinds of benefits (i.e. negative costs!) associated with the weight you’ve gained, and you have a little more confidence in your ability to make the changes and keep making them. Still, it often remains difficult. You gain a bit, you get stuck thanks to your perception of radically increased costs from continuing for the next little phase, then you continue anyway. There will probably be significant perceived costs to allowing your weight to increase further once it’s reached the biomedical “healthy” range.

Crucially, just as in the previous figures, this is all about perceptions, not reality. And your “pretty much zero perceived costs” may turn out to be anywhere! It’s really unlikely to be anywhere near where the eating disorder viewpoint would originally have put it, though. It’s amazing how different things seem looking back from looking forward: you weigh 20 kg more than you used to, and you feel less fat and less bothered about fat than you ever did at the lower weight. It feels like magic, but it isn’t, it’s just your cost function finally getting some updates. Throughout this process, one of the jobs of anyone supporting you through your recovery is to help you smooth out the perceived costs curve: to reduce the spikiness of the spikes, so that each incremental portion of progress is less agonized. In this sense, it’s precisely the mismatch in models, and the fact that the therapist’s is a closer fit to ground truth, that drives the therapeutic method.

A less happy outcome is that there is model discrepancy in a different sense: with respect to the objective function being worked towards. It sometimes happens, especially with “severe and enduring” cases of anorexia, that therapists or other professionals are tacitly optimizing for damage limitation rather than recovery. In effect, in such a case we have the opposite of the desired effect of helping you towards the home valley: Here the therapist thinks you’re at grey triangle in Figure 2 (see Part 1) and that you don’t have what it takes to get all the way over to the globally optimal gold circle. They may think they have good reasons for this, for instance that aiming for full recovery, with the intensive demands for behavioural and physical change this entails, is more likely to result in further deterioration than in success. Regardless of whether any evidence supports this (we know of none that does), not making the therapeutic agenda explicit is never acceptable. The fact that it often happens is a reminder, however, not to take for granted that everyone’s model or objective function is what you think it is, or is the same as yours. This is all the more important when they have as powerful a status in your life as anyone does who is helping manage your recovery.

We hope this may have given you some new ideas about how to configure the optimization processes you’re engaged in—the recovery one and any others. Finally, it’s worth reiterating the basic point that all this is in a fundamental sense a *process*. Recovery is nothing if not iterative experimentation: try something, see what happens, assess, remodel, try something new. And, again, when optimizing for more than one objective in tension with each other, there are always multiple solutions. These two facts combined open out into a more flexible way of conceiving of recovery than may be your default.

Just remember: You can only ever optimize the model you have, and hope it’s representative of real life—or take steps to make it more so. There’s a bunch of choices where you can’t say one is better than the other. They’re just different: they give you more of one thing and less of another. There’s no free lunch: everything is a tradeoff, and you get to decide which you like best. Don’t let anyone else decide for you.

## References

Nuttall, F. Q. (2015). Body mass index: Obesity, BMI, and health: A critical review. *Nutrition Today*, *50*(3), 117. Open-access full text here.

Troscianko, E. T., & Leon, M. (2020). Treating eating: A dynamical systems model of eating disorders. *Frontiers in Psychology*, *11*, 1801. Open-access full text here.

## One thought on “The Mathematics of Optimization in Recovery”