In Nocedal and Wright's optimization book, it defines,
Observe that the isolated local minimizer is defined as a local minimizer, therefore, if $x^\star$ is a local isolated minimizer, there exists a neighborhood $\mathcal{N}$ of $x^\star$ such that $f(x^\star) \leq f(x), \forall x \in \mathcal{N}$ and $x^\star$ is the only local minimizer in $\mathcal{N}$.
But the $\leq$ doesn't make any sense, because it is isolated, therefore for all $x \in \mathcal{N}, x \neq x^\star$, $f(x^\star) < f(x)$.
So why is isolated local minimizer defined as a local minimizer, instead of as a "unique strict local minimizer"? It just seems that defining as a local minimizer instead of a strict local minimizer seems to be redundant if not unnecessary/confusing.
$\endgroup$ 81 Answer
$\begingroup$Let us compare the following definitions.
(Definition 1) A point $x^*$ is an isolated local minimizer if there is a neighborhood $\mathcal N$ of $x^*$ such that $x^*$ is the only local minimizer in $\mathcal N$.
(Definition 2) A point $x^*$ is an isolated local minimizer if there is a neighborhood $\mathcal N$ of $x^*$ such that $x^*$ is the only strict local minimizer in $\mathcal N$.
You seem to imply that these definitions are equivalent. However, this is not true.
Consider a function $f:\mathbb R\to\mathbb R$ such that$0$ is a strict local minimizer but not an isolated minimizer (such a function exists according to your comment). Now, we define the function$$ g:\mathbb R^2\to\mathbb R, \qquad x \mapsto f(\|x\|). $$It can be seen that $(0,0)$ is a strict local minimizer of $g$. It can also be seen that $(0,0)$ is not an isolated minimizer (in the sense of Definition 1). However, other than $(0,0)$ there are no isolated local minimizers in a neighborhood of $(0,0)$: If $y\neq (0,0)$ is a local minimizer, then there are values$z$ in each neighborhood of $y$ such that $\|y\|=\|z\|$and therefore $g(y)=g(z)$. So $(0,0)$ is not an isolated local minimizer in the sense of Definition 2.
Since Definition 1 allows for a wider class of local minimizers, it can be applied more often, i.e. Theorems that use it in their assumptions are stronger.
$\endgroup$