Lets start with question 1) how does JS-divergence handles zeros?
by definition:
\begin{align}
D_{JS}(p||q) &= \frac{1}{2}[D_{KL}(p||\frac{p+q}{2}) + D_{KL}(q||\frac{p+q}{2})] \\
&= \frac{1}{2}\sum_{x\in\Omega} [p(x)log(\frac{2 p(x)}{p(x)+q(x)}) + q(x)log(\frac{2 q(x)}{p(x)+q(x)})]
\end{align}
Where $\Omega$ is the union of the domains of $p$ and $q$. Now lets assume one distribution is zero where the other is not (without loss of generality due to symmetry we can just say $p(x_i) = 0$ and $q(x_i) \neq 0$. We then get for that term in the sum
$$\frac{1}{2}q(x_i)log(\frac{2q(x_i)}{q(x_i)}) = q(x_i)\frac{log(2)}{2}$$
Which isn't undefined as it would be the KL case.
Now onto 2) In GANS why does JS divergence produce better results than KL
The asymmetry of KL divergence places an unfair advantage to one distribution over the other where in this case, its not ideal to consider it this way from an optimization perspective. Additionally KL divergences inability to handle non-overlapped distributions is crushing given that these are approximated through sampling schemes, and therefore there are no guarantees. JS solves both those issues and leads to a smoother manifold which is why its generally preferred. A good resource is this paper where they go more in detail investigating this.