I would first say consider the advice of Thomas W in the comment above and think about whether you really need to discretize your variables. I'd also question the wisdom of training a reasonably sized network with a GA instead of a dedicated neural net training algorithm that's very likely to exhibit much better performance. However, assuming you really do need to do this, the basic approach is generally to do the following.
- Decide on your range and precision. For the sake of simplicity, I'm going to assume you use say $L=16$ bits to represent every weight. You can have some variables encoded at a higher precision than others -- you're in charge of writing the encode/decode functions, so it's entirely up to you, but here, I'm assuming you don't. Your encoding will now be a vector of length $16n$, where $n$ is the number of weights in the network. You can use a string if you like, but it's probably a net loss of efficiency, as mostly what you'll be doing is converting those bits to integers so that you can do arithmetic necessary to decode the encoded weights, so I'd just make it a vector of $16n$ integers.
For the range, you just need the minimum and maximum values you're going to allow. Again, for simplicity, I'm going to say the values range between $-10$ and $10$, denoted as $R_{\min}=-10$ and $R_{\max}=10$.
- Your GA then just manipulates the $0$s and $1$s. To evaluate fitness, you have to decode each of those $16$-bit strings into a double between $-10$ and $10$. With $16$ bits, you have $2^{16}=16536$ possible values. The total range of output values is $10-(-10)=20$. $20/16536=0.0012$. Let $0000000000000000$ be the smallest possible string, and map that onto the value $-10$. Now, every increment to that binary string just adds $0.0012$ to the output value.
Putting all this together, you have $w_i = (R_{\max} - R_{\min})/2^L + R_{\min}$. Do that for each block of $16$ bits giving you each weight in the network. Run your network on your training data and use whatever accuracy metric as the fitness function for the GA.
- For the GA itself now, you don't really have to do anything special. The GA can just see an arbitrary string of $16n$ bits. You can do crossover and mutation on just the apparently meaningless string of bits.
Your question was asking about going from a vector of weights to a vector of strings, which is the opposite conversion from that described above. Generally, for a GA, you don't need a phenotype->genotype conversion. You just generate genotypes using genetic operators and use the genotype->phenotype conversion to evaluate candidate solutions. However, there are some reasons why you might want to go the other way. For instance, you might want to seed the population with a known set of weights, and so you'd need the ability to convert those weights to binary.
Basically, in that case, just invert that conversion function. The idea is to convert the floating-point weight between $R_{\min}$ and $R_{\max}$ into the integer that's closest to the same distance into the range when mapped to $0$ to $2^{L}-1$. For example, if the weight is $-10$, then it's the smallest possible value. It maps onto the smallest possible value between $0$ and $2^{L}-1$, namely $0$. Similarly, a weight of $0$ maps onto $(2^{L}-1)/2$, because it's halfway between the min and max of both ranges. If you play around with it a bit, you should see that the expression here is given by $\big \lfloor \frac{w_i - R_\min}{R_\max - R_\min} 2^L \big \rfloor$. That gives me an integer in the range $0$ to $2^L$. From there, just do a standard binary encoding (or gray code, whatever) to get into a bit string you can manipulate.