Image channels have nothing to do with machine learning, they are just part of computer image processing.
A channel is a number per pixel. So most colour images are stored with red, green and blue channels, as you probably know. Some images are stored in greyscale with just one white channel.
A RGB image is stored like this: pixel 0 red amount, pixel 0 green amount, pixel 0 blue amount, pixel 1 red amount, pixel 1 green amount, pixel 1 blue amount, pixel 2 red amount, ...
They could also be rearranged like this: pixel 0 red amount, pixel 1 red amount, pixel 2 red amount, ....., pixel 99999 red amount, pixel 0 green amount, pixel 1 green amount, ....., pixel 99999 green amount, pixel 0 blue amount, ....., pixel 99999 blue amount.
But that is not common.
A greyscale image only has one channel and it's stored like this: pixel 0 white amount, pixel 1 white amount, pixel 2 white amount, pixel 3 white amount, ...
A black-and-white image also has only one channel (a white channel) but that channel can only be 0 brightness or maximum brightness. They can be stored with just 1 bit per pixel.
Alpha is an extra transparency channel that some pictures have. Alpha 0 means fully transparent. Maximum alpha means fully opaque. Half-maximum alpha means the image is partially see-through at that pixel. Things like photos don't have alpha, but computer graphics that are designed to be displayed on top of other pictures often do.
There are also more exotic systems like YCbCr, where you have a white channel, a blue-versus-green channel, and a red-versus-green channel. Mostly we just convert those to RGB before processing.