Theoretically it might be possible but practically it is not.
You can argue by using the analogy of a Turing machine. You can say that the Intel 8080 is a turing machine hence it can run any program including a neural network given infinite time and memory.
Inspite of the above you will face insurmountable challenges in implementing your system.
CPU's are designed to handle calculations in a sequential manner, most AI algorithms are distributed. You need a GPU (or an AI ASIC) to process the algorithms in a massively parallel manner for a significant speedup.
Additionally GPU's are excellent at floating point math, floating point arithmetic involves numbers with a variable number of decimal places which are key in running neural networks. For example an Intel core i7 6700k is capable of 200 Giga-FLOPs (floating point operations per seconds) while on the other hand an NVidia GTX 1080 GPU is capable of about 8900 Giga-FLOPs which is a significant difference. (Tyler J 2017)
If you decide to use the intel 8080 (0.290 MIPS at 2.000 MHz), you will require millions of processors and billions of dollars just to compute at one gigaflop. You can follow this link to see the cost of computing over the years https://en.wikipedia.org/wiki/FLOPS
Another problem concerns RAM. To efficiently run a neural network you need to fully load it in RAM. It will be a huge challenge to squeeze a neural network in the 64 Kb of RAM that an Intel 8080 processor offers.
The network bandwidth problem will also be a huge bottleneck. Modern GPU's support high speed technology to communicate between the GPU's. For example NVidia's NVLink has a peak speed of around 80 GBps. While PCI-E 3.0 runs at around 30 GBps. Without a high speed interconnection bandwidth you will not achieve any speedup inspite of using a distributed system with many processors.
Additionally you will face significant challenges in programing neural network algorithms for your 8080 processor based system. Most programmers today follow the standards of object oriented programming which enables code reuse, simplified design and maintanance. Besides, OOP languages such as Java, C++ and Python have libraries that significantly simplify the process of programming a neural network.
When the 8080 processor was designed back in 1974 OOP had not yet been concieved, they were also using programming tools i.e. compilers that would be considered archaic with todays standards. I mean good luck debugging that system.
Last but not least, you need big data (or atleast a substantial dataset) to train your neural network on. Without training on a big data set your model will be ineffective. The 8080 supported around 200 Kb of storage. For comparison the MNIST dataset is around 14 GB in size. This means that your processor cannot support the neccessary storage of any ML dataset.
For the above reasons my conclusion is that the 8080 processor provides insufficient resources necessary to implement any effective DL algorithm. Networking millions of them together will not provide any substantial speedup for a DL algorithm.