I wanted to know how the performance of my net would be compared to the same in Tensor Flow. Not to specific but just a rough aproximation.
This is very hard to answer in specific terms because benchmarking is very hard and is often wrong.
The main point of TensorFlow as I see it is to make it easier for you to use a GPU and further allows you to use a large supply of programs written in Python/JavaScript that still give C++ level performance.
How fast is TensorFlow compared to self written neural nets?
This is answering the general question of using TensorFlow/PyTorch vs a custom solution, rather than your specific question of how much of a speed up you'd get.
There was a relatively recent MIT paper, Differentiable Programming for Image Processing and Deep Learning in Halide trying to discuss the performance vs flexibility vs time spent coding a solution of 3 different languages.
Specifically they compared a solution in their language Halide vs PyTorch vs CUDA.
Consider the following example. A recent neural network-based
image processing approximation algorithm was built around a new
“bilateral slicing” layer based on the bilateral grid [Chen et al. 2007; Gharbi et al. 2017]. At the time it was published, neither PyTorch
nor TensorFlow was even capable of practically expressing this
computation. As a result, the authors had to define an entirely
new operator, written by hand in about 100 lines of CUDA for the
forward pass and 200 lines more for its manually-derived gradient
(Fig. 2, right). This was a sizeable programming task which took
significant time and expertise. While new operations—added in just
the last six months before the submission of this paper—now make
it possible to implement this operation in 42 lines of PyTorch, this
yields less than 1/3rd the performance on small inputs and runs
out of memory on realistically-sized images (Fig. 2, middle). The
challenge of efficiently deriving and computing gradients for custom
nodes remains a serious obstacle to deep learning.
So in general you'll probably get faster performance with TensorFlow/PyTorch than a custom C++ implementation, but for specific cases if you have CUDA knowledge on top of C++ then you will be able to write more performant programs.