It depends on the number of classes; we are getting good results with about 40 training examples per class.
A good way to get an idea about this is to run a test with an increasing set of training data, evaluating the result as you go along. Obviously, with a small set (eg 3 sentences per class), it will be very poor, but the accuracy should quickly increase and then stabilise at a higher level. With larger amounts of data you will probably only find a small increase or no change at all.
Collecting this data would not only give you confidence in your conclusion, it would also be a good supporting argument when you have to ask for more training data, or have to justify the poor performance of the classifier if you do find the data set is too small.
So, set up an automated 10-fold cross validation, feed an increasing amount of your available data into it, sit back, and graph the results.