2

I'm looking to search my directory of sql queries and for the most prevalent database tables in all of the files. To do so, I'm hoping to use grep to isolate these names but I can't quite figure out the regular expression to pull just the table names.

I figure in order to isolate the tables, I'm looking for the string of text after the words FROM or JOIN.

I hoping folks can help me build off of this, but currently my grep statements are just finding the occurrences of these words, but how do I pull the string directly proceeding these keywords?

grep -R "FROM" | grep -R  "JOIN"
  • 2
    It would be really helpful if you could add in a part of an (example) file. We can't read your mind, y'know. – grooveplex Jul 11 '16 at 16:33
  • And, that won't work. The first grep has no input other than STDIN, which it ignores, and produces no output for the (ignored) input of the second grep. Read http://askubuntu.com/help/how-to-ask – waltinator Jul 11 '16 at 16:58

1 Answers1

3

I pulled some sample queries from Google and made a couple sql files with the following data in them for example purposes:

file1.sql

SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID;
SELECT Customers.CustomerName, Orders.OrderID
INTO CustomersOrderBackup2013
FROM CustomersOriginal
LEFT JOIN Orders
ON Customers.CustomerID=Orders.CustomerID;

file2.sql

-- Dumping data for table `user_details`
INSERT INTO `user_details` (`user_id`, `username`, `first_name`, `last_name`, `gender`, `password`, `status`) VALUES
(1, 'rogers63', 'david', 'john', 'Female', 'e6a33eee180b07e563d74fee8c2c66b8', 1);

The first thing you're looking for is the OR operator for grep:

$ grep 'INTO\|FROM\|JOIN' *.sql
file1.sql:FROM Orders
file1.sql:INNER JOIN Customers
file1.sql:INTO CustomersOrderBackup2013
file1.sql:FROM CustomersOriginal
file1.sql:LEFT JOIN Orders
file2.sql:INSERT INTO `user_details` (`user_id`, `username`, `first_name`, `last_name`, `gender`, `password`, `status`) VALUES

That will get the entire lines from the original files. Next let's use sed to pull out the table names:

$ grep 'INTO\|FROM\|JOIN' *.sql | sed -r 's/.*?(FROM|INTO|JOIN)\s`?([^` ]*).*/\2/g'
Orders
Customers
CustomersOrderBackup2013
CustomersOriginal
Orders
user_details

Finally, let's eliminate duplicate tables:

$ grep 'INTO\|FROM\|JOIN' *.sql | sed -r 's/.*?(FROM|INTO|JOIN)\s`?([^` ]*).*/\2/g' | sort -u
Customers
CustomersOrderBackup2013
CustomersOriginal
Orders
user_details
DynamicBits
  • 453
  • 2
  • 8