2

I want to create a custom Ubuntu server distro that automatically installs and runs a specific program only on the target machine. I've found a very good guide here:

How to create a customized ubuntu server iso

I want to remove all the bloat that comes with a normal install, is there a way that I can determine which packages are not used at all so that I can remove them from my distro?

On a side note, if you do follow the guide mentioned above and you're running Ubuntu 14.04, use the following link to fix the bug with kickstart (I don't have enough reputation to add it to the comments, so if you do, please do)

kickstart bug fix ubuntu 14.04

10k3y3
  • 51
  • The easiest way is to start with the basics and add what you need. There are distros like Arch that are built like this. – Nattgew Oct 15 '14 at 15:58
  • I am personally a big fan of arch and run it on my desktop, however these systems will be installed in remote locations and need to be maintained by people who have limited Linux knowledge. We standardise on Ubuntu to keep a "central" knowledge base and to make "fault finding" simpler across multiple sites. – 10k3y3 Oct 16 '14 at 08:53
  • From looking here it seems like if you don't choose any of the services, it's going to be pretty basic to begin with. You'd have to be more specific about the program you need to run, and I'd have to look at a server install to see what's there. – Nattgew Oct 16 '14 at 14:54
  • It's custom software that we use to record data from a instrument connected to a serial port. I was hoping to make the distro as small as possible, but I reckon it's going to be more effort than it's worth if I want to stick to Ubuntu. Thanks for your inputs. – 10k3y3 Oct 20 '14 at 08:26
  • You could also look into Ubuntu Core. – Nattgew Oct 20 '14 at 14:34

1 Answers1

1

You can check which files were accessed by a program using strace:

-e trace=file
   Trace all system calls which take a file name as  an
   argument.   You can think of this as an abbreviation
   for  -e trace=open,stat,chmod,unlink,...   which  is
   useful   to   seeing   what  files  the  process  is
   referencing.  Furthermore,  using  the  abbreviation
   will  ensure  that  you don't accidentally forget to
   include a call like  lstat  in  the  list.   Betchya
   woulda forgot that one.

So, something like:

$ strace -fe trace=file -o log /bin/bash -c ''
$ awk -F\" '!a[$2]++&&/\//{print $2}' log | xargs dpkg -S 2>/dev/null | awk -F: '!a[$1]++{print $1}'
bash
libtinfo5
libc6

Be warned, though: this output is not particularly useful one way or the other:

  • It doesn't tell you which packages aren't used by a command, because the use could be by some way other than accessing a file (or maybe it accessed a file which was created by a package but not recorded in dpkg's database).
  • It doesn't tell you which packages could have not been used by a package. For example, if I ran an interactive bash session, the number of packages is a lot higher, mostly because the completion scripts provided by those are also being counted. The list even includes GRUB! And GRUB clearly isn't needed by bash.

What you should do is start with ubuntu-minimal and install only those things needed for the program to run over and above that (you'll know which ones are needed when the program dies of mysterious errors).

muru
  • 197,895
  • 55
  • 485
  • 740