2

I would like to know the total size of the ubuntu repositories here. Is there a command that I can use (perhaps involving wget) that queries the total size of all files inside this directory, without downloading them?

Maythux
  • 84,289
a06e
  • 13,223
  • 26
  • 70
  • 104

5 Answers5

2

Doing a quick du -hd1 on my own personal Ubuntu mirror of the official repository containing 10.04 Lucid, 12.04 Precise and 14.04 Trusty, I'm consuming 418GB of disk space (not including the Extras and Partner repositories):

$ du -hd1
1.1G    ./dists
417G    ./pool
418G    .
$

$ du -hd1 dists
160M    dists/lucid
2.1M    dists/lucid-backports
42M     dists/lucid-proposed
58M     dists/lucid-security
93M     dists/lucid-updates
200M    dists/precise
2.4M    dists/precise-backports
71M     dists/precise-proposed
59M     dists/precise-security
102M    dists/precise-updates
256M    dists/trusty
888K    dists/trusty-backports
40M     dists/trusty-proposed
7.4M    dists/trusty-security
16M     dists/trusty-updates
1.1G    dists
$

$ du -hd1 pool
217G    pool/universe
171G    pool/main
5.4G    pool/restricted
24G     pool/multiverse
417G    pool
$

My mirror contains 32-bit, 64-bit and source data, updated once every 24 hours.

Remember that certain packages are shared between releases, so even if each individual release's mirror by itself was, for arguments sake, 200GB each, that does not necessarily mean that combining three releases automatically consumes 600GB of space, because you are only keeping one copy of each unique file in your mirror.

Let's be honest, 500GB across three LTS' is not whole lot of disk space these days...

1

I think this answers the question well, as it will allow you to get the directory size of any open directory (not just a repo [apt-mirror]) without downloading any files. It's also fairly simple and quick.

TL;DR

Install rclone and replace the URL with whatever you want.

Install Rclone (Binaries available here)

curl https://rclone.org/install.sh | sudo bash

Get Directory Size (Replace URL with any open directory, make sure not to remove :http:)

rclone size --http-url http://ubuntu.uni-klu.ac.at/ubuntu/pool/ :http:

Explanation

Using rclone + http with optional mount will do the trick.

This gives you the freedom to check the size with all sorts of methods. rclone size http: or rclone mount http: directory/ then cd directory/ and du -sh or du -hd1 or ncdu (from here) or (NOT recommended) ls -shR

This might be your best option:

You might want to avoid hammering the server by adjusting the values and optionally adding/removing --fast-list in this command:

rclone size http: -v --tpslimit 5 --bwlimit 500K --checkers 5 --fast-list

Adjust up or down according to your needs and what you think the server can handle. For example, in just a couple minutes, I was able to use rclone size on a server that I thought would be fine with it, and got these results returned.

rclone size --http-url http://apollo.sese.asu.edu/data/ :http: --checkers 100

Total objects: 195669

Total size: 123.619 TBytes (135920738673216 Bytes)

1

Without being able to SSH into the server and run du on the directory it doesn't seem likely you can get this info. However, you may be able to use wget --spider for this purpose. Source

larouxn
  • 789
  • 6
  • 15
0

Just an idea.

Start downloading recusive all index.html files with wget:

wget -r -np -A "*.html" http://ubuntu.uni-klu.ac.at/ubuntu/pool/main

Then type in the same folder another command (you can do that while the other command is running):

find -type f | xargs cat | grep -oP '[0-9]+[K,M]' | \
sed 's/\([0-9]*\)K/\(\1*1024\)/g; s/\([0-9]*\)M/\(\1*1024*1024\)/g;' | \
paste -sd+ | bc

The number that the command prints out is the size in bytes. But note, you have to wait for the first command to end before the second command prints the correct size.

chaos
  • 27,506
  • 12
  • 74
  • 77
0

You an use apt-mirror:

Install it via :

sudo apt-get install apt-mirror

configure it same as in this tuorial, then once you run

sudo apt-mirror

It will report you with the size of the repositories. The added value of apt-mirror over other answers is possibility to find each repository size alone so you can just keep the main for example and comment others then you get size of the main, and so...

Maythux
  • 84,289