1

I see three checksums in a .deb package:

  1. md5sum
  2. sha1
  3. sha256

Why do we need 3 checksums? Can we use any one of these to uniquely identify a Debian package?

Eliah Kagan
  • 117,780

2 Answers2

2

Yes, you can use any one of those sums to identify a package.

Back in the early days of Debian, before apt, before Ubuntu, back when dpkg roamed the Earth freely as the apex package manager, Debian users manually downloaded packages and then manually ran md5sum to verify a non-corrupt download. md5sum went out of style about 20 years ago, as early iterations of apt began to automatically verify downloads as part of the new repository system.

Debian shifted from md5sum to more-secure sha1 and later to much-more-secure-sha256 as the project's security gurus determined that greater and greater computing power over the decades made their packages vulnerable to sophisticated attacks.

However, many legacy packaging methods (like debhelper) and infrastructure (like alioth) threw errors if the older hashes were not also generated. Cleaning out legacy infrastructure is a complex problem. It's not the code; it's the people who have set up workflows that rely upon their favorite tools, and don't really want to change. They are volunteers, so compelling change is rarely a realistic option. So infrastructure cleanup is slow. However, note that this community's willingness to openly discuss change, and to accept that change might be slow, is arguably one of Debian's great strengths.

Someday md5 and sha1 will be gone. But Debian isn't quite there yet.

user535733
  • 62,253
  • In addition md5 and sha256 and sha1 is a lot better than one of them. A collision in one of them will probably not be a collision in any of the others. Using different hashing algorithms has a trivial overhead, but some benefits in terms of difficulty to attack. – vidarlo Aug 13 '19 at 04:50
0

This does not really answer the question but some additional context may nonetheless be appreciated, since "identify a package" may mean so many different things. The hash sums tell you that the package's content matches with the description (.dsc) file. And yes, any of these hashes will tell you if that content was changed.

What it does not tell you is if the content is still the same since the package has left the developer. Everyone could just come up with a package and generate a new .dsc file - with matching but different hashes. You want to check the package's signature to ensure that the package is what it should be, matching hashes or not.

You can also sign packages yourself and trust your own signature. This way you can modify packages. The .changes file link the binaries to a source tree. This may then help to "identify" the functionally equivalent packages across different hardware platform for which you (or the distributions build demons) rebuilt them. But across platforms, the binaries will have different hashes.

For most package managing use cases, knowledge of the package name+version plus signature are sufficient. This also gives you a "newer than" half-order, which hashes cannot provide.

smoe
  • 435
  • 2
  • 8