Commits · 18be6aec587c83bf2ddc94536d33bec8ffbfe256 · m-team / tools / packaging / upstream / Putty

Mar 01, 2020

PrimeCandidateSource: remember prime factors of n-1. · 18be6aec

Simon Tatham authored 5 years ago

We already had a function pcs_require_residue_1() which lets you ask
PrimeCandidateSource to ensure it only returns numbers congruent to 1
mod a given value. pcs_require_residue_1_mod_prime() is the same, but
it also records the number in a list of prime factors of n-1, which
can be queried later.

The idea is that if you're generating a DSA key, in which the small
prime q must divide p-1, the upcoming provable generation algorithm
will be able to recover q from the PrimeCandidateSource and use it as
part of the primality certificate, which reduces the number of bits of
extra prime factors it also has to make up.

18be6aec

New 'Pockle' object, for verifying primality. · 2be70baa

Simon Tatham authored 5 years ago

This implements an extended form of primality verification using
certificates based on Pocklington's theorem. You make a Pockle object,
and then try to convince it that one number after another is prime, by
means of providing it with a list of prime factors of p-1 and a
primitive root. (Or just by saying 'this prime is small enough for you
to check yourself'.)

Pocklington's theorem requires you to have factors of p-1 whose
product is at least the square root of p. I've extended that to
support factorisations only as big as the cube root, via an extension
of the theorem given in Maurer's paper on generating provable primes.

The Pockle object is more or less write-only: it has no methods for
reading out its contents. Its only output channel is the return value
when you try to insert a prime into it: if it isn't sufficiently
convinced that your prime is prime, it will return an error code. So
anything for which it returns POCKLE_OK you can be confident of.

I'm going to use this for provable prime generation. But exposing this
part of the system as an object in its own right means I can write a
set of unit tests for this specifically. My negative tests exercise
all the different ways a certification can be erroneous or inadequate;
the positive tests include proofs of primality of various primes used
in elliptic-curve crypto. The Poly1305 proof in particular is taken
from a proof in DJB's paper, which has exactly the form of a
Pocklington certificate only written in English.

2be70baa

Add mp_copy_integer_into function. · 20a9912c
Simon Tatham authored 5 years ago
```
Even simpler than the existing mp_add_integer_into.
```
20a9912c

Add mp_nthroot function. · 6b279995

Simon Tatham authored 5 years ago

This takes ordinary integer square and cube roots (i.e. not mod
anything) of mp_ints.

6b279995

Introduce a vtable system for prime generation. · ece78824

Simon Tatham authored 5 years ago

The functions primegen() and primegen_add_progress_phase() are gone.
In their place is a small vtable system with two methods corresponding
to them, plus the usual admin of allocating and freeing contexts.

This API change is the starting point for being able to drop in
different prime generation algorithms at run time in response to user
configuration.

ece78824

Rework PrimeCandidateSource without the delta system. · 08a3547b

Simon Tatham authored 5 years ago

Now we don't even bother with picking an mp_int base value and a small
adjustment; we just generate a random mp_int, and if it's congruent to
anything we want to avoid, throw it away and try again.

This should cause us to select completely uniformly from the candidate
values in the available range. Previously, the delta system was
introducing small skews at the start and end of the range (values very
near there were less likely to turn up because they fell within the
delta radius of a smaller set of base values).

I was worried about doing this because I thought it would be slower,
because of having to do a big pile of 'reduce mp_int mod small thing'
every time round the loop: the virtue of the delta system is that you
can set up the residues of your base value once and then try several
deltas using only normal-sized integer operations. But now I look more
closely, we were computing _all_ the residues of the base point every
time round the loop (several thousand of them), whereas now we're very
likely to be able to throw a candidate away after only two or three if
it's divisible by one of the smallest primes, which are also the ones
most likely to get in the way. So probably it's actually _faster_ than
the old system (although, since uniformity was my main aim, I haven't
timed it, only noticed that it seems to be fast _enough_).

08a3547b

DSA key gen: start from 2 when looking for g. · 8b672835

Simon Tatham authored 5 years ago

Having just written a comment about how it was almost inconceivably
improbable that you _wouldn't_ be successful in finding a suitable g
on the very first number you tried, I couldn't help noticing that in
fact my very next DSA key generation test had to try twice. Had I made
a mistake in my probability theory?

No, it turns out: I find g by raising consecutive numbers to the power
(p-1)/q and looking to see if they're not 1, but I start with 1
itself, which along with -1 is the only number that _can't_ work!

Save a bit of pointless effort and iterate up from 2 instead.

8b672835

Fix VS 2015 build failure. · a7f409eb

Simon Tatham authored 5 years ago

Thanks to Pavel Kryukov's CI for pointing this out: VS 2015 doesn't
support C99 hex floating-point literals.

(VS 2017 does, and in general we're treating this as a C99-permitted
code base these days. So I was tempted to just increase the minimum
required compiler version and leave this code as it is. But since the
use of that particular floating literal was so totally frivolous and
unnecessary, I think I'll leave that for another day when it's more
important!)

a7f409eb

Feb 29, 2020

Factor out Miller-Rabin checking into its own file. · 750f5222

Simon Tatham authored 5 years ago

This further cleans up the prime-generation code, to the point where
the main primegen() function has almost nothing in it. Also now I'll
be able to reuse M-R as a primitive in more sophisticated alternatives
to primegen().

750f5222

New vtable API for keygen progress reporting. · 79d3c178

Simon Tatham authored 5 years ago

The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.

While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.

Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.

(This is a reapplication of commit a7bdefb3, without the Miller-Rabin
refactoring accidentally folded into it. Also this time I've added -lm
to the link options, which for some reason _didn't_ cause me a link
failure last time round. No idea why not.)

79d3c178

Revert "New vtable API for keygen progress reporting." · 62733a83

Simon Tatham authored 5 years ago

This reverts commit a7bdefb3.

I had accidentally mashed it together with another commit. I did
actually want to push both of them, but I'd rather push them
separately! So I'm backing out the combined blob, and I'll re-push
them with their proper comments and explanations.

62733a83

New vtable API for keygen progress reporting. · a7bdefb3

Simon Tatham authored 5 years ago

The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.

While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.

Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.

a7bdefb3

New API for primegen(), using PrimeCandidateSource. · 63b8f537

Simon Tatham authored 5 years ago

The more features and options I add to PrimeCandidateSource, the more
cumbersome it will be to replicate each one in a command-line option
to the ultimate primegen() function. So I'm moving to an API in which
the client of primegen() constructs a PrimeCandidateSource themself,
and passes it in to primegen().

Also, changed the API for pcs_new() so that you don't have to pass
'firstbits' unless you really want to. The net effect is that even
though we've added flexibility, we've also simplified the call sites
of primegen() in the simple case: if you want a 1234-bit prime, you
just need to pass pcs_new(1234) as the argument to primegen, and
you're done.

The new declaration of primegen() lives in ssh_keygen.h, along with
all the types it depends on. So I've had to #include that header in a
few new files.

63b8f537

testcrypt.py: avoid restarting subprocess for frees. · 809a4eb2

Simon Tatham authored 5 years ago

I just ran into a bug in which the testcrypt child process was cleanly
terminated, but at least one Python object was left lying around
containing the identifier of a testcrypt object that had never been
freed. On program exit, the Python reference count on that object went
to zero, the __del__ method was invoked, and childprocess.funcall
started a _new_ instance of testcrypt just so it could tell it to free
the object identifier - which, of course, the new testcrypt had never
heard of!

We can already tell the difference between a ChildProcess object which
has no subprocess because it hasn't yet been started, and one which
has no subprocess because it's terminated: the latter has exitstatus
set to something other than None. So now we enforce by assertion that
we don't ever restart the child process, and the __del__ method avoids
doing anything if the child has already finished.

809a4eb2

testcrypt.py: fake some OO syntax. · db7a314c

Simon Tatham authored 5 years ago

When I'm writing Python using the testcrypt API, I keep finding that I
instinctively try to call vtable methods as if they were actual
methods of the object. For example, calling key.sign(msg, 0) instead
of ssh_key_sign(key, msg, 0).

So this change to the Python side of the testcrypt mechanism panders
to my inappropriate finger-macros by making them work! The idea is
that I define a set of pairs (type, prefix), such that any function
whose name begins with the prefix and whose first argument is of that
type will be automatically translated into a method on the Python
object wrapping a testcrypt value of that type. For example, any
function of the form ssh_key_foo(val_ssh_key, other args) will
automatically be exposed as a method key.foo(other args), simply
because (val_ssh_key, "ssh_key_") appears in the translation table.

This is particularly nice for the Python 3 REPL, which will let me
tab-complete the right set of method names by knowing the type I'm
trying to invoke one on. I haven't decided yet whether I want to
switch to using it throughout cryptsuite.py.

For namespace-cleanness, I've also renamed all the existing attributes
of the Python Value class wrapper so that they start with '_', to
leave the space of sensible names clear for the new OOish methods.

db7a314c

Add official kex name "curve25519-sha256". · e025ccc2

Jacob Nevins authored 5 years ago

As a simple alias for "curve25519-sha256@libssh.org". This name is now
standardised in RFC8731 (and, since 77516578, we have the extra
validation mandated by the RFC compared to the libssh spec); also it's
been in OpenSSH at least for ages (since 7.4, 2016-12, 0493766d56).

e025ccc2

Feb 28, 2020

Reject all low-order points in Montgomery key exchange. · 77516578

Simon Tatham authored 5 years ago

This expands our previous check for the public value being zero, to
take in all the values that will _become_ zero after not many steps.

The actual check at run time is done using the new is_infinite query
method for Montgomery curve points. Test cases in cryptsuite.py cover
all the dangerous values I generated via all that fiddly quartic-
solving code.

(DJB's page http://cr.yp.to/ecdh.html#validate also lists these same
constants. But working them out again for myself makes me confident I
can do it again for other similar curves, such as Curve448.)

In particular, this makes us fully compliant with RFC 7748's demand to
check we didn't generate a trivial output key, which can happen if the
other end sends any of those low-order values.

I don't actually see why this is a vital check to perform for security
purposes, for the same reason that we didn't classify the bug
'diffie-hellman-range-check' as a vulnerability: I can't really see
what the other end's incentive might be to deliberately send one of
these nonsense values (and you can't do it by accident - none of these
values is a power of the canonical base point). It's not that a DH
participant couldn't possible want to secretly expose the session
traffic - but there are plenty of more subtle (and less subtle!) ways
to do it, so you don't really gain anything by forcing them to use one
of those instead. But the RFC says to check, so we check.

77516578

eccref.py: find low-order points on Montgomery curves. · 1cad3c82

Simon Tatham authored 5 years ago

This uses the new quartic-solver mod p to generate all the values in
Curve25519 that can end up at the curve identity by repeated
application of the doubling formula.

1cad3c82

numbertheory.py: cubic and quartic solver mod p. · f82af9ff

Simon Tatham authored 5 years ago

I'm going to want to use this for finding special values in elliptic
curves' ground fields.

In order to solve cubics and quartics in F_p, you have to work in
F_{p^2}, for much the same reasons that you have to be willing to use
complex numbers if you want to solve general cubics over the reals
(even if all the eventual roots turn out to be real after all). So
I've also introduced another arithmetic class to work in that kind of
field, and a shim that glues that on to the cyclic-group root finder
from the previous commit.

f82af9ff

numbertheory.py: generalise SqrtModP to do other roots. · 072d3c66

Simon Tatham authored 5 years ago

I'm about to want to solve quartics mod a prime, which means I'll need
to be able to take cube roots mod p as well as square roots.

This commit introduces a more general class which can take rth roots
for any prime r, and moreover, it can do it in a general cyclic group.
(You have to tell it the group's order and give it some primitives for
doing arithmetic, plus a way of iterating over the group elements that
it can use to look for a non-rth-power and roots of unity.)

That system makes it nicely easy to test, because you can give it a
cyclic group represented as the integers under _addition_, and then
you obviously know what all the right answers are. So I've also added
a unit test system checking that.

072d3c66

numbertheory.py: make the ModP class hashable. · 7be2e160
Simon Tatham authored 5 years ago
```
That will let me keep them in sets.
```
7be2e160
numbertheory.py: factor out invert(). · 3ee9b929
Simon Tatham authored 5 years ago
```
I'm about to want to reuse it.
```
3ee9b929

eccref.py: move support routines into a new file. · 122d7852

Simon Tatham authored 5 years ago

I'm about to want to expand the underlying number-theory code, so I'll
start by moving it into a file where it has room to grow without
swamping the main purpose of eccref.py.

122d7852

New query function ecc_montgomery_is_identity. · c9a8fa63
Simon Tatham authored 5 years ago
```
To begin with, this allows me to add a regression test for the change
in the previous commit.
```
c9a8fa63

Preserve zero denominators in ECC point normalisation. · 141b75a7

Simon Tatham authored 5 years ago

ecc_montgomery_normalise takes a point with X and Z coordinates, and
normalises it to Z=1 by means of multiplying X by the inverse of Z and
then setting Z=1.

If you pass in a point with Z=0, representing the curve identity, then
it would be nice to still get the identity back out again afterwards.
We haven't really needed that property until now, but I'm about to
want it.

Currently, what happens is that we try to invert Z mod p; fail, but
don't notice we've failed, and come out with some nonsense value as
the inverse; multiply X by that; and then _set Z to 1_. So the output
value no longer has Z=0.

This commit changes things so that we multiply Z by the inverse we
computed. That way, if Z started off 0, it stays 0.

Also made the same change in the other two curve types, on general
principles, though I don't yet have a use for that.

141b75a7

eccref.py: handle order-2 points in Montgomery curves. · 0645824e

Simon Tatham authored 5 years ago

If a point doubles to the identity, we should return the identity,
rather than throwing a Python divide-by-zero exception.

0645824e

Switch to running cryptsuite.py under Python 3. · 027f1c67

Simon Tatham authored 5 years ago

I'm getting tired of maintaining it as 2/3 compatible; 2 is on the way
out anyway and I'm losing patience. In future, if it breaks in 2, I
think I'm going to stop caring.

027f1c67

Feb 26, 2020
- Fix typo in comment. · cc57ae74
  Jacob Nevins authored 5 years ago
  
  cc57ae74
Feb 25, 2020

Fix a deadlock in SFTP upload. · 3a633bed

Simon Tatham authored 5 years ago

I tried to do an SFTP upload through connection sharing the other day
and found that pscp sent some data and then hung. Now I debug it, what
seems to have happened was that we were looping in sftp_recv() waiting
for an SFTP packet from the remote, but we didn't have any outstanding
SFTP requests that the remote was going to reply to. Checking further,
xfer_upload_ready() reported true, so we _could_ have sent something -
but the logic in the upload loop had a hole through which we managed
to get into 'waiting for a packet' state.

I think what must have happened is that xfer_upload_ready() reported
false so that we entered sftp_recv(), but then the event loop inside
sftp_recv() ran a toplevel callback that made xfer_upload_ready()
return true. So, the fix: sftp_recv() is our last-ditch fallback, and
we always try emptying our callback queue and rechecking upload_ready
before we resort to waiting for a remote packet.

This not only fixes the hang I observed: it also hugely improves the
upload speed. My guess is that the bug must have been preventing us
from filling our outgoing request pipeline a _lot_ - but I didn't
notice it until the one time the queue accidentally ended up empty,
rather than just sparse enough to make transfers slow.

Annoyingly, I actually considered this fix back when I was trying to
fix the proftpd issue mentioned in commit cd97b7e7. I decided fixing
ssh_sendbuffer() was a better idea. In fact it would have been an even
better idea to do both! Oh well, better late than never.

3a633bed

Feb 23, 2020

primegen: fix a small memory leak. · aba52744
Simon Tatham authored 5 years ago
```
There's always one.
```
aba52744

Auxiliary application: 'psocks', a simple SOCKS server. · 1b40d9f3

Simon Tatham authored 5 years ago

This is built more or less entirely out of pieces I already had. The
SOCKS server code is provided by the dynamic forwarding code in
portfwd.c. When that accepts a connection request, it wants to talk to
an SSH ConnectionLayer, which is already a trait with interchangeable
implementations - so I just provide one of my own which only supports
the lportfwd_open() method. And that in turn returns an SshChannel
object, with a special trait implementation all of whose methods
just funnel back to an ordinary Socket.

Result: you get a Socket-to-Socket SOCKS implementation with no SSH
anywhere, and even a minimal amount of need to _pretend_ internally to
be an SSH implementation.

Additional features include the ability to log all the traffic in the
form of diagnostics to standard error, or log each direction of each
connection separately to a file, or for anything more general, to log
each direction of each connection through a pipe to a subcommand that
can filter out whatever you think are the interesting parts. Also, you
can spawn a subcommand after the SOCKS server is set up, and terminate
automatically when that subcommand does - e.g. you might use this to
wrap the execution of a single SOCKS-using program.

This is a modernisation of a diagnostic utility I've had kicking
around out-of-tree for a long time. With all of last year's
refactorings, it now becomes feasible to keep it in-tree without
needing huge amounts of scaffolding. Also, this version runs on
Windows, which is more than the old one did. (On Windows I haven't
implemented the subprocess parts, although there's no reason I
_couldn't_.)

As well as diagnostic uses, this may also be useful in some situations
as a thing to forward ports to: PuTTY doesn't currently support
reverse dynamic port forwarding (in which the remote listening port
acts as a SOCKS server), but you could get the same effect by
forwarding a remote port to a local instance of this. (Although, of
course, that's nothing you couldn't achieve using any other SOCKS
server.)

1b40d9f3

RSA generation: change exponent to 2^16+1. · 5a9bfca3

Simon Tatham authored 5 years ago

This is probably overdue; everyone else seems to have settled on it as
the preferred RSA key exponent for some time. And now that the
descendant of mp_mod_short supports moduli up to 2^32 instead of 2^16,
I can actually add it without the risk of assertion failures during
prime generation.

5a9bfca3

Move invent_firstbits() into sshrsag.c. · 13f594f0

Simon Tatham authored 5 years ago

It's now a subroutine specific to RSA key generation, because the
reworked PrimeCandidateSource system can handle the requirements of
DSA generation automatically.

The difference is that in DSA, one of the primes you generate is used
as a factor in the generation of the other, so you can just pass q as
a factor to pcs_require_residue_1, and it can get the range right by
itself. But in RSA, neither prime is generated with the other one in
mind; they're conceptually generated separately and independently,
apart from that single joint restriction on their product.

(I _could_ have added a feature to PrimeCandidateSource to specify a
range for the prime more specifically than a few initial bits. But I
didn't want to, because it would have been more complicated than doing
it this way, and also slightly less good: if you invent one prime
first and then constrain the range of the other one once you know the
first, then you're not getting an even probability distribution of the
possible _pairs_ of primes - you're privileging one over the other and
skewing the distribution.)

13f594f0

Refactor generation of candidate integers in primegen. · da3bc3d9

Simon Tatham authored 5 years ago

I've replaced the random number generation and small delta-finding
loop in primegen() with a much more elaborate system in its own source
file, with unit tests and everything.

Immediate benefits:

 - fixes a theoretical possibility of overflowing the target number of
   bits, if the random number was so close to the top of the range
   that the addition of delta * factor pushed it over. However, this
   only happened with negligible probability.

 - fixes a directional bias in delta-finding. The previous code
   incremented the number repeatedly until it found a value coprime to
   all the right things, which meant that a prime preceded by a
   particularly long sequence of numbers with tiny factors was more
   likely to be chosen. Now we select candidate delta values at
   random, that bias should be eliminated.

 - changes the semantics of the outermost primegen() function to make
   them easier to use, because now the caller specifies the 'bits' and
   'firstbits' values for the actual returned prime, rather than
   having to account for the factor you're multiplying it by in DSA.
   DSA client code is correspondingly adjusted.

Future benefits:

 - having the candidate generation in a separate function makes it
   easy to reuse in alternative prime generation strategies

 - the available constraints support applications such as Maurer's
   algorithm for generating provable primes, or strong primes for RSA
   in which both p-1 and p+1 have a large factor. So those become
   things we could experiment with in future.

da3bc3d9

testcrypt: allow random_read() to use a full PRNG. · dfddd138

Simon Tatham authored 5 years ago

This still isn't the true random generator used in the live tools:
it's deterministic, for repeatable testing. The Python side of
testcrypt can now call random_make_prng(), which will instantiate a
PRNG with the given seed. random_clear() still gets rid of it.

So I can still have some tests control the precise random numbers
received by the function under test, but for others (especially key
generation, with its uncertainty about how much randomness it will
actually use) I can just say 'here, have a seed, generate as much
stuff from that seed as you need'.

dfddd138

mpint: add a gcd function. · 2debb352

Simon Tatham authored 5 years ago

This is another application of the existing mp_bezout_into, which
needed a tweak or two to cope with the numbers not necessarily being
coprime, plus a wrapper function to deal with shared factors of 2.

It reindents the entire second half of mp_bezout_into, so the patch is
best viewed with whitespace differences ignored.

2debb352

mpint: add mp_random_upto(). · 957f1408

Simon Tatham authored 5 years ago

This is a third random-number generation function, with an API in
between the too-specific mp_random_bits and the too-general
mp_random_in_range. Now you can generate a value between 0 and n
without having to either make n a power of 2, or tediously allocate a
zero mp_int to be the lower limit for mp_random_in_range.

Implementation is done by sawing the existing mp_random_in_range into
two pieces and exposing the API between them.

957f1408

mpint: add mp_lshift_fixed(). · d4a4111f

Simon Tatham authored 5 years ago

This is a version of mp_lshift_fixed_into() which allocates the output
number, which it can do because you know the size of the original
number and are allowed to treat the shift count as non-secret.

(By contrast, mp_lshift_safe() would be a nonsensical function - if
you're trying to keep the shift count secret, you _can't_ use it as a
parameter of memory allocation! In that situation you have no choice
but to allocate memory based on a fixed upper bound.)

d4a4111f

mpint: add mp_[lr]shift_safe_into functions. · 18678ba9

Simon Tatham authored 5 years ago

There was previously no safe left shift at all, which is an omission.
And rshift_safe_into was an odd thing to be missing, so while I'm
here, I've added it on the basis that it will probably be useful
sooner or later.

18678ba9

Start a file of 'unsafe' mp_int functions. · dec79cf1

Simon Tatham authored 5 years ago

Unlike the ones in mpint.c proper, these are not intended to respect
the constant-time guarantees. They're going to be the kind of thing
you use in key generation, which is too random to be constant-time in
any case.

I've arranged several precautions to try to make sure these functions
don't accidentally get linked into the main SSH application, only into
key generators:

 - declare them in a separate header with "unsafe" in the name

 - put "unsafe" in the name of every actual function

 - don't even link the mpunsafe.c translation unit into PuTTY proper

 - in fact, define global symbols of the same name in that file and
   the SSH client code, so that there will be a link failure if we
   ever try to do it by accident

The initial contents of the new source file consist of the subroutine
mp_mod_short() that previously lived in sshprime.c (and was not in
mpint.c proper precisely because it was unsafe). While I'm here, I've
turned it into mp_unsafe_mod_integer() and let it take a modulus of up
to 32 bits instead of 16.

Also added some obviously useful functions to shrink an mpint to the
smallest physical size that can hold the contained number (rather like
bn_restore_invariant in the old Bignum system), which I expect to be
using shortly.

dec79cf1