Minimizing b*s mod c over a range of s; "the worst" case

Sun Apr 7 16:09:33 PDT 1991

Turns out that a "hill-climbing" approach to this sub-problem *can* be
done reliably, quickly, & straightforwardly, if you're willing (unlike I
was <grin>), to (a) keep it simple (perhaps sacrificing some run-time),
and (b) think about it first <embarrassed snarl>.

Suppose we're given fixed 1 <= b < c, gcd(b,c)=1, 0 <= slo <= shi, and
want to find an s in [slo,shi] that minimizes (in the modular sense)
b*s mod c  over all s in [slo,shi].

Notes:

  - By "minimizes (in the modular sense)" I mean gets closest, but
    not exactly equal, to 0, where c-1 and 1 are both 1 away from 0,
    etc.

  - Some of the bounds can be relaxed; e.g., 0 <= slo isn't really
    necessary, but because people have different ideas about what
    mod should do when given inputs of mixed signs, I'm trying to avoid
    stupid arguments by restricting the problem to cases where everyone
    agrees <grin -- but slo < 0 (and the like) aren't needed to solve
    the "worst case input" problem anyway>.

A trick that makes this straightforward is to solve instead two
sub-problems, and paste them together at the end:

A) Find the smallest s in [slo,shi] at which b*s mod c is maximal (in the
   ordinary sense).

B) Find the smallest s in [slo,shi] at which b*s mod c is non-zero but
   minimal (in the ordinary sense).

Notes:

  - Given solutions to #A and #B, the solution to the original problem
    is clearly whichever of the two maps closest to 0 in the modular
    sense.

  - If [slo,shi] contains no more than c integers, then because
    gcd(b,c)=1 was given, b*s mod c is unique for each s in [slo,shi]
    (b*s1 mod c = b*s2 mod c implies s1 & s2 are congruent mod c since
    gcd(b,c)=1 implies b has a multiplicative inverse mod c), and we
    could rephrase #A and #B as "Find the unique s in ...".

I'll sketch how to solve #A; #B is very much the same.  Most people on
this list seem most comfortable with C, so here it is in C-- (a custom
version of C that implicitly declares all variables to be arbitrary-
precision integers <grin>):

smallest_s_in_slo_to_shi_such_that_b_times_s_mod_c_is_maximal(
    b, c, slo, shi )
{
    s = slo;
    /* dist will always be the distance from b*s%c to c-1,  */
    /* so minimizing dist corresponds to maximizing b*s%c   */
    dist = c-1 - (b*s % c);  /* dist in [0,c-1] */
    while (dist) {   /* the *usual* exit is in the middle of the loop, */
		     /* but in the rare case dist *does* reach 0 we're */
		     /* obviously done */

	/* the function referenced in the next line is straightforward */
	/* to write from the proof of Knuth's 6.4S -- won't explain it */
	/* here */
	i = smallest_i_ge_1_such_that_b_times_i_mod_c_le_dist( b, c, dist );

	/* now i in [1,c-1], i is the smallest integer in that range */
	/* that maps into [1,dist], hence the smallest s' > s that   */
	/* maps closer to c-1 than s maps is s'=s+i		     */

	/* note that since 1 <= dist and gcd(b,c)=1 at the call, */
	/* smallest_i_ge_1_such_that_b_times_i_mod_c_le_dist can */
	/* always find an i in [1,c-1] such that b*i%c <= dist	 */

	/* how many times can we bump s by i and still get closer? */
	decrease_in_dist = b*i % c;
	n1 = dist / decrease_in_dist;  /* must be >= 1 */

	/* how many times can we bump s by i without exceeding shi? */
	n2 = (shi - s) / i;  /* must be >= 0 */

	n = n1 < n2 ? n1 : n2;
	if ( n == 0 ) break;  /* the usual exit */

	s += n * i;
	dist -= n * decrease_in_dist;
    }

    return s;
}

Notes:

  - This approach is easy to adapt to the more general problem of
    finding an s in [slo,shi] that makes b*s%c closest (in the modular
    sense) to some fixed integer a -- just (carefully ...) initialize
    "dist" to be the (positive) distance from b*slo%c to a-1 instead of
    to c-1 (& likewise in the routine for solving #B).

  - By doing a binary search in

       smallest_i_ge_1_such_that_b_times_i_mod_c_le_dist
    and
       smallest_i_ge_1_such_that_b_times_i_mod_c_ge_dist

    (the latter is needed to solve #B) to get close to the appropriate
    convergent before looking at the "fine structure", the run-time in
    practice appears to be O(log(c)), and I suspect it would be pretty
    easy to prove that O(log(c)*log(log(c))) is a bound on the worst-
    case time (I'm just counting bignum operations there, assuming that
    bignum + - * / take O(1) time -- bad assumption, but this ain't a
    thesis <grin>).

Running all the possibilities for 17-digit-decimal -> IEEE (normal)
double input through an algorithm built on the above finally yields "the
worst case" (this took 35 minutes of some-fast-flavor-of-SPARC time,
running interpreted Icon, and where at least half the dynamic statements
were "do nothing" assertions involving bignum arithmetic)

2.6153245263757307e65 = (in binary)

1.0011110111100000000001011011110101100010000011011110
  11111111111111111111111111111111111111111111111111111111111111111011...
								   x
  * 2^217

It's a teensy bit too small to be exactly representable, and "x" marks
the bit (the 66th excess bit) that finally distinguishes this from an
exactly-representable case; compute a working result good to less than
66 excess bits and you'll incorrectly fail to set the inexact flag, or
under to-minus-infinity and to-0 rounding will get the wrong answer.
Compute a working result good to more than that (by hook or by crook --
multiply by reciprocal approximations, use a teensy power-of-10 table,
... it's only the accuracy of the final working result that counts), and
you'll get everything exactly right.

i'll-attach-50-"of-the-worst"-to-satisfy-the-curiosity-seekers<grin>-ly
   y'rs  - tim

Tim Peters   Kendall Square Research Corp
timaksr.com,  ksr!timaharvard.harvard.edu

1st column gives the input case.
2nd column says "<e" if it's a little bit less than exactly
   representable, ">e" if it's a little bit greater than exactly
   representable, and "<h" and ">h" similarly for cases close to
   being exactly halfway between adjacent representable numbers (these
   *may* cause problems due to double-rounding errors under nearest/
   even rounding -- see preceding msgs in this series)
3rd column gives the number of excess bits needed before the uncertainty
   is settled.

2.6153245263757307e65	<e	66
2.7176258005319167e-245	<e	65
9.4080055902682397e-227	>e	64
9.0372559027740405e161	>e	64
7.2298047222192324e162	>e	64
5.4897030182071313e45	<h	64
5.4223535416644243e161	>e	64
5.8483921078398283e73	<h	64
1.8074511805548081e163	>e	64
3.6149023611096162e163	>e	64
7.2298047222192324e163	>e	64
9.0372559027740405e159	>e	64
1.8074511805548081e160	>e	64
3.6149023611096162e160	>e	64
7.2298047222192324e160	>e	64
9.7338774138954421e-274	>e	64
2.8863057365094743e192	<h	63
5.7726114730189486e192	<h	63
1.8074511805548081e161	>e	63
3.6149023611096162e161	>e	63
9.7338774138954421e-273	>h	63
4.1489164733416129e-154	>h	63
8.2978329466832258e-154	>h	63
9.5290783281036445e-17	<h	63
1.9058156656207289e-16	<h	63
3.8116313312414578e-16	<h	63
7.6232626624829156e-16	<h	63
4.1391519190645203e181	<h	63
8.2783038381290406e181	<h	63
6.4409240769861689e-143	<h	63
8.8133809804950961e-292	>e	63
2.4691002732654881e-99	>e	63
4.9382005465309762e-99	>e	63
9.8764010930619524e-99	>e	63
9.3590659012335325e-151	>h	63
1.8718131802467065e-150	>h	63
3.7436263604934130e-150	>h	63
7.4872527209868260e-150	>h	63
1.4974505441973652e-149	>h	63
2.9949010883947304e-149	>h	63
5.9898021767894608e-149	>h	63
6.8985865317742005e180	<e	62
1.3797173063548401e181	<e	62
2.7594346127096802e181	<e	62
8.7321175067455367e-253	<h	62
6.9705112365474009e-207	<h	62
8.2310814274709837e232	<h	62
7.3429396004640239e122	<h	62
3.5412292230362827e137	>e	62
7.0824584460725654e137	>e	62