PCH

I'm a college student taking quite a few math classes right now, including one seminar-style course where everyone does independent research and then presents it. One of my classmates had a really interesting presentation that caught my eye, and I was able to help him a bit by gathering some empirical data to support a claim.

The Problem

For those unfamiliar, an integer m is a primitive root modulo n if raising m to successive powers mod n will eventually give you all possible integers mod m (except 0). For example, 3 is a primitive root of 5 because 3^0 == 1, 3^1 == 3, 3^2 == 4, and 3^3 == 2 (mod 5).

My classmate was interested in studying a certain type of primitive root: numbers that are primitive roots mod p, where p is a prime, but not mod p^2. (He calls these z-primitive roots). To investigate, he wrote a script in Mathematica to find these roots. Of his sample, only roughly 30% of primes had any. He used that data as a springboard to make conjectures, but mentioned during the presentation that it'd be really useful if he had more. He said his script was kinda slow (understandable, the problem is hard and involves factoring large numbers) and it had taken his computer two hours to spit out the last thousand results.

At this, my ears perked up a little, because I was pretty sure I could do better. Plus I have a server, so even if I couldn't write a faster program, I could simply run it for longer. I asked him for his contact info so I could send him results, and that night I got to work.

The Solution

Since I'm learning Rust, I see every problem as a nail that I can hit with the Rust Hammer. It doesn't help that Rust is fast and well-documented and safe and just generally nice. I searched crates.io with a few keyword combinations, hoping to find a crate that would calculate primitive roots for me, because I didn't want to DIY it. To my surprise, there were none! The closest thing was the crate number_theory, which had functions for most first-year number theory stuff, but no primitive roots.

I decided to build off of that crate. I used a different crate to generate the prime numbers, which was computationally easy compared to finding their primitive roots. I wrote my own code to find and check primitive roots.

Importing stuff:

use std::fs::File;
use std::io::prelude::*;
use std::env;
use number_theory;
use libm::floor;
use divisors::get_divisors;
use flume;
use std::thread;

First, I had to find primitive roots mod a given prime p. Once you find the first primitive root, the rest are pretty easy: just take the known primitive root and raise it to all powers which are coprime to p-1. To find the first one, I had to search all the integers mod p, with a few performance improvements. First, if a number is a quadratic residue mod p, it cannot be a primitive root, so we can skip it. Checking quadratic residues is as simple as finding the Legendre symbol of n mod p, which is a computationally easy function in the number_theory crate.

Second, you can optimize the order in which you check integers mod p based on the value of p mod 8. This is due to a bunch of theorems mostly by Dirichlet that tell us about the distribution of quadratic residues and nonresidues of p. I read and took suggestions from this article about primitive roots.

From p mod 8, we can find a list of numbers modulo p that's front-loaded with likely primitive roots.

let check_list: Box<dyn Iterator<Item=u64>> = match p % 8 { 
	1 => {
		let p_f = p as f64;
		let p_over_3 = floor(p_f/3.0) as u64; 
		Box::new([2].into_iter().chain(p_over_3..p-1).chain(3..p_over_3))
	}
	3 => Box::new((2..p-1).rev()),
	5 => {
		let p_f = p as f64;
		let p_over_3 = floor(p_f/3.0) as u64;
		Box::new((p_over_3..p-1).chain(2..p_over_3))
	}
	7 => Box::new([2].into_iter().chain((3..p-2).rev())),
	_ => {panic!("p is not an odd prime! p mod 8 is even!");}
};

Then, checking each number, we discard it if it's a primitive root mod p:

if number_theory::NumberTheory::legendre(&n, &p)==1 {
    continue;
}

Now for the real meat of the "checking algorithm:" the order of any n mod m will divide the Carmichael lambda of m. For a prime p, the Carmichael lambda is p-1. So for each proper divisor of p-1, we just have to raise n to that power. If it's never 1, n is a primitive root mod p. If it equals 1 at some point, n must not be a primitive root.

Note: here, divs is a list of proper divisors of p-1. More on that later.

for n in check_list {
	if number_theory::NumberTheory::legendre(&n, &p)==1 {
		continue;
	}

	let mut is_pr = true;
	for d in divs {
		if number_theory::NumberTheory::exp_residue(&n, d, &p) == 1 {
			is_pr = false;
			break;
		}
	}
	if is_pr { return n; }
}

I put all this into a function called find_first_pr:

fn find_first_pr(p: u64, divs: &Vec<u64>) -> u64 {
    // p is a given prime, divs is a vector of divisors of p-1.
    let check_list: Box<dyn Iterator<Item=u64>> = match p % 8 {
        1 => {
            let p_f = p as f64;
            let p_over_3 = floor(p_f/3.0) as u64;
            Box::new([2].into_iter().chain(p_over_3..p-1).chain(3..p_over_3))
        }
        3 => Box::new((2..p-1).rev()),
        5 => {
            let p_f = p as f64;
            let p_over_3 = floor(p_f/3.0) as u64;
            Box::new((p_over_3..p-1).chain(2..p_over_3))
        }
        7 => Box::new([2].into_iter().chain((3..p-2).rev())),
        _ => {panic!("p is not an odd prime! p mod 8 is even!");}
    };


    for n in check_list {
        if number_theory::NumberTheory::legendre(&n, &p)==1 {
            continue;
        }
		
        let mut is_pr = true;
        for d in divs {
            if number_theory::NumberTheory::exp_residue(&n, d, &p) == 1 {
                is_pr = false;
                break;
            }
        }
        if is_pr { return n; }
    }
    return 0; // should never be called
}

Now that we've found one primitive root, we can find the rest:

fn find_all_prs(p: u64, divs: &Vec<u64>) -> Vec<u64> {
    // p is a given prime, divs is a vector of divisors of p-1.
    let g = find_first_pr(p, divs);
    // raise g to all k where gcd(k, p-1) = 1
    let mut all_prs = vec![];
    for k in 1..p {
        if number_theory::NumberTheory::gcd(&k, &(p-1)) == 1 {
            let new_pr = number_theory::NumberTheory::exp_residue(&g, &k, &p);
            all_prs.push(new_pr);
        }
    }

    return all_prs;
}

Easy!

Then to find z-primitive roots of p, we don't need to generate all the primitive roots mod p^2. We can just check our list of primitive roots mod p. The checking function looks pretty much the same as above, except now we're looking at divisors of p(p-1), the Carmichael lambda of p^2. Once again, these have been precomputed.

fn check_pr_mod_psquared(p: u64, n:u64, divs: &Vec<u64>) -> bool {
    // p is a prime, n is a PR mod p, divs are divisors of p(p-1).
    // returns true if n is a PR mod p^2 and false otherwise.
    for d in divs {
        if number_theory::NumberTheory::exp_residue(&n, d, &(p.pow(2))) == 1 {
            return false;
        }
    }
    return true;
}

Pretty neat! Why did I precompute the divisors? Well, at first, I computed p-1's divisors in find_first_pr, and p(p-1)'s divisors in check_pr_mod_psquared. However, this is terrible for performance. You're already sort of doing the same work twice, but then, when factoring p(p-1), your computer has to pull out a very large prime factor from a huge composite number. After I wrote and compiled the program this way, I profiled it, and something like 60% of processor time was dedicated to factoring these numbers. Now I compute p-1's divisors and sort of "manually" construct p(p-1)'s divisors from there:

let mut divs = get_divisors(p-1);
let prs = find_all_prs(p, &divs);
let mut divs_times_p = divs.clone().into_iter().map(|x| x*p).collect();
divs.push(p);
divs.push(p-1);
divs.append(&mut divs_times_p);

In the above code, the variable divs holds divisors of p-1 at first; then it's used to find primitive roots mod p; finally it's modified to include all divisors of p(p-1). After this code block, it'll be used to check primitive roots mod p^2.

How does this program save its results? I'm lazy so I chucked everything into a bunch of TSV files written line-by-line. The I/O of writing a bunch of lines to a file one-by-one could be problematic, but in this case the lines were computed slowly, so writing to disk wasn't bad compared to actually computing the results.

With multithreading in mind, I wrote a worker function that computes z-primitive roots for nth primes between a start and stop index. This is where the divisors of p-1 are computed, as well. If n is a primitive root mod p, but is not a primitive root mod p^2, then n gets written to the output file as a z-primitive root of p.

fn worker(start: u64, stop: u64) {
    let filename = format!("zprs_{}_{}.tsv", start, stop);
    let mut f = File::create(filename).expect("opening file failed");
    write!(f, "p\tzprs\tn_zprs\n").expect("write error");

    for i in start..stop {
        let p = number_theory::NumberTheory::nth_prime(&(i)).unwrap();
        // calculate divisors of p-1 only once 
        let mut divs = get_divisors(p-1);
        let prs = find_all_prs(p, &divs);
        let mut divs_times_p = divs.clone().into_iter().map(|x| x*p).collect();
        divs.push(p);
        divs.push(p-1);
        divs.append(&mut divs_times_p);

        let mut z_prs = vec![];
        for n in prs {
            if check_pr_mod_psquared(p, n, &divs) == false {
                z_prs.push(n);
            }
        }

        write!(f, "{:?}\t{:?}\t{}\n", p, z_prs, z_prs.len()).expect("write error");
    }
}

Last but not least, I used flume to multithread across four workers (number of cores in my server's dinky i3 CPU). A sender thread hands out ranges of indices to worker threads, which compute all the z-primitive roots for primes between those indices. In the end, a bunch of TSV files are generated, which can be concatenated (if desired) with bash line-editing tools.

fn main() {
    let mut start: u64 = env::args().nth(1).unwrap().parse().unwrap();
    let gap: u64 = env::args().nth(2).unwrap().parse().unwrap();
    let mut stop = start + gap;
    let mut interval = vec![start, stop];



    let (tx, rx) = flume::bounded::<Vec<u64>>(0); // rendezvous channel

    for _ in 0..4 {
       let rx = rx.clone();
        thread::spawn(move || {
            for msg in rx.iter() {
                worker(msg[0], msg[1]);
            }
        });
    }

    drop(rx);


    loop { // sender
        tx.send(interval).expect("error sending to worker thread");
        start += gap;
        stop += gap;
        interval = vec![start, stop];
    }
}

This program, once compiled, can be called at the command line with ./[executable name] start increment, where start is the index of prime to start at, and increment is the number of lines per file. The full crate is on github.

The development process for this program was pretty fun! It was the first time I'd tried multithreading in Rust, and while I did have to call a more Rust-literate friend to explain some errors to me, I came out of the process with a better understanding of how threads work. I also unit-tested all of the mathematical functions as I was writing it -- this was the first time I'd done that, and I'll definitely do it again. I caught some errors that would have taken me forever to debug without testing.

In the end, I ran this program on my server for maybe 48 hours, and it found all z-primitive roots of the first 120,000 primes. I was happy with these results, and so was my classmate when I sent them his way!