looking for fast modulo exponation code (386 compatible)

I looking for a fast modulo exponation code (RSA crypto). Something that use hardware multiply for exponation, montgomery's method for modulo division. It could be other methods if it's even faster.

I would like to implement it on a 386

