# [ACCEPTED]-Difference between random draws from scipy.stats....rvs and numpy.random-scipy

Score: 16

scipy.stats.uniform actually uses numpy, here 32 is the corresponding function in stats (mtrand 31 is an alias for numpy.random)

``````class uniform_gen(rv_continuous):
def _rvs(self):
return mtrand.uniform(0.0,1.0,self._size)
``````

scipy.stats 30 has a bit of overhead for error checking 29 and making the interface more flexible. The 28 speed difference should be minimal as long 27 as you don't call uniform.rvs in a loop 26 for each draw. You can get instead all random 25 draws at once, for example (10 million)

``````>>> rvs = stats.uniform.rvs(size=(10000, 1000))
>>> rvs.shape
(10000, 1000)
``````

Here 24 is the long answer, that I wrote a while 23 ago:

The basic random numbers in scipy/numpy 22 are created by Mersenne-Twister PRNG in 21 numpy.random. The random numbers for distributions 20 in numpy.random are in cython/pyrex and 19 are pretty fast.

scipy.stats doesn't have 18 a random number generator, random numbers 17 are obtained in one of three ways:

• directly 16 from numpy.random, e.g. normal, t, ... pretty 15 fast

• random numbers by transformation of 14 other random numbers that are available 13 in numpy.random, also pretty fast because 12 this operates on entire arrays of numbers

• generic: the 11 only generic generation random number generation 10 is by using the ppf (inverse cdf) to transform 9 uniform random numbers. This is relatively 8 fast if there is an explicit expression 7 for the ppf, but can be very slow if the ppf 6 has to be calculated indirectly. For example 5 if only the pdf is defined, then the cdf 4 is obtained through numerical integration 3 and the ppf is obtained through an equation 2 solver. So a few distributions are very 1 slow.

Score: 6

I ran into this today and just wanted to 17 add some timing details to this question. I 16 saw what joon mentioned where, in particular, random 15 numbers from the normal distribution were 14 much more quickly generated with `numpy` than from 13 `rvs` in `scipy.stats`. As user333700 mentioned there is some 12 overhead with `rvs` but if you are generating 11 an array of random values then that gap 10 closes compared to `numpy`. Here is a jupyter 9 timing example:

``````from scipy.stats import norm
import numpy as np

n = norm(0, 1)
%timeit -n 1000 n.rvs(1)[0]
%timeit -n 1000 np.random.normal(0,1)

%timeit -n 1000 a = n.rvs(1000)
%timeit -n 1000 a = [np.random.normal(0,1) for i in range(0, 1000)]
%timeit -n 1000 a = np.random.randn(1000)
``````

This, on my run with `numpy` version 8 1.11.1 and `scipy` 0.17.0, outputs:

``````1000 loops, best of 3: 46.8 µs per loop
1000 loops, best of 3: 492 ns per loop
1000 loops, best of 3: 115 µs per loop
1000 loops, best of 3: 343 µs per loop
1000 loops, best of 3: 61.9 µs per loop
``````

So just generating 7 one random sample from `rvs` was almost 100x 6 slower than using `numpy` directly. However, if 5 you are generating an array of values than 4 the gap closes (115 to 61.9 microseconds).

If 3 you can avoid it, probably don't call `rvs` to 2 get one random value a ton of times in a 1 loop.

More Related questions