Scaling the number of users instead of the hit rate

Thu Feb 16 16:55:20 UTC 2012

On 02/13/2012 05:24 AM, Hohl, Gerrit wrote:

> But my problem now is that I should be able to scale the number of users
> (robots) instead of the hit rate. 

Hi Gerrit,

    I assume that by "hit rate" you mean "request rate". And by "scale"
you mean "change during a test".

Yes, Polygraph can vary the number of active robots during a test. This
is accomplished using "populus factors" in test Phase configurations:
http://www.web-polygraph.org/docs/reference/pgl/types.html#type:docs/reference/pgl/types/Phase

Please note that in simple configurations more robots means higher
request rate if you vary the number of robots during the test (just like
in most real-world situations more users means more incoming requests
from those users). You _can_ vary the number of robots while keeping the
request rate constant, but that requires more complex Phase configurations.

> Our product is more sensitive to this

It may be important to define what _traffic properties_ the product is
sensitive to: total request rate, total authentication rate, number of
active concurrent connections, number of idle persistent connections,
number of unique IP source addresses, number of unique server addresses,
etc. Most of these traffic properties can be modeled with just one robot
or with 10000 robots.

It is tempting to say "I want to see if my proxy can serve a population
of one million potential users" and then try to create one million
robots representing those potential users, but you will run into
hardware, OS, and Polygraph limitations with that approach because of
the amount of waste it implies. A better strategy is to focus on the
important traffic properties (number of users is not a traffic
property!) and use the minimum/simplest configuration that can model
those properties.

> than the hit rate as our focus is on authentification. We don't do any
> caching or things like that. So is there a possibility to scale users?
> Or maybe I should ask the question in a different way: Is there a way I
> can calculate the number of expected robots from other values like
> Bench.client_side.peak_req_rate, Bench.client_side.max_host_load or
> Bench.client_side.max_agent_load?

If your configuration is reasonable, and you are using Bench and address
scheme objects to configure your test, then the number of robots will be
approximately

   bench.peak_req_rate / bench.client_side.max_agent_load

Depending on the address scheme, Polygraph will allocate robot (and
server) IP addresses a little differently, but all address schemes are
designed to more-or-less spread the load among all your hosts. When it
is not possible, the configuration is usually not reasonable.

It is difficult to define "reasonable" precisely, but you can think of
it as "not self-contradictory" or "not allowing unrealistic
interpretations".

>         peak_req_rate = 500/sec;
> 
>         client_side = {
>                 max_host_load = 20000/sec;
>                 max_agent_load = 6/min;
>                 addr_space = ... ;
>                 hosts = ... ;
>         };
> 

For example, the above configuration tells me that you want a single
client-side host to do up to 20000 requests per second while each robot
on that host will do up to 6 requests per minute. This means you want
Polygraph to create up to 200000 robots on a single host. That may
already exceed hardware and/or socket descriptor/port limits. And then
you tell Polygraph to use 2 robots per IP address:

> PolyMix4As asPolyMix4 = {
>         agents_per_addr = 2;
> };

which would result in up to 100000 IP addresses per host. This is
probably not reasonable for today's operating systems.

Now, your peak_req_rate is just 500/sec so, in the actual test, there
should be fewer robots (about 5000) and fewer IP addresses (about 2500),
but Polygraph should refuse to use more than one client-side host if you
have several configured and that could cause other problems.

I recommend the following as a starting point:

0. Read the following to understand how Polygraph interprets your
configuration when it comes to allocation of IP addresses and agents:
http://www.web-polygraph.org/docs/userman/address.html

1a. Define theBench object so that all values are true and reasonable.
1b. Do not forget to use() theBench object.

2.  When calculating robot and server addresses, use SpreadAs address
scheme. It has a more straightforward address allocation algorithm:
http://www.web-polygraph.org/docs/reference/pgl/types.html#type:docs/reference/pgl/types/SpreadAs

3a. If you want to know the details about PolyMix-4 address allocation
scheme, please see:
http://www.web-polygraph.org/docs/workloads/polymix-4/
http://www.measurement-factory.com/docs/PolyMix-4/#Sect:3

3b. You may also use the pmix3-ips.pl script distributed with Polygraph
to see what PolyMix-3 or PolyMix-4 IPs will be needed for a given
request rate.

If you get strange results, attach your entire workload and console logs
(compress if needed) because the bug could be in how you use Bench and
address scheme objects or in host addresses, and that may not be visible
in a PGL snippet.

HTH,

Alex.