polygraph report and raw data

Dmitry Kurochkin dmitry.kurochkin at measurement-factory.com
Fri Apr 27 16:58:52 UTC 2012

Hi Shahab.

shahab bakhtiyari <shahabb at ifi.uio.no> writes:

> Hi Dmitry
> Thank you very much to your previous reply. I have a couple of
> configuration questions that I am sure very easy for you to answer!
> 1.  what "working set size" and "cache size" should I use for a 1GB disk
> cache testing?,  Ram is 4GB but not all cache is used by the proxy(it
> varies from 10% to 30% of Ram usage by proxy). I read in the website  that
> "cache size" should be whole proxy box cache plus disk cache(4+1 = 5)
> . But would it correct considering that proxy does not use the whole cache?

There is no "cache size" in PGL workload.  It may be a variable specific
to the workload.  It's meaning depends on the workload.  In your
workload CacheSize variable is used to determine the fill phase length.
This phase is indented to generate enough traffic to fill the whole
cache multiple times (2 in the workload you are using).  CacheSize value
should be the total cache size, both disk and RAM (not the amount of RAM
on the system, but the amount of RAM used by proxy for cache).

> if it is the case and I have to chose 5Gb as cache size, then I would need
> to choose much larger working set size( maybe 1G rather than 100MB that I
> am currently using), right?

First of all, you need to understand what working set is.  Please read
working_set_length() PGL function description at [1].  A good value for
WSS to start with is the total cache size, i.e. cache_size /
mean_object_size.  Keep in mind that by setting too high (or unlimited)
WSS makes it impossible for proxy to reply with hit for every offered
hit because Polygraph would request objects that have already been
replaced with newer ones in the cache.  Setting too small WSS (like
100MB for 5GB cache) may make it "easier" for proxy to reply with hits
for every offered hit because it needs to cache a just small portion of
recent objects.  Both situations are bad and may lead to inaccurate test

> 2. Could you please have a look  here
> http://client.servebeer.com:8081/varnishYkGcache/  , seeing config files
> and error part, and tell me, which part in config part I have to modofy?

An error does not mean that there is something wrong with your workload.
It depends on your test and your environment.  E.g.  connection errors
may mean that your system configuration should be tuned, or that your
proxy has an issue, or that the request rate is just too high.

You should take a look at each error and investigate it.  Try to
identify what triggers an error (e.g. a particular request header) and
reproduce it with a simplest workload possible or even with a tool like
curl(1).  Polygraph should print HTTP headers dump on console.

Some errors are very likely to be proxy issues, e.g. checksum mismatch.
You may want to fix the proxy or disable checksums in your workload.


> Thank you very much in advance
> Shahab
> On 26 March 2012 12:22, shahab bakhtiyari <shahabb at ifi.uio.no> wrote:
>> Hi guys
>> I have 2 questions, really appreciate if somebody helps me,
>> 1. how can I get the raw data from polygraph,  I mean  the data that
>> "polygraph-reporter"  gives to the gnuplot to plot it(I dont really like
>> gnuplot!!!)
>> 2. I currently have my set up in a private network with private IP
>> addresses(and ofcource have access to internet as well),  I am thinking
>> whether it is possible or not, to add some more clients (since I only have
>> 2 physical clients with a limited number of robots) from Amazon instances?
>> Best regards
>> Shahab
> _______________________________________________
> Users mailing list
> Users at web-polygraph.org
> http://www.web-polygraph.org/mailman/listinfo/users

More information about the Users mailing list