From morakhad at cisco.com Tue Aug 2 13:51:09 2011 From: morakhad at cisco.com (Mohammed Rakhada) Date: Tue, 02 Aug 2011 14:51:09 +0100 Subject: Incorrect throughput being seen in report and differing PGL configs Message-ID: <1312293069.2165.34.camel@localhost> Hello, I am using Web Polygraph version 4.4.0 and running against a proxy server using 8 clients and 8 servers. After the report is run I see the following data. label: Tue Aug 2 13:52:51 BST 2011 throughput: 4196.00xact/sec or 608.43Mbits/sec response time: 49msec mean hit ratios: 18.86% DHR and 6.88% BHR unique URLs: 820565xact (35.40% recurrence) errors: 0.00% (8xact out of 1270224xact) duration: 5.05min start time: Tue, 02 Aug 2011 12:53:18 GMT workload: available Polygraph version: 4.4.0 reporter version: 4.4.0 however when I am looking at the switch statistics the number reported is much lower (360Mbits/sec) Could you clarify what the throughput value actually relates to? I am also seeing the following message when trying to run the polygraph-reporter. Could you help so that I can avoid this error? PGL configuration in /opt/stress/scripts/tmp/sss-101-strs_1312291541.log differs from the one in /opt/stress/scripts/tmp/sss-103-strs_1312291541.log All the configs are identical apart from the "use" line at the bottom which I generate to pair up my servers and clients. The attached file 192.168.29.101.polygraph.pg is an example of this line. Each pair of server / client will have the same line but the next pair will have a different line. Attached are my polygraph files to help you understand my setup. Please let me know if you require further information or clarification. Thanks in advance. Mohammed Rakhada Systems Administrator Cisco Ltd -------------- next part -------------- A non-text attachment was scrubbed... Name: 192.168.29.101.polygraph.pg Type: text/x-csrc Size: 5610 bytes Desc: not available URL: -------------- next part -------------- Robot sss101strs = { kind = "sss101strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.202:8080' ]; http_proxies = [ '192.168.30.50:8080' ]; //origins = S.addresses; addresses = ['192.168.30.201' ** 1000 ]; }; Robot sss103strs = { kind = "sss103strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.204:8080' ]; http_proxies = [ '192.168.30.50:8080' ]; //origins = S.addresses; addresses = ['192.168.30.203' ** 1000 ]; }; Robot sss105strs = { kind = "sss105strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.206:8080' ]; http_proxies = [ '192.168.30.60:8080' ]; //origins = S.addresses; addresses = ['192.168.30.205' ** 1000 ]; }; Robot sss107strs = { kind = "sss107strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.208:8080' ]; http_proxies = [ '192.168.30.60:8080' ]; //origins = S.addresses; addresses = ['192.168.30.207' ** 1000 ]; }; Robot sss109strs = { kind = "sss109strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.210:8080' ]; http_proxies = [ '192.168.30.70:8080' ]; //origins = S.addresses; addresses = ['192.168.30.209' ** 1000 ]; }; Robot sss111strs = { kind = "sss111strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.212:8080' ]; http_proxies = [ '192.168.30.70:8080' ]; //origins = S.addresses; addresses = ['192.168.30.211' ** 1000 ]; }; Robot sss113strs = { kind = "sss113strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.214:8080' ]; http_proxies = [ '192.168.30.80:8080' ]; //origins = S.addresses; addresses = ['192.168.30.213' ** 1000 ]; }; Robot sss115strs = { kind = "sss115strs"; interests = [ "public": 30%, "foreign" ]; //user traces foreign_trace = "/opt/home/mrakhada/t53.urls.httponly.ports"; pop_model = { pop_distr = popUnif(); }; recurrence = 25% / cntImage.cachable; req_rate = 1/sec; //origins = M.names; origins = [ '192.168.30.216:8080' ]; http_proxies = [ '192.168.30.80:8080' ]; //origins = S.addresses; addresses = ['192.168.30.215' ** 1000 ]; }; Server sss102strs = { kind = "sss102strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.202:8080', '192.168.30.202:80' ]; }; Server sss104strs = { kind = "sss104strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.204:8080', '192.168.30.204:80' ]; }; Server sss106strs = { kind = "sss106strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.206:8080', '192.168.30.206:80' ]; }; Server sss108strs = { kind = "sss108strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.208:8080', '192.168.30.208:80' ]; }; Server sss110strs = { kind = "sss110strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.210:8080', '192.168.30.210:80' ]; }; Server sss112strs = { kind = "sss112strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.212:8080', '192.168.30.212:80' ]; }; Server sss114strs = { kind = "sss114strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.214:8080', '192.168.30.214:80' ]; }; Server sss116strs = { kind = "sss116strs"; contents = [ cntJPG: 40%, cntGIF: 45%, cntPNG: 14%, cntPDF: 1% ]; direct_access = contents; //addresses = M.addresses; addresses = [ '192.168.30.216:8080', '192.168.30.216:80' ]; }; From dmitry.kurochkin at measurement-factory.com Tue Aug 2 17:15:29 2011 From: dmitry.kurochkin at measurement-factory.com (Dmitry Kurochkin) Date: Tue, 02 Aug 2011 21:15:29 +0400 Subject: Incorrect throughput being seen in report and differing PGL configs In-Reply-To: <1312293069.2165.34.camel@localhost> References: <1312293069.2165.34.camel@localhost> Message-ID: <87oc074tvy.fsf@gmail.com> Hi. On Tue, 02 Aug 2011 14:51:09 +0100, Mohammed Rakhada wrote: > Hello, > > I am using Web Polygraph version 4.4.0 and running against a proxy > server using 8 clients and 8 servers. After the report is run I see the > following data. > > > label: Tue Aug 2 13:52:51 BST 2011 > throughput: 4196.00xact/sec or 608.43Mbits/sec > response time: 49msec mean > hit ratios: 18.86% DHR and 6.88% BHR > unique URLs: 820565xact (35.40% recurrence) > errors: 0.00% (8xact out of 1270224xact) > duration: 5.05min > start time: Tue, 02 Aug 2011 12:53:18 GMT > workload: available > Polygraph version: 4.4.0 > reporter version: 4.4.0 > > however when I am looking at the switch statistics the number reported > is much lower (360Mbits/sec) > > Could you clarify what the throughput value actually relates to? > Throughput on the index page of HTML report is client side reply throughput, i.e. (size of all replies clients received) / duration). It does not include requests or replies sent by servers. I am not sure why you see lower throughput stats on the switch. You may get wrong stats in reporter if you specify a single log multiple times on the command line. I do not think it is likely, but this may be a bug in Polygraph reporter or client. If you believe Polygraph stats are wrong, I recommend you start with checking that throughput in reporter is calculated correctly from binary logs. Make sure you do not specify any log twice in reporter parameters. Try generating report for a single log: throughput for multiple logs should be equal to sum of all throughputs from each log. You may also send us Polygraph binary logs for investigation. > I am also seeing the following message when trying to run the > polygraph-reporter. Could you help so that I can avoid this error? > > PGL configuration in /opt/stress/scripts/tmp/sss-101-strs_1312291541.log > differs from the one > in /opt/stress/scripts/tmp/sss-103-strs_1312291541.log > > All the configs are identical apart from the "use" line at the bottom > which I generate to pair up my servers and clients. The attached file > 192.168.29.101.polygraph.pg is an example of this line. Each pair of > server / client will have the same line but the next pair will have a > different line. > To get rid of the warning you should use the same workload for all Polygraph client and server processes. It should be simple for your current workload: use a single PGL Robot and Server with addresses, origins, and http_proxies set to list of all addresses you need. E.g.: Robot R = { ... origins = [ all server addresses ]; addresses = [ all client addresses ]; http_proxies = [ all HTTP proxy addresses ]; }; When Polygraph client (server) starts, it checks network interfaces configured on the host and starts only those Robots (Servers) that use locally configured address. So when you run such workload on different hosts, agents with different addresses would be started, which is what you want, I guess. This would also allow result in all Robots making requests to all Servers (with your current workload, Robots running on a given hosts always make requests to a single Server though a single HTTP proxy). Note: you may copy PGL objects to avoid setting the same properties multiple times, e.g.: Robot Base = { // common settings ... }; Robot R1 = Base; R1.req_types = ...; // set R1-specific properties Regards, Dmitry > Attached are my polygraph files to help you understand my setup. > > Please let me know if you require further information or clarification. > > Thanks in advance. > > Mohammed Rakhada > Systems Administrator > Cisco Ltd From rousskov at measurement-factory.com Thu Aug 4 14:12:42 2011 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Thu, 04 Aug 2011 08:12:42 -0600 Subject: Incorrect throughput being seen in report and differing PGL configs In-Reply-To: <87oc074tvy.fsf@gmail.com> References: <1312293069.2165.34.camel@localhost> <87oc074tvy.fsf@gmail.com> Message-ID: <4E3AA8DA.9090201@measurement-factory.com> On 08/02/2011 11:15 AM, Dmitry Kurochkin wrote: > On Tue, 02 Aug 2011 14:51:09 +0100, Mohammed Rakhada wrote: >> throughput: 4196.00xact/sec or 608.43Mbits/sec >> >> however when I am looking at the switch statistics the number reported >> is much lower (360Mbits/sec) >> >> Could you clarify what the throughput value actually relates to? >> > > Throughput on the index page of HTML report is client side reply > throughput, i.e. (size of all replies clients received) / duration). It > does not include requests or replies sent by servers. > > I am not sure why you see lower throughput stats on the switch. You may > get wrong stats in reporter if you specify a single log multiple times > on the command line. I do not think it is likely, but this may be a bug > in Polygraph reporter or client. Another possibility here is that the switch is counting traffic volumes over a longer (or shorter!) periods of time while Polygraph reporter is using response volume during the specified test phase(s). If you want to investigate this further, running longer tests with fixed response sizes while looking at runtime switch stats may be useful. HTH. Alex. From morakhad at cisco.com Thu Aug 4 14:22:42 2011 From: morakhad at cisco.com (Mohammed Rakhada) Date: Thu, 04 Aug 2011 15:22:42 +0100 Subject: Incorrect throughput being seen in report and differing PGL configs In-Reply-To: <4E3AA8DA.9090201@measurement-factory.com> References: <1312293069.2165.34.camel@localhost> <87oc074tvy.fsf@gmail.com> <4E3AA8DA.9090201@measurement-factory.com> Message-ID: <1312467762.2165.41.camel@localhost> Hello Alex, Thanks for your report, there isn't a problem here. It was my interpretation of the data. I didn't realise the throughput was averaged across the length of the test and so in my test as there was a slow ramp period (and it was a short test) it was giving me the "middle" value. All looks fine and as to be expected. Thanks to Dmitry for his initial input, meant to reply to him earlier but was just verifying some of my tests and results. What would be nice would be to have a maximum and 95% measurement so that we can see what the actual measured peak was. As for the hosts and differing configs I've implemented the changes Dmitry suggested and they work fine. Thanks Mohammed Rakhada On Thu, 2011-08-04 at 08:12 -0600, Alex Rousskov wrote: > On 08/02/2011 11:15 AM, Dmitry Kurochkin wrote: > > > On Tue, 02 Aug 2011 14:51:09 +0100, Mohammed Rakhada wrote: > > >> throughput: 4196.00xact/sec or 608.43Mbits/sec > >> > >> however when I am looking at the switch statistics the number reported > >> is much lower (360Mbits/sec) > >> > >> Could you clarify what the throughput value actually relates to? > >> > > > > Throughput on the index page of HTML report is client side reply > > throughput, i.e. (size of all replies clients received) / duration). It > > does not include requests or replies sent by servers. > > > > I am not sure why you see lower throughput stats on the switch. You may > > get wrong stats in reporter if you specify a single log multiple times > > on the command line. I do not think it is likely, but this may be a bug > > in Polygraph reporter or client. > > Another possibility here is that the switch is counting traffic volumes > over a longer (or shorter!) periods of time while Polygraph reporter is > using response volume during the specified test phase(s). > > If you want to investigate this further, running longer tests with fixed > response sizes while looking at runtime switch stats may be useful. > > HTH. > > Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.kurochkin at measurement-factory.com Thu Aug 4 18:13:14 2011 From: dmitry.kurochkin at measurement-factory.com (Dmitry Kurochkin) Date: Thu, 04 Aug 2011 22:13:14 +0400 Subject: Incorrect throughput being seen in report and differing PGL configs In-Reply-To: <1312467762.2165.41.camel@localhost> References: <1312293069.2165.34.camel@localhost> <87oc074tvy.fsf@gmail.com> <4E3AA8DA.9090201@measurement-factory.com> <1312467762.2165.41.camel@localhost> Message-ID: <87ei11gi4l.fsf@gmail.com> Hi Mohammed. On Thu, 04 Aug 2011 15:22:42 +0100, Mohammed Rakhada wrote: > Hello Alex, > > Thanks for your report, there isn't a problem here. It was my > interpretation of the data. I didn't realise the throughput was averaged > across the length of the test and so in my test as there was a slow ramp > period (and it was a short test) it was giving me the "middle" value. > All looks fine and as to be expected. > Good to hear. > Thanks to Dmitry for his initial input, meant to reply to him earlier > but was just verifying some of my tests and results. > > What would be nice would be to have a maximum and 95% measurement so > that we can see what the actual measured peak was. > May not be exactly what you want, but take a look at "load trace" plot on "traffic rates, counts, and volumes" page. Also, "everything" page has stats for individual objects, e.g. "Object 'hits and misses'". If you are interested in "raw" stats, you can get it with polygraph-lx(1) and polygraph-ltrace(1) tools. E.g.: $ ltrace --win_len 30sec --side clt --objects rep.rate LOG Would give you reply rate stats with 30sec interval. Polygraph stat cycle length is 5sec by default and can be changed with --stats_cycle option. Regards, Dmitry > As for the hosts and differing configs I've implemented the changes > Dmitry suggested and they work fine. > > Thanks > > Mohammed Rakhada > > On Thu, 2011-08-04 at 08:12 -0600, Alex Rousskov wrote: > > > On 08/02/2011 11:15 AM, Dmitry Kurochkin wrote: > > > > > On Tue, 02 Aug 2011 14:51:09 +0100, Mohammed Rakhada wrote: > > > > >> throughput: 4196.00xact/sec or 608.43Mbits/sec > > >> > > >> however when I am looking at the switch statistics the number reported > > >> is much lower (360Mbits/sec) > > >> > > >> Could you clarify what the throughput value actually relates to? > > >> > > > > > > Throughput on the index page of HTML report is client side reply > > > throughput, i.e. (size of all replies clients received) / duration). It > > > does not include requests or replies sent by servers. > > > > > > I am not sure why you see lower throughput stats on the switch. You may > > > get wrong stats in reporter if you specify a single log multiple times > > > on the command line. I do not think it is likely, but this may be a bug > > > in Polygraph reporter or client. > > > > Another possibility here is that the switch is counting traffic volumes > > over a longer (or shorter!) periods of time while Polygraph reporter is > > using response volume during the specified test phase(s). > > > > If you want to investigate this further, running longer tests with fixed > > response sizes while looking at runtime switch stats may be useful. > > > > HTH. > > > > Alex. > > > _______________________________________________ > Users mailing list > Users at web-polygraph.org > http://www.web-polygraph.org/mailman/listinfo/users