From PKadziolka at acppharma.pl Tue Aug 26 19:49:35 2008 From: PKadziolka at acppharma.pl (Piotr Kadziolka) Date: Tue, 26 Aug 2008 21:49:35 +0200 Subject: needed help in interpreting results Message-ID: Hi, Some time ago I've made series of tests on Squid-2.5STABLE5. I wanted confirm and verify results obtained by Duane Wessels in his book about Squid. Duane measured how Squid performace can depends on filesystem performance. I took all tests on the same machine in identical environment but I obtained strange results for COSS (coss.png). Could someone interpret this results? Is anyone who can give me explaination about this periodical peak in response time? Adjustment algorithm in the workload (attached file pm4-pf2.pg) take care about misses and hits and take action when value exceeds or fell below, respectively, 3.0 and 0.25 sec. At second image (diskd.png) which contains trace for diskd there is no similar peak. Thanks in advance, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: diskd.png Type: application/octet-stream Size: 7853 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pm4-pf2.pg Type: application/octet-stream Size: 5049 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: coss.png Type: application/octet-stream Size: 8256 bytes Desc: not available URL: From wessels at measurement-factory.com Tue Aug 26 23:03:22 2008 From: wessels at measurement-factory.com (Duane Wessels) Date: Tue, 26 Aug 2008 17:03:22 -0600 (MDT) Subject: needed help in interpreting results In-Reply-To: References: Message-ID: <20080826165613.P47402@measurement-factory.com> On Tue, 26 Aug 2008, Piotr Kadziolka wrote: > Hi, > > Some time ago I've made series of tests on Squid-2.5STABLE5. I wanted > confirm and verify results obtained by Duane Wessels in his book about > Squid. Duane measured how Squid performace can depends on filesystem > performance. I took all tests on the same machine in identical > environment but I obtained strange results for COSS (coss.png). > > > Could someone interpret this results? Is anyone who can give me > explaination about this periodical peak in response time? Adjustment > algorithm in the workload (attached file pm4-pf2.pg) take care about > misses and hits and take action when value exceeds or fell below, > respectively, 3.0 and 0.25 sec. At second image (diskd.png) which contains > trace for diskd there is no similar peak. Whats the time span of the test? Any chance that they could be caused by some daily cron job? But then you'd kind of expect to see the effect in the diskd data too. Also the way that the peaks take longer to decay make me think that its related to Squid filling up rather than a system cron job. Duane W. From PKadziolka at acppharma.pl Wed Aug 27 11:34:29 2008 From: PKadziolka at acppharma.pl (Piotr Kadziolka) Date: Wed, 27 Aug 2008 13:34:29 +0200 Subject: needed help in interpreting results In-Reply-To: <20080826165613.P47402@measurement-factory.com> Message-ID: The test was taken on low-end machine. Compared with nowdays hdd I had very, very small disk, so test couldn't be long. It took nearly 500 minutes. It's to short time to let cron jobs to affect results. Besides there wasn't defined any jobs in cron and Squid logging was disabled. If it was related to Squid filling up what caused that peaks? I don't know details on how COSS works but at first moment I thought that it could be caused by writting data from stripe in memory into disk. But default size of stripe is 1 MB, so peaks would take place more, more frequently. I still have no idea what was the casue of peaks in trace. Where I can find documentation about COSS module? Duane Wessels 2008-08-27 01:22 To Piotr Kadziolka cc users at web-polygraph.org Subject Re: needed help in interpreting results On Tue, 26 Aug 2008, Piotr Kadziolka wrote: > Hi, > > Some time ago I've made series of tests on Squid-2.5STABLE5. I wanted > confirm and verify results obtained by Duane Wessels in his book about > Squid. Duane measured how Squid performace can depends on filesystem > performance. I took all tests on the same machine in identical > environment but I obtained strange results for COSS (coss.png). > > > Could someone interpret this results? Is anyone who can give me > explaination about this periodical peak in response time? Adjustment > algorithm in the workload (attached file pm4-pf2.pg) take care about > misses and hits and take action when value exceeds or fell below, > respectively, 3.0 and 0.25 sec. At second image (diskd.png) which contains > trace for diskd there is no similar peak. Whats the time span of the test? Any chance that they could be caused by some daily cron job? But then you'd kind of expect to see the effect in the diskd data too. Also the way that the peaks take longer to decay make me think that its related to Squid filling up rather than a system cron job. Duane W. -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrian at creative.net.au Wed Aug 27 15:09:48 2008 From: adrian at creative.net.au (Adrian Chadd) Date: Wed, 27 Aug 2008 23:09:48 +0800 Subject: needed help in interpreting results In-Reply-To: References: <20080826165613.P47402@measurement-factory.com> Message-ID: <20080827150948.GD30494@skywalker.creative.net.au> On Wed, Aug 27, 2008, Piotr Kadziolka wrote: > The test was taken on low-end machine. Compared with nowdays hdd I had > very, very small disk, so test couldn't be long. It took nearly 500 > minutes. It's to short time to let cron jobs to affect results. Besides > there wasn't defined any jobs in cron and Squid logging was disabled. If > it was related to Squid filling up what caused that peaks? I don't know > details on how COSS works but at first moment I thought that it could be > caused by writting data from stripe in memory into disk. But default size > of stripe is 1 MB, so peaks would take place more, more frequently. I > still have no idea what was the casue of peaks in trace. Where I can find > documentation about COSS module? squid/src/fs/coss/ Its not all that well documented; especially the changes which occured during the early Squid-2.6 lifetime by Steven and I to make it somewhat more useful. Adrian From net_engineer at hotmail.com Wed Aug 27 17:24:59 2008 From: net_engineer at hotmail.com (Rafael Vieira) Date: Wed, 27 Aug 2008 17:24:59 +0000 Subject: Problems with address range while using more the 1 client Message-ID: Hello, I have the following testing topology: Workload: Polymix 4 2 Clients -> 1 Server I connect the first client, it gives me a network range as virtual addresses. When I run the second client, it gives me just the same network range. Is it normal? Shouldn't it give to the second client a different network range? Do anybody know how to change this abnormal behavior? Thanks Rafael Vieira _________________________________________________________________ Confira v?deos com not?cias do NY Times, gols direto do Lance, videocassetadas e muito mais no MSN Video! http://video.msn.com/?mkt=pt-br -------------- next part -------------- An HTML attachment was scrubbed... URL: From net_engineer at hotmail.com Wed Aug 27 19:25:02 2008 From: net_engineer at hotmail.com (Rafael Vieira) Date: Wed, 27 Aug 2008 19:25:02 +0000 Subject: Problems with "period of innactivity" on server Message-ID: I have the following situation: I have a server in a network, and a client (in another network), after running workload polymix 4, they seem to be communicating without any problem (nothing is been blocked in my firewall). After some time (about 15 minutes) the server stops for 'period of innactivity'. Why would it say 'innactivity' if the client is started and sending packets to server? Is there anything that can be done to solve this problem? Thanks Rafael Vieira _________________________________________________________________ Receba GR?TIS as mensagens do Messenger no seu celular quando voc? estiver offline. Conhe?a o MSN Mobile! http://mobile.live.com/signup/signup2.aspx?lc=pt-br -------------- next part -------------- An HTML attachment was scrubbed... URL: From rousskov at measurement-factory.com Wed Aug 27 21:42:20 2008 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Wed, 27 Aug 2008 15:42:20 -0600 Subject: Problems with address range while using more the 1 client In-Reply-To: References: Message-ID: <1219873340.6064.64.camel@pail> On Wed, 2008-08-27 at 17:24 +0000, Rafael Vieira wrote: > > I have the following testing topology: > > Workload: Polymix 4 > > 2 Clients -> 1 Server > > I connect the first client, it gives me a network range as virtual > addresses. When I run the second client, it gives me just the same > network range. > Is it normal? Shouldn't it give to the second client a different > network range? PolyMix robots should use different addresses. Are you using the same workload file on all drone hosts? You should. Does your workload set the right host (primary) addresses for the two client and one server hosts? If these hints do not help, please post your workload and verbose console log (you should always do that when asking test configuration or interpretation questions). Thank you, Alex. From rousskov at measurement-factory.com Wed Aug 27 21:52:20 2008 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Wed, 27 Aug 2008 15:52:20 -0600 Subject: Problems with "period of innactivity" on server In-Reply-To: References: Message-ID: <1219873940.6064.72.camel@pail> On Wed, 2008-08-27 at 19:25 +0000, Rafael Vieira wrote: > I have a server in a network, and a client (in another network), after > running workload polymix 4, they seem to be communicating without any > problem (nothing is been blocked in my firewall). > > After some time (about 15 minutes) the server stops for 'period of > innactivity'. Why would it say 'innactivity' if the client is started > and sending packets to server? Bugs notwithstanding, Polygraph server reports inactivity timeout when it does not receive any requests and cannot send any responses. You can control the timeout duration using the --idle_tout command line option. The default is 5 minutes. It is possible that the server was receiving requests but then stopped because the network, the device under test, or the drones are overloaded. It is also possible that the client is misconfigured to send too few requests after the test starts. Please post your workload and verbose console logs if you still need help with this. Thank you, Alex. From net_engineer at hotmail.com Thu Aug 28 13:26:21 2008 From: net_engineer at hotmail.com (Rafael Vieira) Date: Thu, 28 Aug 2008 13:26:21 +0000 Subject: Problems with address range while using more the 1 client Message-ID: > Are you using the same workload file on all drone hosts? You should.Yes, I am > Does your workload set the right host (primary) addresses for the two> client and one server hosts?Yes, it does.> > If these hints do not help, please post your workload and verbose> console log (you should always do that when asking test configuration or> interpretation questions).Attached are my workload, my server_side logs and my client_side logs. Thank you in advance. Rafael Vieira :> > > > I have the following testing topology:> > > > Workload: Polymix 4> > > > 2 Clients -> 1 Server> > > > I connect the first client, it gives me a network range as virtual> > addresses. When I run the second client, it gives me just the same> > network range.> > Is it normal? Shouldn't it give to the second client a different> > network range?> > PolyMix robots should use different addresses.> > Are you using the same workload file on all drone hosts? You should.> Does your workload set the right host (primary) addresses for the two> client and one server hosts?> > If these hints do not help, please post your workload and verbose> console log (you should always do that when asking test configuration or> interpretation questions).> > Thank you,> > Alex.> > _________________________________________________________________ Confira v?deos com not?cias do NY Times, gols direto do Lance, videocassetadas e muito mais no MSN Video! http://video.msn.com/?mkt=pt-br -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: polygraph logs and workload.zip Type: application/x-zip-compressed Size: 16292 bytes Desc: not available URL: From net_engineer at hotmail.com Thu Aug 28 13:36:03 2008 From: net_engineer at hotmail.com (Rafael Vieira) Date: Thu, 28 Aug 2008 13:36:03 +0000 Subject: Problems with "period of innactivity" on server Message-ID: Thank for que quick answer. I dont know where it can be misconfigured. I just decreased the time in platDur = 300sec because we dont want it to run for many hours. Is there any problem with doing that? Attached are all relevant files. Thank you.> Subject: Re: Problems with 'period of innactivity' on server> From: rousskov at measurement-factory.com> To: net_engineer at hotmail.com> CC: marcello.mezzanotti at gmail.com; users at web-polygraph.org> Date: Wed, 27 Aug 2008 15:52:20 -0600> > > On Wed, 2008-08-27 at 19:25 +0000, Rafael Vieira wrote:> > > I have a server in a network, and a client (in another network), after> > running workload polymix 4, they seem to be communicating without any> > problem (nothing is been blocked in my firewall).> > > > After some time (about 15 minutes) the server stops for 'period of> > innactivity'. Why would it say 'innactivity' if the client is started> > and sending packets to server?> > Bugs notwithstanding, Polygraph server reports inactivity timeout when> it does not receive any requests and cannot send any responses. You can> control the timeout duration using the --idle_tout command line option.> The default is 5 minutes.> > It is possible that the server was receiving requests but then stopped> because the network, the device under test, or the drones are> overloaded. It is also possible that the client is misconfigured to send> too few requests after the test starts.> > Please post your workload and verbose console logs if you still need> help with this.> > Thank you,> > Alex.> > _________________________________________________________________ Cansado de espa?o para s? 50 fotos? Conhe?a o Spaces, o site de relacionamentos com at? 6,000 fotos! http://www.amigosdomessenger.com.br -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: polygraph logs and workload.zip Type: application/x-zip-compressed Size: 18169 bytes Desc: not available URL: From rousskov at measurement-factory.com Thu Aug 28 13:46:22 2008 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Thu, 28 Aug 2008 07:46:22 -0600 Subject: Problems with address range while using more the 1 client In-Reply-To: References: Message-ID: <1219931182.6064.179.camel@pail> On Thu, 2008-08-28 at 13:26 +0000, Rafael Vieira wrote: > > Are you using the same workload file on all drone hosts? You should. > Yes, I am > > > Does your workload set the right host (primary) addresses for the > two client and one server hosts? > Yes, it does. > > If these hints do not help, please post your workload and verbose > > console log (you should always do that when asking test > configuration or interpretation questions). > > Attached are my workload, my server_side logs and my client_side logs. You forgot to attach the console log of the second client so my response cannot be precise. However, please note that - your workload contains three client hosts (.45, .46, and .47) - according to your email, you are using two client hosts - your peak request rate for the Bench is only enough for one host (each PolyMix-4 client host does up to 500/sec by default) All of the above have to be in-sync for the PolyMix-4 math to work. You can customize the workload to use many other combinations, but PolyMix guts and addressing schemes require that the three items above match. Polygraph did try to warn you about the discrepancy, but the generic warning is not very clear: > 000.01| fyi: the number of virtual agent addresses (76) is not divisible by the number of real host addresess (3); will not attempt to create agent addresses A correct PolyMix setup will not have that warning. HTH, Alex. > > > I have the following testing topology: > > > > > > Workload: Polymix 4 > > > > > > 2 Clients -> 1 Server > > > > > > I connect the first client, it gives me a network range as virtual > > > addresses. When I run the second client, it gives me just the same > > > network range. > > > Is it normal? Shouldn't it give to the second client a different > > > network range? > > > > PolyMix robots should use different addresses. > > > > Are you using the same workload file on all drone hosts? You should. > > Does your workload set the right host (primary) addresses for the > two > > client and one server hosts? > > > > If these hints do not help, please post your workload and verbose > > console log (you should always do that when asking test > configuration or > > interpretation questions). > > > > Thank you, > > > > Alex. > > > > > > > > ______________________________________________________________________ > Not?cias direto do New York Times, gols do Lance, videocassetadas e > muitos outros v?deos no MSN Videos! Confira j?! From rousskov at measurement-factory.com Thu Aug 28 13:55:12 2008 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Thu, 28 Aug 2008 07:55:12 -0600 Subject: Problems with "period of innactivity" on server In-Reply-To: References: Message-ID: <1219931712.6064.187.camel@pail> On Thu, 2008-08-28 at 13:36 +0000, Rafael Vieira wrote: > I dont know where it can be misconfigured. I just decreased the > time in platDur = 300sec because we dont want it to run for many > hours. Is there any problem with doing that? No, adjusting platDur should not cause connectivity problems. However, your server is not getting any requests at all and quits after 5 minutes of inactivity: > 000.02| starting 575 HTTP agents... > 000.10| i-framp 0 0.00 -1 -1.00 0 575 > 000.19| i-framp 0 0.00 -1 -1.00 0 575 > ... > 004.94| i-fill 0 0.00 -1 -1.00 0 575 > 005.02| i-fill 0 0.00 -1 -1.00 0 575 > 005.09| was idle for at least 5.00min > 005.09| got 0 xactions and 0 errors The server is not getting any requests because your proxy cannot resolve any PolyMix domain names. Here is one of the many error messages logged by polyclt: > 000.02| Xaction.cc:74: error: 1/1 (c19) unsupported HTTP status code > 1219929069.023472# obj: http://w1049.h1129o1101s1010.bench.tst/w1255599d.0ca53025:000000ca/t02/_00000001.htm flags: basic,GET, size: 0/-1 xact: 1255599d.0ca53025:00000496 > HTTP/1.0 503 Service Unavailable > Server: squid/2.5.STABLE14 > Mime-Version: 1.0 > Date: Thu, 28 Aug 2008 13:00:18 GMT > Content-Type: text/html > Content-Length: 1355 > Expires: Thu, 28 Aug 2008 13:00:18 GMT > X-Squid-Error: ERR_DNS_FAIL 0 ... If you run into more problems, consider looking at console logs. They often contain enough information to diagnose the problem. The console format is documented at http://www.web-polygraph.org/docs/reference/output/console.html HTH, Alex. > > Subject: Re: Problems with 'period of innactivity' on server > > From: rousskov at measurement-factory.com > > To: net_engineer at hotmail.com > > CC: marcello.mezzanotti at gmail.com; users at web-polygraph.org > > Date: Wed, 27 Aug 2008 15:52:20 -0600 > > > > > > On Wed, 2008-08-27 at 19:25 +0000, Rafael Vieira wrote: > > > > > I have a server in a network, and a client (in another network), > after > > > running workload polymix 4, they seem to be communicating without > any > > > problem (nothing is been blocked in my firewall). > > > > > > After some time (about 15 minutes) the server stops for 'period of > > > innactivity'. Why would it say 'innactivity' if the client is > started > > > and sending packets to server? > > > > Bugs notwithstanding, Polygraph server reports inactivity timeout > when > > it does not receive any requests and cannot send any responses. You > can > > control the timeout duration using the --idle_tout command line > option. > > The default is 5 minutes. > > > > It is possible that the server was receiving requests but then > stopped > > because the network, the device under test, or the drones are > > overloaded. It is also possible that the client is misconfigured to > send > > too few requests after the test starts. > > > > Please post your workload and verbose console logs if you still need > > help with this.