Client side stats SSL workload 4.3.2

Michael Hendrie pmix at hendrie.id.au
Fri May 17 08:11:46 UTC 2013


On 17/05/2013, at 2:29 AM, Alex Rousskov <rousskov at measurement-factory.com> wrote:

> On 05/16/2013 03:22 AM, Dmitry Kurochkin wrote:
>> Hi Michael.
>> 
>> Michael Hendrie <pmix at hendrie.id.au> writes:
>> 
>>> Hi List,
>>> 
>>> I recently updated from Web-Polygraph 4.0.10 running on FreeBSD to
>>> Web-Polygraph 4.3.2 running on Linux (RHEL 6.1 clone) and now have some
>>> inconsistency being reported in the client side runtime stats.
>>> 
>>> The issue only seems to occur when polygraph-client is configured to send
>>> requests via a proxy.  In the client side stats, I'm seeing a significantly
>>> higher request rate than what was configured, on the server side stats they
>>> report as expected.  e.g. on a test configured for 250req/sec:
>>> 
>>> Client Side Stats:
>>> 030.41| i-top1 786670 421.83   1357   0.00   0 1671
>>>           ssl 448598 421.83   1360   0.00   0 1671
>>> 030.50| i-top1 788934 452.78   1379   0.00   0 1668
>>>           ssl 449857 452.78   1379   0.00   0 1668
>>> 030.58| i-top1 791229 458.98   1465   0.00   0 1660
>>>           ssl 451193 458.98   1465   0.00   0 1660
>>> 030.66| i-top1 793401 434.40   1461   0.00   0 1687
>>>           ssl 452449 434.40   1463   0.00   0 1687
>>> 030.75| i-top1 795476 415.02   1401   0.00   0 1723
>>>           ssl 453636 415.02   1402   0.00   0 1721
>>> 
>>> Server Side Stats:
>>> 030.45| i-top1 426331 227.03   2518   0.00   0 2200
>>>           ssl 426331 227.03   2518   0.00   0 1637
>>> 030.54| i-top1 427519 237.59   2539   0.00   0 2201
>>>           ssl 427519 237.59   2539   0.00   0 1638
>>> 030.62| i-top1 428710 238.19   2467   0.00   0 2171
>>>           ssl 428710 238.19   2468   0.00   0 1608
>>> 030.70| i-top1 429962 250.40   2503   0.00   0 2300
>>>           ssl 429962 250.40   2502   0.00   0 1737
>>> 
>>> 
>>> I didn't see this inconsistency in 4.0.10 and if I have a direct
>>> client->server test (no proxy) then stats are also reported correctly.
>>> 
>>> The proxy itself (squid) is reporting around 190 client req/sec:
>>> client_http.requests = 188.674032/sec
>>> 
>>> Having just read the changelog (after typing the above), now wondering if
>>> the fix in the non-public 4.6.0 would resolve this issue?
>>> 
>>> *
>>> 
>>> - Make retried transactions obey request inter-arrival
>>>  distribution and, hence, Robot::req_rate setting.
>>> 
>>> *
>>> 
>>> Any thoughts/idea's?
>>> 
>> 
>> Since v4.2.0 CONNECT tunnel establishment is treated as a separate
>> transaction:
>> 
>> 	- Treat an HTTP CONNECT response as the end of the transaction.
>> 	  Collect and report stats dedicated to these new CONNECT-only
>> 	  transactions. These CONNECT transactions are often rather
>> 	  different from the in-tunnel transactions that follow.
>> 	  Isolating them also prevents CONNECT overheads from polluting
>> 	  the first in-tunnel transaction stats.
>> 
>> So an HTTPS GET transaction over proxy is counted as two transactions on
>> the client side: CONNECT and GET.
>> 
>> Detailed stats are available in HTML report or can be extracted from
>> binary log with the lx tool.
>> 
>> The behavior is the same on all newer Polygraph versions, so upgrading
>> would not help.
> 
> .... but there are plans to make baseline reporting match configuration
> better when dealing with what we call "compound" transactions. The
> overall principle behind the planned changes is to have reported
> baseline request rate (i-lines in the console output) match the
> configured Bench::peak_req_rate (or equivalent), regardless of whether
> the workload involves CONNECT requests and/or authentication.
> 
> We will also report CONNECT (and authentication) activity separately,
> just like we report ssl activity separately today.
> 
> 
> We hope the above changes will reduce surprises caused by compound
> transactions involving multiple HTTP requests.
> 
> 

OK so the numbers stack up then, squid is reporting ~180 req/sec (it will only know the CONNECT and not how many GET's passed through the tunnel) and peak_req_rate configured for 250 so the ~430 being reported client side sounds OK.

Unfortunately I have another problem then, was hoping the above could be related to what I'm seeing.  

If I run a direct test (no proxy involved), it will go for hours and never miss a beat (last one ran for 15 hours).  If I use the same workload but introduce a proxy with the --proxy switch the test always bombs out somewhere between the 2 and 3 hour mark (have tried at least 5 different times) even though none of the devices (including the proxy) are reporting any high load or resource exhaustion.  Before it bombs here's what the client console says:

136.74| i-top1 3588097 406.69   1583   0.00   0 1681
           ssl 2048966 406.69   1584   0.00   0 1679
136.83| i-top1 3590106 400.26   1600   0.00   0 1659
           ssl 2050132 400.26   1602   0.00   0 1633
136.91| i-top1 3592036 364.13   1695   0.00   0 1719
           ssl 2051152 364.13   1696   0.00   0 1653
137.04| PolyApp.cc:189: error: 1/1 (c58) internal timers may be getting behind
137.04| record level of timing drift: 2.48sec; last check was 6.45sec ago
137.27| i-top1 3592779  34.36  11075   0.00   0 1609
           ssl 2051667  34.36  11075   0.00   0 1102
137.30| PolyApp.cc:189: error: 2/2 (c58) internal timers may be getting behind
137.30| record level of timing drift: 15.40sec; last check was 15.37sec ago
137.40| Connection.cc:581: error: 1/3 (s32) Broken pipe
137.40| error: raw close on SSL connection failed
137.40| connection between 172.17.20.35:31263 and 172.16.0.13:8080 failed at 3 reads, 5 writes, and 2 xacts
137.40| Connection.cc:581: error: 2/4 (s32) Broken pipe
137.40| error: raw close on SSL connection failed
137.40| connection between 172.17.20.173:43974 and 172.16.0.13:8080 failed at 4 reads, 6 writes, and 3 xacts
137.41| Connection.cc:581: error: 4/6 (s32) Broken pipe
137.41| error: raw close on SSL connection failed
137.41| connection between 172.17.20.5:42461 and 172.16.0.13:8080 failed at 3 reads, 5 writes, and 2 xacts
137.41| Connection.cc:581: error: 8/10 (s32) Broken pipe
137.41| error: raw close on SSL connection failed
137.41| connection between 172.17.20.245:35371 and 172.16.0.13:8080 failed at 3 reads, 5 writes, and 2 xacts
137.41| Connection.cc:581: error: 16/18 (s32) Broken pipe
137.41| error: raw close on SSL connection failed
137.41| connection between 172.17.21.32:13088 and 172.16.0.13:8080 failed at 4 reads, 6 writes, and 3 xacts
137.43| i-top1 3593243  53.18  25644   0.00   0 1533
           ssl 2051843  53.18  25644   0.00   0  769
137.44| i-top1 3593243   0.00     -1  -1.00   0 1895
           ssl 2051843   0.00     -1  -1.00   0  748


Any ideas on this one?

Michael


More information about the Users mailing list