Two questions about Web Polygraph report. Need you help.

Alex Rousskov rousskov at
Tue Jul 22 04:01:33 UTC 2008


    [ Sorry for the delay with this response. It looks like your mailer
or its HTML attachments do not get through to me and others have not
responded. ]

> The Web Polygraph version is 3.1.5.<br>
> In the traffic Stream/Object table, what do "all replies" and "page"
> mean? Do they have any relation between the two items?<br>

The "all replies" line corresponds to any response message. In other
words, that line combines statistics for all kinds of HTTP transactions.
Most other lines in the same tables depict stats for a given kind of
response message or a group of message. The "all replies" line is used
when you want to answer a question like "what is the mean size of an
average response" or "how long does an average HTTP transaction take"?

The "page" stats accumulate measurements from multiple HTTP transactions
that work on delivering a single page. There is one transaction that
fetches the markup container and then zero of more transactions that
fetch embedded objects such as images, referenced in that container. All
these individual HTTP transactions can run in parallel. 

Since some of the HTTP responses deliver markup container or embedded
objects, some of the "all replies" stats contribute to "page" stats.
When embedded object are fetched in parallel with the container or each
other, the exact relationship between "all replies" and "page" stats
becomes rather complex. HTTP transactions that do not fetch HTML
containers or embedded objects do not contribute to "page" stats at all.

> <br>Question 2:<br>Form the Web Polygraph side, How to count the
> unique urls which request by Web Polygraph client?<br>I use the
> entered times of the open Connection state in the Concurrent HTTP/TCP
> connection level table to multiply the (1 - recurrence, recurrence was
> a parameter in the robot configuration file). <br>Am I right?<br>

A single connection can be used for many requests and for requests with
repeated URLs so connection stats will not help you here.

To estimate how many unique URLs where requested you can compute this:

    (1 - recurrence ratio) * number of transactions

You can get the number of transactions from the total count on "all
replies" line in the report or at the end of the polyclt console output.

In production tests, the number of URLs in a working set (i.e., URLs
that may be requested at a given time) is usually more important than
the total number of unique URLs because the working set size is limited
while the total number of unique URLs (and even host names in unreleased
versions!) can grow without limits. To achieve perfect hit ratio, a
cache needs to store (the cachable part of) the working set and not
necessarily all the unique URLs.



More information about the Users mailing list