Huge Domain List

unjc email unjc.email at gmail.com
Fri Apr 20 03:28:06 UTC 2012


Hi Alex,

I have a shorter list of 5000 domains that I usually test with.  It
works pretty well with a "poisoned" DNS server.   However, I found
that 5000 domains are still quite enough for stressing the
application.   That's why I want to run with a much longer list of
domains.

If Webpolygraph manages to run with the 30k domain list, it would
likely be a stress test I will run regularly.  Therefore, as you said,
25-min of start time is definitely not very efficient for running a
test.  Do you know if there is any way to shorten the preparation
time?  Would a sorted list be helpful?



Thanks,
Jacky

On Thu, Apr 19, 2012 at 7:52 PM, Alex Rousskov
<rousskov at measurement-factory.com> wrote:
> On 04/19/2012 03:31 PM, unjc email wrote:
>> I have a long list of (~30000) domains which I want Webpolygraph
>> clients to use them in the request URLs.
>
>> I have tried using addressMap in workload as like below:
>
>> AddrMap M = {
>>      names = [ 'google.com','facebook.com','youtube.com','yahoo.com','live.com','baidu.com','blogspot.com','wikipedia.org'........];
>>      addresses = [ '192.168.1.1' ];
>> };
>
>
>> I believe the address list is too long, Webpolygraph throws the
>> "SynSym.cc:61: cannot cast string to addr" exception after trying to
>> start the test for more than 10 minutes.
>>
>> Please advise if there is another way to input my custom domain-list
>> for Webpolygraph to generate URLs.
>
>
> I just tried a simple.pg workload with an address map like yours that
> has 30000 domain names (formed from a local dictionary file). It does
> take 25min to parse and interpret 30K strings(*), but the test starts.
> Even the memory consumption looks reasonable at 100MB per process. The
> client then fails in my case because I do not have a name server setup
> to resolve those names, but I hope you do.
>
> I used the following #include trick to keep the workload readable:
>
>> AddrMap M = {
>>      names = [
>>      'firstname',
>> #include "/tmp/names"
>>      'lastname'
>>      ];
>>      addresses = [ '192.168.1.1:80' ];
>> };
>>
>> use(M);
>
>
> Does your workload work fine with, say, 10 custom domains?
>
> If yes, perhaps your input line is too long (for Polygraph or for your
> text editor)? Try the #include trick above, with every domain on its own
> line. It is more manageable that way.
>
>
> HTH,
>
> Alex.
> P.S.(*) Polygraph is not optimized to quickly grok 30K random names. In
> fact, the default algorithm may try to find a "range" pattern in those
> names so that they can be merged into a more compact representation. It
> is possible to optimize handling of a large number of random names as
> well, of course. Running more than a few tests with 25min startup times
> would not be very productive!



More information about the Users mailing list