From unjc.email at gmail.com Thu Apr 12 17:27:05 2012 From: unjc.email at gmail.com (unjc email) Date: Thu, 12 Apr 2012 13:27:05 -0400 Subject: Realistic content simulation Message-ID: Hi there, I have followed the Realistic content simulation manual to try to create realistic image contents. Here are what I have done: 1. Created cdb file I have created cdb file that contains jpg image files only. > polygraph-4.3.1/src/csm/cdb --add wp.cdb media/*.jpg Question: Can I add different file types (.jpg, .gif, .png) and kinds (.flv, .html, .swf) into the cdb file? 2. add a PGL ContentType for images from that database Content ImageContent = { kind = "image"; mime = { type = "image/jpeg"; extensions = [ ".jpg" ]; }; //obj_life_cycle = olcImage; //size = const(13KB); cachable = 0%; checksum = 1%; content_db = "workloads/wp.cdb"; // import content templateS }; Server S = { kind = "S101"; contents = [ ImageContent ]; direct_access = contents; ... I download an image from one of the URLs and try to view it on a browser; but the image is shown broken. http://hostname:9090/w1925a4f2.210b3b17:00000008/t03/_00000003.jpg Please advise if I have missed any step in creating the realistic image contents. Thanks, Jacky From dmitry.kurochkin at measurement-factory.com Thu Apr 12 18:08:45 2012 From: dmitry.kurochkin at measurement-factory.com (Dmitry Kurochkin) Date: Thu, 12 Apr 2012 22:08:45 +0400 Subject: Realistic content simulation In-Reply-To: References: Message-ID: <87wr5komaa.fsf@gmail.com> Hi Jacky. unjc email writes: > Hi there, > > I have followed the Realistic content simulation manual to try to > create realistic image contents. Here are what I have done: > > 1. Created cdb file > I have created cdb file that contains jpg image files only. >> polygraph-4.3.1/src/csm/cdb --add wp.cdb media/*.jpg > > Question: Can I add different file types (.jpg, .gif, .png) and kinds > (.flv, .html, .swf) into the cdb file? > PGL ContentType uses all objects in the given cdb database. You can add any files to a single cdb database but PGL ContentType would not distinguish between them which is probably not what you want. > > 2. add a PGL ContentType for images from that database > > Content ImageContent = { > kind = "image"; > mime = { type = "image/jpeg"; extensions = [ ".jpg" ]; }; > //obj_life_cycle = olcImage; > //size = const(13KB); > cachable = 0%; > checksum = 1%; > content_db = "workloads/wp.cdb"; // import content templateS > }; > > Server S = { > kind = "S101"; > contents = [ ImageContent ]; > direct_access = contents; > ... > > > I download an image from one of the URLs and try to view it on a > browser; but the image is shown broken. > http://hostname:9090/w1925a4f2.210b3b17:00000008/t03/_00000003.jpg > You should use "--format verbatim" option when adding images to cdb database. This would Regards, Dmitry > > Please advise if I have missed any step in creating the realistic > image contents. > > > > > Thanks, > Jacky > _______________________________________________ > Users mailing list > Users at web-polygraph.org > http://www.web-polygraph.org/mailman/listinfo/users From shahabb at ifi.uio.no Sat Apr 14 00:32:35 2012 From: shahabb at ifi.uio.no (Shahab Bakhtiyari) Date: Sat, 14 Apr 2012 02:32:35 +0200 Subject: Authentication problem Message-ID: <508fb1f910ffb52fff2236478747cce8@ulrik.uio.no> Hi I am doing some tests with polygraph, when I want to involve some authentication, I get "segmentation fault" error. I use a simple pgl file wich works well otherwise. I added these parts: string[] endEast = credentials(10000, "east-end"); string[] endWest = credentials(15000, "west-end"); Robot R = { ... credentials = select( [ endEast, endWest ],1000); // just 1000 actual credentials ... } but when "use()"ing endEast and endWest use(endEast,emdWest) I will get the above error when try to run the process. here is the log: Apr 14 01:55:35 server kernel: [23711.540866] polygraph-serve[19851]: segfault at 4b ip 000000000000004b sp 00007fffe27b6d08 error 14 in polygraph-server[400000+13f000] what I need to do else or what I am doing wrong? thank u in advance -- Regards Shahab B. From unjc.email at gmail.com Thu Apr 19 21:31:30 2012 From: unjc.email at gmail.com (unjc email) Date: Thu, 19 Apr 2012 17:31:30 -0400 Subject: Domain List Message-ID: Hello there, I have a long list of (~30000) domains which I want Webpolygraph clients to use them in the request URLs. Because the proxy server I am testing would apply different policies upon different domains, I can't really use dynamic domains. I have tried using addressMap in workload as like below: AddrMap M = { names = [ 'google.com','facebook.com','youtube.com','yahoo.com','live.com','baidu.com','blogspot.com','wikipedia.org'........]; addresses = [ '192.168.1.1' ]; }; Robot R = { kind = "R101"; pop_model = { pop_distr = popUnif(); }; recurrence = 50%; req_rate = undef(); origins = M.names; } I believe the address list is too long, Webpolygraph throws the "SynSym.cc:61: cannot cast string to addr" exception after trying to start the test for more than 10 minutes. Please advise if there is another way to input my custom domain-list for Webpolygraph to generate URLs. Thanks, Jacky From rousskov at measurement-factory.com Thu Apr 19 23:52:41 2012 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Thu, 19 Apr 2012 17:52:41 -0600 Subject: Huge Domain List In-Reply-To: References: Message-ID: <4F90A549.7020005@measurement-factory.com> On 04/19/2012 03:31 PM, unjc email wrote: > I have a long list of (~30000) domains which I want Webpolygraph > clients to use them in the request URLs. > I have tried using addressMap in workload as like below: > AddrMap M = { > names = [ 'google.com','facebook.com','youtube.com','yahoo.com','live.com','baidu.com','blogspot.com','wikipedia.org'........]; > addresses = [ '192.168.1.1' ]; > }; > I believe the address list is too long, Webpolygraph throws the > "SynSym.cc:61: cannot cast string to addr" exception after trying to > start the test for more than 10 minutes. > > Please advise if there is another way to input my custom domain-list > for Webpolygraph to generate URLs. I just tried a simple.pg workload with an address map like yours that has 30000 domain names (formed from a local dictionary file). It does take 25min to parse and interpret 30K strings(*), but the test starts. Even the memory consumption looks reasonable at 100MB per process. The client then fails in my case because I do not have a name server setup to resolve those names, but I hope you do. I used the following #include trick to keep the workload readable: > AddrMap M = { > names = [ > 'firstname', > #include "/tmp/names" > 'lastname' > ]; > addresses = [ '192.168.1.1:80' ]; > }; > > use(M); Does your workload work fine with, say, 10 custom domains? If yes, perhaps your input line is too long (for Polygraph or for your text editor)? Try the #include trick above, with every domain on its own line. It is more manageable that way. HTH, Alex. P.S.(*) Polygraph is not optimized to quickly grok 30K random names. In fact, the default algorithm may try to find a "range" pattern in those names so that they can be merged into a more compact representation. It is possible to optimize handling of a large number of random names as well, of course. Running more than a few tests with 25min startup times would not be very productive! From unjc.email at gmail.com Fri Apr 20 03:28:06 2012 From: unjc.email at gmail.com (unjc email) Date: Thu, 19 Apr 2012 23:28:06 -0400 Subject: Huge Domain List In-Reply-To: <4F90A549.7020005@measurement-factory.com> References: <4F90A549.7020005@measurement-factory.com> Message-ID: Hi Alex, I have a shorter list of 5000 domains that I usually test with. It works pretty well with a "poisoned" DNS server. However, I found that 5000 domains are still quite enough for stressing the application. That's why I want to run with a much longer list of domains. If Webpolygraph manages to run with the 30k domain list, it would likely be a stress test I will run regularly. Therefore, as you said, 25-min of start time is definitely not very efficient for running a test. Do you know if there is any way to shorten the preparation time? Would a sorted list be helpful? Thanks, Jacky On Thu, Apr 19, 2012 at 7:52 PM, Alex Rousskov wrote: > On 04/19/2012 03:31 PM, unjc email wrote: >> I have a long list of (~30000) domains which I want Webpolygraph >> clients to use them in the request URLs. > >> I have tried using addressMap in workload as like below: > >> AddrMap M = { >> ? ? ?names = [ 'google.com','facebook.com','youtube.com','yahoo.com','live.com','baidu.com','blogspot.com','wikipedia.org'........]; >> ? ? ?addresses = [ '192.168.1.1' ]; >> }; > > >> I believe the address list is too long, Webpolygraph throws the >> "SynSym.cc:61: cannot cast string to addr" exception after trying to >> start the test for more than 10 minutes. >> >> Please advise if there is another way to input my custom domain-list >> for Webpolygraph to generate URLs. > > > I just tried a simple.pg workload with an address map like yours that > has 30000 domain names (formed from a local dictionary file). It does > take 25min to parse and interpret 30K strings(*), but the test starts. > Even the memory consumption looks reasonable at 100MB per process. The > client then fails in my case because I do not have a name server setup > to resolve those names, but I hope you do. > > I used the following #include trick to keep the workload readable: > >> AddrMap M = { >> ? ? ?names = [ >> ? ? ?'firstname', >> #include "/tmp/names" >> ? ? ?'lastname' >> ? ? ?]; >> ? ? ?addresses = [ '192.168.1.1:80' ]; >> }; >> >> use(M); > > > Does your workload work fine with, say, 10 custom domains? > > If yes, perhaps your input line is too long (for Polygraph or for your > text editor)? Try the #include trick above, with every domain on its own > line. It is more manageable that way. > > > HTH, > > Alex. > P.S.(*) Polygraph is not optimized to quickly grok 30K random names. In > fact, the default algorithm may try to find a "range" pattern in those > names so that they can be merged into a more compact representation. It > is possible to optimize handling of a large number of random names as > well, of course. Running more than a few tests with 25min startup times > would not be very productive! From shahabb at ifi.uio.no Fri Apr 20 13:06:34 2012 From: shahabb at ifi.uio.no (shahab bakhtiyari) Date: Fri, 20 Apr 2012 15:06:34 +0200 Subject: polygraph report and raw data In-Reply-To: References: Message-ID: Hi Dmitry Thank you very much to your previous reply. I have a couple of configuration questions that I am sure very easy for you to answer! 1. what "working set size" and "cache size" should I use for a 1GB disk cache testing?, Ram is 4GB but not all cache is used by the proxy(it varies from 10% to 30% of Ram usage by proxy). I read in the website that "cache size" should be whole proxy box cache plus disk cache(4+1 = 5) . But would it correct considering that proxy does not use the whole cache? if it is the case and I have to chose 5Gb as cache size, then I would need to choose much larger working set size( maybe 1G rather than 100MB that I am currently using), right? 2. Could you please have a look here http://client.servebeer.com:8081/varnishYkGcache/ , seeing config files and error part, and tell me, which part in config part I have to modofy? Thank you very much in advance Shahab On 26 March 2012 12:22, shahab bakhtiyari wrote: > Hi guys > > I have 2 questions, really appreciate if somebody helps me, > > 1. how can I get the raw data from polygraph, I mean the data that > "polygraph-reporter" gives to the gnuplot to plot it(I dont really like > gnuplot!!!) > > 2. I currently have my set up in a private network with private IP > addresses(and ofcource have access to internet as well), I am thinking > whether it is possible or not, to add some more clients (since I only have > 2 physical clients with a limited number of robots) from Amazon instances? > > Best regards > Shahab > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rousskov at measurement-factory.com Fri Apr 20 14:02:46 2012 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Fri, 20 Apr 2012 08:02:46 -0600 Subject: Huge Domain List In-Reply-To: References: <4F90A549.7020005@measurement-factory.com> Message-ID: <4F916C86.3090704@measurement-factory.com> On 04/19/2012 09:28 PM, unjc email wrote: > I have a shorter list of 5000 domains that I usually test with. It > works pretty well with a "poisoned" DNS server. However, I found > that 5000 domains are still quite enough for stressing the > application. That's why I want to run with a much longer list of > domains. > > If Webpolygraph manages to run with the 30k domain list, it would > likely be a stress test I will run regularly. Therefore, as you said, > 25-min of start time is definitely not very efficient for running a > test. Do you know if there is any way to shorten the preparation > time? Would a sorted list be helpful? A sorted list would not help. We need new code to import huge lists of random domain names efficiently. Patches or sponsorships are welcome. Cheers, Alex. From dmitry.kurochkin at measurement-factory.com Fri Apr 20 14:33:21 2012 From: dmitry.kurochkin at measurement-factory.com (Dmitry Kurochkin) Date: Fri, 20 Apr 2012 18:33:21 +0400 Subject: Authentication problem In-Reply-To: <508fb1f910ffb52fff2236478747cce8@ulrik.uio.no> References: <508fb1f910ffb52fff2236478747cce8@ulrik.uio.no> Message-ID: <87hawelbgu.fsf@gmail.com> Hi Shahab. Sorry for a delayed reply, I somehow missed your email. Shahab Bakhtiyari writes: > Hi > > I am doing some tests with polygraph, when I want to involve some > authentication, I get "segmentation fault" error. I use a simple pgl > file wich works well otherwise. > > I added these parts: > > > string[] endEast = credentials(10000, "east-end"); > string[] endWest = credentials(15000, "west-end"); > > Robot R = { > ... > > credentials = select( > [ endEast, endWest ],1000); // just 1000 actual > credentials > > ... > > } > > > but when "use()"ing endEast and endWest > > use(endEast,emdWest) > > > I will get the above error when try to run the process. > > here is the log: > > Apr 14 01:55:35 server kernel: [23711.540866] polygraph-serve[19851]: > segfault at 4b ip 000000000000004b sp 00007fffe27b6d08 error 14 in > polygraph-server[400000+13f000] > > > > > what I need to do else or what I am doing wrong? > You should not use() string arrays. Just remove "use(endEast,emdWest);" line. Segfault is a bug, of course. Thanks for reporting. We will look into it. Regards, Dmitry > thank u in advance > -- > Regards > Shahab B. > _______________________________________________ > Users mailing list > Users at web-polygraph.org > http://www.web-polygraph.org/mailman/listinfo/users From dmitry.kurochkin at measurement-factory.com Fri Apr 27 16:58:52 2012 From: dmitry.kurochkin at measurement-factory.com (Dmitry Kurochkin) Date: Fri, 27 Apr 2012 20:58:52 +0400 Subject: polygraph report and raw data In-Reply-To: References: Message-ID: <87mx5xjelv.fsf@gmail.com> Hi Shahab. shahab bakhtiyari writes: > Hi Dmitry > Thank you very much to your previous reply. I have a couple of > configuration questions that I am sure very easy for you to answer! > > 1. what "working set size" and "cache size" should I use for a 1GB disk > cache testing?, Ram is 4GB but not all cache is used by the proxy(it > varies from 10% to 30% of Ram usage by proxy). I read in the website that > "cache size" should be whole proxy box cache plus disk cache(4+1 = 5) > . But would it correct considering that proxy does not use the whole cache? > There is no "cache size" in PGL workload. It may be a variable specific to the workload. It's meaning depends on the workload. In your workload CacheSize variable is used to determine the fill phase length. This phase is indented to generate enough traffic to fill the whole cache multiple times (2 in the workload you are using). CacheSize value should be the total cache size, both disk and RAM (not the amount of RAM on the system, but the amount of RAM used by proxy for cache). > > if it is the case and I have to chose 5Gb as cache size, then I would need > to choose much larger working set size( maybe 1G rather than 100MB that I > am currently using), right? > First of all, you need to understand what working set is. Please read working_set_length() PGL function description at [1]. A good value for WSS to start with is the total cache size, i.e. cache_size / mean_object_size. Keep in mind that by setting too high (or unlimited) WSS makes it impossible for proxy to reply with hit for every offered hit because Polygraph would request objects that have already been replaced with newer ones in the cache. Setting too small WSS (like 100MB for 5GB cache) may make it "easier" for proxy to reply with hits for every offered hit because it needs to cache a just small portion of recent objects. Both situations are bad and may lead to inaccurate test results. > > 2. Could you please have a look here > http://client.servebeer.com:8081/varnishYkGcache/ , seeing config files > and error part, and tell me, which part in config part I have to modofy? > An error does not mean that there is something wrong with your workload. It depends on your test and your environment. E.g. connection errors may mean that your system configuration should be tuned, or that your proxy has an issue, or that the request rate is just too high. You should take a look at each error and investigate it. Try to identify what triggers an error (e.g. a particular request header) and reproduce it with a simplest workload possible or even with a tool like curl(1). Polygraph should print HTTP headers dump on console. Some errors are very likely to be proxy issues, e.g. checksum mismatch. You may want to fix the proxy or disable checksums in your workload. Regards, Dmitry > Thank you very much in advance > Shahab > > > On 26 March 2012 12:22, shahab bakhtiyari wrote: > >> Hi guys >> >> I have 2 questions, really appreciate if somebody helps me, >> >> 1. how can I get the raw data from polygraph, I mean the data that >> "polygraph-reporter" gives to the gnuplot to plot it(I dont really like >> gnuplot!!!) >> >> 2. I currently have my set up in a private network with private IP >> addresses(and ofcource have access to internet as well), I am thinking >> whether it is possible or not, to add some more clients (since I only have >> 2 physical clients with a limited number of robots) from Amazon instances? >> >> Best regards >> Shahab >> >> >> >> > _______________________________________________ > Users mailing list > Users at web-polygraph.org > http://www.web-polygraph.org/mailman/listinfo/users