From alberto.klocker at bluereef.com.au Mon Sep 9 01:40:16 2013 From: alberto.klocker at bluereef.com.au (Alberto Klocker) Date: Mon, 9 Sep 2013 11:40:16 +1000 Subject: Is a client robot an OS thread? Message-ID: Hi, We run tests on a 10 Gb network environment with Polygraph as one of the benchmarking tools but have found we are hitting 100% CPU usage on both the poly-client and poly-server which would indicate we need to spread the processes and systems out. How well does the reporting engine handle log files from multiple sources? Say I were to script a system that launches two or three instances of the client and server, will the reporting engine be capable of combining the results into one report? Would it be better I write something that "merges" the results together? I'm curious to see what others have done to ramp Polygraph up above its single core limits. We can get 10 Gb throughput with large files but my goal is to find maximum, stable concurrent connections. Any help is much appreciated! Kevin Xie1 > writes: >* Thanks for your quick response, Dmitry.*>**>* However, I don't understand how a single thread can simulate multiple*>* concurrent "users" submitting request to web servers. Is it that the*>* single thread iterates all open connections one by one (submitting a*>* request or processing response one at a time)?* Sort of, though details are more complex than that. If you are interested, search for I/O multiplexing. >* bound by CPU since just 1 core is used at a time, or by slow*>* connections ..., do I miss something here, or my understanding is*>* fundamentally wrong?*>** The fact that a single polygraph process can use only a single core may be a limitation indeed. But moving each simulated agent to a separate OS thread is not the solution. You can run multiple Polygraph client or server processes on a single system. But that makes it harder to manage the test and usually requires some auxiliary scripts to start all processes. It would be nice to have SMP support in Polygraph and we have some ideas about it. But so far there has not been enough interest in such project. Patches or sponsorships are welcome. In any case, please do not make conclusions about Polygraph performance from the fact that it is single-threaded. Polygraph is successfully used for high performance tests in 10Gbit networks (though it may require multiple Polygraph processes and/or systems). Regards, Dmitry >* Appreciated for any light shed!*>**>* Kevin*>**>* -----Original Message-----*>* From: Dmitry Kurochkin [mailto:dmitry.kurochkin at measurement-factory.com ] *>* Sent: May-03-12 3:19 PM*>* To: Kevin Xie1*>* Cc: users at web-polygraph.org *>* Subject: Re: Is a client robot an OS thread?*>**>* Hi Kevin.*>**>* Kevin Xie1 > writes:*>**>>* Hi,*>>**>>* Is a client robot a real OS thread? I'm using web polygrah in Linux,*>>* but I doesn't see multiple threads in the OS during a test with 500*>>* robots.*>>**>**>* Polygraph client and server processes are single-threaded. All Robot*>* and Server agents are running in a single thread.*>**>* Regards,*>* Dmitry*>**>>* Thanks,*>>* Kevin* *Alberto Klocker* | Software Developer | www.bluereef.com.au *_________________________________________________________________* *D: *+61 3 9895 8006 | *T*: +61 3 9898 8000 | @AlbertoKlocker Please consider the environment before printing this email -------------- next part -------------- An HTML attachment was scrubbed... URL: From rousskov at measurement-factory.com Mon Sep 9 15:07:08 2013 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Mon, 09 Sep 2013 09:07:08 -0600 Subject: Is a client robot an OS thread? In-Reply-To: References: Message-ID: <522DE41C.4080103@measurement-factory.com> On 09/08/2013 07:40 PM, Alberto Klocker wrote: > We run tests on a 10 Gb network environment with Polygraph as one of > the benchmarking tools but have found we are hitting 100% CPU usage > on both the poly-client and poly-server which would indicate we need > to spread the processes and systems out. Yes, you must run multiple Polygraph client and server processes to test at 10Gb speeds [without using huge responses]. > How well does the reporting engine handle log files from multiple sources? Reporter and lx handle them well because Polygraph had to merge logs from different _hosts_ from the early days of first cache-offs. The primary problem with the SMP test setup is in IP address management and configuration. In our SMP tests, we usually: - create all IP aliases before the test using polygraph-aka; - use --fake-hosts to tell each Polygraph process which local IP aliases it should use; and - fool Polygraph into thinking that is running on multiple hosts to make sure each client process generates appropriate load (in reality, each host is a CPU core). We are working on removing the need for above hacks in future Polygraph releases. The above may not be relevant/important if your workloads do not use PGL Benches and related features. > Say I were to script a system that launches two or three instances of > the client and server, will the reporting engine be capable of > combining the results into one report? Yes. Just give polygraph-reporter all the client-side and all the server-side logs at once. > Would it be better I write something that "merges" the results together? No, and merging statistics correctly is actually not as easy as it may seem. > I'm curious to see what others have done to ramp Polygraph up above its single core limits. Please see the above notes. We are working on publishing our 10Gb results which will allow us to share more details of our setup. HTH, Alex. From pavel.kazlenka at measurement-factory.com Mon Sep 9 16:53:50 2013 From: pavel.kazlenka at measurement-factory.com (Pavel Kazlenka) Date: Mon, 09 Sep 2013 19:53:50 +0300 Subject: Is a client robot an OS thread? In-Reply-To: References: Message-ID: <522DFD1E.4080407@measurement-factory.com> Hi Alberto, On 09/09/2013 04:40 AM, Alberto Klocker wrote: > Hi, > We run tests on a 10 Gb network environment with Polygraph as one of the benchmarking tools but have found we are hitting 100% CPU usage on both the poly-client and poly-server which would indicate we need to spread the processes and systems out. > How well does the reporting engine handle log files from multiple sources? > Say I were to script a system that launches two or three instances of the client and server, will the reporting engine be capable of combining the results into one report? All the polygraph reporting tools (like polygraph-lx, polygraph-ltrace, polygraph-reporter) support multiple logs aggregation. E.g. if you have logs from three client workers named clt.1.log, clt.2.log, clt.3.log you can extract statistics from them simply using $ polygraph-lx clt.1.log clt.2.log clt.3.log. Same for others. However, there is a minor bug that prevents from ltrace tool extracting logs from both server and client sides in one run: https://bugs.launchpad.net/polygraph/+bug/1204983 > Would it be better I write something that "merges" the results together? As a result of written above, you don't have to write anything. > I'm curious to see what others have done to ramp Polygraph up above its single core limits. > We can get 10 Gb throughput with large files but my goal is to find maximum, stable concurrent connections. > Any help is much appreciated! Sure Alberto. There is a number of actions that could help you increase polygraph performance: 1) Run multiple instances of workers and bind them to different cpu cores. E.g if you have 2 cpu cores on machine that runs polygraph-client, run two clients like: taskset -c 0 polygraph-client --log clt.1.log ... taskset -c 1 polygraph-client --log clt.2.log ... 2) Bind tune cpu affinity for NIC interrupts to cores different from the ones that already occupied by polygraph workers (clients or servers). E.g. if you have 4 cpu cores, consider to run three polygraph-client instances bound to cores 0-2 and bind NIC interrupts to core 3. This article should help you with SMP affinity in linux: http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux . There is no good formula to say what should be ratio between workers' and NIC's cores, but top output could give you the clues. 3) If you have 'virtual' (hyper-threading provided) cores, it's not a good idea to bind polygraph workers to both 'real' and 'virtual' instances of the same core. 4) If you have enough RAM, consider writing test results on in-memory FS like tmpfs (https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt). 5) Tuning linux TCP stack for 10G networking is important part too. I'd use this document as a guide: http://landley.net/kdocs/ols/2009/ols2009-pages-169-184.pdf . Best wishes, Pavel > > Kevin Xie1 > writes: > > >/ Thanks for your quick response, Dmitry. > />/ > />/ However, I don't understand how a single thread can simulate multiple > />/ concurrent "users" submitting request to web servers. Is it that the > />/ single thread iterates all open connections one by one (submitting a > />/ request or processing response one at a time)? > / > Sort of, though details are more complex than that. If you are > interested, search for I/O multiplexing. > > >/ bound by CPU since just 1 core is used at a time, or by slow > />/ connections ..., do I miss something here, or my understanding is > />/ fundamentally wrong? > />/ > / > The fact that a single polygraph process can use only a single core may > be a limitation indeed. But moving each simulated agent to a separate > OS thread is not the solution. > > You can run multiple Polygraph client or server processes on a single > system. But that makes it harder to manage the test and usually > requires some auxiliary scripts to start all processes. > > It would be nice to have SMP support in Polygraph and we have some ideas > about it. But so far there has not been enough interest in such > project. Patches or sponsorships are welcome. > > In any case, please do not make conclusions about Polygraph performance > from the fact that it is single-threaded. Polygraph is successfully > used for high performance tests in 10Gbit networks (though it may > require multiple Polygraph processes and/or systems). > > Regards, > Dmitry > > >/ Appreciated for any light shed! > />/ > />/ Kevin > />/ > />/ -----Original Message----- > />/ From: Dmitry Kurochkin [mailto:dmitry.kurochkin at measurement-factory.com ] > />/ Sent: May-03-12 3:19 PM > />/ To: Kevin Xie1 > />/ Cc:users at web-polygraph.org > />/ Subject: Re: Is a client robot an OS thread? > />/ > />/ Hi Kevin. > />/ > />/ Kevin Xie1 > writes: > />/ > />>/ Hi, > />>/ > />>/ Is a client robot a real OS thread? I'm using web polygrah in Linux, > />>/ but I doesn't see multiple threads in the OS during a test with 500 > />>/ robots. > />>/ > />/ > />/ Polygraph client and server processes are single-threaded. All Robot > />/ and Server agents are running in a single thread. > />/ > />/ Regards, > />/ Dmitry > />/ > />>/ Thanks, > />>/ Kevin/ > *Alberto Klocker* | Software Developer | www.bluereef.com.au > > ___________________________________________________________________ > *D: *+61 3 9895 8006 | *T*: +61 3 9898 8000 | @AlbertoKlocker > Please consider the environment before printing this email > > > _______________________________________________ > Users mailing list > Users at web-polygraph.org > http://www.web-polygraph.org/mailman/listinfo/users -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto.klocker at bluereef.com.au Tue Sep 10 05:59:42 2013 From: alberto.klocker at bluereef.com.au (Alberto Klocker) Date: Tue, 10 Sep 2013 15:59:42 +1000 Subject: Is a client robot an OS thread? In-Reply-To: <522DFD1E.4080407@measurement-factory.com> References: <522DFD1E.4080407@measurement-factory.com> Message-ID: Thanks for all the help! You can really see a lot of hard work and love has gone into Web-Polygraph. I've written a rails front end to handle the launching, gather all the files and host the finished report so your instructions fit perfectly. *Alberto Klocker* | Software Developer | www.bluereef.com.au *_________________________________________________________________* *D: *+61 3 9895 8006 | *T*: +61 3 9898 8000 | @AlbertoKlocker Please consider the environment before printing this email On Tue, Sep 10, 2013 at 2:53 AM, Pavel Kazlenka < pavel.kazlenka at measurement-factory.com> wrote: > Hi Alberto, > > On 09/09/2013 04:40 AM, Alberto Klocker wrote: > > Hi, > > We run tests on a 10 Gb network environment with Polygraph as one of the benchmarking tools but have found we are hitting 100% CPU usage on both the poly-client and poly-server which would indicate we need to spread the processes and systems out. > > How well does the reporting engine handle log files from multiple sources? > > Say I were to script a system that launches two or three instances of the client and server, will the reporting engine be capable of combining the results into one report? > > All the polygraph reporting tools (like polygraph-lx, polygraph-ltrace, > polygraph-reporter) support multiple logs aggregation. E.g. if you have > logs from three client workers named clt.1.log, clt.2.log, clt.3.log you > can extract statistics from them simply using > $ polygraph-lx clt.1.log clt.2.log clt.3.log. Same for others. However, > there is a minor bug that prevents from ltrace tool extracting logs from > both server and client sides in one run: > https://bugs.launchpad.net/polygraph/+bug/1204983 > > Would it be better I write something that "merges" the results together? > > As a result of written above, you don't have to write anything. > > I'm curious to see what others have done to ramp Polygraph up above its single core limits. > > We can get 10 Gb throughput with large files but my goal is to find maximum, stable concurrent connections. > > Any help is much appreciated! > > Sure Alberto. There is a number of actions that could help you increase > polygraph performance: > > > 1) Run multiple instances of workers and bind them to different cpu cores. > E.g if you have 2 cpu cores on machine that runs polygraph-client, run two > clients like: > taskset -c 0 polygraph-client --log clt.1.log ... > taskset -c 1 polygraph-client --log clt.2.log ... > > 2) Bind tune cpu affinity for NIC interrupts to cores different from the > ones that already occupied by polygraph workers (clients or servers). E.g. > if you have 4 cpu cores, consider to run three polygraph-client instances > bound to cores 0-2 and bind NIC interrupts to core 3. This article should > help you with SMP affinity in linux: > http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux. There is no good formula to say what should be ratio between workers' and > NIC's cores, but top output could give you the clues. > > 3) If you have 'virtual' (hyper-threading provided) cores, it's not a good > idea to bind polygraph workers to both 'real' and 'virtual' instances of > the same core. > > 4) If you have enough RAM, consider writing test results on in-memory FS > like tmpfs (https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt > ). > > 5) Tuning linux TCP stack for 10G networking is important part too. I'd > use this document as a guide: > http://landley.net/kdocs/ols/2009/ols2009-pages-169-184.pdf . > > Best wishes, > Pavel > > Kevin Xie1 > writes: > > >* Thanks for your quick response, Dmitry.*>**>* However, I don't understand how a single thread can simulate multiple*>* concurrent "users" submitting request to web servers. Is it that the*>* single thread iterates all open connections one by one (submitting a*>* request or processing response one at a time)?* > Sort of, though details are more complex than that. If you are > interested, search for I/O multiplexing. > > >* bound by CPU since just 1 core is used at a time, or by slow*>* connections ..., do I miss something here, or my understanding is*>* fundamentally wrong?*>** > The fact that a single polygraph process can use only a single core may > be a limitation indeed. But moving each simulated agent to a separate > OS thread is not the solution. > > You can run multiple Polygraph client or server processes on a single > system. But that makes it harder to manage the test and usually > requires some auxiliary scripts to start all processes. > > It would be nice to have SMP support in Polygraph and we have some ideas > about it. But so far there has not been enough interest in such > project. Patches or sponsorships are welcome. > > In any case, please do not make conclusions about Polygraph performance > from the fact that it is single-threaded. Polygraph is successfully > used for high performance tests in 10Gbit networks (though it may > require multiple Polygraph processes and/or systems). > > Regards, > Dmitry > > >* Appreciated for any light shed!*>**>* Kevin*>**>* -----Original Message-----*>* From: Dmitry Kurochkin [mailto:dmitry.kurochkin at measurement-factory.com ] *>* Sent: May-03-12 3:19 PM*>* To: Kevin Xie1*>* Cc: users at web-polygraph.org *>* Subject: Re: Is a client robot an OS thread?*>**>* Hi Kevin.*>**>* Kevin Xie1 > writes:*>**>>* Hi,*>>**>>* Is a client robot a real OS thread? I'm using web polygrah in Linux,*>>* but I doesn't see multiple threads in the OS during a test with 500*>>* robots.*>>**>**>* Polygraph client and server processes are single-threaded. All Robot*>* and Server agents are running in a single thread.*>**>* Regards,*>* Dmitry*>**>>* Thanks,*>>* Kevin* > > *Alberto Klocker* | Software Developer | www.bluereef.com.au > *_________________________________________________________________* > *D: *+61 3 9895 8006 | *T*: +61 3 9898 8000 | @AlbertoKlocker > Please consider the environment before printing this email > > > _______________________________________________ > Users mailing listUsers at web-polygraph.orghttp://www.web-polygraph.org/mailman/listinfo/users > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: