From rousskov at measurement-factory.com Wed Aug 1 22:33:43 2018 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Wed, 1 Aug 2018 16:33:43 -0600 Subject: Bench SMP mode In-Reply-To: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> Message-ID: On 07/29/2018 06:15 PM, William Law wrote: > I'm trying to move away from launching multiple polygraph-server/client > instances via a script and allocating a specific core, fake_hosts subset > and same config to utilising SMP mode (which I assume you launch one > instance and it uses as many cores as it needs). Your assumption is correct. You should not use fake hosts though. Let Polygraph create aliases for you. > I have the following bench config: > > Bench sslBench = { > client_side = { > max_host_load = 100/sec; > max_agent_load = 1/sec; > addr_space = [ '198.18.24-27.10-249/22' ]; > hosts = [ '198.18.24.2' ] ** 750; > cpu_cores = [ 65535 ]; > }; > server_side = { > max_host_load = client_side.max_host_load; > max_agent_load = client_side.max_agent_load; > addr_space = [ '198.18.28-29.10-249:443/23' ]; > hosts = [ '198.18.28.2' ] ** 480; > cpu_cores = [ 65535 ]; > }; > }; Your "750" and "480" should be the number of cores on hosts 198.18.24.2 and 198.18.28.2 (cores that you want to use). It is best to put that multiplier inside the address array. Your [ 65535 ] should be an array of arrays of core IDs, one inner array per SMP worker, telling the corresponding worker which core(s) to use. Here is a better (but untested) version of your Bench, using 4 cores per drone, numbered 2 through 5. Bench sslBench = { client_side = { max_host_load = 100/sec; max_agent_load = 1/sec; addr_space = [ 'lo::198.18.24-27.10-249/32' ]; hosts = [ '198.18.24.2' ** 4 ]; cpu_cores = [ [2], [3], [4], [5] ]; }; server_side = { max_host_load = client_side.max_host_load; max_agent_load = client_side.max_agent_load; addr_space = [ 'lo::198.18.28-29.10-249:443/32' ]; hosts = [ '198.18.28.2' ** 4 ]; cpu_cores = client_side.cpu_cores; }; }; Avoid sharing physical cores among virtual cores: Two busy virtual cores can do _less_ than one real core they share. If you really have 40 physical cores, and you want to use most of them, then you would probably want to generate the cpu_cores array by a script. We should add PGL function(s) to generate typical CPU affinity map(s) for a given number of cores. Quality patches or sponsorships welcome. Note that I added loopback interfaces to addr_space and changed their subnet to /32. You should let Polygraph create these aliases (and the corresponding robots and servers) on the loopback interface and then configure a couple of simple routes for all agents to be able to talk to each other (or the proxy). No --fake_hosts! > after about 7 > minutes I was getting latency suddenly spiking to >40 seconds. In that > situation, none of the processes is running at more than 26% of the single > core its running on, and still way under the throughput that the DUT can > handle. I suspect your OS ran out of some resource like RAM for Polygraph processes, conntrack buffer space, or ephemeral ports. Check system logs. If that does not help, try monitoring with atop(1). HTH, Alex. From william.law at tesserent.com Thu Aug 2 00:03:04 2018 From: william.law at tesserent.com (William Law) Date: Thu, 2 Aug 2018 10:03:04 +1000 Subject: Bench SMP mode In-Reply-To: References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> Message-ID: Hi Alex, > -----Original Message----- > From: Alex Rousskov [mailto:rousskov at measurement-factory.com] > Sent: Thursday, 2 August 2018 8:34 AM > To: William Law; users at lists.web-polygraph.org > Subject: Re: Bench SMP mode > > On 07/29/2018 06:15 PM, William Law wrote: > > > I'm trying to move away from launching multiple polygraph-server/client > > instances via a script and allocating a specific core, fake_hosts subset > > and same config to utilising SMP mode (which I assume you launch one > > instance and it uses as many cores as it needs). > > Your assumption is correct. You should not use fake hosts though. Let > Polygraph create aliases for you. > I'm using fake hosts to limit the particular instance to a number of IP's in the greater pool. This was helping reduce the CPU load that each agent was consuming. > > I have the following bench config: > > > > Bench sslBench = { > > client_side = { > > max_host_load = 100/sec; > > max_agent_load = 1/sec; > > addr_space = [ '198.18.24-27.10-249/22' ]; > > hosts = [ '198.18.24.2' ] ** 750; > > cpu_cores = [ 65535 ]; > > }; > > server_side = { > > max_host_load = client_side.max_host_load; > > max_agent_load = client_side.max_agent_load; > > addr_space = [ '198.18.28-29.10-249:443/23' ]; > > hosts = [ '198.18.28.2' ] ** 480; > > cpu_cores = [ 65535 ]; > > }; > > }; > > Your "750" and "480" should be the number of cores on hosts 198.18.24.2 > and 198.18.28.2 (cores that you want to use). It is best to put that > multiplier inside the address array. I had done it this way as then it accurately produced the number of servers and robots to cover the IP space > > Your [ 65535 ] should be an array of arrays of core IDs, one inner array > per SMP worker, telling the corresponding worker which core(s) to use. > This was the final array variable that I ended up on trying to find a valid value for the field that the code wouldn't grumble about. > > Here is a better (but untested) version of your Bench, using 4 cores per > drone, numbered 2 through 5. > > Bench sslBench = { > client_side = { > max_host_load = 100/sec; > max_agent_load = 1/sec; > addr_space = [ 'lo::198.18.24-27.10-249/32' ]; > hosts = [ '198.18.24.2' ** 4 ]; > cpu_cores = [ [2], [3], [4], [5] ]; > }; > server_side = { > max_host_load = client_side.max_host_load; > max_agent_load = client_side.max_agent_load; > addr_space = [ 'lo::198.18.28-29.10-249:443/32' ]; > hosts = [ '198.18.28.2' ** 4 ]; > cpu_cores = client_side.cpu_cores; > }; > }; > > Avoid sharing physical cores among virtual cores: Two busy virtual cores > can do _less_ than one real core they share. I might turn off hyperthreading then and give what you have provided a go. I'll let you know. > > If you really have 40 physical cores, and you want to use most of them, > then you would probably want to generate the cpu_cores array by a > script. We should add PGL function(s) to generate typical CPU affinity > map(s) for a given number of cores. Quality patches or sponsorships > welcome. I'm running 2 Dell R830's w/ 4x Xeon E5-4620 v4's and 256GB RAM each (also 8x 10G and 6x 1G eth ports). > Note that I added loopback interfaces to addr_space and changed their > subnet to /32. You should let Polygraph create these aliases (and the > corresponding robots and servers) on the loopback interface and then > configure a couple of simple routes for all agents to be able to talk to > each other (or the proxy). No --fake_hosts! I'm curious as to why you attach the IP's to lo and not the interface that is connected to the DUT though. > > after about 7 > > minutes I was getting latency suddenly spiking to >40 seconds. In that > > situation, none of the processes is running at more than 26% of the > > single > > core its running on, and still way under the throughput that the DUT can > > handle. > > > I suspect your OS ran out of some resource like RAM for Polygraph > processes, conntrack buffer space, or ephemeral ports. Check system > logs. If that does not help, try monitoring with atop(1). Conntrack isn't an issue here (not used on the client/server endpoints) nor is RAM usage (see my note on the specs above). It might be ephemeral ports, but I don't think I'm establishing enough connections to exhaust the 1025-65535 port range that is configured. Again, I'll check and let you know. > > HTH, > > Alex. Kind Regards, William Law From william.law at tesserent.com Tue Aug 7 05:55:17 2018 From: william.law at tesserent.com (William Law) Date: Tue, 7 Aug 2018 15:55:17 +1000 Subject: Bench SMP mode In-Reply-To: References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> Message-ID: <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> Hi All, Some info, bug and progress update for you. > > > > Here is a better (but untested) version of your Bench, using 4 cores per > > drone, numbered 2 through 5. > > > > Bench sslBench = { > > client_side = { > > max_host_load = 100/sec; > > max_agent_load = 1/sec; > > addr_space = [ 'lo::198.18.24-27.10-249/32' ]; > > hosts = [ '198.18.24.2' ** 4 ]; > > cpu_cores = [ [2], [3], [4], [5] ]; > > }; > > server_side = { > > max_host_load = client_side.max_host_load; > > max_agent_load = client_side.max_agent_load; > > addr_space = [ 'lo::198.18.28-29.10-249:443/32' ]; > > hosts = [ '198.18.28.2' ** 4 ]; > > cpu_cores = client_side.cpu_cores; > > }; > > }; > > > > Avoid sharing physical cores among virtual cores: Two busy virtual cores > > can do _less_ than one real core they share. > Here's the Bench config I've managed to get working. The strange core progression is to utilise the primary logical core ID of each physical core - didn't want to go altering behaviours of other previous tests too much by disabling hyperthreading as yet. Bench sslBench = { client_side = { max_host_load = 10000/sec; max_agent_load = 50/sec; addr_space = [ 'em1::198.18.24-27.10-249/22' ]; hosts = [ '198.18.24.2' ** 16 ] ; cpu_cores = [ [0,1],[2,3],[8,9],[10,11],[16,17],[18,19],[24,25],[26,27],[32,33],[34,35],[40,41],[42,43],[48,49],[50,51],[56,57],[58,59] ]; // cpu_cores = [ [0,1],[2,3],[8,9],[10,11],[16,17],[18,19],[24,25],[26,27],[32,33],[34,35],[40,41],[42,43],[48,49],[50,51],[56,57],[58,59],[64,65],[66,67],[72,73],[74,75] ]; }; server_side = { max_host_load = client_side.max_host_load; max_agent_load = client_side.max_agent_load; addr_space = [ 'em1::198.18.28-29.10-249:443/23' ]; hosts = [ '198.18.28.2' ** 16 ] ; cpu_cores = client_side.cpu_cores; }; }; I'm not getting the same latency spikes that I was before, but I have run into a new problem - I can only configure a maximum of 16 core sets before polygraph-server segfaults or bails with the following: *** glibc detected *** /usr/local/bin/polygraph-server: double free or corruption (fasttop): 0x000000000285ac00 *** ======= Backtrace: ========= /lib64/libc.so.6[0x31d6a75e5e] /lib64/libc.so.6[0x31d6a78cad] /usr/local/bin/polygraph-server[0x40cbe1] /usr/local/bin/polygraph-server[0x41ad92] /usr/local/bin/polygraph-server[0x41baec] /usr/local/bin/polygraph-server[0x4077ba] /lib64/libc.so.6(__libc_start_main+0x100)[0x31d6a1ed20] /usr/local/bin/polygraph-server[0x407409] To get this result, I enter in more than 16 cores, and the associated number of entries in cpu_cores. If you can let me know what other debug information you want from me and I can DM the details to you. This is a shame as the test starts to max out all 16 server processes at about 50% client test load & 5.4Gbit/s of traffic. Although I did leave it running and saw it hit 8Gbit/s with 30+ second latencies as the phase started to back down the load amount. Now the interesting this is that if the processes are running with 2 cores allocated then they should have been showing 200% cpu usage per process (2 cores consumed), not just 100% (1 core). Are the processes not multithreaded in that way? > script. We should add PGL function(s) to generate typical CPU affinity > map(s) for a given number of cores. Quality patches or sponsorships > welcome. Not much of a C coder myself, I could probably cobble something together but it may not necessarily be quality :D Might see if we can help another way. > > > > > HTH, > > > > Alex. Kind Regards, William Law From rousskov at measurement-factory.com Tue Aug 7 15:54:16 2018 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Tue, 7 Aug 2018 09:54:16 -0600 Subject: Bench SMP mode In-Reply-To: <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> Message-ID: <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> On 08/06/2018 11:55 PM, William Law wrote: > Here's the Bench config I've managed to get working. The strange core > progression is to utilise the primary logical core ID of each physical > core - didn't want to go altering behaviours of other previous tests too > much by disabling hyperthreading as yet. I would list just one virtual core (e.g. odd ones) per worker. The workload will look simpler and you might save a few CPU cycles by not moving processes across virtual cores. No need to disable hyperthreading (although doing that may improve overall performance a bit if you have enough physical cores). Another useful trick for high-load tests is to confine NIC interrupts to one (or, if necessary, more) dedicated CPU cores as well. Just like Polygraph, the interrupt processing code in the kernel probably benefits from CPU affinity and, more importantly, confining it prevents CPU-grabbing conflicts between it and Polygraph workers. In summary, you want to give each CPU-hungry program (worker, kernel, etc.) a dedicated CPU core (to the extent possible). > I'm not getting the same latency spikes that I was before, but I have run > into a new problem - I can only configure a maximum of 16 core sets before > polygraph-server segfaults We will try to find the time to reproduce and fix. > If you can let me know what other debug information you want from me and I > can DM the details to you. A stack trace from the crash ("bt full" in gdb) would be a good start if we cannot reproduce this. > if the processes are running with 2 cores allocated then they should have been > showing 200% cpu usage per process (2 cores consumed), not just 100% (1 > core). Are the processes not multithreaded in that way? You have one worker process per physical core. One process cannot consume more than 100% of anything. Workers have no threads (for this discussion, you can view each worker as a thread if you wish). And two virtual cores are a red herring -- in a context of a single busy process, they only add overheads. Cheers, Alex. From william.law at tesserent.com Wed Aug 8 00:28:34 2018 From: william.law at tesserent.com (William Law) Date: Wed, 8 Aug 2018 10:28:34 +1000 Subject: Bench SMP mode In-Reply-To: <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> Message-ID: <065050103a9a6dc003c9ee775dcf5340@mail.gmail.com> > > Here's the Bench config I've managed to get working. The strange core > > progression is to utilise the primary logical core ID of each physical > > core - didn't want to go altering behaviours of other previous tests too > > much by disabling hyperthreading as yet. > > I would list just one virtual core (e.g. odd ones) per worker. The > workload will look simpler and you might save a few CPU cycles by not > moving processes across virtual cores. No need to disable hyperthreading > (although doing that may improve overall performance a bit if you have > enough physical cores). This is what the cpu information looks like on the boxes: ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 80 On-line CPU(s) list: 0-79 Thread(s) per core: 2 Core(s) per socket: 10 Socket(s): 4 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-4620 v4 @ 2.10GHz Stepping: 1 CPU MHz: 2095.127 BogoMIPS: 4190.02 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 25600K NUMA node0 CPU(s): 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76 NUMA node1 CPU(s): 1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61,65,69,73,77 NUMA node2 CPU(s): 2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,66,70,74,78 NUMA node3 CPU(s): 3,7,11,15,19,23,27,31,35,39,43,47,51,55,59,63,67,71,75,79 ~]# I'm working on the logic that every second cpu on a node (aka socket) is the "logical" second hyperthread cpu. Looking at the CPU load when pushing traffic I see the same behaviour with software interrupts loading up particular cores. > > Another useful trick for high-load tests is to confine NIC interrupts to > one (or, if necessary, more) dedicated CPU cores as well. Just like > Polygraph, the interrupt processing code in the kernel probably benefits > from CPU affinity and, more importantly, confining it prevents > CPU-grabbing conflicts between it and Polygraph workers. > > In summary, you want to give each CPU-hungry program (worker, kernel, > etc.) a dedicated CPU core (to the extent possible). > > > > I'm not getting the same latency spikes that I was before, but I have > > run > > into a new problem - I can only configure a maximum of 16 core sets > > before > > polygraph-server segfaults > > We will try to find the time to reproduce and fix. > Unfortunately the test is limited by this, I can easily max out the 16 cores on both sides. For the full 10Gbit throughput with my old test profile I think it was about 22+ cores would see per-process cpu consumption below 95% on both sides. > > > If you can let me know what other debug information you want from me and > > I > > can DM the details to you. > > A stack trace from the crash ("bt full" in gdb) would be a good start if > we cannot reproduce this. See attached! (will email you direct if the list whinges). Setup for 20 cores, 1 dump with 2 cores per worker, the other with 1 core per worker. > > if the processes are running with 2 cores allocated then they should > > have > been > > showing 200% cpu usage per process (2 cores consumed), not just 100% (1 > > core). Are the processes not multithreaded in that way? > > You have one worker process per physical core. One process cannot > consume more than 100% of anything. Workers have no threads (for this > discussion, you can view each worker as a thread if you wish). And two > virtual cores are a red herring -- in a context of a single busy > process, they only add overheads. Shame, thought the robots might have run as individual threads under a worker, make more use of SMP. > Cheers, > > Alex. Regards, William -------------- next part -------------- ~]# gdb --args /usr/local/bin/polygraph-server --cfg_dirs /usr/local/share/polygraph/polytests/em1 --config "10g-web1hr-smp-v2.pg" --log /var/log/polygraph/em1/smp/pclient-%worker.log --idle_tout 300sec GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/local/bin/polygraph-server...done. (gdb) run Starting program: /usr/local/bin/polygraph-server --cfg_dirs /usr/local/share/polygraph/polytests/em1 --config 10g-web1hr-smp-v2.pg --log /var/log/polygraph/em1/smp/pclient-%worker.log --idle_tout 300sec [Thread debugging using libthread_db enabled] 000.00| current time: 1533688009.243832 or Wed, 08 Aug 2018 00:26:49 GMT *** glibc detected *** /usr/local/bin/polygraph-server: double free or corruption (fasttop): 0x00000000007c11a0 *** ======= Backtrace: ========= /lib64/libc.so.6[0x31d6a75e5e] /lib64/libc.so.6[0x31d6a78cad] /usr/local/bin/polygraph-server[0x40cbe1] /usr/local/bin/polygraph-server[0x41ad92] /usr/local/bin/polygraph-server[0x41baec] /usr/local/bin/polygraph-server[0x4077ba] /lib64/libc.so.6(__libc_start_main+0x100)[0x31d6a1ed20] /usr/local/bin/polygraph-server[0x407409] ======= Memory map: ======== 00400000-00555000 r-xp 00000000 fd:00 1706329 /usr/local/bin/polygraph-server 00755000-00758000 rw-p 00155000 fd:00 1706329 /usr/local/bin/polygraph-servera 00758000-01572000 rw-p 00000000 00:00 0 [heap] 31d6600000-31d6620000 r-xp 00000000 fd:00 1441797 /lib64/ld-2.12.so 31d6820000-31d6821000 r--p 00020000 fd:00 1441797 /lib64/ld-2.12.so 31d6821000-31d6822000 rw-p 00021000 fd:00 1441797 /lib64/ld-2.12.so 31d6822000-31d6823000 rw-p 00000000 00:00 0 31d6a00000-31d6b8b000 r-xp 00000000 fd:00 1441800 /lib64/libc-2.12.so 31d6b8b000-31d6d8a000 ---p 0018b000 fd:00 1441800 /lib64/libc-2.12.so 31d6d8a000-31d6d8e000 r--p 0018a000 fd:00 1441800 /lib64/libc-2.12.so 31d6d8e000-31d6d90000 rw-p 0018e000 fd:00 1441800 /lib64/libc-2.12.so 31d6d90000-31d6d94000 rw-p 00000000 00:00 0 31d6e00000-31d6e02000 r-xp 00000000 fd:00 1441821 /lib64/libdl-2.12.so 31d6e02000-31d7002000 ---p 00002000 fd:00 1441821 /lib64/libdl-2.12.so 31d7002000-31d7003000 r--p 00002000 fd:00 1441821 /lib64/libdl-2.12.so 31d7003000-31d7004000 rw-p 00003000 fd:00 1441821 /lib64/libdl-2.12.so 31d7200000-31d7217000 r-xp 00000000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7217000-31d7417000 ---p 00017000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7417000-31d7418000 r--p 00017000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7418000-31d7419000 rw-p 00018000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7419000-31d741d000 rw-p 00000000 00:00 0 31d7a00000-31d7a83000 r-xp 00000000 fd:00 1441811 /lib64/libm-2.12.so 31d7a83000-31d7c82000 ---p 00083000 fd:00 1441811 /lib64/libm-2.12.so 31d7c82000-31d7c83000 r--p 00082000 fd:00 1441811 /lib64/libm-2.12.so 31d7c83000-31d7c84000 rw-p 00083000 fd:00 1441811 /lib64/libm-2.12.so 31d7e00000-31d7e15000 r-xp 00000000 fd:00 1441824 /lib64/libz.so.1.2.3 31d7e15000-31d8014000 ---p 00015000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8014000-31d8015000 r--p 00014000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8015000-31d8016000 rw-p 00015000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8200000-31d821d000 r-xp 00000000 fd:00 1441831 /lib64/libselinux.so.1 31d821d000-31d841c000 ---p 0001d000 fd:00 1441831 /lib64/libselinux.so.1 31d841c000-31d841d000 r--p 0001c000 fd:00 1441831 /lib64/libselinux.so.1 31d841d000-31d841e000 rw-p 0001d000 fd:00 1441831 /lib64/libselinux.so.1 31d841e000-31d841f000 rw-p 00000000 00:00 0 31d8a00000-31d8a16000 r-xp 00000000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8a16000-31d8c16000 ---p 00016000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c16000-31d8c17000 r--p 00016000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c17000-31d8c18000 rw-p 00017000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c18000-31d8c1a000 rw-p 00000000 00:00 0 31d8e00000-31d8e16000 r-xp 00000000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d8e16000-31d9015000 ---p 00016000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d9015000-31d9016000 rw-p 00015000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d9200000-31d92e8000 r-xp 00000000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d92e8000-31d94e8000 ---p 000e8000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94e8000-31d94ef000 r--p 000e8000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94ef000-31d94f1000 rw-p 000ef000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94f1000-31d9506000 rw-p 00000000 00:00 0 31dc200000-31dc3ba000 r-xp 00000000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc3ba000-31dc5ba000 ---p 001ba000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5ba000-31dc5d5000 r--p 001ba000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5d5000-31dc5e1000 rw-p 001d5000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5e1000-31dc5e5000 rw-p 00000000 00:00 0 31dca00000-31dca03000 r-xp 00000000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dca03000-31dcc02000 ---p 00003000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dcc02000-31dcc03000 r--p 00002000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dcc03000-31dcc04000 rw-p 00003000 fd:00 1442034 /lib64/libcom_err.so.2.1 31de200000-31de2dc000 r-xp 00000000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de2dc000-31de4db000 ---p 000dc000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de4db000-31de4e5000 r--p 000db000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de4e5000-31de4e7000 rw-p 000e5000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de600000-31de602000 r-xp 00000000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de602000-31de801000 ---p 00002000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de801000-31de802000 r--p 00001000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de802000-31de803000 rw-p 00002000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31df600000-31df60a000 r-xp 00000000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df60a000-31df809000 ---p 0000a000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df809000-31df80a000 r--p 00009000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df80a000-31df80b000 rw-p 0000a000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31dfa00000-31dfa29000 r-xp 00000000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfa29000-31dfc29000 ---p 00029000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc29000-31dfc2a000 r--p 00029000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc2a000-31dfc2b000 rw-p 0002a000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc2b000-31dfc2c000 rw-p 00000000 00:00 0 31dfe00000-31dfe41000 r-xp 00000000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31dfe41000-31e0041000 ---p 00041000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e0041000-31e0042000 r--p 00041000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e0042000-31e0044000 rw-p 00042000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e3200000-31e3262000 r-xp 00000000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3262000-31e3462000 ---p 00062000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3462000-31e3466000 r--p 00062000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3466000-31e346c000 rw-p 00066000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0 7ffff0021000-7ffff4000000 ---p 00000000 00:00 0 7ffff5f7c000-7ffff7ff2000 rw-p 00000000 00:00 0 7ffff7ffd000-7ffff7ffe000 rw-p 00000000 00:00 0 7ffff7ffe000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso] 7ffffffea000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Program received signal SIGABRT, Aborted. 0x00000031d6a324f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt full #0 0x00000031d6a324f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar = 0 pid = selftid = #1 0x00000031d6a33cd5 in abort () at abort.c:92 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x7fffffffd2d8, sa_sigaction = 0x7fffffffd2d8}, sa_mask = {__val = { 140737488343744, 140737488348762, 31, 214055614991, 3, 140737488343754, 6, 214055614995, 2, 140737488343741, 3, 214055608260, 1, 214055614991, 3, 140737488343750}}, sa_flags = 10, sa_restorer = 0x31d6b57a13} sigs = {__val = {32, 0 }} #2 0x00000031d6a70417 in __libc_message (do_abort=2, fmt=0x31d6b58c00 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 ap = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0x7fffffffdc40, reg_save_area = 0x7fffffffdb50}} ap_copy = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fffffffdc40, reg_save_area = 0x7fffffffdb50}} fd = 9 on_2 = list = nlist = cp = written = #3 0x00000031d6a75e5e in malloc_printerr (action=3, str=0x31d6b58ed8 "double free or corruption (fasttop)", ptr=, ar_ptr=) at malloc.c:6360 buf = "00000000007c11a0" cp = #4 0x00000031d6a78cad in _int_free (av=0x31d6d8e120, p=0x7c1190, have_lock=0) at malloc.c:4846 size = fb = nextchunk = nextsize = nextinuse = prevsize = bck = fwd = errstr = locked = #5 0x000000000040cbe1 in Array::~Array (this=, __in_chrg=) at ../../src/xstd/Array.h:30 No locals. #6 0x000000000041ad92 in ~Array (this=0x7fffffffe1a0) at ../../src/xstd/Array.h:30 No locals. #7 PolyApp::makeAgents (this=0x7fffffffe1a0) at PolyApp.cc:899 localAgents = {_vptr.Array = 0x4e7d10, theItems = 0x1524640, theCapacity = 256, theCount = 180} cpuCores = {_vptr.Array = 0x4e7f30, theItems = 0x1515608, theCapacity = 32, theCount = 20} localHostCount = 20 agentBegin = cfgReportGap = { = {tv_sec = 5, tv_usec = 0}, } cfgNextReport = { = {tv_sec = 1533688015, tv_usec = 289139}, } ifaceMap = {theStaticIndex = {_vptr.Array = 0x4eebb0, theItems = 0x860630, theCapacity = 265, theCount = 265}, theDynamicIndex = {_vptr.Array = 0x4eebb0, theItems = 0x7c1670, theCapacity = 0, theCount = 0}, theCount = 193} agentCount = hosts = {> = {_vptr.Array = 0x4e7e50, theItems = 0x1522060, theCapacity = 32, theCount = 20}, } localCpuCores = {_vptr.Array = 0x4e7f30, theItems = 0x15152f8, theCapacity = 32, theCount = 20} agentEnd = localAgentAddreses = {> = {_vptr.Array = 0x4e7d70, theItems = 0x1524e50, theCapacity = 256, theCount = 180}, } #8 0x000000000041baec in PolyApp::run (this=0x7fffffffe1a0, argc=9, argv=0x7fffffffe398) at PolyApp.cc:1173 No locals. #9 0x00000000004077ba in main (argc=9, argv=0x7fffffffe398) at PolySrv.cc:73 app = { = { = {_vptr.FileScanTicker = 0x4e4cd0}, = {_vptr.BcastRcver = 0x4e4d88, theChannels = {_vptr.Array = 0x4e56b0, theItems = 0x7a5950, theCapacity = 16, theCount = 1}}, theCmdLine = { _vptr.CmdLine = 0x5101b0, theArgs = {_vptr.Array = 0x4e6170, theItems = 0x7a5ad0, theCapacity = 16, theCount = 9}, theOpts = {_vptr.Array = 0x4e5970, theItems = 0x7a5c40, theCapacity = 64, theCount = 34}, theAnonymParser = 0x0, thePrgName = {static npos = 2147483647, theBuf = 0x787dc0}}, thePglCfg = {static npos = 2147483647, theBuf = 0x7a62f0}, theIfaces = {_vptr.Array = 0x4e82b0, theItems = 0x154fab0, theCapacity = 194, theCount = 194}, theLocals = {_vptr.Array = 0x4e8250, theItems = 0x7a5910, theCapacity = 0, theCount = 0}, theContentSels = { _vptr.Array = 0x4e81f0, theItems = 0x7f5e60, theCapacity = 2, theCount = 2}, theAgentType = { static npos = 2147483647, theBuf = 0x77f300}, theBeepDoorman = 0x0, theIdleBeg = { = {tv_sec = -1, tv_usec = -1}, }, theIdleEnd = { = {tv_sec = -1, tv_usec = -1}, }, isIdle = false, theTickCount = 0, theStateCount = 0, theWorkerCount = 20}, } (gdb) -------------- next part -------------- ~]# gdb --args /usr/local/bin/polygraph-server --cfg_dirs /usr/local/share/polygraph/polytests/em1 --config "10g-web1hr-smp-v2.pg" --log /var/log/polygraph/em1/smp/pclient-%worker.log --idle_tout 300secGNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/local/bin/polygraph-server...done. (gdb) run Starting program: /usr/local/bin/polygraph-server --cfg_dirs /usr/local/share/polygraph/polytests/em1 --config 10g-web1hr-smp-v2.pg --log /var/log/polygraph/em1/smp/pclient-%worker.log --idle_tout 300sec [Thread debugging using libthread_db enabled] 000.00| current time: 1533687751.947364 or Wed, 08 Aug 2018 00:22:31 GMT *** glibc detected *** /usr/local/bin/polygraph-server: double free or corruption (fasttop): 0x00000000007c11a0 *** ======= Backtrace: ========= /lib64/libc.so.6[0x31d6a75e5e] /lib64/libc.so.6[0x31d6a78cad] /usr/local/bin/polygraph-server[0x40cbe1] /usr/local/bin/polygraph-server[0x41ad92] /usr/local/bin/polygraph-server[0x41baec] /usr/local/bin/polygraph-server[0x4077ba] /lib64/libc.so.6(__libc_start_main+0x100)[0x31d6a1ed20] /usr/local/bin/polygraph-server[0x407409] ======= Memory map: ======== 00400000-00555000 r-xp 00000000 fd:00 1706329 /usr/local/bin/polygraph-server 00755000-00758000 rw-p 00155000 fd:00 1706329 /usr/local/bin/polygraph-server 00758000-01572000 rw-p 00000000 00:00 0 [heap] 31d6600000-31d6620000 r-xp 00000000 fd:00 1441797 /lib64/ld-2.12.so 31d6820000-31d6821000 r--p 00020000 fd:00 1441797 /lib64/ld-2.12.so 31d6821000-31d6822000 rw-p 00021000 fd:00 1441797 /lib64/ld-2.12.so 31d6822000-31d6823000 rw-p 00000000 00:00 0 31d6a00000-31d6b8b000 r-xp 00000000 fd:00 1441800 /lib64/libc-2.12.so 31d6b8b000-31d6d8a000 ---p 0018b000 fd:00 1441800 /lib64/libc-2.12.so 31d6d8a000-31d6d8e000 r--p 0018a000 fd:00 1441800 /lib64/libc-2.12.so 31d6d8e000-31d6d90000 rw-p 0018e000 fd:00 1441800 /lib64/libc-2.12.so 31d6d90000-31d6d94000 rw-p 00000000 00:00 0 31d6e00000-31d6e02000 r-xp 00000000 fd:00 1441821 /lib64/libdl-2.12.so 31d6e02000-31d7002000 ---p 00002000 fd:00 1441821 /lib64/libdl-2.12.so 31d7002000-31d7003000 r--p 00002000 fd:00 1441821 /lib64/libdl-2.12.so 31d7003000-31d7004000 rw-p 00003000 fd:00 1441821 /lib64/libdl-2.12.so 31d7200000-31d7217000 r-xp 00000000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7217000-31d7417000 ---p 00017000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7417000-31d7418000 r--p 00017000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7418000-31d7419000 rw-p 00018000 fd:00 1441801 /lib64/libpthread-2.12.so 31d7419000-31d741d000 rw-p 00000000 00:00 0 31d7a00000-31d7a83000 r-xp 00000000 fd:00 1441811 /lib64/libm-2.12.so 31d7a83000-31d7c82000 ---p 00083000 fd:00 1441811 /lib64/libm-2.12.so 31d7c82000-31d7c83000 r--p 00082000 fd:00 1441811 /lib64/libm-2.12.so 31d7c83000-31d7c84000 rw-p 00083000 fd:00 1441811 /lib64/libm-2.12.so 31d7e00000-31d7e15000 r-xp 00000000 fd:00 1441824 /lib64/libz.so.1.2.3 31d7e15000-31d8014000 ---p 00015000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8014000-31d8015000 r--p 00014000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8015000-31d8016000 rw-p 00015000 fd:00 1441824 /lib64/libz.so.1.2.3 31d8200000-31d821d000 r-xp 00000000 fd:00 1441831 /lib64/libselinux.so.1 31d821d000-31d841c000 ---p 0001d000 fd:00 1441831 /lib64/libselinux.so.1 31d841c000-31d841d000 r--p 0001c000 fd:00 1441831 /lib64/libselinux.so.1 31d841d000-31d841e000 rw-p 0001d000 fd:00 1441831 /lib64/libselinux.so.1 31d841e000-31d841f000 rw-p 00000000 00:00 0 31d8a00000-31d8a16000 r-xp 00000000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8a16000-31d8c16000 ---p 00016000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c16000-31d8c17000 r--p 00016000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c17000-31d8c18000 rw-p 00017000 fd:00 1441823 /lib64/libresolv-2.12.so 31d8c18000-31d8c1a000 rw-p 00000000 00:00 0 31d8e00000-31d8e16000 r-xp 00000000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d8e16000-31d9015000 ---p 00016000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d9015000-31d9016000 rw-p 00015000 fd:00 1441812 /lib64/libgcc_s-4.4.7-20120601.so.1 31d9200000-31d92e8000 r-xp 00000000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d92e8000-31d94e8000 ---p 000e8000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94e8000-31d94ef000 r--p 000e8000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94ef000-31d94f1000 rw-p 000ef000 fd:00 1703947 /usr/lib64/libstdc++.so.6.0.13 31d94f1000-31d9506000 rw-p 00000000 00:00 0 31dc200000-31dc3ba000 r-xp 00000000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc3ba000-31dc5ba000 ---p 001ba000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5ba000-31dc5d5000 r--p 001ba000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5d5000-31dc5e1000 rw-p 001d5000 fd:00 1710432 /usr/lib64/libcrypto.so.1.0.1e 31dc5e1000-31dc5e5000 rw-p 00000000 00:00 0 31dca00000-31dca03000 r-xp 00000000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dca03000-31dcc02000 ---p 00003000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dcc02000-31dcc03000 r--p 00002000 fd:00 1442034 /lib64/libcom_err.so.2.1 31dcc03000-31dcc04000 rw-p 00003000 fd:00 1442034 /lib64/libcom_err.so.2.1 31de200000-31de2dc000 r-xp 00000000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de2dc000-31de4db000 ---p 000dc000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de4db000-31de4e5000 r--p 000db000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de4e5000-31de4e7000 rw-p 000e5000 fd:00 1442133 /lib64/libkrb5.so.3.3 31de600000-31de602000 r-xp 00000000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de602000-31de801000 ---p 00002000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de801000-31de802000 r--p 00001000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31de802000-31de803000 rw-p 00002000 fd:00 1442001 /lib64/libkeyutils.so.1.3 31df600000-31df60a000 r-xp 00000000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df60a000-31df809000 ---p 0000a000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df809000-31df80a000 r--p 00009000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31df80a000-31df80b000 rw-p 0000a000 fd:00 1442022 /lib64/libkrb5support.so.0.1 31dfa00000-31dfa29000 r-xp 00000000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfa29000-31dfc29000 ---p 00029000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc29000-31dfc2a000 r--p 00029000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc2a000-31dfc2b000 rw-p 0002a000 fd:00 1442024 /lib64/libk5crypto.so.3.1 31dfc2b000-31dfc2c000 rw-p 00000000 00:00 0 31dfe00000-31dfe41000 r-xp 00000000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31dfe41000-31e0041000 ---p 00041000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e0041000-31e0042000 r--p 00041000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e0042000-31e0044000 rw-p 00042000 fd:00 1442137 /lib64/libgssapi_krb5.so.2.2 31e3200000-31e3262000 r-xp 00000000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3262000-31e3462000 ---p 00062000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3462000-31e3466000 r--p 00062000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 31e3466000-31e346c000 rw-p 00066000 fd:00 1712801 /usr/lib64/libssl.so.1.0.1e 7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0 7ffff0021000-7ffff4000000 ---p 00000000 00:00 0 7ffff5f7c000-7ffff7ff2000 rw-p 00000000 00:00 0 7ffff7ffd000-7ffff7ffe000 rw-p 00000000 00:00 0 7ffff7ffe000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso] 7ffffffea000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Program received signal SIGABRT, Aborted. 0x00000031d6a324f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt full #0 0x00000031d6a324f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar = 0 pid = selftid = #1 0x00000031d6a33cd5 in abort () at abort.c:92 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x7fffffffd2d8, sa_sigaction = 0x7fffffffd2d8}, sa_mask = {__val = { 140737488343744, 140737488348762, 31, 214055614991, 3, 140737488343754, 6, 214055614995, 2, 140737488343741, 3, 214055608260, 1, 214055614991, 3, 140737488343750}}, sa_flags = 10, sa_restorer = 0x31d6b57a13} sigs = {__val = {32, 0 }} #2 0x00000031d6a70417 in __libc_message (do_abort=2, fmt=0x31d6b58c00 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 ap = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0x7fffffffdc40, reg_save_area = 0x7fffffffdb50}} ap_copy = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fffffffdc40, reg_save_area = 0x7fffffffdb50}} fd = 9 on_2 = list = nlist = cp = written = #3 0x00000031d6a75e5e in malloc_printerr (action=3, str=0x31d6b58ed8 "double free or corruption (fasttop)", ptr=, ar_ptr=) at malloc.c:6360 buf = "00000000007c11a0" cp = #4 0x00000031d6a78cad in _int_free (av=0x31d6d8e120, p=0x7c1190, have_lock=0) at malloc.c:4846 size = fb = nextchunk = nextsize = nextinuse = prevsize = bck = fwd = errstr = locked = #5 0x000000000040cbe1 in Array::~Array (this=, __in_chrg=) at ../../src/xstd/Array.h:30 No locals. #6 0x000000000041ad92 in ~Array (this=0x7fffffffe1a0) at ../../src/xstd/Array.h:30 No locals. #7 PolyApp::makeAgents (this=0x7fffffffe1a0) at PolyApp.cc:899 localAgents = {_vptr.Array = 0x4e7d10, theItems = 0x1524640, theCapacity = 256, theCount = 180} cpuCores = {_vptr.Array = 0x4e7f30, theItems = 0x1515608, theCapacity = 32, theCount = 20} localHostCount = 20 agentBegin = cfgReportGap = { = {tv_sec = 5, tv_usec = 0}, } cfgNextReport = { = {tv_sec = 1533687757, tv_usec = 992722}, } ifaceMap = {theStaticIndex = {_vptr.Array = 0x4eebb0, theItems = 0x860630, theCapacity = 265, theCount = 265}, theDynamicIndex = {_vptr.Array = 0x4eebb0, theItems = 0x7c1670, theCapacity = 0, theCount = 0}, theCount = 193} agentCount = hosts = {> = {_vptr.Array = 0x4e7e50, theItems = 0x1522060, theCapacity = 32, theCount = 20}, } localCpuCores = {_vptr.Array = 0x4e7f30, theItems = 0x15152f8, theCapacity = 32, theCount = 20} agentEnd = localAgentAddreses = {> = {_vptr.Array = 0x4e7d70, theItems = 0x1524e50, theCapacity = 256, theCount = 180}, } #8 0x000000000041baec in PolyApp::run (this=0x7fffffffe1a0, argc=9, argv=0x7fffffffe398) at PolyApp.cc:1173 No locals. #9 0x00000000004077ba in main (argc=9, argv=0x7fffffffe398) at PolySrv.cc:73 app = { = { = {_vptr.FileScanTicker = 0x4e4cd0}, = {_vptr.BcastRcver = 0x4e4d88, theChannels = {_vptr.Array = 0x4e56b0, theItems = 0x7a5950, theCapacity = 16, theCount = 1}}, theCmdLine = { _vptr.CmdLine = 0x5101b0, theArgs = {_vptr.Array = 0x4e6170, theItems = 0x7a5ad0, theCapacity = 16, theCount = 9}, theOpts = {_vptr.Array = 0x4e5970, theItems = 0x7a5c40, theCapacity = 64, theCount = 34}, theAnonymParser = 0x0, thePrgName = {static npos = 2147483647, theBuf = 0x787dc0}}, thePglCfg = {static npos = 2147483647, theBuf = 0x7a62f0}, theIfaces = {_vptr.Array = 0x4e82b0, theItems = 0x154fab0, theCapacity = 194, theCount = 194}, theLocals = {_vptr.Array = 0x4e8250, theItems = 0x7a5910, theCapacity = 0, theCount = 0}, theContentSels = { _vptr.Array = 0x4e81f0, theItems = 0x7f5e60, theCapacity = 2, theCount = 2}, theAgentType = { static npos = 2147483647, theBuf = 0x77f300}, theBeepDoorman = 0x0, theIdleBeg = { = {tv_sec = -1, tv_usec = -1}, }, theIdleEnd = { = {tv_sec = -1, tv_usec = -1}, }, isIdle = false, theTickCount = 0, theStateCount = 0, theWorkerCount = 20}, } (gdb) From rousskov at measurement-factory.com Wed Aug 8 01:19:42 2018 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Tue, 7 Aug 2018 19:19:42 -0600 Subject: Bench SMP mode In-Reply-To: <065050103a9a6dc003c9ee775dcf5340@mail.gmail.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> <065050103a9a6dc003c9ee775dcf5340@mail.gmail.com> Message-ID: On 08/07/2018 06:28 PM, William Law wrote: > This is what the cpu information looks like on the boxes: > ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 80 > On-line CPU(s) list: 0-79 > Thread(s) per core: 2 > Core(s) per socket: 10 > Socket(s): 4 > NUMA node(s): 4 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 79 > Model name: Intel(R) Xeon(R) CPU E5-4620 v4 @ 2.10GHz > Stepping: 1 > CPU MHz: 2095.127 > BogoMIPS: 4190.02 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 25600K > NUMA node0 CPU(s): 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76 > NUMA node1 CPU(s): 1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61,65,69,73,77 > NUMA node2 CPU(s): 2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,66,70,74,78 > NUMA node3 CPU(s): 3,7,11,15,19,23,27,31,35,39,43,47,51,55,59,63,67,71,75,79 > ~]# > I'm working on the logic that every second cpu on a node (aka socket) is the > "logical" second hyperthread cpu. Looking at the CPU load when pushing > traffic I see the same behaviour with software interrupts loading up > particular cores. I hope somebody more knowledgeable about CPU architectures can validate your theory. I wish I could! As for the network interrupts, YMMV, but they tend to migrate towards busy workers/CPUs in the tests I have seen, which is not necessarily a good thing when the worker is close to maxing out a CPU core. Confining interrupts to dedicated cores may improve overall performance. > See attached! (will email you direct if the list whinges). Setup for 20 > cores, 1 dump with 2 cores per worker, the other with 1 core per worker. Thank you for sharing these helpful backtraces. >> You have one worker process per physical core. One process cannot >> consume more than 100% of anything. Workers have no threads (for this >> discussion, you can view each worker as a thread if you wish). And two >> virtual cores are a red herring -- in a context of a single busy >> process, they only add overheads. > Shame, thought the robots might have run as individual threads under a > worker, make more use of SMP. I am not sure I share your disappointment in terms of performance: A robot=thread model would only scale well for very busy robots, which is both unrealistic (in most cases) and already supported (by configuring one robot per worker). Polygraph was born before SMP became a thing on regular machines we used for drones. If we were to write it from scratch today, we would have used threads for ease of worker management/synchronization, but we would still not dedicate a thread to each robot because such rigid and expensive architecture would not scale in many realistic simulations that use thousands of robots. Cheers, Alex. From william.law at tesserent.com Wed Aug 8 02:16:52 2018 From: william.law at tesserent.com (William Law) Date: Wed, 8 Aug 2018 12:16:52 +1000 Subject: Bench SMP mode In-Reply-To: References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> <065050103a9a6dc003c9ee775dcf5340@mail.gmail.com> Message-ID: <5ff923507308cbb47167672f903792ad@mail.gmail.com> > As for the network interrupts, YMMV, but they tend to migrate towards > busy workers/CPUs in the tests I have seen, which is not necessarily a > good thing when the worker is close to maxing out a CPU core. Confining > interrupts to dedicated cores may improve overall performance. In multi-socket architectures PCI-e ports are bound to a particular socket (aka NUMA node). In this case the dual-port Intel X710 card that I am using is bound to the first socket. In top this shows as SI usage on the cores associated with that socket. > > See attached! (will email you direct if the list whinges). Setup for 20 > > cores, 1 dump with 2 cores per worker, the other with 1 core per worker. > > Thank you for sharing these helpful backtraces. Hope you find something useful! > I am not sure I share your disappointment in terms of performance: A > robot=thread model would only scale well for very busy robots, which is > both unrealistic (in most cases) and already supported (by configuring > one robot per worker). > > Polygraph was born before SMP became a thing on regular machines we used > for drones. If we were to write it from scratch today, we would have > used threads for ease of worker management/synchronization, but we would > still not dedicate a thread to each robot because such rigid and > expensive architecture would not scale in many realistic simulations > that use thousands of robots. All good, does well for what it is. Knowing more about the internal architecture allows me to work around it (more cores!) and still achieve the testing goals. > Cheers, > > Alex. Regards, William From rousskov at measurement-factory.com Tue Aug 14 15:47:58 2018 From: rousskov at measurement-factory.com (Alex Rousskov) Date: Tue, 14 Aug 2018 09:47:58 -0600 Subject: Bench SMP mode In-Reply-To: <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> Message-ID: On 08/07/2018 09:54 AM, Alex Rousskov wrote: > On 08/06/2018 11:55 PM, William Law wrote: >> I can only configure a maximum of 16 core sets before polygraph-server segfaults > We will try to find the time to reproduce and fix. Done. A compressed patch for Polygraph v4 is attached. HTH, Alex. -------------- next part -------------- A non-text attachment was scrubbed... Name: v4-f6a4949.patch.gz Type: application/gzip Size: 5508 bytes Desc: not available URL: From william.law at tesserent.com Thu Aug 16 00:28:34 2018 From: william.law at tesserent.com (William Law) Date: Thu, 16 Aug 2018 10:28:34 +1000 Subject: Bench SMP mode In-Reply-To: <6624b168567be658f533f2d40bcc5fe4@mail.gmail.com> References: <5885a856e68e7b7c9e30f00df4586299@mail.gmail.com> <7f867bef1bd206b7f6bee7c79cbfef70@mail.gmail.com> <4d137a51-1594-dd2a-65ef-eb8d531df756@measurement-factory.com> <6077447b07539801482e21d12ab6f55a@mail.gmail.com> <6624b168567be658f533f2d40bcc5fe4@mail.gmail.com> Message-ID: Hi Alex and everyone else, > > > Done. A compressed patch for Polygraph v4 is attached. > > Thanks for that. I've recompiled, will re-run tests and let you know how > it > goes. First test run completed with 20 cores specified in the bench config. With a 1500-byte MTU on Intel X710's: 1.5 million packets/s 9.6 Gbit/s Most of the 20 cores on each side were running at 100% on Xeon E5-4620 v4's. HTTPS TLSv1 2048bit keysize as the workload. I'd say this is a winner :) Some little things to tweak now to improve per worker load, IP address coverage, etc. Thanks for your assistance! Regards, William